Re: Debugging Ethernet issues
Florian Fainelliwrites: > On 11/14/2016 11:00 AM, Måns Rullgård wrote: >> Florian Fainelli writes: >> >>> On 11/14/2016 10:20 AM, Florian Fainelli wrote: On 11/14/2016 09:59 AM, Sebastian Frias wrote: > On 11/14/2016 06:32 PM, Florian Fainelli wrote: >> On 11/14/2016 07:33 AM, Mason wrote: >>> On 14/11/2016 15:58, Mason wrote: >>> nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx vs nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off I'm not sure whether "flow control" is relevant... >>> >>> Based on phy_print_status() >>> phydev->pause ? "rx/tx" : "off" >>> I added the following patch. >>> >>> diff --git a/drivers/net/ethernet/aurora/nb8800.c >>> b/drivers/net/ethernet/aurora/nb8800.c >>> index defc22a15f67..4e758c1cfa4e 100644 >>> --- a/drivers/net/ethernet/aurora/nb8800.c >>> +++ b/drivers/net/ethernet/aurora/nb8800.c >>> @@ -667,6 +667,8 @@ static void nb8800_link_reconfigure(struct >>> net_device *dev) >>> struct phy_device *phydev = priv->phydev; >>> int change = 0; >>> >>> + printk("%s from %pf\n", __func__, __builtin_return_address(0)); >>> + >>> if (phydev->link) { >>> if (phydev->speed != priv->speed) { >>> priv->speed = phydev->speed; >>> @@ -1274,9 +1276,9 @@ static int nb8800_hw_init(struct net_device *dev) >>> nb8800_writeb(priv, NB8800_PQ2, val & 0xff); >>> >>> /* Auto-negotiate by default */ >>> - priv->pause_aneg = true; >>> - priv->pause_rx = true; >>> - priv->pause_tx = true; >>> + priv->pause_aneg = false; >>> + priv->pause_rx = false; >>> + priv->pause_tx = false; >>> >>> nb8800_mc_init(dev, 0); >>> >>> >> >> [...] >> >> And the time difference is clearly accounted for auto-negotiation time >> here, as you can see it takes about 3 seconds for Gigabit Ethernet to >> auto-negotiate and that seems completely acceptable and normal to me >> since it is a more involved process than lower speeds. >> >>> >>> >>> OK, so now it works (by accident?) even on 100 Mbps switch, but it still >>> prints "flow control rx/tx"... >> >> Because your link partner advertises flow control, and that's what >> phydev->pause and phydev->asym_pause report (I know it's confusing, but >> that's what it is at the moment). > > Thanks. > Could you confirm that Mason's patch is correct and/or that it does not > has negative side-effects? The patch is not correct nor incorrect per-se, it changes the default policy of having pause frames advertised by default to not having them advertised by default. >> >> I was advised to advertise flow control by default back when I was >> working on the driver, and I think it makes sense to do so. >> This influences both your Ethernet MAC and the link partner in that the result is either flow control is enabled (before) or it is not (with the patch). There must be something amiss if you see packet loss or some kind of problem like that with an early exchange such as DHCP. Flow control tend to kick in under higher packet rates (at least, that's what you expect). > > Right now we know that Mason's patch makes this work, but we do not > understand why nor its implications. You need to understand why, right now, the way this problem is presented, you came up with a workaround, not with the root cause or the solution. What does your link partner (switch?) reports, that is, what is the ethtool output when you have a link up from your nb8800 adapter? >>> >>> Actually, nb8800_pause_config() seems to be doing a complete MAC/DMA >>> reconfiguration when pause frames get auto-negotiated while the link is >>> UP, >> >> This is due to a silly hardware limitation. The register containing the >> flow control bits can't be written while rx is enabled. > > You do a DMA stop, but you don't disable the MAC receiver unlike what > nb8800_stop() does, why is not calling nb8800_mac_rx() necessary here? Oh, right. That's because the RXC_CR register (where the flow control bits are) can't be modified when the RCR_EN bit (rx dma enable) is set. The MAC core register controlled by nb8800_mac_rx() doesn't matter here. There is no way of changing the flow control setting without briefly stopping rx dma. None of this should be relevant here though since everything should be all set up before dma is enabled the first time. >>> and it does not differentiate being called from >>> ethtool::set_pauseparam or the PHYLIB adjust_link callback (which it >>> probably should), >> >> Differentiate how? > >
Re: Debugging Ethernet issues
Hi Florian, On 11/14/2016 10:00 PM, Florian Fainelli wrote: > On 11/14/2016 12:27 PM, Mason wrote: >> On 14/11/2016 19:20, Florian Fainelli wrote: >> >>> On 11/14/2016 09:59 AM, Sebastian Frias wrote: >>> Could you confirm that Mason's patch is correct and/or that it does not has negative side-effects? >>> >>> The patch is not correct nor incorrect per-se, it changes the default >>> policy of having pause frames advertised by default to not having them >>> advertised by default. This influences both your Ethernet MAC and the >>> link partner in that the result is either flow control is enabled >>> (before) or it is not (with the patch). There must be something amiss if >>> you see packet loss or some kind of problem like that with an early >>> exchange such as DHCP. Flow control tend to kick in under higher packet >>> rates (at least, that's what you expect). >> >> Did you note that, without the change under discussion (i.e. with >> the eth driver as it is upstream), when the board is connected to >> a 100 Mbps switch, then *nothing* works *systematically (no ping, >> no DHCP; are there other relevant low-level network tools?). > > No I missed that, way too many emails, really. So how about you compare > the register settings that could be (that is, all that could be modified > by the PHYLIB adjust_link function) and try to spot where things could > go wrong? The registers that make this work or fail are modified in 'nb8800_pause_config()' which is called from 'adjust_link', see below. >Any other register that can be influenced by the link speed? It may not be the link-speed per se, but the way these "pause" parameters are interacting. Indeed, on the 1000 Mbps switch we get (1): "Link partner advertised pause frame use: Symmetric Receive-only" And on the 100 Mbps switch we get (2): "Link partner advertised pause frame use: Symmetric" Since the 100 Mbps is advertising "symmetric", the code in 'drivers/net/ethernet/aurora/nb8800.c:nb8800_pause_config()' enables the flow control bit: nb8800_modl(priv, NB8800_RXC_CR, RCR_FL, priv->pause_tx); If we force this to be disabled, the network works on 100 Mbps. Keep in mind that the same driver (net and phy) work on a different board, could it be just by chance? > > It seems like a possible (yet after re-reading, very unlikely) scenario, > considering that priv->speed, priv->duplex and priv->link are initially > zero-initialized (because nb8800_priv is zero initialized) may not force > a correct link transition and a full MAC reconfiguration in > nb8800_link_reconfigure() where some of the cached values are used. > > NB: you will see most drivers initialize the previous link, Could you suggest another driver to compare this with? It is not easy to know which drivers follow the conventions so that they can be used as examples. >speed, > duplex values to -1, because those are outside of the range of values > that PHYLIB would assign to phydev->{link,duplex,speed}, and therefore, > this is guaranteed to make the adjust_link callback that tries to > minimize these settings to force a transition. > >> >> Also, maybe this comment was lost in my own noise: >> >> If I manually set the link up, then down, then run udhcpc >> => then nothing works, as if something is wedged somewhere >> (a kernel thread gets borked by a race condition?) > > Well then start seriously debugging the problem: firs thing you need to > check is is the RUNNING flag set on the interface (which indicates a > carrier on?) without that, the networking stack won't even send packets. 'ifconfig' reports "eth0: flags=4163" Packets get sent, as the statistics show for example the following values increasing: tx_bytes: 756 tx_frames: 3 However, the statistics show that RX values increase and then stop. Basically, if we do: # ethtool -S eth0 > /tmp/toto # udhcpc (ctrl-C because it won't return because it keeps trying) # ethtool -S eth0 > /tmp/titi # diff /tmp/toto /tmp/titi we can see that the RX values go from 0 to X, but get stuck at X, while the TX values keep increasing: rx_bytes_ok: 887 rx_frames_ok: 8 ... tx_bytes: 1576 tx_frames: 7 Like if the Flow Control is not working properly. However we don't know how to double check it, since the older driver (for kernel 3.4) does not seem to use the feature. > If it is not set, why is not it set? Did nb8800_mac_config() get called > in the first place to configure the MAC wrt. the link settings? > > When you transmit, do transmit counters increase? That would indicate > the TX DMA does its job. When transmission occurs, it is successful or > is it reporting errors? If the PHY supports it, can you access PHY > counters and look for success/error counters changing? Finally, try to > put another golden (working) host and if your switch supports it, > configure port mirroring to look at packets. If the switch does not > support it, then try different link
Re: Debugging Ethernet issues
On 14/11/2016 22:00, Florian Fainelli wrote: > No I missed that, way too many emails, really. Sorry, I was trying to be thorough, but I went overboard. I'll start a new thread tomorrow, with a smaller CC list (only those who have participated so far) and I'll try to remain concise. Regards.
Re: Debugging Ethernet issues
On 11/14/2016 12:27 PM, Mason wrote: > On 14/11/2016 19:20, Florian Fainelli wrote: > >> On 11/14/2016 09:59 AM, Sebastian Frias wrote: >> >>> Could you confirm that Mason's patch is correct and/or that it does not >>> has negative side-effects? >> >> The patch is not correct nor incorrect per-se, it changes the default >> policy of having pause frames advertised by default to not having them >> advertised by default. This influences both your Ethernet MAC and the >> link partner in that the result is either flow control is enabled >> (before) or it is not (with the patch). There must be something amiss if >> you see packet loss or some kind of problem like that with an early >> exchange such as DHCP. Flow control tend to kick in under higher packet >> rates (at least, that's what you expect). > > Did you note that, without the change under discussion (i.e. with > the eth driver as it is upstream), when the board is connected to > a 100 Mbps switch, then *nothing* works *systematically (no ping, > no DHCP; are there other relevant low-level network tools?). No I missed that, way too many emails, really. So how about you compare the register settings that could be (that is, all that could be modified by the PHYLIB adjust_link function) and try to spot where things could go wrong? Any other register that can be influenced by the link speed? It seems like a possible (yet after re-reading, very unlikely) scenario, considering that priv->speed, priv->duplex and priv->link are initially zero-initialized (because nb8800_priv is zero initialized) may not force a correct link transition and a full MAC reconfiguration in nb8800_link_reconfigure() where some of the cached values are used. NB: you will see most drivers initialize the previous link, speed, duplex values to -1, because those are outside of the range of values that PHYLIB would assign to phydev->{link,duplex,speed}, and therefore, this is guaranteed to make the adjust_link callback that tries to minimize these settings to force a transition. > > Also, maybe this comment was lost in my own noise: > > If I manually set the link up, then down, then run udhcpc > => then nothing works, as if something is wedged somewhere > (a kernel thread gets borked by a race condition?) Well then start seriously debugging the problem: firs thing you need to check is is the RUNNING flag set on the interface (which indicates a carrier on?) without that, the networking stack won't even send packets. If it is not set, why is not it set? Did nb8800_mac_config() get called in the first place to configure the MAC wrt. the link settings? When you transmit, do transmit counters increase? That would indicate the TX DMA does its job. When transmission occurs, it is successful or is it reporting errors? If the PHY supports it, can you access PHY counters and look for success/error counters changing? Finally, try to put another golden (working) host and if your switch supports it, configure port mirroring to look at packets. If the switch does not support it, then try different link partners. > > Could not advertising pause frames result in making such a > race condition impossible? (I don't really believe in a race, > due to the 100% nature of the problem.) > >>> Right now we know that Mason's patch makes this work, but we do not >>> understand >>> why nor its implications. >> >> You need to understand why, right now, the way this problem is >> presented, you came up with a workaround, not with the root cause or the >> solution. What does your link partner (switch?) reports, that is, what >> is the ethtool output when you have a link up from your nb8800 adapter? > > Isn't that what ethtool -a eth0 prints? No, ethtool -a prints the local pause settings. > How do I get the link partner information? ethtool eth0: # ethtool eth0 Settings for eth0: Supported ports: [ TP MII ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Link partner advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full ^== Link partner advertised pause frame use: Symmetric Link partner advertised auto-negotiation: Yes ^ Speed: 100Mb/s Duplex: Full Port: MII PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: gs Wake-on: d SecureOn password: 00:00:00:00:00:00 Current message level: 0x0007 (7) drv probe link Link detected: yes # > Just
Re: Debugging Ethernet issues
On 14/11/2016 19:20, Florian Fainelli wrote: > On 11/14/2016 09:59 AM, Sebastian Frias wrote: > >> Could you confirm that Mason's patch is correct and/or that it does not >> has negative side-effects? > > The patch is not correct nor incorrect per-se, it changes the default > policy of having pause frames advertised by default to not having them > advertised by default. This influences both your Ethernet MAC and the > link partner in that the result is either flow control is enabled > (before) or it is not (with the patch). There must be something amiss if > you see packet loss or some kind of problem like that with an early > exchange such as DHCP. Flow control tend to kick in under higher packet > rates (at least, that's what you expect). Did you note that, without the change under discussion (i.e. with the eth driver as it is upstream), when the board is connected to a 100 Mbps switch, then *nothing* works *systematically (no ping, no DHCP; are there other relevant low-level network tools?). Also, maybe this comment was lost in my own noise: If I manually set the link up, then down, then run udhcpc => then nothing works, as if something is wedged somewhere (a kernel thread gets borked by a race condition?) Could not advertising pause frames result in making such a race condition impossible? (I don't really believe in a race, due to the 100% nature of the problem.) >> Right now we know that Mason's patch makes this work, but we do not >> understand >> why nor its implications. > > You need to understand why, right now, the way this problem is > presented, you came up with a workaround, not with the root cause or the > solution. What does your link partner (switch?) reports, that is, what > is the ethtool output when you have a link up from your nb8800 adapter? Isn't that what ethtool -a eth0 prints? How do I get the link partner information? Just ethtool eth0? Regards.
Re: Debugging Ethernet issues
On 11/14/2016 11:00 AM, Måns Rullgård wrote: > Florian Fainelliwrites: > >> On 11/14/2016 10:20 AM, Florian Fainelli wrote: >>> On 11/14/2016 09:59 AM, Sebastian Frias wrote: On 11/14/2016 06:32 PM, Florian Fainelli wrote: > On 11/14/2016 07:33 AM, Mason wrote: >> On 14/11/2016 15:58, Mason wrote: >> >>> nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control >>> rx/tx >>> vs >>> nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off >>> >>> I'm not sure whether "flow control" is relevant... >> >> Based on phy_print_status() >> phydev->pause ? "rx/tx" : "off" >> I added the following patch. >> >> diff --git a/drivers/net/ethernet/aurora/nb8800.c >> b/drivers/net/ethernet/aurora/nb8800.c >> index defc22a15f67..4e758c1cfa4e 100644 >> --- a/drivers/net/ethernet/aurora/nb8800.c >> +++ b/drivers/net/ethernet/aurora/nb8800.c >> @@ -667,6 +667,8 @@ static void nb8800_link_reconfigure(struct >> net_device *dev) >> struct phy_device *phydev = priv->phydev; >> int change = 0; >> >> + printk("%s from %pf\n", __func__, __builtin_return_address(0)); >> + >> if (phydev->link) { >> if (phydev->speed != priv->speed) { >> priv->speed = phydev->speed; >> @@ -1274,9 +1276,9 @@ static int nb8800_hw_init(struct net_device *dev) >> nb8800_writeb(priv, NB8800_PQ2, val & 0xff); >> >> /* Auto-negotiate by default */ >> - priv->pause_aneg = true; >> - priv->pause_rx = true; >> - priv->pause_tx = true; >> + priv->pause_aneg = false; >> + priv->pause_rx = false; >> + priv->pause_tx = false; >> >> nb8800_mc_init(dev, 0); >> >> > > [...] > > And the time difference is clearly accounted for auto-negotiation time > here, as you can see it takes about 3 seconds for Gigabit Ethernet to > auto-negotiate and that seems completely acceptable and normal to me > since it is a more involved process than lower speeds. > >> >> >> OK, so now it works (by accident?) even on 100 Mbps switch, but it still >> prints "flow control rx/tx"... > > Because your link partner advertises flow control, and that's what > phydev->pause and phydev->asym_pause report (I know it's confusing, but > that's what it is at the moment). Thanks. Could you confirm that Mason's patch is correct and/or that it does not has negative side-effects? >>> >>> The patch is not correct nor incorrect per-se, it changes the default >>> policy of having pause frames advertised by default to not having them >>> advertised by default. > > I was advised to advertise flow control by default back when I was > working on the driver, and I think it makes sense to do so. > >>> This influences both your Ethernet MAC and the link partner in that >>> the result is either flow control is enabled (before) or it is not >>> (with the patch). There must be something amiss if you see packet >>> loss or some kind of problem like that with an early exchange such as >>> DHCP. Flow control tend to kick in under higher packet rates (at >>> least, that's what you expect). >>> Right now we know that Mason's patch makes this work, but we do not understand why nor its implications. >>> >>> You need to understand why, right now, the way this problem is >>> presented, you came up with a workaround, not with the root cause or the >>> solution. What does your link partner (switch?) reports, that is, what >>> is the ethtool output when you have a link up from your nb8800 adapter? >> >> Actually, nb8800_pause_config() seems to be doing a complete MAC/DMA >> reconfiguration when pause frames get auto-negotiated while the link is >> UP, > > This is due to a silly hardware limitation. The register containing the > flow control bits can't be written while rx is enabled. You do a DMA stop, but you don't disable the MAC receiver unlike what nb8800_stop() does, why is not calling nb8800_mac_rx() necessary here? > >> and it does not differentiate being called from >> ethtool::set_pauseparam or the PHYLIB adjust_link callback (which it >> probably should), > > Differentiate how? Differentiate in that when you are called from adjust_link, why bother checking with netif_running() since you are only configuring the pause settings when phydev->link is set. Not that this matters much, but that's something the caller can tell you. > >> wondering if there is a not a remote chance you can get the reply to >> arrive right when you just got signaled a link UP? > > If you're attempting to send or receive things before you get the link > up notification, you shouldn't expect anything to work reliably. No kidding. -- Florian
Re: Debugging Ethernet issues
Florian Fainelliwrites: > On 11/14/2016 10:20 AM, Florian Fainelli wrote: >> On 11/14/2016 09:59 AM, Sebastian Frias wrote: >>> On 11/14/2016 06:32 PM, Florian Fainelli wrote: On 11/14/2016 07:33 AM, Mason wrote: > On 14/11/2016 15:58, Mason wrote: > >> nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control >> rx/tx >> vs >> nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off >> >> I'm not sure whether "flow control" is relevant... > > Based on phy_print_status() > phydev->pause ? "rx/tx" : "off" > I added the following patch. > > diff --git a/drivers/net/ethernet/aurora/nb8800.c > b/drivers/net/ethernet/aurora/nb8800.c > index defc22a15f67..4e758c1cfa4e 100644 > --- a/drivers/net/ethernet/aurora/nb8800.c > +++ b/drivers/net/ethernet/aurora/nb8800.c > @@ -667,6 +667,8 @@ static void nb8800_link_reconfigure(struct net_device > *dev) > struct phy_device *phydev = priv->phydev; > int change = 0; > > + printk("%s from %pf\n", __func__, __builtin_return_address(0)); > + > if (phydev->link) { > if (phydev->speed != priv->speed) { > priv->speed = phydev->speed; > @@ -1274,9 +1276,9 @@ static int nb8800_hw_init(struct net_device *dev) > nb8800_writeb(priv, NB8800_PQ2, val & 0xff); > > /* Auto-negotiate by default */ > - priv->pause_aneg = true; > - priv->pause_rx = true; > - priv->pause_tx = true; > + priv->pause_aneg = false; > + priv->pause_rx = false; > + priv->pause_tx = false; > > nb8800_mc_init(dev, 0); > > [...] And the time difference is clearly accounted for auto-negotiation time here, as you can see it takes about 3 seconds for Gigabit Ethernet to auto-negotiate and that seems completely acceptable and normal to me since it is a more involved process than lower speeds. > > > OK, so now it works (by accident?) even on 100 Mbps switch, but it still > prints "flow control rx/tx"... Because your link partner advertises flow control, and that's what phydev->pause and phydev->asym_pause report (I know it's confusing, but that's what it is at the moment). >>> >>> Thanks. >>> Could you confirm that Mason's patch is correct and/or that it does not >>> has negative side-effects? >> >> The patch is not correct nor incorrect per-se, it changes the default >> policy of having pause frames advertised by default to not having them >> advertised by default. I was advised to advertise flow control by default back when I was working on the driver, and I think it makes sense to do so. >> This influences both your Ethernet MAC and the link partner in that >> the result is either flow control is enabled (before) or it is not >> (with the patch). There must be something amiss if you see packet >> loss or some kind of problem like that with an early exchange such as >> DHCP. Flow control tend to kick in under higher packet rates (at >> least, that's what you expect). >> >>> >>> Right now we know that Mason's patch makes this work, but we do not >>> understand why nor its implications. >> >> You need to understand why, right now, the way this problem is >> presented, you came up with a workaround, not with the root cause or the >> solution. What does your link partner (switch?) reports, that is, what >> is the ethtool output when you have a link up from your nb8800 adapter? > > Actually, nb8800_pause_config() seems to be doing a complete MAC/DMA > reconfiguration when pause frames get auto-negotiated while the link is > UP, This is due to a silly hardware limitation. The register containing the flow control bits can't be written while rx is enabled. > and it does not differentiate being called from > ethtool::set_pauseparam or the PHYLIB adjust_link callback (which it > probably should), Differentiate how? > wondering if there is a not a remote chance you can get the reply to > arrive right when you just got signaled a link UP? If you're attempting to send or receive things before you get the link up notification, you shouldn't expect anything to work reliably. -- Måns Rullgård
Re: Debugging Ethernet issues
On 11/14/2016 10:20 AM, Florian Fainelli wrote: > On 11/14/2016 09:59 AM, Sebastian Frias wrote: >> On 11/14/2016 06:32 PM, Florian Fainelli wrote: >>> On 11/14/2016 07:33 AM, Mason wrote: On 14/11/2016 15:58, Mason wrote: > nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx > vs > nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off > > I'm not sure whether "flow control" is relevant... Based on phy_print_status() phydev->pause ? "rx/tx" : "off" I added the following patch. diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c index defc22a15f67..4e758c1cfa4e 100644 --- a/drivers/net/ethernet/aurora/nb8800.c +++ b/drivers/net/ethernet/aurora/nb8800.c @@ -667,6 +667,8 @@ static void nb8800_link_reconfigure(struct net_device *dev) struct phy_device *phydev = priv->phydev; int change = 0; + printk("%s from %pf\n", __func__, __builtin_return_address(0)); + if (phydev->link) { if (phydev->speed != priv->speed) { priv->speed = phydev->speed; @@ -1274,9 +1276,9 @@ static int nb8800_hw_init(struct net_device *dev) nb8800_writeb(priv, NB8800_PQ2, val & 0xff); /* Auto-negotiate by default */ - priv->pause_aneg = true; - priv->pause_rx = true; - priv->pause_tx = true; + priv->pause_aneg = false; + priv->pause_rx = false; + priv->pause_tx = false; nb8800_mc_init(dev, 0); Connected to 1000 Mbps switch: # time udhcpc | while read LINE; do date; echo $LINE; done Thu Jan 1 00:00:22 UTC 1970 udhcpc (v1.22.1) started Thu Jan 1 00:00:22 UTC 1970 Sending discover... [ 24.565346] nb8800_link_reconfigure from phy_state_machine Thu Jan 1 00:00:25 UTC 1970 Sending discover... [ 26.575402] nb8800_link_reconfigure from phy_state_machine [ 26.580972] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx Thu Jan 1 00:00:28 UTC 1970 Sending discover... Thu Jan 1 00:00:29 UTC 1970 Sending select for 172.27.64.58... Thu Jan 1 00:00:29 UTC 1970 Lease of 172.27.64.58 obtained, lease time 604800 Thu Jan 1 00:00:29 UTC 1970 deleting routers Thu Jan 1 00:00:29 UTC 1970 adding dns 172.27.0.17 real0m7.388s user0m0.040s sys 0m0.090s Connected to 100 Mbps switch: # time udhcpc | while read LINE; do date; echo $LINE; done Thu Jan 1 00:00:14 UTC 1970 udhcpc (v1.22.1) started Thu Jan 1 00:00:15 UTC 1970 Sending discover... [ 16.968621] nb8800_link_reconfigure from phy_state_machine [ 17.975359] nb8800_link_reconfigure from phy_state_machine [ 17.980923] nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx Thu Jan 1 00:00:18 UTC 1970 Sending discover... Thu Jan 1 00:00:19 UTC 1970 Sending select for 172.27.64.58... Thu Jan 1 00:00:19 UTC 1970 Lease of 172.27.64.58 obtained, lease time 604800 Thu Jan 1 00:00:19 UTC 1970 deleting routers Thu Jan 1 00:00:19 UTC 1970 adding dns 172.27.0.17 real0m4.355s user0m0.043s sys 0m0.083s >>> >>> And the time difference is clearly accounted for auto-negotiation time >>> here, as you can see it takes about 3 seconds for Gigabit Ethernet to >>> auto-negotiate and that seems completely acceptable and normal to me >>> since it is a more involved process than lower speeds. >>> OK, so now it works (by accident?) even on 100 Mbps switch, but it still prints "flow control rx/tx"... >>> >>> Because your link partner advertises flow control, and that's what >>> phydev->pause and phydev->asym_pause report (I know it's confusing, but >>> that's what it is at the moment). >> >> Thanks. >> Could you confirm that Mason's patch is correct and/or that it does not >> has negative side-effects? > > The patch is not correct nor incorrect per-se, it changes the default > policy of having pause frames advertised by default to not having them > advertised by default. This influences both your Ethernet MAC and the > link partner in that the result is either flow control is enabled > (before) or it is not (with the patch). There must be something amiss if > you see packet loss or some kind of problem like that with an early > exchange such as DHCP. Flow control tend to kick in under higher packet > rates (at least, that's what you expect). > > >> >> Right now we know that Mason's patch makes this work, but we do not >> understand >> why nor its implications. > > You need to understand why, right now, the way this problem is >
Re: Debugging Ethernet issues
On 11/14/2016 09:59 AM, Sebastian Frias wrote: > On 11/14/2016 06:32 PM, Florian Fainelli wrote: >> On 11/14/2016 07:33 AM, Mason wrote: >>> On 14/11/2016 15:58, Mason wrote: >>> nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx vs nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off I'm not sure whether "flow control" is relevant... >>> >>> Based on phy_print_status() >>> phydev->pause ? "rx/tx" : "off" >>> I added the following patch. >>> >>> diff --git a/drivers/net/ethernet/aurora/nb8800.c >>> b/drivers/net/ethernet/aurora/nb8800.c >>> index defc22a15f67..4e758c1cfa4e 100644 >>> --- a/drivers/net/ethernet/aurora/nb8800.c >>> +++ b/drivers/net/ethernet/aurora/nb8800.c >>> @@ -667,6 +667,8 @@ static void nb8800_link_reconfigure(struct net_device >>> *dev) >>> struct phy_device *phydev = priv->phydev; >>> int change = 0; >>> >>> + printk("%s from %pf\n", __func__, __builtin_return_address(0)); >>> + >>> if (phydev->link) { >>> if (phydev->speed != priv->speed) { >>> priv->speed = phydev->speed; >>> @@ -1274,9 +1276,9 @@ static int nb8800_hw_init(struct net_device *dev) >>> nb8800_writeb(priv, NB8800_PQ2, val & 0xff); >>> >>> /* Auto-negotiate by default */ >>> - priv->pause_aneg = true; >>> - priv->pause_rx = true; >>> - priv->pause_tx = true; >>> + priv->pause_aneg = false; >>> + priv->pause_rx = false; >>> + priv->pause_tx = false; >>> >>> nb8800_mc_init(dev, 0); >>> >>> >>> Connected to 1000 Mbps switch: >>> >>> # time udhcpc | while read LINE; do date; echo $LINE; done >>> Thu Jan 1 00:00:22 UTC 1970 >>> udhcpc (v1.22.1) started >>> Thu Jan 1 00:00:22 UTC 1970 >>> Sending discover... >>> [ 24.565346] nb8800_link_reconfigure from phy_state_machine >>> Thu Jan 1 00:00:25 UTC 1970 >>> Sending discover... >>> [ 26.575402] nb8800_link_reconfigure from phy_state_machine >>> [ 26.580972] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow >>> control rx/tx >>> Thu Jan 1 00:00:28 UTC 1970 >>> Sending discover... >>> Thu Jan 1 00:00:29 UTC 1970 >>> Sending select for 172.27.64.58... >>> Thu Jan 1 00:00:29 UTC 1970 >>> Lease of 172.27.64.58 obtained, lease time 604800 >>> Thu Jan 1 00:00:29 UTC 1970 >>> deleting routers >>> Thu Jan 1 00:00:29 UTC 1970 >>> adding dns 172.27.0.17 >>> >>> real0m7.388s >>> user0m0.040s >>> sys 0m0.090s >>> >>> >>> >>> Connected to 100 Mbps switch: >>> >>> # time udhcpc | while read LINE; do date; echo $LINE; done >>> Thu Jan 1 00:00:14 UTC 1970 >>> udhcpc (v1.22.1) started >>> Thu Jan 1 00:00:15 UTC 1970 >>> Sending discover... >>> [ 16.968621] nb8800_link_reconfigure from phy_state_machine >>> [ 17.975359] nb8800_link_reconfigure from phy_state_machine >>> [ 17.980923] nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow >>> control rx/tx >>> Thu Jan 1 00:00:18 UTC 1970 >>> Sending discover... >>> Thu Jan 1 00:00:19 UTC 1970 >>> Sending select for 172.27.64.58... >>> Thu Jan 1 00:00:19 UTC 1970 >>> Lease of 172.27.64.58 obtained, lease time 604800 >>> Thu Jan 1 00:00:19 UTC 1970 >>> deleting routers >>> Thu Jan 1 00:00:19 UTC 1970 >>> adding dns 172.27.0.17 >>> >>> real0m4.355s >>> user0m0.043s >>> sys 0m0.083s >>> >> >> And the time difference is clearly accounted for auto-negotiation time >> here, as you can see it takes about 3 seconds for Gigabit Ethernet to >> auto-negotiate and that seems completely acceptable and normal to me >> since it is a more involved process than lower speeds. >> >>> >>> >>> OK, so now it works (by accident?) even on 100 Mbps switch, but it still >>> prints "flow control rx/tx"... >> >> Because your link partner advertises flow control, and that's what >> phydev->pause and phydev->asym_pause report (I know it's confusing, but >> that's what it is at the moment). > > Thanks. > Could you confirm that Mason's patch is correct and/or that it does not > has negative side-effects? The patch is not correct nor incorrect per-se, it changes the default policy of having pause frames advertised by default to not having them advertised by default. This influences both your Ethernet MAC and the link partner in that the result is either flow control is enabled (before) or it is not (with the patch). There must be something amiss if you see packet loss or some kind of problem like that with an early exchange such as DHCP. Flow control tend to kick in under higher packet rates (at least, that's what you expect). > > Right now we know that Mason's patch makes this work, but we do not understand > why nor its implications. You need to understand why, right now, the way this problem is presented, you came up with a workaround, not with the root cause or the solution. What does your link partner (switch?) reports, that is, what is the ethtool output when you have a link up from your nb8800
Re: Debugging Ethernet issues
On 11/14/2016 06:32 PM, Florian Fainelli wrote: > On 11/14/2016 07:33 AM, Mason wrote: >> On 14/11/2016 15:58, Mason wrote: >> >>> nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx >>> vs >>> nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off >>> >>> I'm not sure whether "flow control" is relevant... >> >> Based on phy_print_status() >> phydev->pause ? "rx/tx" : "off" >> I added the following patch. >> >> diff --git a/drivers/net/ethernet/aurora/nb8800.c >> b/drivers/net/ethernet/aurora/nb8800.c >> index defc22a15f67..4e758c1cfa4e 100644 >> --- a/drivers/net/ethernet/aurora/nb8800.c >> +++ b/drivers/net/ethernet/aurora/nb8800.c >> @@ -667,6 +667,8 @@ static void nb8800_link_reconfigure(struct net_device >> *dev) >> struct phy_device *phydev = priv->phydev; >> int change = 0; >> >> + printk("%s from %pf\n", __func__, __builtin_return_address(0)); >> + >> if (phydev->link) { >> if (phydev->speed != priv->speed) { >> priv->speed = phydev->speed; >> @@ -1274,9 +1276,9 @@ static int nb8800_hw_init(struct net_device *dev) >> nb8800_writeb(priv, NB8800_PQ2, val & 0xff); >> >> /* Auto-negotiate by default */ >> - priv->pause_aneg = true; >> - priv->pause_rx = true; >> - priv->pause_tx = true; >> + priv->pause_aneg = false; >> + priv->pause_rx = false; >> + priv->pause_tx = false; >> >> nb8800_mc_init(dev, 0); >> >> >> Connected to 1000 Mbps switch: >> >> # time udhcpc | while read LINE; do date; echo $LINE; done >> Thu Jan 1 00:00:22 UTC 1970 >> udhcpc (v1.22.1) started >> Thu Jan 1 00:00:22 UTC 1970 >> Sending discover... >> [ 24.565346] nb8800_link_reconfigure from phy_state_machine >> Thu Jan 1 00:00:25 UTC 1970 >> Sending discover... >> [ 26.575402] nb8800_link_reconfigure from phy_state_machine >> [ 26.580972] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow >> control rx/tx >> Thu Jan 1 00:00:28 UTC 1970 >> Sending discover... >> Thu Jan 1 00:00:29 UTC 1970 >> Sending select for 172.27.64.58... >> Thu Jan 1 00:00:29 UTC 1970 >> Lease of 172.27.64.58 obtained, lease time 604800 >> Thu Jan 1 00:00:29 UTC 1970 >> deleting routers >> Thu Jan 1 00:00:29 UTC 1970 >> adding dns 172.27.0.17 >> >> real0m7.388s >> user0m0.040s >> sys 0m0.090s >> >> >> >> Connected to 100 Mbps switch: >> >> # time udhcpc | while read LINE; do date; echo $LINE; done >> Thu Jan 1 00:00:14 UTC 1970 >> udhcpc (v1.22.1) started >> Thu Jan 1 00:00:15 UTC 1970 >> Sending discover... >> [ 16.968621] nb8800_link_reconfigure from phy_state_machine >> [ 17.975359] nb8800_link_reconfigure from phy_state_machine >> [ 17.980923] nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow >> control rx/tx >> Thu Jan 1 00:00:18 UTC 1970 >> Sending discover... >> Thu Jan 1 00:00:19 UTC 1970 >> Sending select for 172.27.64.58... >> Thu Jan 1 00:00:19 UTC 1970 >> Lease of 172.27.64.58 obtained, lease time 604800 >> Thu Jan 1 00:00:19 UTC 1970 >> deleting routers >> Thu Jan 1 00:00:19 UTC 1970 >> adding dns 172.27.0.17 >> >> real0m4.355s >> user0m0.043s >> sys 0m0.083s >> > > And the time difference is clearly accounted for auto-negotiation time > here, as you can see it takes about 3 seconds for Gigabit Ethernet to > auto-negotiate and that seems completely acceptable and normal to me > since it is a more involved process than lower speeds. > >> >> >> OK, so now it works (by accident?) even on 100 Mbps switch, but it still >> prints "flow control rx/tx"... > > Because your link partner advertises flow control, and that's what > phydev->pause and phydev->asym_pause report (I know it's confusing, but > that's what it is at the moment). Thanks. Could you confirm that Mason's patch is correct and/or that it does not has negative side-effects? Right now we know that Mason's patch makes this work, but we do not understand why nor its implications. > >> >> # ethtool -a eth0 >> Pause parameters for eth0: >> Autonegotiate: off >> RX: off >> TX: off >> >> These values make sense considering my changes in the driver. >> >> Are 100 Mbps switches supposed to support these pause features, >> and are they supposed to be able to auto-negotiate them? > > Yes, switches can support flow control aka pause frames, and unless they > are configurable, they typically advertise what their EEPROM has defined > for them, so most likely the full auto-negotiated spectrum: > 10/100/1000Mbps and support for flow control, but your mileage may vary > of course. >
Re: Debugging Ethernet issues
On 11/14/2016 07:33 AM, Mason wrote: > On 14/11/2016 15:58, Mason wrote: > >> nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx >> vs >> nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off >> >> I'm not sure whether "flow control" is relevant... > > Based on phy_print_status() > phydev->pause ? "rx/tx" : "off" > I added the following patch. > > diff --git a/drivers/net/ethernet/aurora/nb8800.c > b/drivers/net/ethernet/aurora/nb8800.c > index defc22a15f67..4e758c1cfa4e 100644 > --- a/drivers/net/ethernet/aurora/nb8800.c > +++ b/drivers/net/ethernet/aurora/nb8800.c > @@ -667,6 +667,8 @@ static void nb8800_link_reconfigure(struct net_device > *dev) > struct phy_device *phydev = priv->phydev; > int change = 0; > > + printk("%s from %pf\n", __func__, __builtin_return_address(0)); > + > if (phydev->link) { > if (phydev->speed != priv->speed) { > priv->speed = phydev->speed; > @@ -1274,9 +1276,9 @@ static int nb8800_hw_init(struct net_device *dev) > nb8800_writeb(priv, NB8800_PQ2, val & 0xff); > > /* Auto-negotiate by default */ > - priv->pause_aneg = true; > - priv->pause_rx = true; > - priv->pause_tx = true; > + priv->pause_aneg = false; > + priv->pause_rx = false; > + priv->pause_tx = false; > > nb8800_mc_init(dev, 0); > > > Connected to 1000 Mbps switch: > > # time udhcpc | while read LINE; do date; echo $LINE; done > Thu Jan 1 00:00:22 UTC 1970 > udhcpc (v1.22.1) started > Thu Jan 1 00:00:22 UTC 1970 > Sending discover... > [ 24.565346] nb8800_link_reconfigure from phy_state_machine > Thu Jan 1 00:00:25 UTC 1970 > Sending discover... > [ 26.575402] nb8800_link_reconfigure from phy_state_machine > [ 26.580972] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow > control rx/tx > Thu Jan 1 00:00:28 UTC 1970 > Sending discover... > Thu Jan 1 00:00:29 UTC 1970 > Sending select for 172.27.64.58... > Thu Jan 1 00:00:29 UTC 1970 > Lease of 172.27.64.58 obtained, lease time 604800 > Thu Jan 1 00:00:29 UTC 1970 > deleting routers > Thu Jan 1 00:00:29 UTC 1970 > adding dns 172.27.0.17 > > real0m7.388s > user0m0.040s > sys 0m0.090s > > > > Connected to 100 Mbps switch: > > # time udhcpc | while read LINE; do date; echo $LINE; done > Thu Jan 1 00:00:14 UTC 1970 > udhcpc (v1.22.1) started > Thu Jan 1 00:00:15 UTC 1970 > Sending discover... > [ 16.968621] nb8800_link_reconfigure from phy_state_machine > [ 17.975359] nb8800_link_reconfigure from phy_state_machine > [ 17.980923] nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow > control rx/tx > Thu Jan 1 00:00:18 UTC 1970 > Sending discover... > Thu Jan 1 00:00:19 UTC 1970 > Sending select for 172.27.64.58... > Thu Jan 1 00:00:19 UTC 1970 > Lease of 172.27.64.58 obtained, lease time 604800 > Thu Jan 1 00:00:19 UTC 1970 > deleting routers > Thu Jan 1 00:00:19 UTC 1970 > adding dns 172.27.0.17 > > real0m4.355s > user0m0.043s > sys 0m0.083s > And the time difference is clearly accounted for auto-negotiation time here, as you can see it takes about 3 seconds for Gigabit Ethernet to auto-negotiate and that seems completely acceptable and normal to me since it is a more involved process than lower speeds. > > > OK, so now it works (by accident?) even on 100 Mbps switch, but it still > prints "flow control rx/tx"... Because your link partner advertises flow control, and that's what phydev->pause and phydev->asym_pause report (I know it's confusing, but that's what it is at the moment). > > # ethtool -a eth0 > Pause parameters for eth0: > Autonegotiate: off > RX: off > TX: off > > These values make sense considering my changes in the driver. > > Are 100 Mbps switches supposed to support these pause features, > and are they supposed to be able to auto-negotiate them? Yes, switches can support flow control aka pause frames, and unless they are configurable, they typically advertise what their EEPROM has defined for them, so most likely the full auto-negotiated spectrum: 10/100/1000Mbps and support for flow control, but your mileage may vary of course. -- Florian
Re: Debugging Ethernet issues
On 14/11/2016 15:58, Mason wrote: > nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx > vs > nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off > > I'm not sure whether "flow control" is relevant... Based on phy_print_status() phydev->pause ? "rx/tx" : "off" I added the following patch. diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c index defc22a15f67..4e758c1cfa4e 100644 --- a/drivers/net/ethernet/aurora/nb8800.c +++ b/drivers/net/ethernet/aurora/nb8800.c @@ -667,6 +667,8 @@ static void nb8800_link_reconfigure(struct net_device *dev) struct phy_device *phydev = priv->phydev; int change = 0; + printk("%s from %pf\n", __func__, __builtin_return_address(0)); + if (phydev->link) { if (phydev->speed != priv->speed) { priv->speed = phydev->speed; @@ -1274,9 +1276,9 @@ static int nb8800_hw_init(struct net_device *dev) nb8800_writeb(priv, NB8800_PQ2, val & 0xff); /* Auto-negotiate by default */ - priv->pause_aneg = true; - priv->pause_rx = true; - priv->pause_tx = true; + priv->pause_aneg = false; + priv->pause_rx = false; + priv->pause_tx = false; nb8800_mc_init(dev, 0); Connected to 1000 Mbps switch: # time udhcpc | while read LINE; do date; echo $LINE; done Thu Jan 1 00:00:22 UTC 1970 udhcpc (v1.22.1) started Thu Jan 1 00:00:22 UTC 1970 Sending discover... [ 24.565346] nb8800_link_reconfigure from phy_state_machine Thu Jan 1 00:00:25 UTC 1970 Sending discover... [ 26.575402] nb8800_link_reconfigure from phy_state_machine [ 26.580972] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx Thu Jan 1 00:00:28 UTC 1970 Sending discover... Thu Jan 1 00:00:29 UTC 1970 Sending select for 172.27.64.58... Thu Jan 1 00:00:29 UTC 1970 Lease of 172.27.64.58 obtained, lease time 604800 Thu Jan 1 00:00:29 UTC 1970 deleting routers Thu Jan 1 00:00:29 UTC 1970 adding dns 172.27.0.17 real0m7.388s user0m0.040s sys 0m0.090s Connected to 100 Mbps switch: # time udhcpc | while read LINE; do date; echo $LINE; done Thu Jan 1 00:00:14 UTC 1970 udhcpc (v1.22.1) started Thu Jan 1 00:00:15 UTC 1970 Sending discover... [ 16.968621] nb8800_link_reconfigure from phy_state_machine [ 17.975359] nb8800_link_reconfigure from phy_state_machine [ 17.980923] nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx Thu Jan 1 00:00:18 UTC 1970 Sending discover... Thu Jan 1 00:00:19 UTC 1970 Sending select for 172.27.64.58... Thu Jan 1 00:00:19 UTC 1970 Lease of 172.27.64.58 obtained, lease time 604800 Thu Jan 1 00:00:19 UTC 1970 deleting routers Thu Jan 1 00:00:19 UTC 1970 adding dns 172.27.0.17 real0m4.355s user0m0.043s sys 0m0.083s OK, so now it works (by accident?) even on 100 Mbps switch, but it still prints "flow control rx/tx"... # ethtool -a eth0 Pause parameters for eth0: Autonegotiate: off RX: off TX: off These values make sense considering my changes in the driver. Are 100 Mbps switches supposed to support these pause features, and are they supposed to be able to auto-negotiate them? Regards.
Re: Debugging Ethernet issues
On 14/11/2016 14:34, Andrew Lunn wrote: > Mason wrote: > >> How exactly does one use the generic PHY driver? > > If there is no specific driver available for the PHY, linux will fall > back to the generic driver. So just don't compile the specific driver. > You can see under /sys/bus/mdio_bus/devices which driver is being > used. I've removed the Atheros driver from the kernel config. # ls /sys/bus/mdio_bus/drivers Generic 10G PHY/ Generic PHY/ # ls -l "/sys/bus/mdio_bus/devices/26000.nb8800-mii:04/" lrwxrwxrwx1 root root 0 Jan 1 00:00 of_node -> ../../../../../../../firmware/devicetree/base/soc/ethernet@26000/ethernet-phy@4/ -r--r--r--1 root root 4096 Jan 1 00:00 phy_has_fixups -r--r--r--1 root root 4096 Jan 1 00:00 phy_id -r--r--r--1 root root 4096 Jan 1 00:00 phy_interface lrwxrwxrwx1 root root 0 Jan 1 00:00 subsystem -> ../../../../../../../bus/mdio_bus/ -rw-r--r--1 root root 4096 Jan 1 00:00 uevent (I don't know why/how it finds so many ".." in the symlink path) [1.170994] libphy: Fixed MDIO Bus: probed [1.181794] libphy: nb8800-mii: probed [1.192759] nb8800 26000.ethernet eth0: MAC address 00:16:e8:4b:b0:7d # time udhcpc | while read LINE; do date; echo $LINE; done Mon Nov 14 15:39:50 UTC 2016 udhcpc (v1.22.1) started Mon Nov 14 15:39:50 UTC 2016 Sending discover... [ 125.468634] nb8800_link_reconfigure from phy_state_machine [ 126.475300] nb8800_link_reconfigure from phy_state_machine Mon Nov 14 15:39:53 UTC 2016 Sending discover... [ 126.682096] nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx Mon Nov 14 15:39:57 UTC 2016 Sending discover... Mon Nov 14 15:40:20 UTC 2016 Sending discover... Mon Nov 14 15:40:23 UTC 2016 Sending discover... Mon Nov 14 15:40:26 UTC 2016 Sending discover... Mon Nov 14 15:40:49 UTC 2016 Sending discover... Mon Nov 14 15:40:52 UTC 2016 Sending discover... Mon Nov 14 15:40:55 UTC 2016 Sending discover... ^C real1m12.040s user0m0.067s sys 0m0.057s # tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 bytes 15:39:54.214574 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 15:39:55.215830 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 15:39:57.247862 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 15:39:57.248027 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 15:40:20.307556 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 15:40:20.307732 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 15:40:23.334155 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 15:40:23.334313 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 15:40:26.360807 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 15:40:26.360976 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 15:40:49.420473 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 15:40:49.420645 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 15:40:52.447121 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 15:40:52.447284 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 15:40:55.473721 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 15:40:55.473891 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 # ethtool -S eth0 NIC statistics: rx_bytes_ok: 975 rx_frames_ok: 8 rx_undersize_frames: 0 rx_fragment_frames: 0 rx_64_byte_frames: 6 rx_127_byte_frames: 0 rx_255_byte_frames: 1 rx_511_byte_frames: 1 rx_1023_byte_frames: 0 rx_max_size_frames: 0 rx_oversize_frames: 0 rx_bad_fcs_frames: 0 rx_broadcast_frames: 4 rx_multicast_frames: 0 rx_control_frames: 0 rx_pause_frames: 0 rx_unsup_control_frames: 0 rx_align_error_frames: 0 rx_overrun_frames: 0 rx_jabber_frames: 0 rx_bytes: 975 rx_frames: 8 tx_bytes_ok: 2960 tx_frames_ok: 11 tx_64_byte_frames: 3 tx_127_byte_frames: 0 tx_255_byte_frames: 0 tx_511_byte_frames: 8 tx_1023_byte_frames: 0 tx_max_size_frames: 0 tx_oversize_frames: 0 tx_broadcast_frames: 8 tx_multicast_frames: 0 tx_control_frames: 0 tx_pause_frames: 0 tx_underrun_frames: 0 tx_single_collision_frames: 0 tx_multi_collision_frames: 0 tx_deferred_collision_frames: 0 tx_late_collision_frames: 0
Re: Debugging Ethernet issues
> How exactly does one use the generic PHY driver? If there is no specific driver available for the PHY, linux will fall back to the generic driver. So just don't compile the specific driver. You can see under /sys/bus/mdio_bus/devices which driver is being used. Andrew
Re: Debugging Ethernet issues
On 13/11/2016 20:55, Florian Fainelli wrote: > Le 13/11/2016 à 11:51, Mason a écrit : >> On 13/11/2016 04:09, Andrew Lunn wrote: >> >>> Mason wrote: >>> When connected to a Gigabit switch 3.4 negotiates a LAN DHCP setup instantly 4.7 requires over 5 seconds to do so >>> >>> When you run tcpdump on the DHCP server, are you noticing the first >>> request is missing? >>> >>> What can happen is the dhclient gets started immediately and sends out >>> its first request before auto-negotiation has finished. So this first packet >>> gets lost. The retransmit after a few seconds is then successful. >> >> I will run tcpdump on the server as I run udhcpc on the client >> for Linux 3.4 vs 4.7 >> >> Do you know what would make auto-negotiation fail at 100 Mbps >> on 4.7? (whereas it succeeds on 3.4) >> >> (Thinking out loud) If the problem were in auto-negotiation, >> then if should work if I hard-code speed and duplex using >> ethtool, right? (IIRC, hard-coding doesn't help.) > > I would start with checking basic things: > > - does your Ethernet driver get a link UP being reported correctly > (netif_carrier_ok returns 1)? I don't see any calls to netif_carrier_ok() in the network driver ( drivers/net/ethernet/aurora/nb8800.c ) Maybe it is using some generic infrastructure? > - if you let the bootloader configure the PHY and utilize the Generic > PHY driver instead of the Atheros PHY driver, does the problem appear as > well? How exactly does one use the generic PHY driver? Do you mean the following? Documentation/devicetree/bindings/net/fixed-link.txt Regards.
Re: Debugging Ethernet issues
On 11/13/2016 08:55 PM, Florian Fainelli wrote: > Le 13/11/2016 à 11:51, Mason a écrit : >> On 13/11/2016 04:09, Andrew Lunn wrote: >> >>> Mason wrote: >>> When connected to a Gigabit switch 3.4 negotiates a LAN DHCP setup instantly 4.7 requires over 5 seconds to do so >>> >>> When you run tcpdump on the DHCP server, are you noticing the first >>> request is missing? >>> >>> What can happen is the dhclient gets started immediately and sends out >>> its first request before auto-negotiation has finished. So this first packet >>> gets lost. The retransmit after a few seconds is then successful. >> >> I will run tcpdump on the server as I run udhcpc on the client >> for Linux 3.4 vs 4.7 >> >> Do you know what would make auto-negotiation fail at 100 Mbps >> on 4.7? (whereas it succeeds on 3.4) >> >> (Thinking out loud) If the problem were in auto-negotiation, >> then if should work if I hard-code speed and duplex using >> ethtool, right? (IIRC, hard-coding doesn't help.) > > I would start with checking basic things: > > - does your Ethernet driver get a link UP being reported correctly > (netif_carrier_ok returns 1)? > - if you let the bootloader configure the PHY and utilize the Generic > PHY driver instead of the Atheros PHY driver, does the problem appear as > well? Would using a "fixed-link" serve the same? It appears that using a fixed-link { #address-cells = <1>; #size-cells = <0>; #ifdef WITH_FIXED_LINK phy-connection-type = "rgmii"; fixed-link { speed = <100>; full-duplex; }; #else phy-connection-type = "rgmii"; phy-handle = <_phy>; /* Atheros AR8035 */ eth0_phy: ethernet-phy@4 { interrupt-parent = <>; compatible = "ethernet-phy-id004d.d072", "ethernet-phy-ieee802.3-c22"; interrupts = <37 IRQ_TYPE_EDGE_RISING>; reg = <4>; }; #endif }; works. For what is worth, the patch that Mason was talking about earlier in the thread: "...After much hair-pulling, it turned out that *some* of the breakage was caused by a local patch..." was setting changing the following delay in 'drivers/net/phy/phy.c:phy_state_machine()' /* Only re-schedule a PHY state machine change if we are polling the * PHY, if PHY_IGNORE_INTERRUPT is set, then we will be moving * between states from phy_mac_interrupt() */ if (phydev->irq == PHY_POLL) queue_delayed_work(system_power_efficient_wq, >state_queue, PHY_STATE_TIME * HZ); from "PHY_STATE_TIME * HZ" to "0". That caused 2 of 3 types of boards to fail, while one of them always worked regardless of the delay. In a nutshell: - Board A, chip X: works with delay "PHY_STATE_TIME * HZ" or "0". - Board B, chip X: does not work with delay "0" - Board C, chip Y: does not work with delay "0" Does board A works by "luck" when this delay is "0"? (this delay has always been there, but it is not clear why) > - what do transmit/receive counters on the Ethernet driver/MAC return? >
Re: Debugging Ethernet issues
On 14/11/2016 13:13, Mason wrote: > This is a different log which I got earlier, but can no longer reproduce: > > # tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d > tcpdump: verbose output suppressed, use -v or -vv for full protocol decode > listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 > bytes > 11:08:09.610662 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from > 00:16:e8:4b:b0:7d, length 300 > 11:08:10.642852 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, > length 391 > 11:08:10.643276 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from > 00:16:e8:4b:b0:7d, length 300 > 11:08:10.790526 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, > length 391 > 11:08:11.638146 ARP, Reply 172.27.64.58 is-at 00:16:e8:4b:b0:7d, length 46 > 11:08:11.638156 IP 172.27.64.1 > 172.27.64.58: ICMP echo request, id 29883, > seq 0, length 28 > 11:08:11.638345 IP 172.27.64.58 > 172.27.64.1: ICMP echo reply, id 29883, seq > 0, length 28 > 11:08:16.642811 ARP, Request who-has 172.27.64.1 tell 172.27.64.58, length 46 > 11:08:16.642822 ARP, Reply 172.27.64.1 is-at 00:15:17:24:e0:81, length 28 Additional tests on v4.7 If I set the link up *BEFORE* running the DHCP client, then I get: # ip link set eth0 up /* Wait 4-5 seconds */ [ 69.815303] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx # time udhcpc | while read LINE; do date; echo $LINE; done Mon Nov 14 13:27:30 UTC 2016 udhcpc (v1.22.1) started Mon Nov 14 13:27:30 UTC 2016 Sending discover... Mon Nov 14 13:27:31 UTC 2016 Sending select for 172.27.64.58... Mon Nov 14 13:27:32 UTC 2016 Lease of 172.27.64.58 obtained, lease time 604800 Mon Nov 14 13:27:32 UTC 2016 deleting routers Mon Nov 14 13:27:32 UTC 2016 adding dns 172.27.0.17 real0m1.292s user0m0.037s sys 0m0.087s # tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 bytes 13:27:30.922880 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 13:27:30.923055 IP 172.27.64.1 > 172.27.64.58: ICMP echo request, id 29883, seq 0, length 28 13:27:31.924151 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 13:27:31.936221 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 13:27:32.061869 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 13:27:35.933946 ARP, Request who-has 172.27.64.58 tell 172.27.64.1, length 28 13:27:35.934079 ARP, Reply 172.27.64.58 is-at 00:16:e8:4b:b0:7d, length 46 I did see (once) the 9-packet trace (with the ping echo/reply) # tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 bytes 13:17:31.494117 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 13:17:32.495374 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 13:17:32.510753 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 13:17:32.591262 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 13:17:33.494093 ARP, Reply 172.27.64.58 is-at 00:16:e8:4b:b0:7d, length 46 13:17:33.494107 IP 172.27.64.1 > 172.27.64.58: ICMP echo request, id 29883, seq 0, length 28 13:17:33.494242 IP 172.27.64.58 > 172.27.64.1: ICMP echo reply, id 29883, seq 0, length 28 13:17:38.500651 ARP, Request who-has 172.27.64.1 tell 172.27.64.58, length 46 13:17:38.500663 ARP, Reply 172.27.64.1 is-at 00:15:17:24:e0:81, length 28 Since the ping request is sent from the DHCP server, perhaps the server is checking its ARP table. I think it is not an important difference. Experiment #2 Set link up, then down, then up. Then send DHCP request. # ip link set eth0 up [ 39.185326] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx [root@buildroot ~]# ip link set eth0 down /* WAIT A LONG TIME FOR "Link is Down" MESSAGE */ # ip link set eth0 up [ 102.818598] nb8800 26000.ethernet eth0: Link is Down [ 104.828632] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx Note: isn't it weird that I have to set link up before "link down" message appears? # time udhcpc | while read LINE; do date; echo $LINE; done Mon Nov 14 13:40:08 UTC 2016 udhcpc (v1.22.1) started Mon Nov 14 13:40:08 UTC 2016 Sending discover... Mon Nov 14 13:40:11 UTC 2016 Sending discover... Mon Nov 14 13:40:14 UTC 2016 Sending discover... Mon Nov 14 13:40:37 UTC 2016 Sending discover... Mon Nov 14 13:40:40 UTC 2016 Sending discover... Mon Nov 14 13:40:43 UTC 2016 Sending discover... Mon Nov 14 13:41:06 UTC 2016 Sending discover... Mon Nov 14 13:41:09 UTC 2016 Sending discover... Mon Nov 14 13:41:12 UTC 2016 Sending
Re: Debugging Ethernet issues
On 13/11/2016 04:09, Andrew Lunn wrote: > Mason wrote: > >> When connected to a Gigabit switch >> 3.4 negotiates a LAN DHCP setup instantly >> 4.7 requires over 5 seconds to do so > > When you run tcpdump on the DHCP server, are you noticing the first > request is missing? > > What can happen is the dhclient gets started immediately and sends out > its first request before auto-negotiation has finished. So this first packet > gets lost. The retransmit after a few seconds is then successful. This is what happens on 3.4 # time udhcpc | while read LINE; do date; echo $LINE; done Mon Nov 14 11:57:12 UTC 2016 udhcpc (v1.22.1) started Mon Nov 14 11:57:12 UTC 2016 Sending discover... [ 50.15] tangox-enet.0: link up (1000 Mbps - Full Duplex) Mon Nov 14 11:57:15 UTC 2016 Sending discover... Mon Nov 14 11:57:16 UTC 2016 Sending select for 172.27.64.58... Mon Nov 14 11:57:17 UTC 2016 Lease of 172.27.64.58 obtained, lease time 604800 Mon Nov 14 11:57:17 UTC 2016 deleting routers Mon Nov 14 11:57:17 UTC 2016 adding dns 172.27.0.17 real0m4.704s user0m0.030s sys 0m0.550s The corresponding log on the DHCP server was # tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 bytes 11:57:16.095474 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 11:57:16.095638 IP 172.27.64.1 > 172.27.64.58: ICMP echo request, id 29883, seq 0, length 28 11:57:17.096740 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 11:57:17.097182 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 11:57:17.202842 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 11:57:21.101946 ARP, Request who-has 172.27.64.58 tell 172.27.64.1, length 28 11:57:21.102182 ARP, Reply 172.27.64.58 is-at 00:16:e8:4b:b0:7d, length 46 This is a different log which I got earlier, but can no longer reproduce: # tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 bytes 11:08:09.610662 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 11:08:10.642852 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 11:08:10.643276 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 11:08:10.790526 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 11:08:11.638146 ARP, Reply 172.27.64.58 is-at 00:16:e8:4b:b0:7d, length 46 11:08:11.638156 IP 172.27.64.1 > 172.27.64.58: ICMP echo request, id 29883, seq 0, length 28 11:08:11.638345 IP 172.27.64.58 > 172.27.64.1: ICMP echo reply, id 29883, seq 0, length 28 11:08:16.642811 ARP, Request who-has 172.27.64.1 tell 172.27.64.58, length 46 11:08:16.642822 ARP, Reply 172.27.64.1 is-at 00:15:17:24:e0:81, length 28 This is what happens on v4.7 # time udhcpc | while read LINE; do date; echo $LINE; done Mon Nov 14 11:51:25 UTC 2016 udhcpc (v1.22.1) started Mon Nov 14 11:51:25 UTC 2016 Sending discover... Mon Nov 14 11:51:28 UTC 2016 Sending discover... [ 342.658572] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx Mon Nov 14 11:51:32 UTC 2016 Sending discover... Mon Nov 14 11:51:33 UTC 2016 Sending select for 172.27.64.58... Mon Nov 14 11:51:33 UTC 2016 Lease of 172.27.64.58 obtained, lease time 604800 Mon Nov 14 11:51:33 UTC 2016 deleting routers Mon Nov 14 11:51:33 UTC 2016 adding dns 172.27.0.17 real0m7.348s user0m0.053s sys 0m0.077s The corresponding log on the DHCP server was # tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 bytes 11:51:31.957245 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 11:51:31.957409 IP 172.27.64.1 > 172.27.64.58: ICMP echo request, id 29883, seq 0, length 28 11:51:32.958514 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 11:51:32.970538 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300 11:51:33.038205 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391 11:51:36.957949 ARP, Request who-has 172.27.64.58 tell 172.27.64.1, length 28 11:51:36.958112 ARP, Reply 172.27.64.58 is-at 00:16:e8:4b:b0:7d, length 46 For reference, here are the ethtool statistics on 4.7 after the DHCP setup: # ethtool -S eth0 NIC statistics: rx_bytes_ok: 1747 rx_frames_ok: 9 rx_undersize_frames: 0 rx_fragment_frames: 0 rx_64_byte_frames: 3 rx_127_byte_frames: 2 rx_255_byte_frames: 1 rx_511_byte_frames: 3 rx_1023_byte_frames: 0
Re: Debugging Ethernet issues
Le 13/11/2016 à 11:51, Mason a écrit : > On 13/11/2016 04:09, Andrew Lunn wrote: > >> Mason wrote: >> >>> When connected to a Gigabit switch >>> 3.4 negotiates a LAN DHCP setup instantly >>> 4.7 requires over 5 seconds to do so >> >> When you run tcpdump on the DHCP server, are you noticing the first >> request is missing? >> >> What can happen is the dhclient gets started immediately and sends out >> its first request before auto-negotiation has finished. So this first packet >> gets lost. The retransmit after a few seconds is then successful. > > I will run tcpdump on the server as I run udhcpc on the client > for Linux 3.4 vs 4.7 > > Do you know what would make auto-negotiation fail at 100 Mbps > on 4.7? (whereas it succeeds on 3.4) > > (Thinking out loud) If the problem were in auto-negotiation, > then if should work if I hard-code speed and duplex using > ethtool, right? (IIRC, hard-coding doesn't help.) I would start with checking basic things: - does your Ethernet driver get a link UP being reported correctly (netif_carrier_ok returns 1)? - if you let the bootloader configure the PHY and utilize the Generic PHY driver instead of the Atheros PHY driver, does the problem appear as well? - what do transmit/receive counters on the Ethernet driver/MAC return? -- Florian
Re: Debugging Ethernet issues
On 13/11/2016 04:09, Andrew Lunn wrote: > Mason wrote: > >> When connected to a Gigabit switch >> 3.4 negotiates a LAN DHCP setup instantly >> 4.7 requires over 5 seconds to do so > > When you run tcpdump on the DHCP server, are you noticing the first > request is missing? > > What can happen is the dhclient gets started immediately and sends out > its first request before auto-negotiation has finished. So this first packet > gets lost. The retransmit after a few seconds is then successful. I will run tcpdump on the server as I run udhcpc on the client for Linux 3.4 vs 4.7 Do you know what would make auto-negotiation fail at 100 Mbps on 4.7? (whereas it succeeds on 3.4) (Thinking out loud) If the problem were in auto-negotiation, then if should work if I hard-code speed and duplex using ethtool, right? (IIRC, hard-coding doesn't help.) Regards.
Re: Debugging Ethernet issues
> When connected to a Gigabit switch > 3.4 negotiates a LAN DHCP setup instantly > 4.7 requires over 5 seconds to do so When you run tcpdump on the DHCP server, are you noticing the first request is missing? What can happen is the dhclient gets started immediately and sends out its first request before auto-negotiation has finished. So this first packet gets lost. The retransmit after a few seconds is then successful. Andrew
Debugging Ethernet issues
Hello everyone, In a past thread ("Ethernet not working on a different SoC with same eth HW") I was struggling getting Ethernet to work at all on a new board using a recent 4.7 kernel. After much hair-pulling, it turned out that *some* of the breakage was caused by a local patch which should have been guarded by a preprocessor macro. But even after reverting that patch, Ethernet does not work well on this board with kernel 4.7 whereas if I use an ancient 3.4 kernel, then Ethernet works much better. Differences: When connected to a 100 Mbps switch 3.4 negotiates a LAN DHCP setup instantly 4.7 times out When connected to a Gigabit switch 3.4 negotiates a LAN DHCP setup instantly 4.7 requires over 5 seconds to do so (In case it matters, my board is using an Atheros 8035 PHY.) I am aware that there have been hundreds of patches to the phy and net frameworks in the 3.4 to 4.7 time-frame. I'm wondering if there are important events I can log, to see what is going wrong in the 4.7 case? Are there kernel debugging options I might turn on, to better understand what is going wrong? I would be extremely grateful for any insight on this subject. Regards.