Re: [PATCH RFC] net: phy: Introduced the PHY_AN_PENDING state
Resending as I didn't reply all. On Tue, Jun 09, 2015 at 10:17:59PM -0700, Florian Fainelli wrote: > > The typical way to work around these problems are to fix them at the PHY > driver level, see below. > My first attempt of work around is to target on the PHY driver but I couldn't figure out a good way in doing that because 1. We use RT kernel. The jitter performance is very sensitive to the time we spend in the kernel. Hence polling for auto-negotiation to complete before starting a new one could hurt the system performance. Moreover, I've seen the PHY takes up to 11s to complete auto-negotiation. 2. Ruling out status polling mean I have to switch to use some work scheduling mechanism to let the PHY driver come back and check the auto-negotiation status (which is what the state machine is doing now) but this seem to even complicate the solution because the PHY state machine has moved to PHY_AN state at the same time, but the auto-negotiation has not really been fired yet. There are more conditions that need to be consider to sync the PHY driver and the PHY state machine after the previous auto-negotiation finish. It looks like a dead end to me for continuing down the path of modifying PHY driver. Do let me know if you have better idea to achieve the same objective. > > That sounds like a bug in the PHY state machine and/or the PHY driver if > you are allowed to restart auto-negotiation while one is pending. Now > that the PHY state machine has debug prints built-in, could you capture > a trace of this failing case? > > Is this observed with the generic PHY driver or a custom PHY driver? > It's not really a problem in the state machine or PHY driver. A very common scenario for an auto-negotiation to start before previous complete is at system boot up where the previous auto-negotiation, either triggered by hardware (because PHY come out from reset and auto-negotiate by itself) or software (U-Boot triggering PHY software reset). The other scenario that I'm able to induce the same effect is by doing mii-tool -r in Linux. > As usual with state machines, introducing a new state needs to be > carefully done in order to make sure that all transitions are correct, > so far I would rather work on finding the root cause/extending the > timeout and/or making it configurable on a PHY-driver basis rather than > having this additional state which is more error prone. > I agree that introducing changes to the state machine will need careful review and it's my last option within the constraint I have. I tried to serialize the PHY_AN_PENDING and PHY_AN state to minimize the disruption introduced to the state machine. There is only one entry to the PHY_AN_PENDING state (via phy_start_aneg) and there is only one exit point to continue with PHY_AN. The state machine would otherwise identical with the original state machine. Should the PHY drop to any state other than PHY_AN_PENDING, they will always transition their state as usual. Should the PHY state machine requires an auto-negotiation, it will always enter PHY_AN_PENDING and always continue with PHY_AN as usual. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] net: phy: Introduced the PHY_AN_PENDING state
Le 06/09/15 21:36, Keng Soon Cheah a écrit : > The PHY_AN_PENDING state is put as a gate to enter the PHY_AN state > where it will wait for any uncomplete auto-negotiation session to > finish before starting a new one. > > This extra state could be used to workaround some auto-negotation > issues from certain vendors. The typical way to work around these problems are to fix them at the PHY driver level, see below. > > an_pending_timeout module parameter is used to enable the AN_PENDING > transition state. Set it to 0 to disable AN_PENDING state transition, > set it to any non-zero value to specify the timeout period for > PHY_AN_PENDING state in second. The default value is 0. > > an_pending_guard module parameter serves as a guard band to delay > the auto-negotiation firing after the previous auto-negotiation > finish. > > Signed-off-by: Keng Soon Cheah > > Conflicts: > > drivers/net/phy/phy.c > --- > We observed failure in the ethernet link operation when our board pairs > with some network switch model. The problem happens when an > auto-negotiation is started around the time the previous auto-negotiation > complete. We believe this might be an interoperatibility issue between > the PHYs but we need a short-term solution in software to workaround the > issue. > > We found that we are able to avoid from hitting the problem by waiting any > pending auto-negotiation to complete before starting a new one and this > patch is designed to serve the purpose. That sounds like a bug in the PHY state machine and/or the PHY driver if you are allowed to restart auto-negotiation while one is pending. Now that the PHY state machine has debug prints built-in, could you capture a trace of this failing case? Is this observed with the generic PHY driver or a custom PHY driver? > > A PHY_AN_PENDING state is introduced and it will act as a gate to enter > the PHY_AN state. This state will check for auto-negotiation completion > or timeout after an_pending_timeout period, then it will wait for > an_pending_guard before triggering another auto-negotiation. > > The following diagram shows the timing diagram > > >an_pending_timeout an_pending_guard >V V auto-nego >|->|| > ^ >auto-negotiation complete/timeout > > We do not have plan to submit this patch upstream (unless the community > feels this patch is useful in general) but we would like to seek for > feedback or advice if this patch could introduce new problems. As usual with state machines, introducing a new state needs to be carefully done in order to make sure that all transitions are correct, so far I would rather work on finding the root cause/extending the timeout and/or making it configurable on a PHY-driver basis rather than having this additional state which is more error prone. Thanks! > > --- > drivers/net/phy/phy.c | 44 +++- > include/linux/phy.h |3 ++- > 2 files changed, 45 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c > index b2197b5..35e6484 100644 > --- a/drivers/net/phy/phy.c > +++ b/drivers/net/phy/phy.c > @@ -38,6 +38,16 @@ > > #include > > +static unsigned int an_pending_timeout; > +module_param(an_pending_timeout, uint, 0644); > +MODULE_PARM_DESC(an_pending_timeout, > + "Timeout period for PHY_AN_PENDING state in second. 0 to disable > PHY_AN_PENDING state (default)"); > + > +static unsigned int an_pending_guard; > +module_param(an_pending_guard, uint, 0644); > +MODULE_PARM_DESC(an_pending_guard, > + "Guard band period before firing auto-negotiation from PHY_AN_PENDING > state in second. Default to 0"); > + > static const char *phy_speed_to_str(int speed) > { > switch (speed) { > @@ -82,7 +92,6 @@ static const char *phy_state_to_str(enum phy_state st) > return NULL; > } > > - > /** > * phy_print_status - Convenience function to print out the current phy > status > * @phydev: the phy_device struct > @@ -485,6 +494,18 @@ int phy_start_aneg(struct phy_device *phydev) > > /* Invalidate LP advertising flags */ > phydev->lp_advertising = 0; > + if (an_pending_timeout) { > + switch (phydev->state) { > + case PHY_AN_PENDING: > + case PHY_HALTED: > + break; > + default: > + phydev->state = PHY_AN_PENDING; > + phydev->link_timeout = an_pending_timeout; > + goto out_unlock; > + } > + > + } > > err = phydev->drv->config_aneg(phydev); > if (err < 0) > @@ -831,6 +852,27 @@ void phy_state_machine(struct work_struct *work) > phydev->link_timeout = PHY_AN_TIMEOUT; > > break; > + case PHY_AN_PENDING: > +
[PATCH RFC] net: phy: Introduced the PHY_AN_PENDING state
The PHY_AN_PENDING state is put as a gate to enter the PHY_AN state where it will wait for any uncomplete auto-negotiation session to finish before starting a new one. This extra state could be used to workaround some auto-negotation issues from certain vendors. an_pending_timeout module parameter is used to enable the AN_PENDING transition state. Set it to 0 to disable AN_PENDING state transition, set it to any non-zero value to specify the timeout period for PHY_AN_PENDING state in second. The default value is 0. an_pending_guard module parameter serves as a guard band to delay the auto-negotiation firing after the previous auto-negotiation finish. Signed-off-by: Keng Soon Cheah Conflicts: drivers/net/phy/phy.c --- We observed failure in the ethernet link operation when our board pairs with some network switch model. The problem happens when an auto-negotiation is started around the time the previous auto-negotiation complete. We believe this might be an interoperatibility issue between the PHYs but we need a short-term solution in software to workaround the issue. We found that we are able to avoid from hitting the problem by waiting any pending auto-negotiation to complete before starting a new one and this patch is designed to serve the purpose. A PHY_AN_PENDING state is introduced and it will act as a gate to enter the PHY_AN state. This state will check for auto-negotiation completion or timeout after an_pending_timeout period, then it will wait for an_pending_guard before triggering another auto-negotiation. The following diagram shows the timing diagram an_pending_timeout an_pending_guard V V auto-nego |->|| ^ auto-negotiation complete/timeout We do not have plan to submit this patch upstream (unless the community feels this patch is useful in general) but we would like to seek for feedback or advice if this patch could introduce new problems. --- drivers/net/phy/phy.c | 44 +++- include/linux/phy.h |3 ++- 2 files changed, 45 insertions(+), 2 deletions(-) diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index b2197b5..35e6484 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -38,6 +38,16 @@ #include +static unsigned int an_pending_timeout; +module_param(an_pending_timeout, uint, 0644); +MODULE_PARM_DESC(an_pending_timeout, + "Timeout period for PHY_AN_PENDING state in second. 0 to disable PHY_AN_PENDING state (default)"); + +static unsigned int an_pending_guard; +module_param(an_pending_guard, uint, 0644); +MODULE_PARM_DESC(an_pending_guard, + "Guard band period before firing auto-negotiation from PHY_AN_PENDING state in second. Default to 0"); + static const char *phy_speed_to_str(int speed) { switch (speed) { @@ -82,7 +92,6 @@ static const char *phy_state_to_str(enum phy_state st) return NULL; } - /** * phy_print_status - Convenience function to print out the current phy status * @phydev: the phy_device struct @@ -485,6 +494,18 @@ int phy_start_aneg(struct phy_device *phydev) /* Invalidate LP advertising flags */ phydev->lp_advertising = 0; + if (an_pending_timeout) { + switch (phydev->state) { + case PHY_AN_PENDING: + case PHY_HALTED: + break; + default: + phydev->state = PHY_AN_PENDING; + phydev->link_timeout = an_pending_timeout; + goto out_unlock; + } + + } err = phydev->drv->config_aneg(phydev); if (err < 0) @@ -831,6 +852,27 @@ void phy_state_machine(struct work_struct *work) phydev->link_timeout = PHY_AN_TIMEOUT; break; + case PHY_AN_PENDING: + /* Check if negotiation is done. Break if there's an error */ + err = phy_aneg_done(phydev); + if (err < 0) + break; + + /* If AN is done, we'll proceed with the real aneg triggering */ + if (err > 0) { + if (phydev->link_timeout > 0) + phydev->link_timeout = -(an_pending_guard); + else if (phydev->link_timeout < 0) + phydev->link_timeout++; + } else + phydev->link_timeout--; + + if (0 == phydev->link_timeout) { + needs_aneg = true; + + phydev->link_timeout = PHY_AN_TIMEOUT; + } + break; case PHY_AN: err = phy_read_status(phydev); if (err < 0) diff --git a/include/linux/phy.h b/include/linux/phy.h index a26c3f8..a63afdc 10