Re: [RFC PATCH] net: phy: Don't miss phy_suspend() on PHY_HALTED for PHYs with interrupts

2017-03-21 Thread Florian Fainelli
On 03/21/2017 03:09 AM, Roger Quadros wrote:
> On 20/03/17 18:41, Florian Fainelli wrote:
>> On 03/16/2017 12:46 AM, Roger Quadros wrote:
>>> On 15/03/17 17:49, Andrew Lunn wrote:
 On Wed, Mar 15, 2017 at 05:00:08PM +0200, Roger Quadros wrote:
> Andrew,
>
> On 15/03/17 16:08, Andrew Lunn wrote:
>> On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote:
>>> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state 
>>> change and not polling.")
>>> phy_suspend() doesn't get called as part of phy_stop() for PHYs using
>>> interrupts because the phy state machine is never triggered after a 
>>> phy_stop().
>>>
>>> Explicitly trigger the PHY state machine so that it can
>>> see the new PHY state (HALTED) and suspend the PHY.
>>>
>>> Signed-off-by: Roger Quadros 
>>
>> Hi Roger
>>
>> This seems sensible. It mirrors what phy_start() does.
>>
>> Reviewed-by: Andrew Lunn 
>
> The reason for this being an RFC was the following comment just before
> where I add the phy_trigger_machine()
>
> /* Cannot call flush_scheduled_work() here as desired because
>  * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change()
>  * will not reenable interrupts.
>  */
>
> Is this comment still applicable? If yes, is it OK to call
> phy_trigger_machine() there?

 Humm, good question.

 My _guess_ would be, calling it with sync=True could
 deadlock. sync=False is probably safe. But lets see what Florian says.
>>>
>>> I agree that we should use phy_trigger_machine() with sync=False.
>>>

>
>>
>> It does however lead to a follow up question. Are there other places
>> phydev->state is changed and it is missing a phy_trigger_machine()?
>>
>
> One other place I think we should add phy_trigger_machine() is 
> phy_start_aneg().

 Humm, that might get us into a tight loop.

 phy_start_aneg() kicks the phy driver to start autoneg and sets
 phydev->state = PHY_AN.

 phy_trigger_machine() triggers the state machine immediately. 

 In state PHY_AN, we check if aneg is done. If not, it sets needs_aneg
 = true. At the end of the state machine, this then calls
 phy_start_aneg(), and it all starts again.

 We are missing the 1s delay we have with polling. So for
 phy_start_aneg(), we might need a phy_delayed_trigger_machine(), which
 waits a second before doing anything?
>>>
>>> I think that should do the trick.
>>>
>>> How about this?
>>
>> This sounds like a possible fix indeed, however I would like to better
>> assess the impact on non interrupt driven PHYs before rolling such a change.
> 
> Is it safer if I add a check to do this only for interrupt driven PHYs?

Yes I think this is a good solution that would not impact polled PHYs.
Can you submit a formal patch for that change?

Thanks!

> 
> e.g.
> 
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 4b855f2..e0f5755 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -630,6 +630,9 @@ int phy_start_aneg(struct phy_device *phydev)
>  
>  out_unlock:
>   mutex_unlock(>lock);
> + if (!err && phy_interrupt_is_valid(phydev))
> + queue_delayed_work(system_power_efficient_wq, 
> >state_queue, HZ);
> +
>   return err;
>  }
>  EXPORT_SYMBOL(phy_start_aneg);
> 


-- 
Florian


Re: [RFC PATCH] net: phy: Don't miss phy_suspend() on PHY_HALTED for PHYs with interrupts

2017-03-21 Thread Roger Quadros
On 20/03/17 18:41, Florian Fainelli wrote:
> On 03/16/2017 12:46 AM, Roger Quadros wrote:
>> On 15/03/17 17:49, Andrew Lunn wrote:
>>> On Wed, Mar 15, 2017 at 05:00:08PM +0200, Roger Quadros wrote:
 Andrew,

 On 15/03/17 16:08, Andrew Lunn wrote:
> On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote:
>> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state 
>> change and not polling.")
>> phy_suspend() doesn't get called as part of phy_stop() for PHYs using
>> interrupts because the phy state machine is never triggered after a 
>> phy_stop().
>>
>> Explicitly trigger the PHY state machine so that it can
>> see the new PHY state (HALTED) and suspend the PHY.
>>
>> Signed-off-by: Roger Quadros 
>
> Hi Roger
>
> This seems sensible. It mirrors what phy_start() does.
>
> Reviewed-by: Andrew Lunn 

 The reason for this being an RFC was the following comment just before
 where I add the phy_trigger_machine()

 /* Cannot call flush_scheduled_work() here as desired because
  * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change()
  * will not reenable interrupts.
  */

 Is this comment still applicable? If yes, is it OK to call
 phy_trigger_machine() there?
>>>
>>> Humm, good question.
>>>
>>> My _guess_ would be, calling it with sync=True could
>>> deadlock. sync=False is probably safe. But lets see what Florian says.
>>
>> I agree that we should use phy_trigger_machine() with sync=False.
>>
>>>

>
> It does however lead to a follow up question. Are there other places
> phydev->state is changed and it is missing a phy_trigger_machine()?
>

 One other place I think we should add phy_trigger_machine() is 
 phy_start_aneg().
>>>
>>> Humm, that might get us into a tight loop.
>>>
>>> phy_start_aneg() kicks the phy driver to start autoneg and sets
>>> phydev->state = PHY_AN.
>>>
>>> phy_trigger_machine() triggers the state machine immediately. 
>>>
>>> In state PHY_AN, we check if aneg is done. If not, it sets needs_aneg
>>> = true. At the end of the state machine, this then calls
>>> phy_start_aneg(), and it all starts again.
>>>
>>> We are missing the 1s delay we have with polling. So for
>>> phy_start_aneg(), we might need a phy_delayed_trigger_machine(), which
>>> waits a second before doing anything?
>>
>> I think that should do the trick.
>>
>> How about this?
> 
> This sounds like a possible fix indeed, however I would like to better
> assess the impact on non interrupt driven PHYs before rolling such a change.

Is it safer if I add a check to do this only for interrupt driven PHYs?

e.g.

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 4b855f2..e0f5755 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -630,6 +630,9 @@ int phy_start_aneg(struct phy_device *phydev)
 
 out_unlock:
mutex_unlock(>lock);
+   if (!err && phy_interrupt_is_valid(phydev))
+   queue_delayed_work(system_power_efficient_wq, 
>state_queue, HZ);
+
return err;
 }
 EXPORT_SYMBOL(phy_start_aneg);
-- 
2.7.4

-- 
cheers,
-roger


Re: [RFC PATCH] net: phy: Don't miss phy_suspend() on PHY_HALTED for PHYs with interrupts

2017-03-20 Thread Florian Fainelli
On 03/15/2017 08:00 AM, Roger Quadros wrote:
> Andrew,
> 
> On 15/03/17 16:08, Andrew Lunn wrote:
>> On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote:
>>> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state change 
>>> and not polling.")
>>> phy_suspend() doesn't get called as part of phy_stop() for PHYs using
>>> interrupts because the phy state machine is never triggered after a 
>>> phy_stop().
>>>
>>> Explicitly trigger the PHY state machine so that it can
>>> see the new PHY state (HALTED) and suspend the PHY.
>>>
>>> Signed-off-by: Roger Quadros 
>>
>> Hi Roger
>>
>> This seems sensible. It mirrors what phy_start() does.
>>
>> Reviewed-by: Andrew Lunn 
> 
> The reason for this being an RFC was the following comment just before
> where I add the phy_trigger_machine()
> 
> /* Cannot call flush_scheduled_work() here as desired because
>  * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change()
>  * will not reenable interrupts.
>  */
> 
> Is this comment still applicable? If yes, is it OK to call
> phy_trigger_machine() there?

I think the comment is still applicable, most PHYLIB consumers will call
this with rtnl_lock() held from ndo_close() for instance.

> 
>>
>> It does however lead to a follow up question. Are there other places
>> phydev->state is changed and it is missing a phy_trigger_machine()?
>>
> 
> One other place I think we should add phy_trigger_machine() is 
> phy_start_aneg().
> 
> Since commit 3c293f4e08b5 I was not getting my ethernet link to come up if
> ethernet cable was plugged before the ethernet interface was brought up.
> The PHY seems to be stuck from RUNNING to AN state with no new interrupts 
> from the
> PHY.
> 
> I could workaround it on my board by doing either of the following:
> 
> 1) explicitly halt the PHY at ethernet driver probe time. Then when
> network interface is brought up it resumes the PHY and we see the
> Link/ANEG done interrupt.
> 
> 2) force BMCR_ANRESTART every time from .config_aneg in the PHY driver.
> 
> I will still keep approach 1 as it is desirable to put the PHY in low power 
> mode
> for us but I need a second opinion if we should call phy_trigger_machine()
> from phy_start_aneg() or not.
> 


-- 
Florian


Re: [RFC PATCH] net: phy: Don't miss phy_suspend() on PHY_HALTED for PHYs with interrupts

2017-03-20 Thread Florian Fainelli
On 03/16/2017 12:46 AM, Roger Quadros wrote:
> On 15/03/17 17:49, Andrew Lunn wrote:
>> On Wed, Mar 15, 2017 at 05:00:08PM +0200, Roger Quadros wrote:
>>> Andrew,
>>>
>>> On 15/03/17 16:08, Andrew Lunn wrote:
 On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote:
> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state 
> change and not polling.")
> phy_suspend() doesn't get called as part of phy_stop() for PHYs using
> interrupts because the phy state machine is never triggered after a 
> phy_stop().
>
> Explicitly trigger the PHY state machine so that it can
> see the new PHY state (HALTED) and suspend the PHY.
>
> Signed-off-by: Roger Quadros 

 Hi Roger

 This seems sensible. It mirrors what phy_start() does.

 Reviewed-by: Andrew Lunn 
>>>
>>> The reason for this being an RFC was the following comment just before
>>> where I add the phy_trigger_machine()
>>>
>>> /* Cannot call flush_scheduled_work() here as desired because
>>>  * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change()
>>>  * will not reenable interrupts.
>>>  */
>>>
>>> Is this comment still applicable? If yes, is it OK to call
>>> phy_trigger_machine() there?
>>
>> Humm, good question.
>>
>> My _guess_ would be, calling it with sync=True could
>> deadlock. sync=False is probably safe. But lets see what Florian says.
> 
> I agree that we should use phy_trigger_machine() with sync=False.
> 
>>
>>>

 It does however lead to a follow up question. Are there other places
 phydev->state is changed and it is missing a phy_trigger_machine()?

>>>
>>> One other place I think we should add phy_trigger_machine() is 
>>> phy_start_aneg().
>>
>> Humm, that might get us into a tight loop.
>>
>> phy_start_aneg() kicks the phy driver to start autoneg and sets
>> phydev->state = PHY_AN.
>>
>> phy_trigger_machine() triggers the state machine immediately. 
>>
>> In state PHY_AN, we check if aneg is done. If not, it sets needs_aneg
>> = true. At the end of the state machine, this then calls
>> phy_start_aneg(), and it all starts again.
>>
>> We are missing the 1s delay we have with polling. So for
>> phy_start_aneg(), we might need a phy_delayed_trigger_machine(), which
>> waits a second before doing anything?
> 
> I think that should do the trick.
> 
> How about this?

This sounds like a possible fix indeed, however I would like to better
assess the impact on non interrupt driven PHYs before rolling such a change.

> 
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 8fef03b..162061c 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -630,6 +630,10 @@ int phy_start_aneg(struct phy_device *phydev)
>  
>  out_unlock:
>   mutex_unlock(>lock);
> +
> + if (!err)
> + queue_delayed_work(system_power_efficient_wq, 
> >state_queue, HZ);
> +
>   return err;
>  }
>  EXPORT_SYMBOL(phy_start_aneg);
> 


-- 
Florian


Re: [RFC PATCH] net: phy: Don't miss phy_suspend() on PHY_HALTED for PHYs with interrupts

2017-03-16 Thread Roger Quadros
On 15/03/17 17:49, Andrew Lunn wrote:
> On Wed, Mar 15, 2017 at 05:00:08PM +0200, Roger Quadros wrote:
>> Andrew,
>>
>> On 15/03/17 16:08, Andrew Lunn wrote:
>>> On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote:
 Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state 
 change and not polling.")
 phy_suspend() doesn't get called as part of phy_stop() for PHYs using
 interrupts because the phy state machine is never triggered after a 
 phy_stop().

 Explicitly trigger the PHY state machine so that it can
 see the new PHY state (HALTED) and suspend the PHY.

 Signed-off-by: Roger Quadros 
>>>
>>> Hi Roger
>>>
>>> This seems sensible. It mirrors what phy_start() does.
>>>
>>> Reviewed-by: Andrew Lunn 
>>
>> The reason for this being an RFC was the following comment just before
>> where I add the phy_trigger_machine()
>>
>> /* Cannot call flush_scheduled_work() here as desired because
>>  * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change()
>>  * will not reenable interrupts.
>>  */
>>
>> Is this comment still applicable? If yes, is it OK to call
>> phy_trigger_machine() there?
> 
> Humm, good question.
> 
> My _guess_ would be, calling it with sync=True could
> deadlock. sync=False is probably safe. But lets see what Florian says.

I agree that we should use phy_trigger_machine() with sync=False.

> 
>>
>>>
>>> It does however lead to a follow up question. Are there other places
>>> phydev->state is changed and it is missing a phy_trigger_machine()?
>>>
>>
>> One other place I think we should add phy_trigger_machine() is 
>> phy_start_aneg().
> 
> Humm, that might get us into a tight loop.
> 
> phy_start_aneg() kicks the phy driver to start autoneg and sets
> phydev->state = PHY_AN.
> 
> phy_trigger_machine() triggers the state machine immediately. 
> 
> In state PHY_AN, we check if aneg is done. If not, it sets needs_aneg
> = true. At the end of the state machine, this then calls
> phy_start_aneg(), and it all starts again.
> 
> We are missing the 1s delay we have with polling. So for
> phy_start_aneg(), we might need a phy_delayed_trigger_machine(), which
> waits a second before doing anything?

I think that should do the trick.

How about this?

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 8fef03b..162061c 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -630,6 +630,10 @@ int phy_start_aneg(struct phy_device *phydev)
 
 out_unlock:
mutex_unlock(>lock);
+
+   if (!err)
+   queue_delayed_work(system_power_efficient_wq, 
>state_queue, HZ);
+
return err;
 }
 EXPORT_SYMBOL(phy_start_aneg);

-- 
cheers,
-roger


Re: [RFC PATCH] net: phy: Don't miss phy_suspend() on PHY_HALTED for PHYs with interrupts

2017-03-15 Thread Andrew Lunn
On Wed, Mar 15, 2017 at 05:00:08PM +0200, Roger Quadros wrote:
> Andrew,
> 
> On 15/03/17 16:08, Andrew Lunn wrote:
> > On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote:
> >> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state 
> >> change and not polling.")
> >> phy_suspend() doesn't get called as part of phy_stop() for PHYs using
> >> interrupts because the phy state machine is never triggered after a 
> >> phy_stop().
> >>
> >> Explicitly trigger the PHY state machine so that it can
> >> see the new PHY state (HALTED) and suspend the PHY.
> >>
> >> Signed-off-by: Roger Quadros 
> > 
> > Hi Roger
> > 
> > This seems sensible. It mirrors what phy_start() does.
> > 
> > Reviewed-by: Andrew Lunn 
> 
> The reason for this being an RFC was the following comment just before
> where I add the phy_trigger_machine()
> 
> /* Cannot call flush_scheduled_work() here as desired because
>  * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change()
>  * will not reenable interrupts.
>  */
> 
> Is this comment still applicable? If yes, is it OK to call
> phy_trigger_machine() there?

Humm, good question.

My _guess_ would be, calling it with sync=True could
deadlock. sync=False is probably safe. But lets see what Florian says.

> 
> > 
> > It does however lead to a follow up question. Are there other places
> > phydev->state is changed and it is missing a phy_trigger_machine()?
> > 
> 
> One other place I think we should add phy_trigger_machine() is 
> phy_start_aneg().

Humm, that might get us into a tight loop.

phy_start_aneg() kicks the phy driver to start autoneg and sets
phydev->state = PHY_AN.

phy_trigger_machine() triggers the state machine immediately. 

In state PHY_AN, we check if aneg is done. If not, it sets needs_aneg
= true. At the end of the state machine, this then calls
phy_start_aneg(), and it all starts again.

We are missing the 1s delay we have with polling. So for
phy_start_aneg(), we might need a phy_delayed_trigger_machine(), which
waits a second before doing anything?

  Andrew


Re: [RFC PATCH] net: phy: Don't miss phy_suspend() on PHY_HALTED for PHYs with interrupts

2017-03-15 Thread Roger Quadros
Andrew,

On 15/03/17 16:08, Andrew Lunn wrote:
> On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote:
>> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state change 
>> and not polling.")
>> phy_suspend() doesn't get called as part of phy_stop() for PHYs using
>> interrupts because the phy state machine is never triggered after a 
>> phy_stop().
>>
>> Explicitly trigger the PHY state machine so that it can
>> see the new PHY state (HALTED) and suspend the PHY.
>>
>> Signed-off-by: Roger Quadros 
> 
> Hi Roger
> 
> This seems sensible. It mirrors what phy_start() does.
> 
> Reviewed-by: Andrew Lunn 

The reason for this being an RFC was the following comment just before
where I add the phy_trigger_machine()

/* Cannot call flush_scheduled_work() here as desired because
 * of rtnl_lock(), but PHY_HALTED shall guarantee phy_change()
 * will not reenable interrupts.
 */

Is this comment still applicable? If yes, is it OK to call
phy_trigger_machine() there?

> 
> It does however lead to a follow up question. Are there other places
> phydev->state is changed and it is missing a phy_trigger_machine()?
> 

One other place I think we should add phy_trigger_machine() is phy_start_aneg().

Since commit 3c293f4e08b5 I was not getting my ethernet link to come up if
ethernet cable was plugged before the ethernet interface was brought up.
The PHY seems to be stuck from RUNNING to AN state with no new interrupts from 
the
PHY.

I could workaround it on my board by doing either of the following:

1) explicitly halt the PHY at ethernet driver probe time. Then when
network interface is brought up it resumes the PHY and we see the
Link/ANEG done interrupt.

2) force BMCR_ANRESTART every time from .config_aneg in the PHY driver.

I will still keep approach 1 as it is desirable to put the PHY in low power mode
for us but I need a second opinion if we should call phy_trigger_machine()
from phy_start_aneg() or not.

-- 
cheers,
-roger


Re: [RFC PATCH] net: phy: Don't miss phy_suspend() on PHY_HALTED for PHYs with interrupts

2017-03-15 Thread Andrew Lunn
On Wed, Mar 15, 2017 at 03:51:27PM +0200, Roger Quadros wrote:
> Since commit 3c293f4e08b5 ("net: phy: Trigger state machine on state change 
> and not polling.")
> phy_suspend() doesn't get called as part of phy_stop() for PHYs using
> interrupts because the phy state machine is never triggered after a 
> phy_stop().
> 
> Explicitly trigger the PHY state machine so that it can
> see the new PHY state (HALTED) and suspend the PHY.
> 
> Signed-off-by: Roger Quadros 

Hi Roger

This seems sensible. It mirrors what phy_start() does.

Reviewed-by: Andrew Lunn 

It does however lead to a follow up question. Are there other places
phydev->state is changed and it is missing a phy_trigger_machine()?

  Andrew