On 2019/7/31 13:44, Heiner Kallweit wrote:
> On 31.07.2019 05:33, liuyonglong wrote:
>>
>>
>> On 2019/7/31 3:04, Heiner Kallweit wrote:
>>> On 30.07.2019 08:35, liuyonglong wrote:
>>>> :/sys/kernel/debug/tracing$ cat trace
>>>> # tracer: nop
>>>> #
>>>> # entries-in-buffer/entries-written: 45/45 #P:128
>>>> #
>>>> # _-----=> irqs-off
>>>> # / _----=> need-resched
>>>> # | / _---=> hardirq/softirq
>>>> # || / _--=> preempt-depth
>>>> # ||| / delay
>>>> # TASK-PID CPU# |||| TIMESTAMP FUNCTION
>>>> # | | | |||| | |
>>>> kworker/64:2-1028 [064] .... 172.295687: mdio_access:
>>>> mii-0000:bd:00.0 read phy:0x01 reg:0x02 val:0x001c
>>>> kworker/64:2-1028 [064] .... 172.295726: mdio_access:
>>>> mii-0000:bd:00.0 read phy:0x01 reg:0x03 val:0xc916
>>>> kworker/64:2-1028 [064] .... 172.296902: mdio_access:
>>>> mii-0000:bd:00.0 read phy:0x01 reg:0x01 val:0x79ad
>>>> kworker/64:2-1028 [064] .... 172.296938: mdio_access:
>>>> mii-0000:bd:00.0 read phy:0x01 reg:0x0f val:0x2000
>>>> kworker/64:2-1028 [064] .... 172.321213: mdio_access:
>>>> mii-0000:bd:00.0 read phy:0x01 reg:0x00 val:0x1040
>>>> kworker/64:2-1028 [064] .... 172.343209: mdio_access:
>>>> mii-0000:bd:00.1 read phy:0x03 reg:0x02 val:0x001c
>>>> kworker/64:2-1028 [064] .... 172.343245: mdio_access:
>>>> mii-0000:bd:00.1 read phy:0x03 reg:0x03 val:0xc916
>>>> kworker/64:2-1028 [064] .... 172.343882: mdio_access:
>>>> mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x79ad
>>>> kworker/64:2-1028 [064] .... 172.343918: mdio_access:
>>>> mii-0000:bd:00.1 read phy:0x03 reg:0x0f val:0x2000
>>>> kworker/64:2-1028 [064] .... 172.362658: mdio_access:
>>>> mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1040
>>>> kworker/64:2-1028 [064] .... 172.385961: mdio_access:
>>>> mii-0000:bd:00.2 read phy:0x05 reg:0x02 val:0x001c
>>>> kworker/64:2-1028 [064] .... 172.385996: mdio_access:
>>>> mii-0000:bd:00.2 read phy:0x05 reg:0x03 val:0xc916
>>>> kworker/64:2-1028 [064] .... 172.386646: mdio_access:
>>>> mii-0000:bd:00.2 read phy:0x05 reg:0x01 val:0x79ad
>>>> kworker/64:2-1028 [064] .... 172.386681: mdio_access:
>>>> mii-0000:bd:00.2 read phy:0x05 reg:0x0f val:0x2000
>>>> kworker/64:2-1028 [064] .... 172.411286: mdio_access:
>>>> mii-0000:bd:00.2 read phy:0x05 reg:0x00 val:0x1040
>>>> kworker/64:2-1028 [064] .... 172.433225: mdio_access:
>>>> mii-0000:bd:00.3 read phy:0x07 reg:0x02 val:0x001c
>>>> kworker/64:2-1028 [064] .... 172.433260: mdio_access:
>>>> mii-0000:bd:00.3 read phy:0x07 reg:0x03 val:0xc916
>>>> kworker/64:2-1028 [064] .... 172.433887: mdio_access:
>>>> mii-0000:bd:00.3 read phy:0x07 reg:0x01 val:0x79ad
>>>> kworker/64:2-1028 [064] .... 172.433922: mdio_access:
>>>> mii-0000:bd:00.3 read phy:0x07 reg:0x0f val:0x2000
>>>> kworker/64:2-1028 [064] .... 172.452862: mdio_access:
>>>> mii-0000:bd:00.3 read phy:0x07 reg:0x00 val:0x1040
>>>> ifconfig-1324 [011] .... 177.325585: mdio_access:
>>>> mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1040
>>>> kworker/u257:0-8 [012] .... 177.325642: mdio_access:
>>>> mii-0000:bd:00.1 read phy:0x03 reg:0x04 val:0x01e1
>>>> kworker/u257:0-8 [012] .... 177.325654: mdio_access:
>>>> mii-0000:bd:00.1 write phy:0x03 reg:0x04 val:0x05e1
>>>> kworker/u257:0-8 [012] .... 177.325708: mdio_access:
>>>> mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x79ad
>>>> kworker/u257:0-8 [012] .... 177.325744: mdio_access:
>>>> mii-0000:bd:00.1 read phy:0x03 reg:0x09 val:0x0200
>>>> kworker/u257:0-8 [012] .... 177.325779: mdio_access:
>>>> mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1040
>>>> kworker/u257:0-8 [012] .... 177.325788: mdio_access:
>>>> mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1240
>>>> kworker/u257:0-8 [012] .... 177.325843: mdio_access:
>>>> mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x798d
>>>
>>> What I think that happens here:
>>> Writing 0x1240 to BMCR starts aneg. When reading BMSR immediately after
>>> that then the PHY seems to have cleared
>>> the "aneg complete" bit already, but not yet the "link up" bit. This
>>> results in the false "link up" notification.
>>> The following patch is based on the fact that in case of enabled aneg we
>>> can't have a valid link if aneg isn't
>>> finished. Could you please test whether this works for you?
>>>
>>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>>> index 6b5cb87f3..7ddd91df9 100644
>>> --- a/drivers/net/phy/phy_device.c
>>> +++ b/drivers/net/phy/phy_device.c
>>> @@ -1774,6 +1774,12 @@ int genphy_update_link(struct phy_device *phydev)
>>> phydev->link = status & BMSR_LSTATUS ? 1 : 0;
>>> phydev->autoneg_complete = status & BMSR_ANEGCOMPLETE ? 1 : 0;
>>>
>>> + /* Consider the case that autoneg was started and "aneg complete"
>>> + * bit has been reset, but "link up" bit not yet.
>>> + */
>>> + if (phydev->autoneg == AUTONEG_ENABLE && !phydev->autoneg_complete)
>>> + phydev->link = 0;
>>> +
>>> return 0;
>>> }
>>> EXPORT_SYMBOL(genphy_update_link);
>>>
>>
>> This patch can solve the issue! Will it be upstream?
>>
> I'll check for side effects, but in general: yes.
>
>> So it's nothing to do with the bios, and just the PHY's own behavior,
>> the "link up" bit can not reset immediately,right?
>>
> Yes, it's the PHY's own behavior, and to a certain extent it may depend on
> speed
> of the MDIO bus. At least few network chips require a delay of several
> microseconds
> after each MDIO bus access. This may be sufficient for the PHY to reset the
> link-up bit in time.
>
>> ps: It will take 1 second more to make the link up for RTL8211F when 0x798d
>> happend.
>>
> In polling mode link-up is detected up to 1s after it happened.
> You could switch to interrupt mode to reduce the aneg time.
>
>>
>>
> Heiner
>
> .
>
Thanks very much!