On Sat, 20 Jul 2019, Heiner Kallweit wrote:

On 19.07.2019 23:12, Thomas Voegtle wrote:
On Fri, 19 Jul 2019, Heiner Kallweit wrote:

On 18.07.2019 20:50, Thomas Voegtle wrote:

Hello,

I'm having network problems with the commits on r8169 since v5.2. There are 
ping packet loss, sometimes 100%, sometimes 50%. In the end network is unusable.

v5.2 is fine, I bisected it down to:

a2928d28643e3c064ff41397281d20c445525032 is the first bad commit
commit a2928d28643e3c064ff41397281d20c445525032
Author: Heiner Kallweit <hkallwe...@gmail.com>
Date:   Sun Jun 2 10:53:49 2019 +0200

    r8169: use paged versions of phylib MDIO access functions

    Use paged versions of phylib MDIO access functions to simplify
    the code.

    Signed-off-by: Heiner Kallweit <hkallwe...@gmail.com>
    Signed-off-by: David S. Miller <da...@davemloft.net>


Reverting that commit on top of v5.2-11564-g22051d9c4a57 fixes the problem
for me (had to adjust the renaming to r8169_main.c).

I have a:
04:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev
0c)
        Subsystem: Biostar Microtech Int'l Corp Device [1565:2400]
        Kernel driver in use: r8169

on a BIOSTAR H81MG motherboard.

Interesting. I have the same chip version (RTL8168g) and can't reproduce
the issue. Can you provide a full dmesg output and test the patch below
on top of linux-next? I'd be interested in the WARN_ON stack traces
(if any) and would like to know whether the experimental change to
__phy_modify_changed helps.


greetings,

  Thomas


Heiner


diff --git a/drivers/net/ethernet/realtek/r8169_main.c 
b/drivers/net/ethernet/realtek/r8169_main.c
index 8d7dd4c5f..26be73000 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -1934,6 +1934,8 @@ static int rtl_get_eee_supp(struct rtl8169_private *tp)
    struct phy_device *phydev = tp->phydev;
    int ret;

+    WARN_ON(phy_read(phydev, 0x1f));
+
    switch (tp->mac_version) {
    case RTL_GIGA_MAC_VER_34:
    case RTL_GIGA_MAC_VER_35:
@@ -1957,6 +1959,8 @@ static int rtl_get_eee_lpadv(struct rtl8169_private *tp)
    struct phy_device *phydev = tp->phydev;
    int ret;

+    WARN_ON(phy_read(phydev, 0x1f));
+
    switch (tp->mac_version) {
    case RTL_GIGA_MAC_VER_34:
    case RTL_GIGA_MAC_VER_35:
@@ -1980,6 +1984,8 @@ static int rtl_get_eee_adv(struct rtl8169_private *tp)
    struct phy_device *phydev = tp->phydev;
    int ret;

+    WARN_ON(phy_read(phydev, 0x1f));
+
    switch (tp->mac_version) {
    case RTL_GIGA_MAC_VER_34:
    case RTL_GIGA_MAC_VER_35:
@@ -2003,6 +2009,8 @@ static int rtl_set_eee_adv(struct rtl8169_private *tp, 
int val)
    struct phy_device *phydev = tp->phydev;
    int ret = 0;

+    WARN_ON(phy_read(phydev, 0x1f));
+
    switch (tp->mac_version) {
    case RTL_GIGA_MAC_VER_34:
    case RTL_GIGA_MAC_VER_35:
diff --git a/drivers/net/phy/phy-core.c b/drivers/net/phy/phy-core.c
index 16667fbac..1aa1142b8 100644
--- a/drivers/net/phy/phy-core.c
+++ b/drivers/net/phy/phy-core.c
@@ -463,12 +463,10 @@ int __phy_modify_changed(struct phy_device *phydev, u32 
regnum, u16 mask,
        return ret;

    new = (ret & ~mask) | set;
-    if (new == ret)
-        return 0;

-    ret = __phy_write(phydev, regnum, new);
+    __phy_write(phydev, regnum, new);

-    return ret < 0 ? ret : 1;
+    return new != ret;
}
EXPORT_SYMBOL_GPL(__phy_modify_changed);



Took your patch on top of next-20190719.
See attached dmesg.
It didn't work. Same thing, lots of ping drops, no usable network.

like that:
44 packets transmitted, 2 received, 95% packet loss, time 44005ms


Maybe important:
I build a kernel with no modules.

I have to power off when I booted a kernel which doesn't work, a (soft) reboot 
into a older kernel (e.g. 4.9.y)  doesn't
fix the problem. Powering off and on does.


Then, what you could do is reversing the hunks of the patch step by step.
Or make them separate patches and bisect.
Relevant are the hunks from point 1 and 2.

1. first 5 hunks (I don't think you have to reverse them individually)
  EEE-related

2. rtl8168g_disable_aldps, rtl8168g_phy_adjust_10m_aldps, 
rtl8168g_1_hw_phy_config
  all of these hunks are in the path for RTL8168g

3. rtl8168h_1_hw_phy_config, rtl8168h_2_hw_phy_config, 
rtl8168ep_1_hw_phy_config,
  rtl8168ep_2_hw_phy_config
  not in the path for RTL8168g


this is the minimal revert:

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index efef5453b94f..267995a614b5 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -3249,12 +3249,14 @@ static void rtl8168g_1_hw_phy_config(struct rtl8169_private *tp)
        else
                phy_modify_paged(tp->phydev, 0x0bcc, 0x12, 0, BIT(15));

-       ret = phy_read_paged(tp->phydev, 0x0a46, 0x13);
-       if (ret & BIT(8))
-               phy_modify_paged(tp->phydev, 0x0c41, 0x12, 0, BIT(1));
-       else
-               phy_modify_paged(tp->phydev, 0x0c41, 0x12, BIT(1), 0);
-
+       rtl_writephy(tp, 0x1f, 0x0a46);
+       if (rtl_readphy(tp, 0x13) & 0x0100) {
+               rtl_writephy(tp, 0x1f, 0x0c41);
+               rtl_w0w1_phy(tp, 0x15, 0x0002, 0x0000);
+       } else {
+               rtl_writephy(tp, 0x1f, 0x0c41);
+               rtl_w0w1_phy(tp, 0x15, 0x0000, 0x0002);
+       }
        /* Enable PHY auto speed down */
        phy_modify_paged(tp->phydev, 0x0a44, 0x11, 0, BIT(3) | BIT(2));



Could it be, that there is just a typo?

        if (ret & BIT(8))
-               phy_modify_paged(tp->phydev, 0x0c41, 0x12, 0, BIT(1));
+               phy_modify_paged(tp->phydev, 0x0c41, 0x15, 0, BIT(1));
        else
-               phy_modify_paged(tp->phydev, 0x0c41, 0x12, BIT(1), 0);
+               phy_modify_paged(tp->phydev, 0x0c41, 0x15, BIT(1), 0);




greetings,

      Thomas

Reply via email to