Re: r8169: Crash after reloading driver if network hangs
Hello, * Francois Romieu [EMAIL PROTECTED] [2007-01-29 08:56]: Bernhard Walle [EMAIL PROTECTED] : @@ -1371,10 +1371,9 @@ static inline void rtl8169_request_timer return; init_timer(timer); - timer-expires = jiffies + RTL8169_PHY_TIMEOUT; timer-data = (unsigned long)(dev); timer-function = rtl8169_phy_timer; - add_timer(timer); + mod_timer(timer, jiffies + RTL8169_PHY_TIMEOUT); } But I think _this_ change is unnecessary, ... add_timer() is not supposed to modify an existing timer whereas mod_timer() encompasses both that's clear. and the race exists in both direction as netif_running() is true as soon as dev_open() starts but way before dev-open() completes. But rtl8169_request_timer() is only called from change_mtu() and dev-open(). And, if you call init_timer(), you always have a new timer, and the reference to an existing tp-timer will be lost. [...] ... but that looks good (better than my patch) and should resolve the issue, too. I can't test because it's triggered only if the network hangs and you know, the last one isn't reproducable. There will be something to test in the merge of realtek's stuff #2. Great. Regards, Bernhard -- SUSE LINUX Products GmbHE-Mail: [EMAIL PROTECTED] Maxfeldstr. 5 Phone: +49 (911) 74053-0 90409 Nürnberg, Germany OpenPGP DDAF6454: F61F 34CC 09CA FB82 C9F6 BA4B 8865 3696 DDAF 6454 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
r8169: Crash after reloading driver if network hangs
Hello, also after applying the latest patch that was posted on that mailing list, I have still the problem described in http://bugzilla.kernel.org/show_bug.cgi?id=5137. So after a network hang with several NETDEV WATCHDOG: eth0: transmit timed out in the kernel log, I removed the driver ('rmmod r8169') and wanted to reload the driver ('modprobe r8169'). Now the system crashed: kernel BUG at kernel/timer.c:407! invalid opcode: [1] SMP CPU 1 Modules linked in: r8169 i915 drm deflate zlib_deflate twofish twofish_common serpent blowfi sh des cbc ecb blkcipher aes xcbc sha256 w83627ehf hwmon i2c_isa sha1 ipv6 md5 eeprom crypto _null af_key snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device nfs lockd nfs_acl sunrpc af_pa cket cpufreq_conservative cpufreq_ondemand cpufreq_userspace cpufreq_powersave acpi_cpufreq freq_table button battery ac loop rfcomm hidp l2cap hci_usb bluetooth dm_mod usb_storage fus e snd_hda_intel snd_hda_codec snd_pcm iTCO_wdt ehci_hcd uhci_hcd intel_agp i2c_i801 i2c_core snd_timer snd soundcore snd_page_alloc iTCO_vendor_support usbcore floppy lp parport_pc ppd ev parport ext3 mbcache jbd sg sr_mod cdrom edd fan ata_piix libata piix thermal processor s d_mod scsi_mod ide_disk ide_core Pid: 5724, comm: modprobe Not tainted 2.6.20-rc6-default #1 RIP: 0010:[80241dfe] [80241dfe] mod_timer+0x7/0x22 RSP: 0018:8100566c9c20 EFLAGS: 00010246 RAX: 0006 RBX: 810061d8c000 RCX: 0001 RDX: 03e8 RSI: a58a RDI: 810061d8d1e0 RBP: 810061d8c4c0 R08: c2022000 R09: 810061d8c4c0 R10: 0046 R11: 0202 R12: R13: ff01 R14: 8100566c03e8 R15: 810061d8c000 FS: 2b38153566f0() GS:81007db9abc0() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 557fc870 CR3: 540af000 CR4: 06e0 Process modprobe (pid: 5724, threadinfo 8100566c8000, task 810037fde850) Stack: 881b6d8b 0064 810061d8c4c0 c2022000 881b88ef 00ff 00ff 81ff6r8169: eth1: link up 81ff 8111 81007d0079c8 0296 Call Trace: [881b6d8b] :r8169:rtl8169_set_speed+0x4b/0x53 [881b88ef] :r8169:rtl8169_init_one+0x954/0x9c4 [8031b440] pci_device_probe+0xe5/0x151 [8036d620] really_probe+0x87/0x106 [8036d83f] __driver_attach+0x6f/0xaf [8036d7d0] __driver_attach+0x0/0xaf [8036d7d0] __driver_attach+0x0/0xaf [8036cb86] bus_for_each_dev+0x43/0x6e [8036ce7c] bus_add_driver+0x6b/0x18d [8031b63b] __pci_register_driver+0x75/0xaa [80298300] sys_init_module+0x1793/0x1900 [8025511e] system_call+0x7e/0x83 Code: 0f 0b eb fe 48 39 77 10 75 06 48 83 3f 00 75 05 e9 73 85 fd RIP [80241dfe] mod_timer+0x7/0x22 RSP 8100566c9c20 netif_running(dev) returns true although open() hasn't been called that sets the function of the timer. dev-state is 6 (I checked that). Simple fix is attached. Although that seems to fix the symptom and not the cause, please apply it if you don't have a better solution. --- r8169.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/drivers/net/r8169.c 2007-01-28 18:19:56.0 +0100 +++ b/drivers/net/r8169.c 2007-01-28 18:22:50.0 +0100 @@ -810,7 +810,8 @@ static int rtl8169_set_speed(struct net_ ret = tp-set_speed(dev, autoneg, speed, duplex); - if (netif_running(dev) (tp-phy_1000_ctrl_reg ADVERTISE_1000FULL)) + if (netif_running(dev) tp-timer.function + (tp-phy_1000_ctrl_reg ADVERTISE_1000FULL)) mod_timer(tp-timer, jiffies + RTL8169_PHY_TIMEOUT); return ret; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: r8169: Crash after reloading driver if network hangs
Bernhard Walle [EMAIL PROTECTED] : [...] Simple fix is attached. Although that seems to fix the symptom and not the cause, please apply it if you don't have a better solution. What about the patch below ? diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c index 577babd..4e22af7 100644 --- a/drivers/net/r8169.c +++ b/drivers/net/r8169.c @@ -1373,7 +1373,7 @@ static inline void rtl8169_request_timer(struct net_device *dev) timer-expires = jiffies + RTL8169_PHY_TIMEOUT; timer-data = (unsigned long)(dev); timer-function = rtl8169_phy_timer; - add_timer(timer); + mod_timer(timer); } #ifdef CONFIG_NET_POLL_CONTROLLER @@ -1448,7 +1448,7 @@ static void rtl8169_init_phy(struct net_device *dev, struct rtl8169_private *tp) rtl8169_phy_reset(dev, tp); - rtl8169_set_speed(dev, autoneg, speed, duplex); + tp-set_speed(dev, autoneg, speed, duplex); if ((RTL_R8(PHYstatus) TBI_Enable) netif_msg_link(tp)) printk(KERN_INFO PFX %s: TBI auto-negotiating\n, dev-name); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: r8169: Crash after reloading driver if network hangs
* Francois Romieu [EMAIL PROTECTED] [2007-01-28 20:04]: Bernhard Walle [EMAIL PROTECTED] : [...] Simple fix is attached. Although that seems to fix the symptom and not the cause, please apply it if you don't have a better solution. What about the patch below ? diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c index 577babd..4e22af7 100644 --- a/drivers/net/r8169.c +++ b/drivers/net/r8169.c @@ -1373,7 +1373,7 @@ static inline void rtl8169_request_timer(struct net_device *dev) timer-expires = jiffies + RTL8169_PHY_TIMEOUT; timer-data = (unsigned long)(dev); timer-function = rtl8169_phy_timer; - add_timer(timer); + mod_timer(timer); } Doesn't compile, I think you mean this? @@ -1371,10 +1371,9 @@ static inline void rtl8169_request_timer return; init_timer(timer); - timer-expires = jiffies + RTL8169_PHY_TIMEOUT; timer-data = (unsigned long)(dev); timer-function = rtl8169_phy_timer; - add_timer(timer); + mod_timer(timer, jiffies + RTL8169_PHY_TIMEOUT); } But I think _this_ change is unnecessary, ... #ifdef CONFIG_NET_POLL_CONTROLLER @@ -1448,7 +1448,7 @@ static void rtl8169_init_phy(struct net_device *dev, struct rtl8169_private *tp) rtl8169_phy_reset(dev, tp); - rtl8169_set_speed(dev, autoneg, speed, duplex); + tp-set_speed(dev, autoneg, speed, duplex); if ((RTL_R8(PHYstatus) TBI_Enable) netif_msg_link(tp)) printk(KERN_INFO PFX %s: TBI auto-negotiating\n, dev-name); ... but that looks good (better than my patch) and should resolve the issue, too. I can't test because it's triggered only if the network hangs and you know, the last one isn't reproducable. Regards, Bernhard - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: r8169: Crash after reloading driver if network hangs
Bernhard Walle [EMAIL PROTECTED] : [...] Doesn't compile, I think you mean this? Yes. @@ -1371,10 +1371,9 @@ static inline void rtl8169_request_timer return; init_timer(timer); - timer-expires = jiffies + RTL8169_PHY_TIMEOUT; timer-data = (unsigned long)(dev); timer-function = rtl8169_phy_timer; - add_timer(timer); + mod_timer(timer, jiffies + RTL8169_PHY_TIMEOUT); } But I think _this_ change is unnecessary, ... add_timer() is not supposed to modify an existing timer whereas mod_timer() encompasses both and the race exists in both direction as netif_running() is true as soon as dev_open() starts but way before dev-open() completes. [...] ... but that looks good (better than my patch) and should resolve the issue, too. I can't test because it's triggered only if the network hangs and you know, the last one isn't reproducable. There will be something to test in the merge of realtek's stuff #2. -- Ueimor - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html