RE: My vote against eepro* removal
From: Jesse Brandeburg > On Mon, 23 Jan 2006, kus Kusche Klaus wrote: > > Here are my results: > > > > If the watchdog doesn't get interrupted, preempted, or whatever, > > it spends 340 us in its body: > > * 303 us in the mii code > > * 36 us in the following code up to e100_adjust_adaptive_ifs > > * 1 us in the remaining code (I think my chip doesn't need any > > of those chip-specific fixups) > > > > The 303 us in the mii code are divided in the following way: > > * 101 us in mii_ethtool_gset > > * 135 us in the whole if > > * 67 us in mii_check_link > > > > This is with the udelay(2) instead of udelay(20) hack applied. > > With udelay(20), the mii times are 128 + 170 + 85 us, > > i.e. 383 us instead of 303 us, or >= 420 us for the whole watchdog. > > Thank you very much for that detailed analysis! okay, so > calls to mii.c > take too long, but those depend on mmio_read in e100 to do > the work, so > this patch attempts to minimize the latency. > > This patch is against linus-2.6.git, I compile and ssh/ping > tested it. > Would you be willing to send your instrumentation patches? I > could then > test any fixes easier. No deep magic behind my instrumentation: A few global variables and some rdtscl in the watchdog: unsigned long my_tsc_1 = 0; unsigned long my_tsc_2 = 0; unsigned long my_tsc_3 = 0; unsigned long my_tsc_4 = 0; EXPORT_SYMBOL(my_tsc_1); EXPORT_SYMBOL(my_tsc_2); EXPORT_SYMBOL(my_tsc_3); EXPORT_SYMBOL(my_tsc_4); static void e100_watchdog(unsigned long data) { struct nic *nic = (struct nic *)data; struct ethtool_cmd cmd; DPRINTK(TIMER, DEBUG, "right now = %ld\n", jiffies); /* mii library handles link maintenance tasks */ rdtscl(my_tsc_1); mii_ethtool_gset(&nic->mii, &cmd); rdtscl(my_tsc_2); if(mii_link_ok(&nic->mii) && !netif_carrier_ok(nic->netdev)) { DPRINTK(LINK, INFO, "link up, %sMbps, %s-duplex\n", cmd.speed == SPEED_100 ? "100" : "10", cmd.duplex == DUPLEX_FULL ? "full" : "half"); } else if(!mii_link_ok(&nic->mii) && netif_carrier_ok(nic->netdev)) { DPRINTK(LINK, INFO, "link down\n"); } rdtscl(my_tsc_3); mii_check_link(&nic->mii); rdtscl(my_tsc_4); /* Software generated interrupt to recover from (rare) Rx * allocation failure. ... And a small module which prints the timings periodically when loaded: /* Example module, built after LDD book release 3 */ #include #include #include #include #include MODULE_LICENSE("GPL"); /* Output interval, in jiffies */ #define INTERVAL 2111 /* Output scaling: TSC ==> microseconds */ #define SCALE(x) ((x)/500) extern unsigned long my_tsc_1; extern unsigned long my_tsc_2; extern unsigned long my_tsc_3; extern unsigned long my_tsc_4; static struct timer_list my_timer; static void timer_func(unsigned long dummy) { printk(KERN_NOTICE "my_timer: diff = %lu / %lu / %lu\n", SCALE(my_tsc_2 - my_tsc_1), SCALE(my_tsc_3 - my_tsc_2), SCALE(my_tsc_4 - my_tsc_3)); my_timer.expires += INTERVAL; add_timer(&my_timer); } static int __init mymod_init(void) { init_timer(&my_timer); my_timer.function = timer_func; my_timer.expires = jiffies + INTERVAL; add_timer(&my_timer); printk(KERN_NOTICE "Started mymod...\n"); return 0; } static void __exit mymod_exit(void) { del_timer_sync(&my_timer); printk(KERN_NOTICE "Finished mymod...\n"); } module_init(mymod_init); module_exit(mymod_exit); > > e100: attempt a shorter delay for mdio reads > > Signed-off-by: Jesse Brandeburg <[EMAIL PROTECTED]> > > Simply reorder our write/read sequence for mdio reads to > minimize latency > as well as delay a shorter interval for each loop. > > diff --git a/drivers/net/e100.c b/drivers/net/e100.c > --- a/drivers/net/e100.c > +++ b/drivers/net/e100.c > @@ -891,23 +891,25 @@ static u16 mdio_ctrl(struct nic *nic, u3 >* procedure it should be done under lock. >*/ > spin_lock_irqsave(&nic->mdio_lock, flags); > - for (i = 100; i; --i) { > + for (i = 1000; i; --i) { > if (readl(&nic->csr->mdi_ctrl) & mdi_ready) > break; > - udelay(20); > + udelay(2); > } > if (unlikely(!i)) { > - printk("e100.mdio_ctrl(%s) won't go Ready\n", > + DPRINTK(PROBE, ERR, "e100.mdio_ctrl(%s) won't > go Ready\n", > nic->netdev->name ); > spin_unlock_irqrestore(&nic->mdio_lock, flags); > return 0; /* No way to indicate > timeout error */ > } The piece of code above is not yet present in my version of e100. (I'm still at 2.6.15, there is no -rt patch for 2.6.16 yet) > writel((reg << 16) | (addr << 21) | dir | data, > &nic->csr->mdi_ctrl); > > - for (i = 0; i < 100; i++) { > - udelay(20); > + /* to avoid latency,
RE: My vote against eepro* removal
On Mon, 23 Jan 2006, kus Kusche Klaus wrote: From: John Ronciak > Can we try a couple of things? 1) just comment out all the check for > link code in the e100 driver and give that a try and 2) just comment > out the update stats call and see if that works. These seem to be the > differences and we need to know which one is causing the problem. First of all, I am still unable to get any traces of this in the latency tracer. Moreover, as I told before, removing parts of the watchdog usually made my eth0 nonfunctional (which is bad - this is an embedded system with ssh access). Hence, I explicitely instrumented the watchdog function with tsc. Output of the timings is done by a background thread, so the timings should not increase the runtime of the watchdog. Here are my results: If the watchdog doesn't get interrupted, preempted, or whatever, it spends 340 us in its body: * 303 us in the mii code * 36 us in the following code up to e100_adjust_adaptive_ifs * 1 us in the remaining code (I think my chip doesn't need any of those chip-specific fixups) The 303 us in the mii code are divided in the following way: * 101 us in mii_ethtool_gset * 135 us in the whole if * 67 us in mii_check_link This is with the udelay(2) instead of udelay(20) hack applied. With udelay(20), the mii times are 128 + 170 + 85 us, i.e. 383 us instead of 303 us, or >= 420 us for the whole watchdog. As the RTC runs with 8192 Hz during my tests, the watchdog is hit by 2-3 interrupts, which adds another 75 - 110 us to its total execution time, i.e. the time it blocks other rtprio 1 threads. Thank you very much for that detailed analysis! okay, so calls to mii.c take too long, but those depend on mmio_read in e100 to do the work, so this patch attempts to minimize the latency. This patch is against linus-2.6.git, I compile and ssh/ping tested it. Would you be willing to send your instrumentation patches? I could then test any fixes easier. e100: attempt a shorter delay for mdio reads Signed-off-by: Jesse Brandeburg <[EMAIL PROTECTED]> Simply reorder our write/read sequence for mdio reads to minimize latency as well as delay a shorter interval for each loop. --- drivers/net/e100.c | 12 +++- 1 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/net/e100.c b/drivers/net/e100.c --- a/drivers/net/e100.c +++ b/drivers/net/e100.c @@ -891,23 +891,25 @@ static u16 mdio_ctrl(struct nic *nic, u3 * procedure it should be done under lock. */ spin_lock_irqsave(&nic->mdio_lock, flags); - for (i = 100; i; --i) { + for (i = 1000; i; --i) { if (readl(&nic->csr->mdi_ctrl) & mdi_ready) break; - udelay(20); + udelay(2); } if (unlikely(!i)) { - printk("e100.mdio_ctrl(%s) won't go Ready\n", + DPRINTK(PROBE, ERR, "e100.mdio_ctrl(%s) won't go Ready\n", nic->netdev->name ); spin_unlock_irqrestore(&nic->mdio_lock, flags); return 0; /* No way to indicate timeout error */ } writel((reg << 16) | (addr << 21) | dir | data, &nic->csr->mdi_ctrl); - for (i = 0; i < 100; i++) { - udelay(20); + /* to avoid latency, read to flush the write, then delay, and only +* delay 2us per loop, manual says read should complete in < 64us */ + for (i = 0; i < 1000; i++) { if ((data_out = readl(&nic->csr->mdi_ctrl)) & mdi_ready) break; + udelay(2); } spin_unlock_irqrestore(&nic->mdio_lock, flags); DPRINTK(HW, DEBUG,
RE: My vote against eepro* removal
From: John Ronciak > Can we try a couple of things? 1) just comment out all the check for > link code in the e100 driver and give that a try and 2) just comment > out the update stats call and see if that works. These seem to be the > differences and we need to know which one is causing the problem. First of all, I am still unable to get any traces of this in the latency tracer. Moreover, as I told before, removing parts of the watchdog usually made my eth0 nonfunctional (which is bad - this is an embedded system with ssh access). Hence, I explicitely instrumented the watchdog function with tsc. Output of the timings is done by a background thread, so the timings should not increase the runtime of the watchdog. Here are my results: If the watchdog doesn't get interrupted, preempted, or whatever, it spends 340 us in its body: * 303 us in the mii code * 36 us in the following code up to e100_adjust_adaptive_ifs * 1 us in the remaining code (I think my chip doesn't need any of those chip-specific fixups) The 303 us in the mii code are divided in the following way: * 101 us in mii_ethtool_gset * 135 us in the whole if * 67 us in mii_check_link This is with the udelay(2) instead of udelay(20) hack applied. With udelay(20), the mii times are 128 + 170 + 85 us, i.e. 383 us instead of 303 us, or >= 420 us for the whole watchdog. As the RTC runs with 8192 Hz during my tests, the watchdog is hit by 2-3 interrupts, which adds another 75 - 110 us to its total execution time, i.e. the time it blocks other rtprio 1 threads. -- Klaus Kusche (Software Development - Control Systems) KEBA AG Gewerbepark Urfahr, A-4041 Linz, Austria (Europe) Tel: +43 / 732 / 7090-3120 Fax: +43 / 732 / 7090-6301 E-Mail: [EMAIL PROTECTED]WWW: www.keba.com - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: My vote against eepro* removal
On Fri, 2006-01-20 at 18:01 -0800, John Ronciak wrote: > There is a timer routine in the eepro100 driver which does the check > for link as well as a check for on of the hang conditions (with > work-around). It does the check for link in a different way than > e100. e100 uses mii call where eepro100 does it manually. Another > difference is that eepro100 doesn't get stats unless called by the > system. It's not in the timer routine at all. > > Can we try a couple of things? 1) just comment out all the check for > link code in the e100 driver and give that a try and 2) just comment > out the update stats call and see if that works. These seem to be the > differences and we need to know which one is causing the problem. Heh, FWIW, Microsoft found this exact same bug with 2 different chipsets in their latency testing (see section 4.4): http://research.microsoft.com/~mbj/papers/tr-98-29.html Lee - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: My vote against eepro* removal
On 1/20/06, Lee Revell <[EMAIL PROTECTED]> wrote: > > Why don't these cause excessive scheduling delays in eepro100 then? > Can't we just copy the eepro100 behavior? Reports still float around from time to time about hangs with the eepro100 which go away when the e100 driver is used. We don't maintain the eepro100 driver so I can't tell you much about it other than the reports we get sometimes. There is a timer routine in the eepro100 driver which does the check for link as well as a check for on of the hang conditions (with work-around). It does the check for link in a different way than e100. e100 uses mii call where eepro100 does it manually. Another difference is that eepro100 doesn't get stats unless called by the system. It's not in the timer routine at all. Can we try a couple of things? 1) just comment out all the check for link code in the e100 driver and give that a try and 2) just comment out the update stats call and see if that works. These seem to be the differences and we need to know which one is causing the problem. -- Cheers, John - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: My vote against eepro* removal
On Fri, 2006-01-20 at 17:19 -0800, John Ronciak wrote: > On 1/20/06, Lee Revell <[EMAIL PROTECTED]> wrote: > > Seems like the important question is, why does e100 need a watchdog if > > eepro100 works fine without one? Isn't the point of a watchdog in this > > context to work around other bugs in the driver (or the hardware)? > There are a number of things that the watchdog in e100 does. It > checks link (up, down), reads the hardware stats, adjusts the adaptive > IFS and checks to 3 known hang conditions based on certain types of > the hardware. You might be able to get around without doing the > work-arounds (as long as you don't' see hangs happening with the > hardware being used) but the checking of the link and the stats are > probably needed. Why don't these cause excessive scheduling delays in eepro100 then? Can't we just copy the eepro100 behavior? Lee - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: My vote against eepro* removal
On 1/20/06, Lee Revell <[EMAIL PROTECTED]> wrote: > Seems like the important question is, why does e100 need a watchdog if > eepro100 works fine without one? Isn't the point of a watchdog in this > context to work around other bugs in the driver (or the hardware)? There are a number of things that the watchdog in e100 does. It checks link (up, down), reads the hardware stats, adjusts the adaptive IFS and checks to 3 known hang conditions based on certain types of the hardware. You might be able to get around without doing the work-arounds (as long as you don't' see hangs happening with the hardware being used) but the checking of the link and the stats are probably needed. -- Cheers, John - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: My vote against eepro* removal
On Fri, 2006-01-20 at 11:19 +0100, kus Kusche Klaus wrote: > For a non-full preemption kernel, your patch moves the 500 us > piece of code from kernel to thread context, so it really > improves things. But is 500 us something to worry about in a > non-full preemption kernel? Yes, absolutely. Once exit_mmap (a latency regression which was introduced in 2.6.14) and rt_run_flush/rt_garbage_collect (which have always been problematic) are fixed, 500usecs will stick out like a sore thumb even on a regular PREEMPT kernel. Also, you should be able to capture this latency in /proc/latency trace by configuring an -rt kernel with PREEMPT_DESKTOP and hard/softirq preemption disabled. Lee - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: My vote against eepro* removal
On Fri, 2006-01-20 at 12:55 +0300, Evgeniy Polyakov wrote: > > Analysis of e100: > > * If I comment out the whole body of e100_watchdog except for the > > timer re-registration, the delays are gone (so it is really the > > body of e100_watchdog). However, this makes eth0 non-functional. > > * Commenting out parts of it, I found out that most of the time > > goes into its first half: The code from mii_ethtool_gset to > > mii_check_link (including) makes the big difference, as far as > > I can tell especially mii_ethtool_gset. > > Each MDIO read can take upto 2 msecs (!) and at least 20 usecs in > e100, > and this runs in timer handler. > Concider attaching (only compile tested) patch which moves e100 > watchdog > into workqueue. Seems like the important question is, why does e100 need a watchdog if eepro100 works fine without one? Isn't the point of a watchdog in this context to work around other bugs in the driver (or the hardware)? Lee - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: My vote against eepro* removal
From: Evgeniy Polyakov > Just a hack: > > --- drivers/net/e100.c.1 2006-01-20 13:39:19.0 +0300 > +++ drivers/net/e100.c2006-01-20 14:15:40.0 +0300 > @@ -879,8 +879,8 @@ > > writel((reg << 16) | (addr << 21) | dir | data, > &nic->csr->mdi_ctrl); > > - for(i = 0; i < 100; i++) { > - udelay(20); > + for(i = 0; i < 1000; i++) { > + udelay(2); > if((data_out = readl(&nic->csr->mdi_ctrl)) & mdi_ready) > break; > } My test environment and software is not precise enough for small improvements, but I'd say this results in a 10-15 % improvement (i.e. something like 50 us shorter delay) on the average. To be sure, one would have to take and print tsc timestamps directly in the watchdog code, but printk's mess up my timings. -- Klaus Kusche (Software Development - Control Systems) KEBA AG Gewerbepark Urfahr, A-4041 Linz, Austria (Europe) Tel: +43 / 732 / 7090-3120 Fax: +43 / 732 / 7090-6301 E-Mail: [EMAIL PROTECTED]WWW: www.keba.com - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: My vote against eepro* removal
On Fri, Jan 20, 2006 at 11:19:20AM +0100, kus Kusche Klaus ([EMAIL PROTECTED]) wrote: > From: Evgeniy Polyakov > > Each MDIO read can take upto 2 msecs (!) and at least 20 > > usecs in e100, > > and this runs in timer handler. > > Concider attaching (only compile tested) patch which moves > > e100 watchdog > > into workqueue. > > Hmmm, I don't think moving it around is worth the trouble > (nevertheless, I will test later if I find time). > > For a full preemption kernel, both timer code and workqueue code > are executed in a thread of their own. If I know that there is a > 500 us piece of code in either of them, I have to adjust the prio > of the corresponding thread (and all others) accordingly anyway. > > For a non-full preemption kernel, your patch moves the 500 us > piece of code from kernel to thread context, so it really > improves things. But is 500 us something to worry about in a > non-full preemption kernel? In the worst case it will be milliseconds, which is not very nice timeout for timer handler... P.S. above patch works without problems with my e100 card. > -- > Klaus Kusche (Software Development - Control Systems) > KEBA AG Gewerbepark Urfahr, A-4041 Linz, Austria (Europe) > Tel: +43 / 732 / 7090-3120 Fax: +43 / 732 / 7090-6301 > E-Mail: [EMAIL PROTECTED]WWW: www.keba.com -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: My vote against eepro* removal
On Fri, Jan 20, 2006 at 11:51:23AM +0100, kus Kusche Klaus ([EMAIL PROTECTED]) wrote: > From: Evgeniy Polyakov [mailto:[EMAIL PROTECTED] > > Each MDIO read can take upto 2 msecs (!) and at least 20 > > usecs in e100, > > and this runs in timer handler. > > Concider attaching (only compile tested) patch which moves > > e100 watchdog > > into workqueue. > > Tested the patch. Works and has the expected effects: > > Fully preemptible kernel: > No change: 500 us delay at rtprio 1, no delay at higher rtprio. > (you just moved the 500 us piece of code from one rtprio 1 kernel > thread to another rtprio 1 kernel thread). > > Kernel with desktop preemption: > Originally: Threads at any rtprio suffered from 500 us delay. > With your patch: Only rtprio 1 threads suffer from 500 us delay, > no delay at higher rtprio. Just a hack: --- drivers/net/e100.c.12006-01-20 13:39:19.0 +0300 +++ drivers/net/e100.c 2006-01-20 14:15:40.0 +0300 @@ -879,8 +879,8 @@ writel((reg << 16) | (addr << 21) | dir | data, &nic->csr->mdi_ctrl); - for(i = 0; i < 100; i++) { - udelay(20); + for(i = 0; i < 1000; i++) { + udelay(2); if((data_out = readl(&nic->csr->mdi_ctrl)) & mdi_ready) break; } > -- > Klaus Kusche (Software Development - Control Systems) > KEBA AG Gewerbepark Urfahr, A-4041 Linz, Austria (Europe) > Tel: +43 / 732 / 7090-3120 Fax: +43 / 732 / 7090-6301 > E-Mail: [EMAIL PROTECTED]WWW: www.keba.com > -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: My vote against eepro* removal
From: Evgeniy Polyakov [mailto:[EMAIL PROTECTED] > Each MDIO read can take upto 2 msecs (!) and at least 20 > usecs in e100, > and this runs in timer handler. > Concider attaching (only compile tested) patch which moves > e100 watchdog > into workqueue. Tested the patch. Works and has the expected effects: Fully preemptible kernel: No change: 500 us delay at rtprio 1, no delay at higher rtprio. (you just moved the 500 us piece of code from one rtprio 1 kernel thread to another rtprio 1 kernel thread). Kernel with desktop preemption: Originally: Threads at any rtprio suffered from 500 us delay. With your patch: Only rtprio 1 threads suffer from 500 us delay, no delay at higher rtprio. -- Klaus Kusche (Software Development - Control Systems) KEBA AG Gewerbepark Urfahr, A-4041 Linz, Austria (Europe) Tel: +43 / 732 / 7090-3120 Fax: +43 / 732 / 7090-6301 E-Mail: [EMAIL PROTECTED]WWW: www.keba.com - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: My vote against eepro* removal
From: Evgeniy Polyakov > Each MDIO read can take upto 2 msecs (!) and at least 20 > usecs in e100, > and this runs in timer handler. > Concider attaching (only compile tested) patch which moves > e100 watchdog > into workqueue. Hmmm, I don't think moving it around is worth the trouble (nevertheless, I will test later if I find time). For a full preemption kernel, both timer code and workqueue code are executed in a thread of their own. If I know that there is a 500 us piece of code in either of them, I have to adjust the prio of the corresponding thread (and all others) accordingly anyway. For a non-full preemption kernel, your patch moves the 500 us piece of code from kernel to thread context, so it really improves things. But is 500 us something to worry about in a non-full preemption kernel? -- Klaus Kusche (Software Development - Control Systems) KEBA AG Gewerbepark Urfahr, A-4041 Linz, Austria (Europe) Tel: +43 / 732 / 7090-3120 Fax: +43 / 732 / 7090-6301 E-Mail: [EMAIL PROTECTED]WWW: www.keba.com - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: My vote against eepro* removal
On Fri, Jan 20, 2006 at 10:37:43AM +0100, kus Kusche Klaus ([EMAIL PROTECTED]) wrote: > > From: John Ronciak > > During the watchdog the e100 driver reads all of the status registers > > from the actual hardware. There are 26 (worst case) register reads. > > There is also a spin lock for another check in the watchdog. It would > > still surprise me that all of this would take 500 usec. If you are > > seeing this delay, you can comment out the scheduling of the watchdog > > to see if this goes away. We'll need to narrow down exactly what in > > the watchdog is causing the delay > > Retested it. ... > Effect of the e100 driver: > * There is no measurable effect when my test program is running > at prio 2 - 99. > * For prio 1, I get an interval of 500-650 us every 2 seconds, > which indicates a scheduling latency of 380-530 us. > Hence, some piece of code is running for ~500 us at rt prio 1. > > Analysis of e100: > * If I comment out the whole body of e100_watchdog except for the > timer re-registration, the delays are gone (so it is really the > body of e100_watchdog). However, this makes eth0 non-functional. > * Commenting out parts of it, I found out that most of the time > goes into its first half: The code from mii_ethtool_gset to > mii_check_link (including) makes the big difference, as far as > I can tell especially mii_ethtool_gset. Each MDIO read can take upto 2 msecs (!) and at least 20 usecs in e100, and this runs in timer handler. Concider attaching (only compile tested) patch which moves e100 watchdog into workqueue. Signed-off-by: Evgeniy Polyakov <[EMAIL PROTECTED]> diff --git a/drivers/net/e100.c b/drivers/net/e100.c index 22cd045..e884578 100644 --- a/drivers/net/e100.c +++ b/drivers/net/e100.c @@ -549,7 +549,7 @@ struct nic { enum phy phy; struct params params; struct net_device_stats net_stats; - struct timer_list watchdog; + struct work_struct watchdog; struct timer_list blink_timer; struct mii_if_info mii; struct work_struct tx_timeout_task; @@ -1495,7 +1495,7 @@ static void e100_adjust_adaptive_ifs(str } } -static void e100_watchdog(unsigned long data) +static void e100_watchdog(void *data) { struct nic *nic = (struct nic *)data; struct ethtool_cmd cmd; @@ -1539,7 +1539,7 @@ static void e100_watchdog(unsigned long else nic->flags &= ~ich_10h_workaround; - mod_timer(&nic->watchdog, jiffies + E100_WATCHDOG_PERIOD); + schedule_delayed_work(&nic->watchdog, E100_WATCHDOG_PERIOD); } static inline void e100_xmit_prepare(struct nic *nic, struct cb *cb, @@ -2004,7 +2004,7 @@ static int e100_up(struct nic *nic) goto err_clean_cbs; e100_set_multicast_list(nic->netdev); e100_start_receiver(nic, NULL); - mod_timer(&nic->watchdog, jiffies); + schedule_work(&nic->watchdog); if((err = request_irq(nic->pdev->irq, e100_intr, SA_SHIRQ, nic->netdev->name, nic->netdev))) goto err_no_irq; @@ -2016,7 +2016,7 @@ static int e100_up(struct nic *nic) return 0; err_no_irq: - del_timer_sync(&nic->watchdog); + cancel_rearming_delayed_work(&nic->watchdog); err_clean_cbs: e100_clean_cbs(nic); err_rx_clean_list: @@ -2031,7 +2031,7 @@ static void e100_down(struct nic *nic) netif_stop_queue(nic->netdev); e100_hw_reset(nic); free_irq(nic->pdev->irq, nic->netdev); - del_timer_sync(&nic->watchdog); + flush_scheduled_work(); netif_carrier_off(nic->netdev); e100_clean_cbs(nic); e100_rx_clean_list(nic); @@ -2570,9 +2570,7 @@ static int __devinit e100_probe(struct p pci_set_master(pdev); - init_timer(&nic->watchdog); - nic->watchdog.function = e100_watchdog; - nic->watchdog.data = (unsigned long)nic; + INIT_WORK(&nic->watchdog, e100_watchdog, nic); init_timer(&nic->blink_timer); nic->blink_timer.function = e100_blink_led; nic->blink_timer.data = (unsigned long)nic; > -- > Klaus Kusche (Software Development - Control Systems) > KEBA AG Gewerbepark Urfahr, A-4041 Linz, Austria (Europe) > Tel: +43 / 732 / 7090-3120 Fax: +43 / 732 / 7090-6301 > E-Mail: [EMAIL PROTECTED]WWW: www.keba.com -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: My vote against eepro* removal
> From: John Ronciak > During the watchdog the e100 driver reads all of the status registers > from the actual hardware. There are 26 (worst case) register reads. > There is also a spin lock for another check in the watchdog. It would > still surprise me that all of this would take 500 usec. If you are > seeing this delay, you can comment out the scheduling of the watchdog > to see if this goes away. We'll need to narrow down exactly what in > the watchdog is causing the delay Retested it. Software: 2.6.15.1-rt6, no additional patches, configured for full preemption. Hardware: 500 MHz Pentium 3 According to lspci: 82443+82371 Chipset (440 BX + PIIX4) According to lspci: 82559ER (09) e100 100 Mbit full duplex Measurement software: * I start the rtc at 8192 Hz. * I read from /dev/rtc in a usermode realtime process. * After each read, I read the tsc and calculate the interval. * It should be 122 us most of the time, and indeed it is. At rtprio 2-99, practically all measurements are < 160-175 us on an otherwise idle machine (160 for 99, 175 for 2). At rtprio 1, there are some intervals of up to 300 us on an idle machine. * The program records the intervals in internal arrays and prints them at the end, it doesn't perform any I/O while it is running. Basically, my software measures how long it takes for a realtime process at prio n to get executed after it became ready: The actual interval minus 122 is the time the process was waiting for the CPU (didn't get scheduled). This means that there was continuous CPU activity for at least that long, either at a higher or the same rt prio, on in non-thread kernel code. (the rtc threaded interrupt handler is running above all others) Effect of the e100 driver: * There is no measurable effect when my test program is running at prio 2 - 99. * For prio 1, I get an interval of 500-650 us every 2 seconds, which indicates a scheduling latency of 380-530 us. Hence, some piece of code is running for ~500 us at rt prio 1. Analysis of e100: * If I comment out the whole body of e100_watchdog except for the timer re-registration, the delays are gone (so it is really the body of e100_watchdog). However, this makes eth0 non-functional. * Commenting out parts of it, I found out that most of the time goes into its first half: The code from mii_ethtool_gset to mii_check_link (including) makes the big difference, as far as I can tell especially mii_ethtool_gset. Effect of desktop preemption: With desktop preemption insted of full preemption, the 400-500 us scheduling delay hits at any rt priority, even at 99, i.e. the CPU is occupied by non-thread kernel code. Again, commenting out e100_watchdog brings the delays back to well below 400 us. Latency traces: None. The latency tracer does not record this scheduling latency. My max. traces are around 35 us with full preemption and around 70 us with desktop preemption, and are not related to e100. -- Klaus Kusche (Software Development - Control Systems) KEBA AG Gewerbepark Urfahr, A-4041 Linz, Austria (Europe) Tel: +43 / 732 / 7090-3120 Fax: +43 / 732 / 7090-6301 E-Mail: [EMAIL PROTECTED]WWW: www.keba.com - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: My vote against eepro* removal
During the watchdog the e100 driver reads all of the status registers from the actual hardware. There are 26 (worst case) register reads. There is also a spin lock for another check in the watchdog. It would still surprise me that all of this would take 500 usec. If you are seeing this delay, you can comment out the scheduling of the watchdog to see if this goes away. We'll need to narrow down exactly what in the watchdog is causing the delay Thanks. -- Cheers, John - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: My vote against eepro* removal
On Thu, Jan 19, 2006 at 11:26:51AM +0100, kus Kusche Klaus wrote: > > From: Lee Revell > > On Thu, 2006-01-19 at 08:19 +0100, kus Kusche Klaus wrote: > > > Last time I tested (around 2.6.12), eepro100 worked much better > > > in -rt kernels w.r.t. latencies than e100: > > > > > > e100 caused a periodic latency of about 500 microseconds > > > exactly every 2 seconds, no matter what the load on the interface > > > was (i.e. even on an idle interface). > > > > > > eepro100 did not show any latencies that long, it worked much > > > smoother w.r.t. latencies. > > > > > > Of course I would prefer to have e100 fixed over keeping eepro100 > > > around forever, but the last time I checked, it still wasn't fixed. > > > > Please provide latency traces to illustrate the problematic code path. > > It's not a "latency": As far as I can tell, interrupts or preemption > are not disabled, the latency tracer doesn't show anything. > > I just noticed that low-pri rt processes did not get scheduled for > about 500 microseconds when e100 was active (even if the net was > idle), and that there were no such breaks with eepro100. > > I didn't analyze it in detail at that time, I believed that the e100 > interrupt handler thread was running every 2 seconds for 500 > microseconds, because the interrupt count of eth0 incremented every > 2 seconds, exactly when my rt processes paused. > > This would be bad: That irq thread is at rt prio 47 on my system, > above many importent things. > > However, I checked more closely now, and found out that only a small > portion of the 500 microseconds is spent in the irq thread. Most of > it is spent in the timer thread, at rt prio 1, so the whole thing > is a much smaller problem than I originally believed. > > Must be the function e100_watchdog. >... Is this with 2.6.12 or 2.6.16-rc1? If it's the former, please check whether the problem is still presnt in the latter. If it's the latter, I'm sure the e100 developers (Cc'ed) are interested in your problem. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html