Re: Dynticks Causing High Context Switch Rate in ksoftirqd

2007-11-29 Thread Ray Lee
On Nov 29, 2007 9:11 AM,  <[EMAIL PROTECTED]> wrote:
>
> These are good points. However, on the Slack 10.2 box I repeated these
> measurements with all userspace code quiesced. No daemons running except
> for those that are kernel threads. Secondly, I do run dynticks kernels on
> other Slackware 10.2 boxen without these issues. The hardware may not be
> identical, e.g. Xeons with Intel E7501 chipsets or Opterons with AMD 8131
> chipsets, but I don't see any of this weirdness. Maybe I'll fire up Slack
> 10.2 on spare partition on the other (almost) identical machine and see if
> it exhibits this problem.

Any way you can narrow down the problem space will help, as there are
a lot of variables right now.

Also, please keep the kernel list CC:d so others who are lurking can
see what's going on, and not ask for duplicate data.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dynticks Causing High Context Switch Rate in ksoftirqd

2007-11-29 Thread Ray Lee
On Nov 28, 2007 6:44 PM, <[EMAIL PROTECTED]> wrote:
> I built the same dynticks-enabled 2.6.23.9 kernel on a nearly identical
> system with minor changes to reflect the slightly different hardware.
> These two systems have identical MSI E7210 MasterX FA6R motherboards (same
> model and revision.) The differences are as follows:
>
> behemoth (using Slackware 10.2)
> -
> dual 2.4 GHz Xeons 400 MHz FSB
> LSI 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI
> Newer SATA/PATA Intel PIIX drivers
>
> titan (using Slackware 11.0)
> -
> dual 2.0 GHz Xeons 533 MHz FSB
> Creative Labs SB Audigy LS (using ALSA driver)
> Older IDE PATA Intel PIIX drivers
>
> The result is that "behemoth" continues to exhibit 155,000 context
> switches per second at idle while "titan" shows about 25 - 30 context
> switches per second. Note that motherboard BIOS'es are at the same
> revision and configured identically.
>
> I guess (ugh) it's time for me to pull the MPT-Fusion U320 HBA and the
> SATA disks out of "behemoth" and configure it with old style IDE drives to
> be as close as possible to "titan." Then I can add parts back and see when
> the problem occurs.

Well, the first thing that seems obvious is that you're using
different version of userspace, and the newer userspace is on the
system that behaves better.

The second thing that pops to mind: are you doing all these
measurements booted into single user mode (init=/bin/bash or
somesuch)? If not, then I don't think we can pin this on the hardware
quite yet.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dynticks Causing High Context Switch Rate in ksoftirqd

2007-11-28 Thread Ingo Molnar

* Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Mon, 26 Nov 2007 20:36:32 -0600 (CST) [EMAIL PROTECTED] wrote:
> 
> > Question: Why is ksoftirqd eating about 5 to 10 percent of my CPU on an idle
> > system? The problem occurs if I config the kernel with tickless
> > support (i.e. CONFIG_TICK_ONESHOT=y).  (Thanks to "oprofile" for putting me
> > onto this.)
> 
> beware that oprofile can provide misleading results on a paritally-idle
> system.  You may have discovered that ksoftirqd is consuming 5-10% of the
> non-idle time on that idle system, which is less surprising.
> 
> > I have noted this same problem on kernel versions: 2.6.23.1, 2.6.23.8 and
> > 2.6.23.9
> > 
> > **
> > *** Output from "vmstat -n 1 10" -- Note very high context switch rate ***
> > *** This is on a idle machine! ***
> > **
> > 
> > procs ---memory-- ---swap-- -io --system--
> > cpu
> >  r  b   swpd   free   buff  cache   si   sobibo   incs us sy
> > id wa
> >  0  0  0 1925556   4768 11610400   124 26  7538  1  2
> > 96  1
> >  0  0  0 1925556   4768 11610400 0 02 147329  0  1
> > 99  0
> >  0  0  0 1925548   4768 11610400 0 00 154515  0  1
> > 99  0
> >  0  0  0 1925548   4768 11610400 0 01 153898  0  2
> > 98  0
> >  0  0  0 1925548   4780 11610400 0163 155216  0  1
> > 99  0
> >  0  0  0 1925548   4780 11610400 0 01 161718  0  1
> > 99  0
> >  0  0  0 1925548   4780 11610400 0 00 147587  0  2
> > 98  0
> >  0  0  0 1925548   4780 11610400 0 01 153524  0  2
> > 98  0
> >  0  0  0 1925448   4780 11610400 0 00 153434  0  1
> > 99  0
> >  0  0  0 1925448   4792 11609200 0164 153527  0  2
> > 98  0
> 
> So what piece of code is scheduling so much?  What does `top' say?  
> What does the (sorted) output of oprofile look like?
> 
> Did you try shutting down as much userspace code as possible to find 
> out if some userspace task is misbehaving?

such 'what the heck is happening' problems can also be debugged via the 
tracer. Here's a quickstart:

  http://redhat.com/~mingo/latency-tracing-patches/tracing-QuickStart.txt

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dynticks Causing High Context Switch Rate in ksoftirqd

2007-11-27 Thread Andrew Morton
On Mon, 26 Nov 2007 22:36:17 -0600 Robert Hancock <[EMAIL PROTECTED]> wrote:

> [EMAIL PROTECTED] wrote:
> > Question: Why is ksoftirqd eating about 5 to 10 percent of my CPU on an idle
> > system? The problem occurs if I config the kernel with tickless
> > support (i.e. CONFIG_TICK_ONESHOT=y).  (Thanks to "oprofile" for putting me
> > onto this.)
> > 
> > I have noted this same problem on kernel versions: 2.6.23.1, 2.6.23.8 and
> > 2.6.23.9
> > 
> > **
> > *** Output from "vmstat -n 1 10" -- Note very high context switch rate ***
> > *** This is on a idle machine! ***
> > **
> > 
> > procs ---memory-- ---swap-- -io --system--
> > cpu
> >  r  b   swpd   free   buff  cache   si   sobibo   incs us sy
> > id wa
> >  0  0  0 1925556   4768 11610400   124 26  7538  1  2
> > 96  1
> >  0  0  0 1925556   4768 11610400 0 02 147329  0  1
> > 99  0
> 
> What did oprofile show? It should be able to narrow down what 
> function(s) are responsible for the CPU usage..

Sigh.  I just asked a similar thing.   Let's look at the mail headers:



Message-ID: <[EMAIL PROTECTED]>
...
From: [EMAIL PROTECTED]


From: Robert Hancock <[EMAIL PROTECTED]>
...
In-reply-to: 


Please fix your email client so as to not break threading?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dynticks Causing High Context Switch Rate in ksoftirqd

2007-11-27 Thread Andrew Morton
On Mon, 26 Nov 2007 20:36:32 -0600 (CST) [EMAIL PROTECTED] wrote:

> Question: Why is ksoftirqd eating about 5 to 10 percent of my CPU on an idle
> system? The problem occurs if I config the kernel with tickless
> support (i.e. CONFIG_TICK_ONESHOT=y).  (Thanks to "oprofile" for putting me
> onto this.)

beware that oprofile can provide misleading results on a paritally-idle
system.  You may have discovered that ksoftirqd is consuming 5-10% of the
non-idle time on that idle system, which is less surprising.

> I have noted this same problem on kernel versions: 2.6.23.1, 2.6.23.8 and
> 2.6.23.9
> 
> **
> *** Output from "vmstat -n 1 10" -- Note very high context switch rate ***
> *** This is on a idle machine! ***
> **
> 
> procs ---memory-- ---swap-- -io --system--
> cpu
>  r  b   swpd   free   buff  cache   si   sobibo   incs us sy
> id wa
>  0  0  0 1925556   4768 11610400   124 26  7538  1  2
> 96  1
>  0  0  0 1925556   4768 11610400 0 02 147329  0  1
> 99  0
>  0  0  0 1925548   4768 11610400 0 00 154515  0  1
> 99  0
>  0  0  0 1925548   4768 11610400 0 01 153898  0  2
> 98  0
>  0  0  0 1925548   4780 11610400 0163 155216  0  1
> 99  0
>  0  0  0 1925548   4780 11610400 0 01 161718  0  1
> 99  0
>  0  0  0 1925548   4780 11610400 0 00 147587  0  2
> 98  0
>  0  0  0 1925548   4780 11610400 0 01 153524  0  2
> 98  0
>  0  0  0 1925448   4780 11610400 0 00 153434  0  1
> 99  0
>  0  0  0 1925448   4792 11609200 0164 153527  0  2
> 98  0

So what piece of code is scheduling so much?  What does `top' say?  What
does the (sorted) output of oprofile look like?

Did you try shutting down as much userspace code as possible to find out if
some userspace task is misbehaving?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dynticks Causing High Context Switch Rate in ksoftirqd

2007-11-27 Thread Robert Hancock

[EMAIL PROTECTED] wrote:

Hello Robert,

I've attached additional detail on the config of the misbehaving system
including output from oprofile and PowerTop. PowerTop output leads me to
believe that maybe this is an interaction between my bridged ethernet
setup and dynticks? Hmmm...


Don't know about that, your top wakeups are from br_stp_enable_bridge, 
but that is only 26 a second - that doesn't explain a context switch 
rate of 150,000 a second..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dynticks Causing High Context Switch Rate in ksoftirqd

2007-11-26 Thread Arjan van de Ven
On Mon, 26 Nov 2007 22:36:17 -0600
Robert Hancock <[EMAIL PROTECTED]> wrote:

> [EMAIL PROTECTED] wrote:
> > Question: Why is ksoftirqd eating about 5 to 10 percent of my CPU
> > on an idle system? The problem occurs if I config the kernel with
> > tickless support (i.e. CONFIG_TICK_ONESHOT=y).  (Thanks to
> > "oprofile" for putting me onto this.)
> > 
> > I have noted this same problem on kernel versions: 2.6.23.1,
> > 2.6.23.8 and 2.6.23.9
> > 
> > **
> > *** Output from "vmstat -n 1 10" -- Note very high context switch
> > rate *** *** This is on a idle
> > machine! ***
> > **
> > 
> > procs ---memory-- ---swap-- -io --system--
> > cpu
> >  r  b   swpd   free   buff  cache   si   sobibo   incs
> > us sy id wa
> >  0  0  0 1925556   4768 11610400   124 26
> > 7538  1  2 96  1
> >  0  0  0 1925556   4768 11610400 0 02
> > 147329  0  1 99  0
> 
> What did oprofile show? It should be able to narrow down what 
> function(s) are responsible for the CPU usage..
> 

or better, what does powertop version 1.9 show?
that tends to show tickless wakeup artifacts quite nicely


-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dynticks Causing High Context Switch Rate in ksoftirqd

2007-11-26 Thread Robert Hancock

[EMAIL PROTECTED] wrote:

Question: Why is ksoftirqd eating about 5 to 10 percent of my CPU on an idle
system? The problem occurs if I config the kernel with tickless
support (i.e. CONFIG_TICK_ONESHOT=y).  (Thanks to "oprofile" for putting me
onto this.)

I have noted this same problem on kernel versions: 2.6.23.1, 2.6.23.8 and
2.6.23.9

**
*** Output from "vmstat -n 1 10" -- Note very high context switch rate ***
*** This is on a idle machine! ***
**

procs ---memory-- ---swap-- -io --system--
cpu
 r  b   swpd   free   buff  cache   si   sobibo   incs us sy
id wa
 0  0  0 1925556   4768 11610400   124 26  7538  1  2
96  1
 0  0  0 1925556   4768 11610400 0 02 147329  0  1
99  0


What did oprofile show? It should be able to narrow down what 
function(s) are responsible for the CPU usage..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Dynticks Causing High Context Switch Rate in ksoftirqd

2007-11-26 Thread bdupree
Question: Why is ksoftirqd eating about 5 to 10 percent of my CPU on an idle
system? The problem occurs if I config the kernel with tickless
support (i.e. CONFIG_TICK_ONESHOT=y).  (Thanks to "oprofile" for putting me
onto this.)

I have noted this same problem on kernel versions: 2.6.23.1, 2.6.23.8 and
2.6.23.9

**
*** Output from "vmstat -n 1 10" -- Note very high context switch rate ***
*** This is on a idle machine! ***
**

procs ---memory-- ---swap-- -io --system--
cpu
 r  b   swpd   free   buff  cache   si   sobibo   incs us sy
id wa
 0  0  0 1925556   4768 11610400   124 26  7538  1  2
96  1
 0  0  0 1925556   4768 11610400 0 02 147329  0  1
99  0
 0  0  0 1925548   4768 11610400 0 00 154515  0  1
99  0
 0  0  0 1925548   4768 11610400 0 01 153898  0  2
98  0
 0  0  0 1925548   4780 11610400 0163 155216  0  1
99  0
 0  0  0 1925548   4780 11610400 0 01 161718  0  1
99  0
 0  0  0 1925548   4780 11610400 0 00 147587  0  2
98  0
 0  0  0 1925548   4780 11610400 0 01 153524  0  2
98  0
 0  0  0 1925448   4780 11610400 0 00 153434  0  1
99  0
 0  0  0 1925448   4792 11609200 0164 153527  0  2
98  0


*** System Stats ***


 Distro: Slackware 10.2
 Mobo:   MSI MasterX FA6R E7210
 CPUs:   Dual 2.4 GHz P4 Xeons 400 MHz FSB - Hyperthreading enabled
 Mem:2 GB ECC DDR PC 266


**
*** PCI Config ***
**

00:00.0 Host bridge: Intel Corporation 82875P/E7210 Memory Controller Hub
(rev 02)
00:03.0 PCI bridge: Intel Corporation 82875P/E7210 Processor to PCI to CSA
Bridge (rev 02)
00:06.0 System peripheral: Intel Corporation 82875P/E7210 Processor to I/O
Memory Interface (rev 02)
00:1c.0 PCI bridge: Intel Corporation 6300ESB 64-bit PCI-X Bridge (rev 02)
00:1d.0 USB Controller: Intel Corporation 6300ESB USB Universal Host
Controller (rev 02)
00:1d.1 USB Controller: Intel Corporation 6300ESB USB Universal Host
Controller (rev 02)
00:1d.4 System peripheral: Intel Corporation 6300ESB Watchdog Timer (rev 02)
00:1d.5 PIC: Intel Corporation 6300ESB I/O Advanced Programmable Interrupt
Controller (rev 02)
00:1d.7 USB Controller: Intel Corporation 6300ESB USB2 Enhanced Host
Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 0a)
00:1f.0 ISA bridge: Intel Corporation 6300ESB LPC Interface Controller
(rev 02)
00:1f.1 IDE interface: Intel Corporation 6300ESB PATA Storage Controller
(rev 02)
00:1f.2 IDE interface: Intel Corporation 6300ESB SATA Storage Controller
(rev 02)
00:1f.3 SMBus: Intel Corporation 6300ESB SMBus Controller (rev 02)
01:01.0 Ethernet controller: Intel Corporation 82547GI Gigabit Ethernet
Controller
02:02.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X
Fusion-MPT Dual Ultra320 SCSI (rev 08)
03:09.0 Mass storage controller: Silicon Image, Inc. SiI 3114
[SATALink/SATARaid] Serial ATA Controller (rev 02)
03:0a.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet
Controller
03:0c.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/