Re: CPU load

2007-02-26 Thread Randy Dunlap
On Mon, 26 Feb 2007 13:42:50 +0300 (MSK) malc wrote:

> On Mon, 26 Feb 2007, Pavel Machek wrote:
> 
> > Hi!
> >
> >> [..snip..]
> >>
>  The current situation ought to be documented. Better yet some flag
>  can
> >>>
> >>> It probably _is_ documented, somewhere :-). If you find nice place
> >>> where to document it (top manpage?) go ahead with the patch.
> >>
> >>
> >> How about this:
> >
> > Looks okay to me. (You should probably add your name to it, and I do
> > not like html-like markup... plus please don't add extra spaces
> > between words)...
> 
> Thanks. html-like markup was added to clearly mark the boundaries of
> the message and the text. Extra-spaces courtesy emacs' C-0 M-q.
> 
> >
> > You probably want to send it to akpm?
> 
> Any pointers on how to do that and perhaps preferred submission
> format?
> 
> [..snip..]

Well, he wrote it up and posted it at
http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-26 Thread malc

On Mon, 26 Feb 2007, Pavel Machek wrote:


Hi!


[..snip..]


The current situation ought to be documented. Better yet some flag
can


It probably _is_ documented, somewhere :-). If you find nice place
where to document it (top manpage?) go ahead with the patch.



How about this:


Looks okay to me. (You should probably add your name to it, and I do
not like html-like markup... plus please don't add extra spaces
between words)...


Thanks. html-like markup was added to clearly mark the boundaries of
the message and the text. Extra-spaces courtesy emacs' C-0 M-q.



You probably want to send it to akpm?


Any pointers on how to do that and perhaps preferred submission
format?

[..snip..]

--
vale
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-26 Thread Pavel Machek
Hi!

> [..snip..]
> 
> >>The current situation ought to be documented. Better yet some flag
> >>can
> >
> >It probably _is_ documented, somewhere :-). If you find nice place
> >where to document it (top manpage?) go ahead with the patch.
> 
> 
> How about this:

Looks okay to me. (You should probably add your name to it, and I do
not like html-like markup... plus please don't add extra spaces
between words)...

You probably want to send it to akpm?
Pavel

> 
> CPU load
> 
> 
> Linux exports various bits of information via  `/proc/stat'and
> `/proc/uptime' that userland tools,  such as top(1), use  to calculate
> the average time system spent in a particular state, for example:
> 
> 
> $ iostat
> Linux 2.6.18.3-exp (linmac) 02/20/2007
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>   10.010.002.925.440.00   81.63
> 
> ...
> 
> 
> Here   the system  thinks that  over   the default sampling period the
> system spent 10.01% of the time doing work in user space, 2.92% in the
> kernel, and was overall 81.63% of the time idle.
> 
> In most cases the `/proc/stat'  information reflects the reality quite
> closely, however  due to the   nature of how/when  the kernel collects
> this data sometimes it can not be trusted at all.
> 
> So  how is this information  collected?   Whenever timer interrupt  is
> signalled the  kernel looks  what kind  of task   was running at  this
> moment  and   increments the counter  that  corresponds  to this tasks
> kind/state.  The  problem with  this is  that  the  system  could have
> switched between  various states   multiple times betweentwo timer
> interrupts yet the counter is incremented only for the last state.
> 
> 
> Example
> ---
> 
> If we imagine the system with one task that periodically burns cycles
> in the following manner:
> 
>  time line between two timer interrupts
> |--|
>  ^^
>  |_ something begins working  |
>   |_ something goes to sleep
>  (only to be awaken quite soon)
> 
> In the above  situation the system will be  0% loaded according to the
> `/proc/stat' (since  the timer interrupt will   always happen when the
> system is  executing  the idle  handler),  but in reality  the load is
> closer to 99%.
> 
> One can imagine many more situations where this behavior of the kernel
> will lead to quite erratic information inside `/proc/stat'.
> 
> 
> /* gcc -o hog smallhog.c */
> #include 
> #include 
> #include 
> #include 
> #define HIST 10
> 
> static volatile sig_atomic_t stop;
> 
> static void sighandler (int signr)
> {
>  (void) signr;
>  stop = 1;
> }
> static unsigned long hog (unsigned long niters)
> {
>  stop = 0;
>  while (!stop && --niters);
>  return niters;
> }
> int main (void)
> {
>  int i;
>  struct itimerval it = { .it_interval = { .tv_sec = 0, .tv_usec = 1 },
>  .it_value = { .tv_sec = 0, .tv_usec = 1 } };
>  sigset_t set;
>  unsigned long v[HIST];
>  double tmp = 0.0;
>  unsigned long n;
>  signal (SIGALRM, &sighandler);
>  setitimer (ITIMER_REAL, &it, NULL);
> 
>  hog (ULONG_MAX);
>  for (i = 0; i < HIST; ++i) v[i] = ULONG_MAX - hog (ULONG_MAX);
>  for (i = 0; i < HIST; ++i) tmp += v[i];
>  tmp /= HIST;
>  n = tmp - (tmp / 3.0);
> 
>  sigemptyset (&set);
>  sigaddset (&set, SIGALRM);
> 
>  for (;;) {
>  hog (n);
>  sigwait (&set, &i);
>  }
>  return 0;
> }
> 
> 
> References
> --
> 
> http://lkml.org/lkml/2007/2/12/6
> Documentation/filesystems/proc.txt (1.8)
> 
> 

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-25 Thread malc

On Wed, 14 Feb 2007, Pavel Machek wrote:


Hi!


[..snip..]


The current situation ought to be documented. Better yet some flag
can


It probably _is_ documented, somewhere :-). If you find nice place
where to document it (top manpage?) go ahead with the patch.



How about this:


CPU load


Linux exports various bits of information via  `/proc/stat'and
`/proc/uptime' that userland tools,  such as top(1), use  to calculate
the average time system spent in a particular state, for example:


$ iostat
Linux 2.6.18.3-exp (linmac) 02/20/2007

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
  10.010.002.925.440.00   81.63

...


Here   the system  thinks that  over   the default sampling period the
system spent 10.01% of the time doing work in user space, 2.92% in the
kernel, and was overall 81.63% of the time idle.

In most cases the `/proc/stat'  information reflects the reality quite
closely, however  due to the   nature of how/when  the kernel collects
this data sometimes it can not be trusted at all.

So  how is this information  collected?   Whenever timer interrupt  is
signalled the  kernel looks  what kind  of task   was running at  this
moment  and   increments the counter  that  corresponds  to this tasks
kind/state.  The  problem with  this is  that  the  system  could have
switched between  various states   multiple times betweentwo timer
interrupts yet the counter is incremented only for the last state.


Example
---

If we imagine the system with one task that periodically burns cycles
in the following manner:

 time line between two timer interrupts
|--|
 ^^
 |_ something begins working  |
  |_ something goes to sleep
 (only to be awaken quite soon)

In the above  situation the system will be  0% loaded according to the
`/proc/stat' (since  the timer interrupt will   always happen when the
system is  executing  the idle  handler),  but in reality  the load is
closer to 99%.

One can imagine many more situations where this behavior of the kernel
will lead to quite erratic information inside `/proc/stat'.


/* gcc -o hog smallhog.c */
#include 
#include 
#include 
#include 
#define HIST 10

static volatile sig_atomic_t stop;

static void sighandler (int signr)
{
 (void) signr;
 stop = 1;
}
static unsigned long hog (unsigned long niters)
{
 stop = 0;
 while (!stop && --niters);
 return niters;
}
int main (void)
{
 int i;
 struct itimerval it = { .it_interval = { .tv_sec = 0, .tv_usec = 1 },
 .it_value = { .tv_sec = 0, .tv_usec = 1 } };
 sigset_t set;
 unsigned long v[HIST];
 double tmp = 0.0;
 unsigned long n;
 signal (SIGALRM, &sighandler);
 setitimer (ITIMER_REAL, &it, NULL);

 hog (ULONG_MAX);
 for (i = 0; i < HIST; ++i) v[i] = ULONG_MAX - hog (ULONG_MAX);
 for (i = 0; i < HIST; ++i) tmp += v[i];
 tmp /= HIST;
 n = tmp - (tmp / 3.0);

 sigemptyset (&set);
 sigaddset (&set, SIGALRM);

 for (;;) {
 hog (n);
 sigwait (&set, &i);
 }
 return 0;
}


References
--

http://lkml.org/lkml/2007/2/12/6
Documentation/filesystems/proc.txt (1.8)


--
vale
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-14 Thread Pavel Machek
Hi!
> 
> >>>I have (had?) code that 'exploits' this. I believe I could eat 90% of cpu
> >>>without being noticed.
> >>
> >>Slightly changed version of hog(around 3 lines in total changed) does that
> >>easily on 2.6.18.3 on PPC.
> >>
> >>http://www.boblycat.org/~malc/apc/load-hog-ppc.png
> >
> >I guess it's worth mentioning this is _only_ about displaying the cpu 
> >usage to
> >userspace, as the cpu scheduler knows the accounting of each task in
> >different ways. This behaviour can not be used to exploit the cpu scheduler
> >into a starvation situation. Using the discrete per process accounting to
> >accumulate the displayed values to userspace would fix this problem, but
> >would be expensive.
> 
> Guess you are right, but, once again, the problem is not so much about
> fooling the system to do something or other, but confusing the user:
> 
> a. Everything is fine - the load is 0%, the fact that the system is
>overheating and/or that some processes do not do as much as they
>could is probably due to the bad hardware.
> 
> b. The weird load pattern must be the result of bugs in my code.
>(And then a whole lot of time/effort is poured into fixing the
> problem which is simply not there)
> 
> The current situation ought to be documented. Better yet some flag
> can

It probably _is_ documented, somewhere :-). If you find nice place
where to document it (top manpage?) go ahead with the patch.

> be introduced somewhere in the system so that it exports realy values to
> /proc, not the estimations that are innacurate in some cases (like hog)

Patch would be welcome, but I do not think it will be easy.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-14 Thread Con Kolivas
On Wednesday 14 February 2007 18:28, malc wrote:
> On Wed, 14 Feb 2007, Con Kolivas wrote:
> > On Wednesday 14 February 2007 09:01, malc wrote:
> >> On Mon, 12 Feb 2007, Pavel Machek wrote:
> >>> Hi!
>
> [..snip..]
>
> >>> I have (had?) code that 'exploits' this. I believe I could eat 90% of
> >>> cpu without being noticed.
> >>
> >> Slightly changed version of hog(around 3 lines in total changed) does
> >> that easily on 2.6.18.3 on PPC.
> >>
> >> http://www.boblycat.org/~malc/apc/load-hog-ppc.png
> >
> > I guess it's worth mentioning this is _only_ about displaying the cpu
> > usage to userspace, as the cpu scheduler knows the accounting of each
> > task in different ways. This behaviour can not be used to exploit the cpu
> > scheduler into a starvation situation. Using the discrete per process
> > accounting to accumulate the displayed values to userspace would fix this
> > problem, but would be expensive.
>
> Guess you are right, but, once again, the problem is not so much about
> fooling the system to do something or other, but confusing the user:

Yes and I certainly am not arguing against that.

>
> a. Everything is fine - the load is 0%, the fact that the system is
> overheating and/or that some processes do not do as much as they
> could is probably due to the bad hardware.
>
> b. The weird load pattern must be the result of bugs in my code.
> (And then a whole lot of time/effort is poured into fixing the
>  problem which is simply not there)
>
> The current situation ought to be documented. Better yet some flag can
> be introduced somewhere in the system so that it exports realy values to
> /proc, not the estimations that are innacurate in some cases (like hog)

I wouldn't argue against any of those either. schedstats with userspace tools 
to understand the data will give better information I believe.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-13 Thread malc

On Wed, 14 Feb 2007, Con Kolivas wrote:


On Wednesday 14 February 2007 09:01, malc wrote:

On Mon, 12 Feb 2007, Pavel Machek wrote:

Hi!


[..snip..]


I have (had?) code that 'exploits' this. I believe I could eat 90% of cpu
without being noticed.


Slightly changed version of hog(around 3 lines in total changed) does that
easily on 2.6.18.3 on PPC.

http://www.boblycat.org/~malc/apc/load-hog-ppc.png


I guess it's worth mentioning this is _only_ about displaying the cpu usage to
userspace, as the cpu scheduler knows the accounting of each task in
different ways. This behaviour can not be used to exploit the cpu scheduler
into a starvation situation. Using the discrete per process accounting to
accumulate the displayed values to userspace would fix this problem, but
would be expensive.


Guess you are right, but, once again, the problem is not so much about
fooling the system to do something or other, but confusing the user:

a. Everything is fine - the load is 0%, the fact that the system is
   overheating and/or that some processes do not do as much as they
   could is probably due to the bad hardware.

b. The weird load pattern must be the result of bugs in my code.
   (And then a whole lot of time/effort is poured into fixing the
problem which is simply not there)

The current situation ought to be documented. Better yet some flag can
be introduced somewhere in the system so that it exports realy values to
/proc, not the estimations that are innacurate in some cases (like hog)

--
vale
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-13 Thread Con Kolivas
On Wednesday 14 February 2007 09:01, malc wrote:
> On Mon, 12 Feb 2007, Pavel Machek wrote:
> > Hi!
> >
> >> The kernel looks at what is using cpu _only_ during the
> >> timer
> >> interrupt. Which means if your HZ is 1000 it looks at
> >> what is running
> >> at precisely the moment those 1000 timer ticks occur. It
> >> is
> >> theoretically possible using this measurement system to
> >> use >99% cpu
> >> and record 0 usage if you time your cpu usage properly.
> >> It gets even
> >> more inaccurate at lower HZ values for the same reason.
> >
> > I have (had?) code that 'exploits' this. I believe I could eat 90% of cpu
> > without being noticed.
>
> Slightly changed version of hog(around 3 lines in total changed) does that
> easily on 2.6.18.3 on PPC.
>
> http://www.boblycat.org/~malc/apc/load-hog-ppc.png

I guess it's worth mentioning this is _only_ about displaying the cpu usage to 
userspace, as the cpu scheduler knows the accounting of each task in 
different ways. This behaviour can not be used to exploit the cpu scheduler 
into a starvation situation. Using the discrete per process accounting to 
accumulate the displayed values to userspace would fix this problem, but 
would be expensive.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-13 Thread malc

On Mon, 12 Feb 2007, Pavel Machek wrote:


Hi!


The kernel looks at what is using cpu _only_ during the
timer
interrupt. Which means if your HZ is 1000 it looks at
what is running
at precisely the moment those 1000 timer ticks occur. It
is
theoretically possible using this measurement system to
use >99% cpu
and record 0 usage if you time your cpu usage properly.
It gets even
more inaccurate at lower HZ values for the same reason.


I have (had?) code that 'exploits' this. I believe I could eat 90% of cpu
without being noticed.


Slightly changed version of hog(around 3 lines in total changed) does that
easily on 2.6.18.3 on PPC.

http://www.boblycat.org/~malc/apc/load-hog-ppc.png

--
vale
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-13 Thread Pavel Machek
Hi!

> The kernel looks at what is using cpu _only_ during the 
> timer
> interrupt. Which means if your HZ is 1000 it looks at 
> what is running
> at precisely the moment those 1000 timer ticks occur. It 
> is
> theoretically possible using this measurement system to 
> use >99% cpu
> and record 0 usage if you time your cpu usage properly. 
> It gets even
> more inaccurate at lower HZ values for the same reason.

I have (had?) code that 'exploits' this. I believe I could eat 90% of cpu
without being noticed.
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-12 Thread malc

On Mon, 12 Feb 2007, Andrew Burgess wrote:


On 12/02/07, Vassili Karpov <[EMAIL PROTECTED]> wrote:


How does the kernel calculates the value it places in `/proc/stat' at
4th position (i.e. "idle: twiddling thumbs")?


..


Later small kernel module was developed that tried to time how much
time is spent in the idle handler inside the kernel and exported this
information to the user-space. The results were consistent with our
expectations and the output of the test utility.

..

http://www.boblycat.org/~malc/apc


Vassili

Could you rewrite this code as a kernel patch for
discussion/inclusion in mainline? I and maybe others would
appreciate having idle statistics be more accurate.


I really don't know how to approach that, what i do in itc.c is ugly
to say the least (it's less ugly on PPC, but still).

There's stuff there that is very dangerous, i.e. entering idle handler
on SMP and simultaneously rmmoding the module (which surprisingly
never actually caused any bad things on kernels i had (starting with
2.6.17.3), but paniced on Debians 2.6.8). Safety nets were added but i
don't know whether they are sufficient. All in all what i have is a
gross hack, but it works for my purposes.

Another thing that keeps bothering me (again discovered with this
Debian kernel) is the fact that PREEMPT preempts idle handler, this
just doesn't add up in my head.

So to summarize: i don't know how to properly do that (so that it
works on all/most architectures, is less of a hack, has no negative
impact on performance, etc)

But i guess what innocent `smallhog.c' posted earlier demonstrated -
is that something probably ought to be done about it, or at least
the current situation documented.

--
vale
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-12 Thread malc

On Mon, 12 Feb 2007, Con Kolivas wrote:


On 12/02/07, Vassili Karpov <[EMAIL PROTECTED]> wrote:

Hello,


[..snip..]



The kernel looks at what is using cpu _only_ during the timer
interrupt. Which means if your HZ is 1000 it looks at what is running
at precisely the moment those 1000 timer ticks occur. It is
theoretically possible using this measurement system to use >99% cpu
and record 0 usage if you time your cpu usage properly. It gets even
more inaccurate at lower HZ values for the same reason.


And indeed it appears to be possible to do just that. Example:

/* gcc -o hog smallhog.c */
#include 
#include 
#include 
#include 

#define HIST 10

static sig_atomic_t stop;

static void sighandler (int signr)
{
(void) signr;
stop = 1;
}

static unsigned long hog (unsigned long niters)
{
stop = 0;
while (!stop && --niters);
return niters;
}

int main (void)
{
int i;
struct itimerval it = { .it_interval = { .tv_sec = 0, .tv_usec = 1 },
.it_value = { .tv_sec = 0, .tv_usec = 1 } };
sigset_t set;
unsigned long v[HIST];
double tmp = 0.0;
unsigned long n;

signal (SIGALRM, &sighandler);
setitimer (ITIMER_REAL, &it, NULL);

for (i = 0; i < HIST; ++i) v[i] = ULONG_MAX - hog (ULONG_MAX);
for (i = 0; i < HIST; ++i) tmp += v[i];
tmp /= HIST;
n = tmp - (tmp / 3.0);

sigemptyset (&set);
sigaddset (&set, SIGALRM);

for (;;) {
hog (n);
sigwait (&set, &i);
}
return 0;
}
/* end smallhog.c */

Might need some adjustment for a particular system but ran just fine here
on:
2.4.30   + Athlon tbird (1Ghz)
2.6.19.2 + Athlon X2 3800+ (2Ghz)

Showing next to zero load in top(1) and a whole lot more in APC.

http://www.boblycat.org/~malc/apc/load-tbird-hog.png
http://www.boblycat.org/~malc/apc/load-x2-hog.png

Not quite 99% but nevertheless scary.

--
vale
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-12 Thread Andrew Burgess
On 12/02/07, Vassili Karpov <[EMAIL PROTECTED]> wrote:
>
> How does the kernel calculates the value it places in `/proc/stat' at
> 4th position (i.e. "idle: twiddling thumbs")?
>
..
>
> Later small kernel module was developed that tried to time how much
> time is spent in the idle handler inside the kernel and exported this
> information to the user-space. The results were consistent with our
> expectations and the output of the test utility.
..
> http://www.boblycat.org/~malc/apc

Vassili

Could you rewrite this code as a kernel patch for
discussion/inclusion in mainline? I and maybe others would
appreciate having idle statistics be more accurate.

Thanks for your work
Andrew

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-11 Thread Con Kolivas
On Monday 12 February 2007 18:10, malc wrote:
> On Mon, 12 Feb 2007, Con Kolivas wrote:
> > Lots of confusion comes from this, and often people think their pc
> > suddenly uses a lot less cpu when they change from 1000HZ to 100HZ and
> > use this as an argument/reason for changing to 100HZ when in fact the
> > massive _reported_ difference is simply worse accounting. Of course there
> > is more overhead going from 100 to 1000 but it doesn't suddenly make your
> > apps use 10 times more cpu.
>
> Yep. This, i belive, what made the mplayer developers incorrectly conclude
> that utilizing RTC suddenly made the code run slower, after all /proc/stat
> now claims that CPU load is higher, while in reality it stayed the same -
> it's the accuracy that has improved (somewhat)
>
> But back to the original question, does it look at what's running on timer
> interrupt only or any IRQ? (something which is more in line with my own
> observations)

During the timer interrupt only. However if you create any form of timer, they 
will of course have some periodicity relationship with the timer interrupt.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-11 Thread malc

On Mon, 12 Feb 2007, Con Kolivas wrote:


On Monday 12 February 2007 16:54, malc wrote:

On Mon, 12 Feb 2007, Con Kolivas wrote:

On 12/02/07, Vassili Karpov <[EMAIL PROTECTED]> wrote:


[..snip..]


The kernel looks at what is using cpu _only_ during the timer
interrupt. Which means if your HZ is 1000 it looks at what is running
at precisely the moment those 1000 timer ticks occur. It is
theoretically possible using this measurement system to use >99% cpu
and record 0 usage if you time your cpu usage properly. It gets even
more inaccurate at lower HZ values for the same reason.


Thank you very much. This somewhat contradicts what i saw (and outlined
in usnet article), namely the mplayer+/dev/rtc case. Unless ofcourse
/dev/rtc interrupt is considered to be the same as the interrupt from
PIT (on X86 that is)

P.S. Perhaps it worth documenting this in the documentation? I caused
  me, and perhaps quite a few other people, a great deal of pain and
  frustration.


Lots of confusion comes from this, and often people think their pc suddenly
uses a lot less cpu when they change from 1000HZ to 100HZ and use this as an
argument/reason for changing to 100HZ when in fact the massive _reported_
difference is simply worse accounting. Of course there is more overhead going
from 100 to 1000 but it doesn't suddenly make your apps use 10 times more
cpu.


Yep. This, i belive, what made the mplayer developers incorrectly conclude
that utilizing RTC suddenly made the code run slower, after all /proc/stat
now claims that CPU load is higher, while in reality it stayed the same -
it's the accuracy that has improved (somewhat)

But back to the original question, does it look at what's running on timer
interrupt only or any IRQ? (something which is more in line with my own
observations)

--
vale
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-11 Thread malc

On Mon, 12 Feb 2007, Con Kolivas wrote:


On 12/02/07, Vassili Karpov <[EMAIL PROTECTED]> wrote:


[..snip..]


The kernel looks at what is using cpu _only_ during the timer
interrupt. Which means if your HZ is 1000 it looks at what is running
at precisely the moment those 1000 timer ticks occur. It is
theoretically possible using this measurement system to use >99% cpu
and record 0 usage if you time your cpu usage properly. It gets even
more inaccurate at lower HZ values for the same reason.


Thank you very much. This somewhat contradicts what i saw (and outlined
in usnet article), namely the mplayer+/dev/rtc case. Unless ofcourse
/dev/rtc interrupt is considered to be the same as the interrupt from
PIT (on X86 that is)

P.S. Perhaps it worth documenting this in the documentation? I caused
 me, and perhaps quite a few other people, a great deal of pain and
 frustration.

--
vale
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-11 Thread Con Kolivas
On Monday 12 February 2007 16:54, malc wrote:
> On Mon, 12 Feb 2007, Con Kolivas wrote:
> > On 12/02/07, Vassili Karpov <[EMAIL PROTECTED]> wrote:
>
> [..snip..]
>
> > The kernel looks at what is using cpu _only_ during the timer
> > interrupt. Which means if your HZ is 1000 it looks at what is running
> > at precisely the moment those 1000 timer ticks occur. It is
> > theoretically possible using this measurement system to use >99% cpu
> > and record 0 usage if you time your cpu usage properly. It gets even
> > more inaccurate at lower HZ values for the same reason.
>
> Thank you very much. This somewhat contradicts what i saw (and outlined
> in usnet article), namely the mplayer+/dev/rtc case. Unless ofcourse
> /dev/rtc interrupt is considered to be the same as the interrupt from
> PIT (on X86 that is)
>
> P.S. Perhaps it worth documenting this in the documentation? I caused
>   me, and perhaps quite a few other people, a great deal of pain and
>   frustration.

Lots of confusion comes from this, and often people think their pc suddenly 
uses a lot less cpu when they change from 1000HZ to 100HZ and use this as an 
argument/reason for changing to 100HZ when in fact the massive _reported_ 
difference is simply worse accounting. Of course there is more overhead going 
from 100 to 1000 but it doesn't suddenly make your apps use 10 times more 
cpu.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-11 Thread Con Kolivas
On Monday 12 February 2007 16:55, Stephen Rothwell wrote:
> On Mon, 12 Feb 2007 16:44:22 +1100 "Con Kolivas" <[EMAIL PROTECTED]> wrote:
> > The kernel looks at what is using cpu _only_ during the timer
> > interrupt. Which means if your HZ is 1000 it looks at what is running
> > at precisely the moment those 1000 timer ticks occur. It is
> > theoretically possible using this measurement system to use >99% cpu
> > and record 0 usage if you time your cpu usage properly. It gets even
> > more inaccurate at lower HZ values for the same reason.
>
> That is not true on all architecures, some do more accurate accounting by
> recording the times at user/kernel/interrupt transitions ...

Indeed. It's certainly the way the common more boring pc architectures do it 
though.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU load

2007-02-11 Thread Stephen Rothwell
On Mon, 12 Feb 2007 16:44:22 +1100 "Con Kolivas" <[EMAIL PROTECTED]> wrote:
>
> The kernel looks at what is using cpu _only_ during the timer
> interrupt. Which means if your HZ is 1000 it looks at what is running
> at precisely the moment those 1000 timer ticks occur. It is
> theoretically possible using this measurement system to use >99% cpu
> and record 0 usage if you time your cpu usage properly. It gets even
> more inaccurate at lower HZ values for the same reason.

That is not true on all architecures, some do more accurate accounting by
recording the times at user/kernel/interrupt transitions ...

--
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgpMZ5w06pmhZ.pgp
Description: PGP signature


Re: CPU load

2007-02-11 Thread Con Kolivas

On 12/02/07, Vassili Karpov <[EMAIL PROTECTED]> wrote:

Hello,

How does the kernel calculates the value it places in `/proc/stat' at
4th position (i.e. "idle: twiddling thumbs")?

For background information as to why this question arose in the first
place read on.

While writing the code dealing with video acquisition/processing at
work noticed that what top(1) (and every other tool that uses
`/proc/stat' or `/proc/uptime') shows some very strange results.

Top claimed that the system running one version of the code[A] is
idling more often than the code[B] doing the same thing but more
cleverly. After some head scratching one of my colleagues suggested a
simple test that was implemented in a few minutes.

The test consisted of a counter that incremented in an endless loop
also after certain period of time had elapsed it printed the value of
the counter.  Running this test (with priority set to the lowest
possible level) with code[A] and code[B] confirmed that code[B] is
indeed faster than code[A], in a sense that the test made more forward
progress while code[B] is running.

Hard-coding some things (i.e. the value of the counter after counting
for the duration of one period on completely idle system) we extended
the test to show the percentage of CPU that was utilized. This never
matched the value that top presented us with.

Later small kernel module was developed that tried to time how much
time is spent in the idle handler inside the kernel and exported this
information to the user-space. The results were consistent with our
expectations and the output of the test utility.

Two more points.

a. In the past (again video processing context) i have witnessed
   `/proc/stat' claiming that CPU utilization is 0% for, say, 20
   seconds followed by 5 seconds of 30% load, and then the cycle
   repeated. According to the methods outlined above the load is
   always at 30%.

b. In my personal experience difference between `/proc/stat' and
   "reality" can easily reach 40% (think i saw even more than that)

The module and graphical application that uses it, along with some
short README and a link to Usenet article dealing with the same
subject is available at:
http://www.boblycat.org/~malc/apc


The kernel looks at what is using cpu _only_ during the timer
interrupt. Which means if your HZ is 1000 it looks at what is running
at precisely the moment those 1000 timer ticks occur. It is
theoretically possible using this measurement system to use >99% cpu
and record 0 usage if you time your cpu usage properly. It gets even
more inaccurate at lower HZ values for the same reason.

--
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu load balancing problem on smp

2007-02-09 Thread Marc Donner
On Thursday 08 February 2007 09:42, you wrote:
> On Wed, 7 Feb 2007, Arjan van de Ven wrote:
> > Marc Donner wrote:
> >> 501: 215717 209388 209430 202514   PCI-MSI-edge 
> >> eth10 502:927   1019   1053888   PCI-MSI-edge   
> >>   eth11
> >
> > this is odd, this is not an irq distribution that irqbalance should give
> > you 1
> >
> >> NMI:451 39 42 46
> >> LOC: 170899 170864 170846 170788
> >> ERR:  0
> >>
> >> top output:
> >>
> >> top - 01:45:32 up 16 min,  2 users,  load average: 1.04, 0.92, 0.50
> >> Tasks:  81 total,   3 running,  78 sleeping,   0 stopped,   0 zombie
> >> Cpu0  :  0.0% us,  0.0% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi,
> >> 100.0% si
> >
> > and this doesn't match the irq output...
> > sounds as if something has a real bug; can you send an lsmod ? maybe some
> > driver keeps doing si's
>
> since this only happens when he adds more iptables rules, is it possible
> that there sis some locking, or otherdata structure access that's
> serializing things under load?
>
> Marc, since you don't use modules, send your .config.
>
> David Lang

i've inserted some more iptables rules, and got 'softlockup detected' 
messages.

BUG: soft lockup detected on CPU#1!

Call Trace:
   [] softlockup_tick+0xda/0xec
 [] update_process_times+0x42/0x68
 [] smp_local_timer_interrupt+0x32/0x55
 [] ip_rcv+0x0/0x48e
 [] smp_apic_timer_interrupt+0x4f/0x67
 [] apic_timer_interrupt+0x66/0x70
 [] ip_rcv+0x0/0x48e
 [] ipt_do_table+0x2cd/0x315
 [] nf_iterate+0x3f/0x7b
 [] ip_forward_finish+0x0/0x3e
 [] nf_hook_slow+0x5f/0xca
 [] ip_forward_finish+0x0/0x3e
 [] ip_forward+0x16e/0x212
 [] ip_rcv+0x447/0x48e
 [] e1000_clean_rx_irq+0x41e/0x4ea
 [] e1000_clean+0x2f7/0x4b0
 [] task_rq_lock+0x3d/0x6f
 [] net_rx_action+0x78/0x14a
 [] __do_softirq+0x56/0xd3
 [] ksoftirqd+0x0/0x8f
 [] call_softirq+0x1c/0x28
 [] do_softirq+0x2c/0x7d
 [] smp_apic_timer_interrupt+0x54/0x67
 [] apic_timer_interrupt+0x66/0x70
   [] do_softirq+0x7b/0x7d
 [] ksoftirqd+0x4f/0x8f
 [] kthread+0xcb/0xf5
 [] child_rip+0xa/0x12
 [] kthread+0x0/0xf5
 [] child_rip+0x0/0x12


Marc
BUG: soft lockup detected on CPU#1!

Call Trace:
   [] softlockup_tick+0xda/0xec
 [] update_process_times+0x42/0x68
 [] smp_local_timer_interrupt+0x32/0x55
 [] ip_rcv+0x0/0x48e
 [] smp_apic_timer_interrupt+0x4f/0x67
 [] apic_timer_interrupt+0x66/0x70
 [] ip_rcv+0x0/0x48e
 [] ipt_do_table+0x2cd/0x315
 [] nf_iterate+0x3f/0x7b
 [] ip_forward_finish+0x0/0x3e
 [] nf_hook_slow+0x5f/0xca
 [] ip_forward_finish+0x0/0x3e
 [] ip_forward+0x16e/0x212
 [] ip_rcv+0x447/0x48e
 [] e1000_clean_rx_irq+0x41e/0x4ea
 [] e1000_clean+0x2f7/0x4b0
 [] task_rq_lock+0x3d/0x6f
 [] net_rx_action+0x78/0x14a
 [] __do_softirq+0x56/0xd3
 [] ksoftirqd+0x0/0x8f
 [] call_softirq+0x1c/0x28
 [] do_softirq+0x2c/0x7d
 [] smp_apic_timer_interrupt+0x54/0x67
 [] apic_timer_interrupt+0x66/0x70
   [] do_softirq+0x7b/0x7d
 [] ksoftirqd+0x4f/0x8f
 [] kthread+0xcb/0xf5
 [] child_rip+0xa/0x12
 [] kthread+0x0/0xf5
 [] child_rip+0x0/0x12



BUG: soft lockup detected on CPU#1!

Call Trace:
   [] softlockup_tick+0xda/0xec
 [] update_process_times+0x42/0x68
 [] smp_local_timer_interrupt+0x32/0x55
 [] ip_rcv+0x0/0x48e
 [] smp_apic_timer_interrupt+0x4f/0x67
 [] apic_timer_interrupt+0x66/0x70
 [] ip_rcv+0x0/0x48e
 [] ipt_do_table+0xaf/0x315
 [] nf_iterate+0x3f/0x7b
 [] ip_forward_finish+0x0/0x3e
 [] nf_hook_slow+0x5f/0xca
 [] ip_forward_finish+0x0/0x3e
 [] ip_forward+0x16e/0x212
 [] ip_rcv+0x447/0x48e
 [] e1000_clean_rx_irq+0x41e/0x4ea
 [] e1000_clean+0x2f7/0x4b0
 [] lock_timer_base+0x1b/0x3c
 [] __mod_timer+0xa6/0xb4
 [] net_rx_action+0x78/0x14a
 [] __do_softirq+0x56/0xd3
 [] ksoftirqd+0x0/0x8f
 [] call_softirq+0x1c/0x28
 [] do_softirq+0x2c/0x7d
 [] smp_apic_timer_interrupt+0x54/0x67
 [] apic_timer_interrupt+0x66/0x70
   [] do_softirq+0x7b/0x7d
 [] ksoftirqd+0x4f/0x8f
 [] kthread+0xcb/0xf5
 [] child_rip+0xa/0x12
 [] kthread+0x0/0xf5
 [] child_rip+0x0/0x12




BUG: soft lockup detected on CPU#1!

Call Trace:
   [] softlockup_tick+0xda/0xec
 [] update_process_times+0x42/0x68
 [] smp_local_timer_interrupt+0x32/0x55
 [] ip_rcv+0x0/0x48e
 [] smp_apic_timer_interrupt+0x4f/0x67
 [] apic_timer_interrupt+0x66/0x70
 [] ip_rcv+0x0/0x48e
 [] ipt_do_table+0xbf/0x315
 [] nf_iterate+0x3f/0x7b
 [] ip_forward_finish+0x0/0x3e
 [] nf_hook_slow+0x5f/0xca
 [] ip_forward_finish+0x0/0x3e
 [] ip_forward+0x16e/0x212
 [] ip_rcv+0x447/0x48e
 [] e1000_clean_rx_irq+0x41e/0x4ea
 [] e1000_clean+0x2f7/0x4b0
 [] handle_edge_irq+0x106/0x12f
 [] handle_edge_irq+0x0/0x12f
 [] do_IRQ+0x137/0x159
 [] net_rx_action+0x78/0x14a
 [] __do_softirq+0x56/0xd3
 [] ksoftirqd+0x0/0x8f
 [] call_softirq+0x1c/0x28
 [] do_softirq+0x2c/0x7d
 [] smp_apic_timer_interrupt+0x54/0x67
 [] apic_timer_interrupt+0x66/0x70
   [] do_softirq+0x7b/0x7d
 [] ksoftirqd+0x4f/0x8f
 [] kthread+0xcb/0xf5
 [] child_rip+0xa/0x12
 [] kthread+0x0/0xf5
 [] child_rip+0x0/0x12



BUG: soft lockup detected on CPU#1!

Call

Re: cpu load balancing problem on smp

2007-02-08 Thread David Lang

On Wed, 7 Feb 2007, Arjan van de Ven wrote:



Marc Donner wrote:

501: 215717 209388 209430 202514   PCI-MSI-edge  eth10
502:927   1019   1053888   PCI-MSI-edge  eth11


this is odd, this is not an irq distribution that irqbalance should give you
1

NMI:451 39 42 46
LOC: 170899 170864 170846 170788
ERR:  0

top output:

top - 01:45:32 up 16 min,  2 users,  load average: 1.04, 0.92, 0.50
Tasks:  81 total,   3 running,  78 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0% us,  0.0% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi, 100.0% 
si



and this doesn't match the irq output...
sounds as if something has a real bug; can you send an lsmod ? maybe some 
driver keeps doing si's


since this only happens when he adds more iptables rules, is it possible that 
there sis some locking, or otherdata structure access that's serializing things 
under load?


Marc, since you don't use modules, send your .config.

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu load balancing problem on smp

2007-02-07 Thread Marc Donner
On Wednesday 07 February 2007 06:59, you wrote:
> Marc Donner wrote:
> > 501: 215717 209388 209430 202514   PCI-MSI-edge 
> > eth10 502:927   1019   1053888   PCI-MSI-edge
> >  eth11
>
> this is odd, this is not an irq distribution that irqbalance should
> give you

i think this is ok, because only eth10 is receiving packets. traffic is 
flowing from eth10 to eth11

> and this doesn't match the irq output...
> sounds as if something has a real bug; can you send an lsmod ? maybe
> some driver keeps doing si's

lsmod
Module  Size  Used by
thermal16780  0
fan 6280  0
button  9696  0
processor  29576  1 thermal
ac  6664  0
battery11720  0

drivers are build directly in the kernel. i have attached the config file.

i can also give access to the test setup, if you want.

i have also tested kernel 2.6.18.3 and 2.6.19.2 on other hardware. same 
effect. 

regards 
marc
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.20
# Tue Feb  6 00:17:31 2007
#
CONFIG_X86_64=y
CONFIG_64BIT=y
CONFIG_X86=y
CONFIG_ZONE_DMA32=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_CMPXCHG=y
CONFIG_EARLY_PRINTK=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_DMI=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION="dell2950-router"
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_CPUSETS=y
CONFIG_SYSFS_DEPRECATED=y
CONFIG_RELAY=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_SLAB=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
# CONFIG_SLOB is not set

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y

#
# Block layer
#
CONFIG_BLOCK=y
CONFIG_BLK_DEV_IO_TRACE=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
# CONFIG_IOSCHED_AS is not set
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_VSMP is not set
# CONFIG_MK8 is not set
# CONFIG_MPSC is not set
CONFIG_MCORE2=y
# CONFIG_GENERIC_CPU is not set
CONFIG_X86_L1_CACHE_BYTES=64
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_INTERNODE_CACHE_BYTES=64
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
# CONFIG_MICROCODE is not set
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
CONFIG_X86_HT=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_MTRR=y
CONFIG_SMP=y
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_BKL=y
# CONFIG_NUMA is not set
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_DISCONTIGMEM_MANUAL is not set
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
# CONFIG_SPARSEMEM_STATIC is not set
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_RESOURCES_64BIT=y
CONFIG_NR_CPUS=8
# CONFIG_HOTPLUG_CPU is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_IOMMU=y
# CONFIG_CALGARY_IOMMU is not set
CONFIG_SWIOTLB=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_INTEL=y
CONFIG_X86_MCE_AMD=y
# CONFIG_KEXEC is not set
# CONFIG_CRASH_DUMP is not set
CONFIG_PHYSICAL_START=0x20
CONFIG_SECCOMP=y
# CONFIG_CC_STACKPROTECTOR is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
# CONFIG_REORDER is not set
CONFIG_K8_NB=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_ISA_DMA_API=y
CONFIG_GENERIC_PENDING_IRQ=y

#
# Power management options
#
CONFIG_PM=y
# CONFIG_PM_LEGACY is not set
# CONFIG_PM_DEBUG is not set
# 

Re: cpu load balancing problem on smp

2007-02-06 Thread Arjan van de Ven

Marc Donner wrote:

501: 215717 209388 209430 202514   PCI-MSI-edge  eth10
502:927   1019   1053888   PCI-MSI-edge  eth11


this is odd, this is not an irq distribution that irqbalance should 
give you

1

NMI:451 39 42 46
LOC: 170899 170864 170846 170788
ERR:  0

top output:

top - 01:45:32 up 16 min,  2 users,  load average: 1.04, 0.92, 0.50
Tasks:  81 total,   3 running,  78 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0% us,  0.0% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi, 100.0% si


and this doesn't match the irq output...
sounds as if something has a real bug; can you send an lsmod ? maybe 
some driver keeps doing si's

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu load balancing problem on smp

2007-02-06 Thread Marc Donner
> can you send me the output of
>
> cat /proc/interrupts
 
here it is:
irqblance is running.
network loaded with 600Mbit/s for about 5minutes. 

  CPU0   CPU1   CPU2   CPU3
  0:  37713  41667  41673  49914   IO-APIC-edge  timer
  1:  0  0  2  0   IO-APIC-edge  i8042
  8:  0  0  1  0   IO-APIC-edge  rtc
  9:  0  0  0  0   IO-APIC-fasteoi   acpi
 12:  2  0  2  0   IO-APIC-edge  i8042
 14: 11  9  9  8   IO-APIC-edge  ide0
 20:  0  0  0  0   IO-APIC-fasteoi   
uhci_hcd:usb3
 21: 62 52 37 46   IO-APIC-fasteoi   
ehci_hcd:usb1, uhci_hcd:usb2, uhci_hcd:usb4
 78:665581344351   IO-APIC-fasteoi   megasas
501: 215717 209388 209430 202514   PCI-MSI-edge  eth10
502:927   1019   1053888   PCI-MSI-edge  eth11
NMI:451 39 42 46
LOC: 170899 170864 170846 170788
ERR:  0

top output:

top - 01:45:32 up 16 min,  2 users,  load average: 1.04, 0.92, 0.50
Tasks:  81 total,   3 running,  78 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0% us,  0.0% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi, 100.0% si
Cpu1  :  0.0% us,  0.0% sy,  0.0% ni, 99.0% id,  1.0% wa,  0.0% hi,  0.0% si
Cpu2  :  0.0% us,  0.0% sy,  0.0% ni, 99.7% id,  0.0% wa,  0.0% hi,  0.3% si
Cpu3  :  0.0% us,  0.0% sy,  0.0% ni, 99.7% id,  0.0% wa,  0.3% hi,  0.0% si


regards
Marc
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu load balancing problem on smp

2007-02-06 Thread Pablo Sebastian Greco

Arjan van de Ven wrote:

Pablo Sebastian Greco wrote:

2296:427426436  134563009   PCI-MSI-edge  
eth1
2297:252252  135926471257   PCI-MSI-edge  
eth0


this suggests that  cores would be busy rather than only one
-
Yes, but you are looking at mm kernel statistics, but if you look at the 
standard kernel, you'll see that eth interrupts are on the same core 
according to attached /proc/cpuinfo.

OTOH, take a look at timer interrupt distribution
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 15
model   : 6
model name  :   Intel(R) Xeon(TM) CPU 2.66GHz
stepping: 4
cpu MHz : 2656.000
cache size  : 2048 KB
physical id : 0
siblings: 4
core id : 0
cpu cores   : 2
fpu : yes
fpu_exception   : yes
cpuid level : 6
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm 
constant_tsc pni monitor ds_cpl vmx est cid cx16 xtpr lahf_lm
bogomips: 5324.82
clflush size: 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 15
model   : 6
model name  :   Intel(R) Xeon(TM) CPU 2.66GHz
stepping: 4
cpu MHz : 2656.000
cache size  : 2048 KB
physical id : 0
siblings: 4
core id : 1
cpu cores   : 2
fpu : yes
fpu_exception   : yes
cpuid level : 6
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm 
constant_tsc pni monitor ds_cpl vmx est cid cx16 xtpr lahf_lm
bogomips: 5320.06
clflush size: 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 2
vendor_id   : GenuineIntel
cpu family  : 15
model   : 6
model name  :   Intel(R) Xeon(TM) CPU 2.66GHz
stepping: 4
cpu MHz : 2656.000
cache size  : 2048 KB
physical id : 0
siblings: 4
core id : 0
cpu cores   : 2
fpu : yes
fpu_exception   : yes
cpuid level : 6
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm 
constant_tsc pni monitor ds_cpl vmx est cid cx16 xtpr lahf_lm
bogomips: 5320.20
clflush size: 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 3
vendor_id   : GenuineIntel
cpu family  : 15
model   : 6
model name  :   Intel(R) Xeon(TM) CPU 2.66GHz
stepping: 4
cpu MHz : 2656.000
cache size  : 2048 KB
physical id : 0
siblings: 4
core id : 1
cpu cores   : 2
fpu : yes
fpu_exception   : yes
cpuid level : 6
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm 
constant_tsc pni monitor ds_cpl vmx est cid cx16 xtpr lahf_lm
bogomips: 5320.16
clflush size: 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:


Re: cpu load balancing problem on smp

2007-02-06 Thread Arjan van de Ven

Pablo Sebastian Greco wrote:


2296:427426436  134563009   PCI-MSI-edge  eth1
2297:252252  135926471257   PCI-MSI-edge  eth0


this suggests that  cores would be busy rather than only one
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu load balancing problem on smp

2007-02-06 Thread Pablo Sebastian Greco

Arjan van de Ven wrote:

Marc Donner wrote:


see http://www.irqbalance.org to get irqbalance


I now have tried irqloadbalance, but the same problem.



can you send me the output of

cat /proc/interrupts

(taken when you are or have been loading the network)

maybe there's something fishy going on
-
To unsubscribe from this list: send the line "unsubscribe 
linux-kernel" in

the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Please take a look at this, taken from the same machine running 
different vanilla kernels on fc6.
Current 2.6.19 fedora kernel, looks like 2.6.20rc3 (non mm) in the 
attachment.


2.6.20-rc3
[EMAIL PROTECTED] ~]# rpm -q irqbalance
irqbalance-0.55-2.fc6
[EMAIL PROTECTED] ~]# uptime
 11:51:50 up 6 days, 30 min,  3 users,  load average: 5.31, 5.08, 4.02
[EMAIL PROTECTED] ~]# service irqbalance status
irqbalance (pid 2310) is running...
[EMAIL PROTECTED] ~]# cat /proc/interrupts
   CPU0   CPU1   CPU2   CPU3
  0:  520209517  0  0  0   IO-APIC-edge  timer
  1: 12  0  0  0   IO-APIC-edge  i8042
  8:  1  0  0  0   IO-APIC-edge  rtc
  9:  0  0  0  0   IO-APIC-fasteoi   acpi
 12:103  0  0  0   IO-APIC-edge  i8042
 14:  0  0  0  0   IO-APIC-edge  libata
 15:  0  0  0  0   IO-APIC-edge  libata
 20: 138736  188194096  06797630   IO-APIC-fasteoi   libata
 22:  0  0  0  0   IO-APIC-fasteoi   
uhci_hcd:usb2, uhci_hcd:usb4
 23:  0  0  0  0   IO-APIC-fasteoi   
uhci_hcd:usb1, uhci_hcd:usb3, ehci_hcd:usb5
2296:   1367  0  0  849270653   PCI-MSI-edge  eth1
2297:   1022  835083968  0  0   PCI-MSI-edge  eth0
NMI:  47756 146249  47617 146186
LOC:  516828752  517331906  516828611  517331771
ERR:  0
2.6.20-rc3-mm1
[EMAIL PROTECTED] kernel]# uptime
 12:17:54 up 1 day, 21:58,  2 users,  load average: 9.47, 9.79, 10.28
[EMAIL PROTECTED] kernel]# cat /proc/interrupts
   CPU0   CPU1   CPU2   CPU3
  0:   60031592   61350247   22273772   21780215   IO-APIC-edge  timer
  1:  0  6  1  1   IO-APIC-edge  i8042
  8:  0  0  1  0   IO-APIC-edge  rtc
  9:  0  0  0  0   IO-APIC-fasteoi   acpi
 12:148283104136   IO-APIC-edge  i8042
 14:  0  0  0  0   IO-APIC-edge  libata
 15:  0  0  0  0   IO-APIC-edge  libata
 20:   104827951477821  93306 641628   IO-APIC-fasteoi   libata
 22:  0  0  0  0   IO-APIC-fasteoi   
uhci_hcd:usb2, uhci_hcd:usb4
 23:  0  0  0  0   IO-APIC-fasteoi   
uhci_hcd:usb1, uhci_hcd:usb3, ehci_hcd:usb5
2296:427426436  134563009   PCI-MSI-edge  eth1
2297:252252  135926471257   PCI-MSI-edge  eth0
NMI:  0  0  0  0
LOC:  164661140  165163503  164660992  165163305
ERR:  0


Re: cpu load balancing problem on smp

2007-02-06 Thread Arjan van de Ven

Marc Donner wrote:


see http://www.irqbalance.org to get irqbalance


I now have tried irqloadbalance, but the same problem.



can you send me the output of

cat /proc/interrupts

(taken when you are or have been loading the network)

maybe there's something fishy going on
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu load balancing problem on smp

2007-02-06 Thread Marc Donner
On Tuesday 06 February 2007 19:09, you wrote:
> On Tue, 2007-02-06 at 18:32 +0100, Marc Donner wrote:
> > Hi @all
> >
> > we have detected some problems on our live systems and so i have build a
> > test setup in our lab as follow:
> >
> > 3 Core 2 duo servers, each with 2 CPUs,  with GE interfaces. 2 of them
> > are only for generating network traffic. the 3rd server is the one i want
> > to test. it is connected over two GE links to the other servers. the
> > testserver is configured as an ip router. running kernel 2.6.20.
> >
> > now if i  let traffic flow over the box, about 600Mbit/s and about 120k
> > packets/s all seems to be ok. the load is balanced over all cpus. if now
> > insert some iptables rules, about 500, the softirq load increases, but
> > all seems to be ok. now i insert some rules more, and suddenly 1 CPU is
> > 100% loaded and the other ones are 99% idle. the load toggles now between
> > the cpus in intervals.
>
> I wonder if you are using irqbalance.. if not you probably want to...
> (this should at least split it over 2 cpus)
>
> see http://www.irqbalance.org to get irqbalance

I now have tried irqloadbalance, but the same problem.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu load balancing problem on smp

2007-02-06 Thread Arjan van de Ven
On Tue, 2007-02-06 at 18:32 +0100, Marc Donner wrote:
> Hi @all
> 
> we have detected some problems on our live systems and so i have build a test 
> setup in our lab as follow:
> 
> 3 Core 2 duo servers, each with 2 CPUs,  with GE interfaces. 2 of them are 
> only for generating network traffic. the 3rd server is the one i want to 
> test. it is connected over two GE links to the other servers. the testserver 
> is configured as an ip router. running kernel 2.6.20. 
> 
> now if i  let traffic flow over the box, about 600Mbit/s and about 120k 
> packets/s all seems to be ok. the load is balanced over all cpus. if now 
> insert some iptables rules, about 500, the softirq load increases, but all 
> seems to be ok. now i insert some rules more, and suddenly 1 CPU is 100% 
> loaded and the other ones are 99% idle. the load toggles now between the cpus 
> in intervals.

I wonder if you are using irqbalance.. if not you probably want to...
(this should at least split it over 2 cpus)

see http://www.irqbalance.org to get irqbalance
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/