Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread Aaron Porter
On Tue, Jul 24, 2007 at 09:13:47PM +0200, Andi Kleen wrote:

> You have slab debugging enabled. That makes everything slow.
> It costs you ~66% of your CPU time.  Disable it.

Past peak for the day, but this does look like the culprit. Thanks
to all who replied.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread David Miller
From: Aaron Porter <[EMAIL PROTECTED]>
Date: Tue, 24 Jul 2007 11:49:09 -0700

> samples  %app name symbol name
> 914379   48.8404  vmlinux-2.6.21.5 check_poison_obj
> 341920   18.2632  vmlinux-2.6.21.5 poison_obj
> 37355 1.9953  nf_conntrack (no symbols)

You have SLAB debugging enabled, turn it off if you want
decent performance.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread Chris Snook

Aaron Porter wrote:

On Tue, Jul 24, 2007 at 08:48:00PM +0200, Andi Kleen wrote:

Aaron Porter <[EMAIL PROTECTED]> writes:


I'm in the process up upgrading a pool of apache servers from
2.6.17.8 to 2.6.21.5, and we're seeing a pretty major change in behavior.
Under identical network load, 2.6.21 has a load average more than 3 times
higher, cpu 0 spends well over 90% of its time in interrupts (vs ~30%
under 2.6.17). When we hit 3k apache sessions, ksoftirqd eats 100% of cpu0
and our network traffic drops off rapidly. The end result is that 2.6.17
performs twice as well under this load.

Can you oprofile it?


# opreport -l
CPU: AMD64 processors, speed 1994.52 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask 
of 0x00 (No unit mask) count 10
samples  %app name symbol name
914379   48.8404  vmlinux-2.6.21.5 check_poison_obj
341920   18.2632  vmlinux-2.6.21.5 poison_obj


I bet you have CONFIG_DEBUG_SLAB turned off in your 2.6.17 kernel, and turned on 
in your 2.6.21 kernel.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread Jeff Garzik

Turn off slab debugging and preempt, and see if that helps.

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread Andi Kleen
> # opreport -l
> CPU: AMD64 processors, speed 1994.52 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit 
> mask of 0x00 (No unit mask) count 10
> samples  %app name symbol name
> 914379   48.8404  vmlinux-2.6.21.5 check_poison_obj
> 341920   18.2632  vmlinux-2.6.21.5 poison_obj

You have slab debugging enabled. That makes everything slow.
It costs you ~66% of your CPU time.  Disable it.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread Chris Snook

Aaron Porter wrote:

I'm in the process up upgrading a pool of apache servers from
2.6.17.8 to 2.6.21.5, and we're seeing a pretty major change in behavior.
Under identical network load, 2.6.21 has a load average more than 3 times
higher, cpu 0 spends well over 90% of its time in interrupts (vs ~30%
under 2.6.17). When we hit 3k apache sessions, ksoftirqd eats 100% of cpu0
and our network traffic drops off rapidly. The end result is that 2.6.17
performs twice as well under this load.


Is it always CPU 0, or does it move?  Are you running irqbalance?  If you're 
running irqbalance, you can run a script that alternates between 'cat 
/proc/interrupts' and 'mpstat -P ALL 5 10' and watch the offending interrupt 
jump around between processors.  It's not as informative as oprofile, as Andi 
suggested, but it's really easy to set up.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread Andi Kleen
Aaron Porter <[EMAIL PROTECTED]> writes:

>   I'm in the process up upgrading a pool of apache servers from
> 2.6.17.8 to 2.6.21.5, and we're seeing a pretty major change in behavior.
> Under identical network load, 2.6.21 has a load average more than 3 times
> higher, cpu 0 spends well over 90% of its time in interrupts (vs ~30%
> under 2.6.17). When we hit 3k apache sessions, ksoftirqd eats 100% of cpu0
> and our network traffic drops off rapidly. The end result is that 2.6.17
> performs twice as well under this load.

Can you oprofile it?

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


miserable performance of 2.6.21 under network load

2007-07-24 Thread Aaron Porter

I'm in the process up upgrading a pool of apache servers from
2.6.17.8 to 2.6.21.5, and we're seeing a pretty major change in behavior.
Under identical network load, 2.6.21 has a load average more than 3 times
higher, cpu 0 spends well over 90% of its time in interrupts (vs ~30%
under 2.6.17). When we hit 3k apache sessions, ksoftirqd eats 100% of cpu0
and our network traffic drops off rapidly. The end result is that 2.6.17
performs twice as well under this load.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


miserable performance of 2.6.21 under network load

2007-07-24 Thread Aaron Porter

I'm in the process up upgrading a pool of apache servers from
2.6.17.8 to 2.6.21.5, and we're seeing a pretty major change in behavior.
Under identical network load, 2.6.21 has a load average more than 3 times
higher, cpu 0 spends well over 90% of its time in interrupts (vs ~30%
under 2.6.17). When we hit 3k apache sessions, ksoftirqd eats 100% of cpu0
and our network traffic drops off rapidly. The end result is that 2.6.17
performs twice as well under this load.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread Andi Kleen
Aaron Porter [EMAIL PROTECTED] writes:

   I'm in the process up upgrading a pool of apache servers from
 2.6.17.8 to 2.6.21.5, and we're seeing a pretty major change in behavior.
 Under identical network load, 2.6.21 has a load average more than 3 times
 higher, cpu 0 spends well over 90% of its time in interrupts (vs ~30%
 under 2.6.17). When we hit 3k apache sessions, ksoftirqd eats 100% of cpu0
 and our network traffic drops off rapidly. The end result is that 2.6.17
 performs twice as well under this load.

Can you oprofile it?

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread Chris Snook

Aaron Porter wrote:

I'm in the process up upgrading a pool of apache servers from
2.6.17.8 to 2.6.21.5, and we're seeing a pretty major change in behavior.
Under identical network load, 2.6.21 has a load average more than 3 times
higher, cpu 0 spends well over 90% of its time in interrupts (vs ~30%
under 2.6.17). When we hit 3k apache sessions, ksoftirqd eats 100% of cpu0
and our network traffic drops off rapidly. The end result is that 2.6.17
performs twice as well under this load.


Is it always CPU 0, or does it move?  Are you running irqbalance?  If you're 
running irqbalance, you can run a script that alternates between 'cat 
/proc/interrupts' and 'mpstat -P ALL 5 10' and watch the offending interrupt 
jump around between processors.  It's not as informative as oprofile, as Andi 
suggested, but it's really easy to set up.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread Jeff Garzik

Turn off slab debugging and preempt, and see if that helps.

Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread Andi Kleen
 # opreport -l
 CPU: AMD64 processors, speed 1994.52 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit 
 mask of 0x00 (No unit mask) count 10
 samples  %app name symbol name
 914379   48.8404  vmlinux-2.6.21.5 check_poison_obj
 341920   18.2632  vmlinux-2.6.21.5 poison_obj

You have slab debugging enabled. That makes everything slow.
It costs you ~66% of your CPU time.  Disable it.

-Andi

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread Chris Snook

Aaron Porter wrote:

On Tue, Jul 24, 2007 at 08:48:00PM +0200, Andi Kleen wrote:

Aaron Porter [EMAIL PROTECTED] writes:


I'm in the process up upgrading a pool of apache servers from
2.6.17.8 to 2.6.21.5, and we're seeing a pretty major change in behavior.
Under identical network load, 2.6.21 has a load average more than 3 times
higher, cpu 0 spends well over 90% of its time in interrupts (vs ~30%
under 2.6.17). When we hit 3k apache sessions, ksoftirqd eats 100% of cpu0
and our network traffic drops off rapidly. The end result is that 2.6.17
performs twice as well under this load.

Can you oprofile it?


# opreport -l
CPU: AMD64 processors, speed 1994.52 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask 
of 0x00 (No unit mask) count 10
samples  %app name symbol name
914379   48.8404  vmlinux-2.6.21.5 check_poison_obj
341920   18.2632  vmlinux-2.6.21.5 poison_obj


I bet you have CONFIG_DEBUG_SLAB turned off in your 2.6.17 kernel, and turned on 
in your 2.6.21 kernel.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread David Miller
From: Aaron Porter [EMAIL PROTECTED]
Date: Tue, 24 Jul 2007 11:49:09 -0700

 samples  %app name symbol name
 914379   48.8404  vmlinux-2.6.21.5 check_poison_obj
 341920   18.2632  vmlinux-2.6.21.5 poison_obj
 37355 1.9953  nf_conntrack (no symbols)

You have SLAB debugging enabled, turn it off if you want
decent performance.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: miserable performance of 2.6.21 under network load

2007-07-24 Thread Aaron Porter
On Tue, Jul 24, 2007 at 09:13:47PM +0200, Andi Kleen wrote:

 You have slab debugging enabled. That makes everything slow.
 It costs you ~66% of your CPU time.  Disable it.

Past peak for the day, but this does look like the culprit. Thanks
to all who replied.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/