Re: Processing time spent in IRQ handling and what to do about it

2007-12-20 Thread Oded Arbel

On Wed, 2007-12-19 at 09:58 +0200, Dotan Shavit wrote:
 On Tuesday 18 December 2007, Oded Arbel wrote:
  I can see that a lot of time is spent in the hard-IRQ region - sometimes
  more then all other regions together.
 
 Lets look for more hints...
 
 - Anything interesting in the logs (during boot and after) ? 
 - Lets plug out all the hardware you can: network , USB, disks...
 - rmmod all the modules you can.
 - Boot with a different kernel version.
 - Nothing yet? Lets play with the BIOS...

The logs do not show anything that I don't understand or that I can
relate to this problem, and none of the other options are possible as
this is a production machine.

On a duplicate machine that runs mysql replicated from the first, and
doesn't have any load, stopping the mysqld caused the load to fall to
almost 0. There were very few hardware interrupts after that (as evident
from /proc/interrupts) but there isn't any load so I don't know.

-- 

Oded


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: Processing time spent in IRQ handling and what to do about it

2007-12-20 Thread Oded Arbel

On Wed, 2007-12-19 at 10:34 +0200, Aviv Greenberg wrote:
 Can you send an output of cat /proc/interrupts ? Is there any device
 sharing the IRQ line with the network interface?

On Tue, 2007-12-18 at 22:14 +0200, Oron Peled wrote: 
 6. Why guess?
   watch -n10 -d cat /proc/interrupts


/proc/interrupts looks like this:

  CPU0   CPU1   CPU2   CPU3
  0: 2818676796 3045096095 2597715597 3039460137   IO-APIC-edge timer
  1:  0  2  0  0   IO-APIC-edge i8042
  9:  0  0  0  0   IO-APIC-fasteoi acpi
 12:  0  1  1  2   IO-APIC-edge  i8042
 14:6144547  861135937042  85048   IO-APIC-edge  libata
 15:  0  0  0  0   IO-APIC-edge  libata
 16:  1  0  0  1   IO-APIC-fasteoi
uhci_hcd:usb1, ehci_hcd:usb6
 17:234 13197 11   IO-APIC-fasteoi
uhci_hcd:usb2
 18:  0  0  0  0   IO-APIC-fasteoi
uhci_hcd:usb3
 19:  0  0  0  0   IO-APIC-fasteoi
uhci_hcd:usb4
 22: 24 24 25 23   IO-APIC-fasteoi
uhci_hcd:usb5
2289:  426764360 12  153890890   25567190   PCI-MSI-edge eth1
2290:  184062475   14352363 1146094937   36605794   PCI-MSI-edge eth0
2292:  253368176   26799612  221976501   20082294   PCI-MSI-edge cciss0
NMI:  0  0  0  0
LOC: 2910906978 2910907454 2910906845 2910907935

I haven't calculated diffs exactly yet, but on first glance it looks
like eth0 interrupts are happening at about 150 a second while cciss0
interrupts are happening at about 20 per second. Also eth0 interrupts
happen almost exclusively on one CPI (currently 2 at the moment) and
cciss happen on two CPUs (0 and 2). I'm not sure what's up with CPU1 and
3 - is it possible that because these are the 2nd cores on each chip
that they don't get as many interrupts ? isn't 'irqbalance' supposed to
do something about it ?

-- 

Oded


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: Processing time spent in IRQ handling and what to do about it

2007-12-20 Thread Oron Peled
On Thursday, 20 בDecember 2007, Oded Arbel wrote:
 I haven't calculated diffs exactly yet, but on first glance it looks
 like eth0 interrupts are happening at about 150 a second while cciss0
 interrupts are happening at about 20 per second.

Well, ~150 interrupts/seconds is very low interrupt rate and
should not cause a significant load *unless* they are doing
a heavy work in each interrupt.

[as a reference, on a specific device family I work, we use
 a *minimum* of 1000 interrupts/second even on very low-end
 hosts. When we connect several devices on a bit stonger hosts
 (single cpu) we normally get around ~4000 interrupts/second]

I still tend to suspect the disk controller although its
interrupt rate is really low. Maybe you can test this (run
some I/O bound process like 'find /' and see if it affects
on the hardware interrupts load in top.

If all else fails, than you may want to start using oprofile.

Hope it helps,

-- 
Oron Peled Voice/Fax: +972-4-8228492
[EMAIL PROTECTED]  http://www.actcom.co.il/~oron
ICQ UIN: 16527398

Free software: each person contributes a brick, but ultimately each
person receives a house in return.
   -- Brendan Scott

To unsubscribe, 
send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: Processing time spent in IRQ handling and what to do about it

2007-12-20 Thread Constantine Shulyupin
And my word to this never ending story:

You may use  get_cycles
(http://lxr.linux.no/linux/include/asm-i386/tsc.h#L19) to measure time
(cycles) in interrupts.
Read more here: http://www.linuxdriver.co.il/ldd3/linuxdrive3-CHP-7-SECT-1.html


-- 
Constantine Shulyupin
Freelance Embedded Linux Engineer
054-4234440
http://www.linuxdriver.co.il/

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: Processing time spent in IRQ handling and what to do about it

2007-12-19 Thread Dotan Shavit
On Tuesday 18 December 2007, Oded Arbel wrote:
 I can see that a lot of time is spent in the hard-IRQ region - sometimes
 more then all other regions together.

Lets look for more hints...

- Anything interesting in the logs (during boot and after) ? 
- Lets plug out all the hardware you can: network , USB, disks...
- rmmod all the modules you can.
- Boot with a different kernel version.
- Nothing yet? Lets play with the BIOS...

What stops the IRQs?
How far will you go to catch an IRQ?

#

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: Processing time spent in IRQ handling and what to do about it

2007-12-19 Thread Aviv Greenberg
Can you send an output of cat /proc/interrupts ? Is there any device sharing
the IRQ line with the network interface?

Bnx2 has NAPI support. The recent changes you saw recently are not related,
they are improvements to the NAPI machanism (to support multiple device
queues, not specific to bnx2)

Aviv Greenberg

On 12/19/07, Dotan Shavit [EMAIL PROTECTED] wrote:

 On Tuesday 18 December 2007, Oded Arbel wrote:
  I can see that a lot of time is spent in the hard-IRQ region -
 sometimes
  more then all other regions together.

 Lets look for more hints...

 - Anything interesting in the logs (during boot and after) ?
 - Lets plug out all the hardware you can: network , USB, disks...
 - rmmod all the modules you can.
 - Boot with a different kernel version.
 - Nothing yet? Lets play with the BIOS...

 What stops the IRQs?
 How far will you go to catch an IRQ?

 #

 =
 To unsubscribe, send mail to [EMAIL PROTECTED] with
 the word unsubscribe in the message body, e.g., run the command
 echo unsubscribe | mail [EMAIL PROTECTED]




Re: Processing time spent in IRQ handling and what to do about it

2007-12-19 Thread Rami Rosen
Hi,
You cannot turn it on/off. The driver may support this optional API
  or not. If it supports it, it's the driver sole decision when it's
  better to use polling/interrupt-per-packet according it its hardware
  specifics.

I doubt whether this is exactly so for all NICS, as one might
understand from your answer.
For example, with e1000 NICs, you can
select to build the driver with or without polling support.

See, while configuring the kernel:

Device Drivers-Network Device Support-Ethernet 1000 Mbit - Intel
(R) PRO/1000
Gigabit Ethernet Support-Use RX polling (NAPI).

selecting it sets the CONFIG_E1000_NAPI to y.

By default, in newer kernels, for e1000, it comes with support for
NAPI by default,
but you can also build it without this support.

And if you will look at the code of the driver, you will find in e1000_main.c
module the following:

#ifdef CONFIG_E1000_NAPI
netdev-poll = e1000_clean;
netdev-weight = 64;
#endif

Which means that , when building without CONFIG_E1000_NAPI set, you
will not have
the poll method and therefore no polling/NAPI.


You have also the ability to choose NAPI for other nics; for example,
Tulip; see
Device Drivers-Network Device Support-Ethernet 10 or 100 Mbit - Tulip
family network device support-Use NAPI RX polling.

It could be that on other NICs you cannot turn it on/off.
Broadcom was the first to release the tg3 driver with support for NAPI
for Linux. So they have probably a lot of experience with it, and
it could be that there NAPI support is built in and you cannot avoid it.


BTW, with Open Solaris, this is exactly the situation: the NAPI
support is in the core automatically; the driver start
as interrupt driver, and changes to polling when there is a high load
of interrupts.The drivers need not be built
with any NAPI special support. The driver binary is
the same when working with/without NAPI.There is a way, however, to configure
kernel-wide NAPI parameters.

Regards,
Rami Rosen






On Dec 18, 2007 10:14 PM, Oron Peled [EMAIL PROTECTED] wrote:
 On Tuesday, 18 בDecember 2007, Yedidyah Bar-David wrote:
  I am not an expert on this, but what you want might be NAPI - a new
  network driver infrastructure designed to solve just that. Google a bit
  - I do not know exactly when it entered 2.6 (and you did not state your
  kernel version) and which drivers use it already.

 1. NAPI was new at kernel 2.3.x when it was developed towards 2.4

 2. It gives the *driver* the option to toggle between interrupt driven
and polling mode at runtime. E.g:
- A GB ethernet at full speed may better poll the hardware every once
  in a while.
- The same card is better off using interrupt driven mode if the
  trafic is low.

 3. You cannot turn it on/off. The driver may support this optional API
or not. If it supports it, it's the driver sole decision when it's
better to use polling/interrupt-per-packet according it its hardware
specifics.

 4. I don't think a single fast ethernet card can severely affect your
hardware interrupt load. So either:
- You have a GB (or maybe 2GB?) ethernet with high load.
- You have several fast-ethernet cards working at full speed.

 5. A far better suspect would be the disk controller (e.g: working
without DMA etc.)

 6. Why guess?
 watch -n10 -d cat /proc/interrupts
And calculate how many interrupts per-sec occured for various devices.
That would give you a rough idea who are the possible suspects.


 --
 Oron Peled Voice/Fax: +972-4-8228492
 [EMAIL PROTECTED]  http://www.actcom.co.il/~oron
 ICQ UIN: 16527398

 Linux lasts longer!
 -- Kim J. Brand [EMAIL PROTECTED]


 To unsubscribe, send mail to [EMAIL PROTECTED] with
 the word unsubscribe in the message body, e.g., run the command
 echo unsubscribe | mail [EMAIL PROTECTED]




Re: Processing time spent in IRQ handling and what to do about it

2007-12-18 Thread Oded Arbel

On Tue, 2007-12-18 at 15:21 +0200, Dotan Shavit wrote:
  I don't think that swapping has anything to do with the IRQ behavior I'm
  seeing, 
 In that case, it probably is network related...
 Can you provide more details regarding this?
 
 Is the Apache server you mentioned located on the same machine?

Indeed.

 Are you connected to a private vlan (or seeing non relevant traffic)?

Its infrastructure I don't really have access to so I wouldn't know, but
I'm on a good switch (maybe with a vlan) and I don't see traffic that
isn't meant for me.

 Do you get this (a lot of time is spent in the hard-IRQ region) all the 
 time 
 or just when the server is accessed by it's clients?

I'm always seeing some traffic, so its hard to say if I wouldn't see
hard-IRQ when there aren't any clients. But interestingly enough a
second identical machine which is currently doing nothing except
maintaining a replica of the MySQL database on the first is also seeing
high hard-IRQ counts. A third completely different computer on a
different network with different work loads that also maintains a
replica of the first MySQL database is also seeing high IRQ usage.

 What is the difference between this machine and the other (I understand the 
 other machine works OK) ?

Hardware wise and OS wise - nothing. Software wise there are many
different things, but most prominently:
* it doesn't see the same kind of traffic (which I currently don't think
is the issue as the second server above doesn't see any traffic)
* It doesn't replicate its databases.

-- 

Oded


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: Processing time spent in IRQ handling and what to do about it

2007-12-18 Thread Oded Arbel

On Tue, 2007-12-18 at 07:48 +0200, Yedidyah Bar-David wrote:
 On Tue, Dec 18, 2007 at 02:49:29AM +0200, Oded Arbel wrote:
  Running some static benchmarks that should mimic the behavior on real
  load, on identical hardware at the office, I see very little hard-IRQ
  time if at all. The main difference between the static benchmark and
  real usage is that the static benchmark only tests the application logic
  and IO, while real usage also fetches some files served by Apache over
  HTTP with each request - maybe ~50Kbytes worth of responses are served
  by Apache for each request to the application. I was thinking that the
  high IRQ usage is due to high network traffic - could that be the case
  and could that be affecting the server's performance ?
 
 I am not an expert on this, but what you want might be NAPI - a new
 network driver infrastructure designed to solve just that. Google a bit
 - I do not know exactly when it entered 2.6 (and you did not state your
 kernel version) and which drivers use it already.

Searching for NAPI I see some discussion on it entering 2.4 or 2.5, so
I'm assuming 2.6 had it from the start. I also see some patches for the
bnx2 NIC module which talk about NAPI related fixes for 2.6 - but only
quite recently: October this year.

I'm using Fedora 7 with kernel 2.6.22.1 which is fairly recent so I'm
assuming I have this NAPI. can it possibly be currently turned off and I
need to turn it on ?

-- 

Oded


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: Processing time spent in IRQ handling and what to do about it

2007-12-18 Thread Oron Peled
On Tuesday, 18 בDecember 2007, Yedidyah Bar-David wrote:
 I am not an expert on this, but what you want might be NAPI - a new
 network driver infrastructure designed to solve just that. Google a bit
 - I do not know exactly when it entered 2.6 (and you did not state your
 kernel version) and which drivers use it already.

1. NAPI was new at kernel 2.3.x when it was developed towards 2.4

2. It gives the *driver* the option to toggle between interrupt driven
   and polling mode at runtime. E.g:
   - A GB ethernet at full speed may better poll the hardware every once
 in a while.
   - The same card is better off using interrupt driven mode if the
 trafic is low.

3. You cannot turn it on/off. The driver may support this optional API
   or not. If it supports it, it's the driver sole decision when it's
   better to use polling/interrupt-per-packet according it its hardware
   specifics.

4. I don't think a single fast ethernet card can severely affect your
   hardware interrupt load. So either:
   - You have a GB (or maybe 2GB?) ethernet with high load.
   - You have several fast-ethernet cards working at full speed.

5. A far better suspect would be the disk controller (e.g: working
   without DMA etc.)

6. Why guess?
watch -n10 -d cat /proc/interrupts
   And calculate how many interrupts per-sec occured for various devices.
   That would give you a rough idea who are the possible suspects.


-- 
Oron Peled Voice/Fax: +972-4-8228492
[EMAIL PROTECTED]  http://www.actcom.co.il/~oron
ICQ UIN: 16527398

Linux lasts longer!
-- Kim J. Brand [EMAIL PROTECTED]

To unsubscribe, 
send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Processing time spent in IRQ handling and what to do about it

2007-12-17 Thread Oded Arbel
Hi List

I have a somewhat of a problem but I don't know how serious it is or how
to handle it:

I manage several servers - quite a nice beasts, HP ML360G5 with 2 x dual
Xeons and 4GB ram each. Now one of the production servers is not
behaving all that well - it doesn't handle the load as well as I would
like it to and its responses are slower then what I would expect
according to previous benchmarks (on identical hardware, not on the
specific machine).

After doing some application testing and optimization, I still do not
rule out sub-optimal application behavior, but I noticed something
disturbing and I would appreciate some input on that - 

I use htop to monitor the server's load, and the load average is quite
low when the servers suffers under load, and the cpu time bars rarely
reach over 50%. Splitting the cpu time display in htop according to
system/IO-wait/hard-IRQ/soft-IRQ I can see that a lot of time is spent
in the hard-IRQ region - sometimes more then all other regions
together.

Running some static benchmarks that should mimic the behavior on real
load, on identical hardware at the office, I see very little hard-IRQ
time if at all. The main difference between the static benchmark and
real usage is that the static benchmark only tests the application logic
and IO, while real usage also fetches some files served by Apache over
HTTP with each request - maybe ~50Kbytes worth of responses are served
by Apache for each request to the application. I was thinking that the
high IRQ usage is due to high network traffic - could that be the case
and could that be affecting the server's performance ?

I'd appreciate any references that you can provide - searching the web
for irq bnx2 (the NIC module used by the machine) yields nothing that
I could decipher.

Thanks in advance

--
Oded


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: Processing time spent in IRQ handling and what to do about it

2007-12-17 Thread Yedidyah Bar-David
On Tue, Dec 18, 2007 at 02:49:29AM +0200, Oded Arbel wrote:
 Running some static benchmarks that should mimic the behavior on real
 load, on identical hardware at the office, I see very little hard-IRQ
 time if at all. The main difference between the static benchmark and
 real usage is that the static benchmark only tests the application logic
 and IO, while real usage also fetches some files served by Apache over
 HTTP with each request - maybe ~50Kbytes worth of responses are served
 by Apache for each request to the application. I was thinking that the
 high IRQ usage is due to high network traffic - could that be the case
 and could that be affecting the server's performance ?

I am not an expert on this, but what you want might be NAPI - a new
network driver infrastructure designed to solve just that. Google a bit
- I do not know exactly when it entered 2.6 (and you did not state your
kernel version) and which drivers use it already.
-- 
Didi


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]