RE: e1000: Question about polling

2008-02-20 Thread Brandeburg, Jesse
Badalian Vyacheslav wrote:
 Hello all.
 
 Interesting think:
 
 Have PC that do NAT. Bandwidth about 600 mbs.
 
 Have  4 CPU (2xCoRe 2 DUO HT OFF 3.2 HZ).
 
 irqbalance in kernel is off.
 
 nat2 ~ # cat /proc/irq/217/smp_affinity
 0001
this binds all 217 irq interrupts to cpu 0

 nat2 ~ # cat /proc/irq/218/smp_affinity
 0003

do you mean to be balancing interrupts between core 1 and 2 here?
1 = cpu 0
2 = cpu 1
4 = cpu 2
8 = cpu 3

so 1+2 = 3 for irq 218, ie balancing between the two.

sometimes the cpus will have a paired cache, depending on your bios it
will be organized like cpu 0/2 = shared cache, and cput 1/3 = shared
cache.
you can find this out by looking at physical ID and CORE ID in
/proc/cpuinfo

 Load SI on CPU0 and CPU1 is about 90%
 
 Good... try do
 echo   /proc/irq/217/smp_affinity
 echo   /proc/irq/218/smp_affinity
 
 Get 100% SI at CPU0
 
 Question Why?

because as each adapter generating interrupts gets rotated through cpu0,
it gets stuck on cpu0 because the napi scheduling can only run one at
a time, and so each is always waiting in line behind the other to run
its napi poll, always fills its quota (work_done is always != 0) and
keeps interrupts disabled forever

 I listen that if use IRQ from 1 netdevice to 1 CPU i can get 30%
 perfomance... but i have 4 CPU... i must get more perfomance if i cat
   to smp_affinity.

only if your performance is not cache limited but cpu horsepower
limited.  you're sacrificing cache coherency for cpu power, but if that
works for you then great.
 
 picture looks liks this:
 0-3 CPU get over 50% SI bandwith up 55% SI... bandwith up...
 100% SI on CPU0
 
 I remember patch to fix problem like it... patched function
 e1000_clean...  kernel on pc have this patch (2.6.24-rc7-git2)...
 e1000 driver work much better (i up to 1.5-2x bandwidth before i get
 100% SI), but i think that it not get 100% that it can =)

the patch helps a little because it decreases the amount of time the
driver spends in napi mode, basically shortening the exit condition
(which reenables interrupts, and therefore balancing) to work_done 
budget, not work_done == 0.

 Thanks for answers and sorry for my English

you basically can't get much more than one cpu can do for each nic.  its
possible to get a little more, but my guess is you won't get much.  The
best thing you can do is make sure as much traffic as possible stays in
the same cache, on two different cores.

you can try turning off NAPI mode either in the .config, or build the
sourceforge driver with CFLAGS_EXTRA=-DE1000_NO_NAPI,  which seems
counterintuitive, but with the non-napi e1000 pushing packets to the
backlog queue on each cpu, you may actually get better performance due
to the balancing.

some day soon (maybe) we'll have some coherent way to have one tx and rx
interrupt per core, and enough queues for each port to be able to handle
1 queue per core.

good luck,
  Jesse  
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e1000: Question about polling

2008-02-20 Thread Badalian Vyacheslav
Very big thanks for this answer. You ask for all my questions and for 
all future questions too. Thanks Again!

Badalian Vyacheslav wrote:
  

Hello all.

Interesting think:

Have PC that do NAT. Bandwidth about 600 mbs.

Have  4 CPU (2xCoRe 2 DUO HT OFF 3.2 HZ).

irqbalance in kernel is off.

nat2 ~ # cat /proc/irq/217/smp_affinity
0001


this binds all 217 irq interrupts to cpu 0

  

nat2 ~ # cat /proc/irq/218/smp_affinity
0003



do you mean to be balancing interrupts between core 1 and 2 here?
1 = cpu 0
2 = cpu 1
4 = cpu 2
8 = cpu 3

so 1+2 = 3 for irq 218, ie balancing between the two.

sometimes the cpus will have a paired cache, depending on your bios it
will be organized like cpu 0/2 = shared cache, and cput 1/3 = shared
cache.
you can find this out by looking at physical ID and CORE ID in
/proc/cpuinfo

  

Load SI on CPU0 and CPU1 is about 90%

Good... try do
echo   /proc/irq/217/smp_affinity
echo   /proc/irq/218/smp_affinity

Get 100% SI at CPU0

Question Why?



because as each adapter generating interrupts gets rotated through cpu0,
it gets stuck on cpu0 because the napi scheduling can only run one at
a time, and so each is always waiting in line behind the other to run
its napi poll, always fills its quota (work_done is always != 0) and
keeps interrupts disabled forever

  

I listen that if use IRQ from 1 netdevice to 1 CPU i can get 30%
perfomance... but i have 4 CPU... i must get more perfomance if i cat
  to smp_affinity.



only if your performance is not cache limited but cpu horsepower
limited.  you're sacrificing cache coherency for cpu power, but if that
works for you then great.
 
  

picture looks liks this:
0-3 CPU get over 50% SI bandwith up 55% SI... bandwith up...
100% SI on CPU0

I remember patch to fix problem like it... patched function
e1000_clean...  kernel on pc have this patch (2.6.24-rc7-git2)...
e1000 driver work much better (i up to 1.5-2x bandwidth before i get
100% SI), but i think that it not get 100% that it can =)



the patch helps a little because it decreases the amount of time the
driver spends in napi mode, basically shortening the exit condition
(which reenables interrupts, and therefore balancing) to work_done 
budget, not work_done == 0.

  

Thanks for answers and sorry for my English



you basically can't get much more than one cpu can do for each nic.  its
possible to get a little more, but my guess is you won't get much.  The
best thing you can do is make sure as much traffic as possible stays in
the same cache, on two different cores.

you can try turning off NAPI mode either in the .config, or build the
sourceforge driver with CFLAGS_EXTRA=-DE1000_NO_NAPI,  which seems
counterintuitive, but with the non-napi e1000 pushing packets to the
backlog queue on each cpu, you may actually get better performance due
to the balancing.

some day soon (maybe) we'll have some coherent way to have one tx and rx
interrupt per core, and enough queues for each port to be able to handle
1 queue per core.

good luck,
  Jesse  

  


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e1000: Question about polling

2008-02-20 Thread Badalian Vyacheslav
Sorry for little information and mistakes in letter. Jesse Brandeburg 
ask for all my questions. In future i will try to be more accurate then 
write letters and post more info.

Please not think that it disrespect for you. Its simple language barrier =(

On 18-02-2008 10:18, Badalian Vyacheslav wrote:
  

Hello all.



Hi,

  

Interesting think:

Have PC that do NAT. Bandwidth about 600 mbs.

Have  4 CPU (2xCoRe 2 DUO HT OFF 3.2 HZ).

irqbalance in kernel is off.

nat2 ~ # cat /proc/irq/217/smp_affinity
0001
nat2 ~ # cat /proc/irq/218/smp_affinity
0003

Load SI on CPU0 and CPU1 is about 90%

Good... try do
echo   /proc/irq/217/smp_affinity
echo   /proc/irq/218/smp_affinity

Get 100% SI at CPU0

Question Why?



I think you should show here /proc/interrupts in all these cases.

  
I listen that if use IRQ from 1 netdevice to 1 CPU i can get 30% 
perfomance... but i have 4 CPU... i must get more perfomance if i cat 
  to smp_affinity.


picture looks liks this:
0-3 CPU get over 50% SI bandwith up 55% SI... bandwith up... 
100% SI on CPU0


I remember patch to fix problem like it... patched function 
e1000_clean...  kernel on pc have this patch (2.6.24-rc7-git2)... e1000 
driver work much better (i up to 1.5-2x bandwidth before i get 100% SI), 
but i think that it not get 100% that it can =)



If some patch works for you, and you can show here its advantages,
you should probably add here some link and request for merging.

BTW, I wonder if you tried to check if changing CONFIG_HZ makes any
difference here?

Regards,
Jarek P.

  


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e1000: Question about polling

2008-02-20 Thread Jarek Poplawski
On Wed, Feb 20, 2008 at 12:25:32PM +0300, Badalian Vyacheslav wrote:
...
 Please not think that it disrespect for you. Its simple language barrier =(

OK! Don't disrespect for me -  I'll try fix my English next time!)

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e1000: Question about polling

2008-02-20 Thread Badalian Vyacheslav
Khrm i try to say that i have language barrier and some time may 
wrong compose clauses. Example below =)

I'll try fix my English next time!

Vyacheslav

On Wed, Feb 20, 2008 at 12:25:32PM +0300, Badalian Vyacheslav wrote:
...
  

Please not think that it disrespect for you. Its simple language barrier =(



OK! Don't disrespect for me -  I'll try fix my English next time!)

Jarek P.

  


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e1000: Question about polling

2008-02-20 Thread Jarek Poplawski
On Wed, Feb 20, 2008 at 02:54:27PM +0300, Badalian Vyacheslav wrote:
 Khrm i try to say that i have language barrier and some time may  
 wrong compose clauses. Example below =)

No, only a bit joking...

 I'll try fix my English next time!

Don't worry Vyacheslav: I think your message was understandable enough
if you got good answer from Jesse. (And I've learned something BTW too;
Thanks Jesse!) And after all it's not a language group: we care here
for serious problems!

Jarek P.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e1000: Question about polling

2008-02-19 Thread Jarek Poplawski
On 18-02-2008 10:18, Badalian Vyacheslav wrote:
 Hello all.

Hi,

 Interesting think:
 
 Have PC that do NAT. Bandwidth about 600 mbs.
 
 Have  4 CPU (2xCoRe 2 DUO HT OFF 3.2 HZ).
 
 irqbalance in kernel is off.
 
 nat2 ~ # cat /proc/irq/217/smp_affinity
 0001
 nat2 ~ # cat /proc/irq/218/smp_affinity
 0003
 
 Load SI on CPU0 and CPU1 is about 90%
 
 Good... try do
 echo   /proc/irq/217/smp_affinity
 echo   /proc/irq/218/smp_affinity
 
 Get 100% SI at CPU0
 
 Question Why?

I think you should show here /proc/interrupts in all these cases.

 
 I listen that if use IRQ from 1 netdevice to 1 CPU i can get 30% 
 perfomance... but i have 4 CPU... i must get more perfomance if i cat 
   to smp_affinity.
 
 picture looks liks this:
 0-3 CPU get over 50% SI bandwith up 55% SI... bandwith up... 
 100% SI on CPU0
 
 I remember patch to fix problem like it... patched function 
 e1000_clean...  kernel on pc have this patch (2.6.24-rc7-git2)... e1000 
 driver work much better (i up to 1.5-2x bandwidth before i get 100% SI), 
 but i think that it not get 100% that it can =)

If some patch works for you, and you can show here its advantages,
you should probably add here some link and request for merging.

BTW, I wonder if you tried to check if changing CONFIG_HZ makes any
difference here?

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html