Re: Non-consistent CPU usage in IP forwarding test

2014-04-03 Thread Oleg A . Arkhangelsky

04.04.2014, 08:43, "Abu Rasheda" :

> Is it NUMA system ? This happens when node tries to access memory connected 
> to other CPU.

Yes, that's NUMA system. Of course inter-node traffic is involved. What is 
strange is
that CPU usage pattern looks like this:

http://software.intel.com/sites/default/files/managed/1b/57/oimg.png

Despite the same test traffic pattern.

--
wbr, Oleg.

"Anarchy is about taking complete responsibility for yourself."
  Alan Moore.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Non-consistent CPU usage in IP forwarding test

2014-04-03 Thread Abu Rasheda
On Thursday, April 3, 2014, Oleg A. Arkhangelsky  wrote:

> Hello all,
>
> We've got very strange behavior when testing IP packet forwarding
> performance
> on Sandy Bridge platform (Supermicro X9DRH with the latest BIOS). This is
> two
> socket E5-2690 CPU system. Using different PC we're generating DDoS-like
> traffic
> with rate of about 4.5 million packets per second. Traffic is receiving by
> two
> Intel 82599 NICs and forwarding using the second port of one of this NICs.
> All
> load is evenly distributed among two nodes, so each of 32 CPUs SI usage is
> virtually equal.
>
> Now the strangest part. Few moments after pktgen start on traffic
> generator PC,
> average CPU usage on SB system goes to 30-35%. No packet drops,
> no rx_missed_errors, no rx_no_dma_resources. Very nice. But SI usage
> starts to
> decreasing gradually. After about 10 seconds we see ~15% SI average among
> all
> CPUs. Still no packet drops, the same RX rate as in the beginning, RX
> packet
> count is equal to TX packet count. After some time we see that average SI
> usage
> start to go up. Peaked at initial 30-35% it goes down to 15% again. This
> pattern
> is repeated every 80 seconds. Interval is very stable. It is undoubtedly
> bind
> to the test start time, because if we start test, then interrupt it after
> 10
> seconds and start it again we see the same 30% SI peak in a few moments.
> Then
> all timings will be the same.
>
> During the high load time we see this in "perf top -e cache-misses":
>
> 14017.00 24.9% __netdev_alloc_skb   [kernel.kallsyms]
>  5172.00  9.2% _raw_spin_lock   [kernel.kallsyms]
>  4722.00  8.4% build_skb[kernel.kallsyms]
>  3603.00  6.4% fib_table_lookup [kernel.kallsyms]
>
> During the "15% load time" top is different:
>
> 11090.00 20.9% build_skb[kernel.kallsyms]
>  4879.00  9.2% fib_table_lookup [kernel.kallsyms]
>  4756.00  9.0% ipt_do_table
> /lib/modules/3.12.15-BUILD-g2e94e30-dirty/kernel/net/ipv4/netfilter/ip_tables.ko
>  3042.00  5.7% nf_iterate   [kernel.kallsyms]
>
> And __netdev_alloc_skb is at the end of list:
>
>   911.00  0.5% __netdev_alloc_skb [kernel.kallsyms]
>
> Some info from "perf stat -a sleep 2":
>
> 15% SI:
>28640006291 cycles#0.447 GHz
>   [83.23%]
>38764605205 instructions  #1.35  insns per cycle
>
> 30% SI:
>56225552442 cycles#0.877 GHz
>   [83.23%]
>39718182298 instructions  #0.71  insns per cycle
>
> CPUs never go above C1 state, all cores speed from /proc/cpuinfo is
> constant at
> 2899.942 MHz. ASPM is disabled.
>
> All non-essential userspace apps was explicitly killed for test time, there
> was no active cron jobs too. So we should assume no interference with
> userspace.
>
> Kernel version is 3.12.15 (ixgbe 3.21.2), but we have the same behavior
> with
> ancient 2.6.35 (ixgbe 3.10.16). Although on 2.6.35 we sometimes get 160-170
> seconds interval and different symbols at the "perf top" output (especially
> local_bh_enable() which is completely blows my mind).
>
> Does anybody have some thoughts about the reasons of this kind of behavior?
> Sandy Bridge CPU has many uncore/offcore events, which I can sample, maybe
> some of them can shed some light on such behavior?
>
>
Is it NUMA system ? This happens when node tries to access memory connected
to other CPU.

Abu Raheda
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Non-consistent CPU usage in IP forwarding test

2014-04-03 Thread Oleg A . Arkhangelsky
Hello all,

We've got very strange behavior when testing IP packet forwarding performance
on Sandy Bridge platform (Supermicro X9DRH with the latest BIOS). This is two
socket E5-2690 CPU system. Using different PC we're generating DDoS-like traffic
with rate of about 4.5 million packets per second. Traffic is receiving by two
Intel 82599 NICs and forwarding using the second port of one of this NICs. All
load is evenly distributed among two nodes, so each of 32 CPUs SI usage is
virtually equal.

Now the strangest part. Few moments after pktgen start on traffic generator PC,
average CPU usage on SB system goes to 30-35%. No packet drops,
no rx_missed_errors, no rx_no_dma_resources. Very nice. But SI usage starts to
decreasing gradually. After about 10 seconds we see ~15% SI average among all
CPUs. Still no packet drops, the same RX rate as in the beginning, RX packet
count is equal to TX packet count. After some time we see that average SI usage
start to go up. Peaked at initial 30-35% it goes down to 15% again. This pattern
is repeated every 80 seconds. Interval is very stable. It is undoubtedly bind
to the test start time, because if we start test, then interrupt it after 10
seconds and start it again we see the same 30% SI peak in a few moments. Then
all timings will be the same.

During the high load time we see this in "perf top -e cache-misses":

14017.00 24.9% __netdev_alloc_skb   [kernel.kallsyms]   

 5172.00  9.2% _raw_spin_lock   [kernel.kallsyms]   

 4722.00  8.4% build_skb[kernel.kallsyms]   

 3603.00  6.4% fib_table_lookup [kernel.kallsyms]   


During the "15% load time" top is different:

11090.00 20.9% build_skb[kernel.kallsyms]   

 4879.00  9.2% fib_table_lookup [kernel.kallsyms]   

 4756.00  9.0% ipt_do_table 
/lib/modules/3.12.15-BUILD-g2e94e30-dirty/kernel/net/ipv4/netfilter/ip_tables.ko
 3042.00  5.7% nf_iterate   [kernel.kallsyms]  

And __netdev_alloc_skb is at the end of list:

  911.00  0.5% __netdev_alloc_skb [kernel.kallsyms] 

Some info from "perf stat -a sleep 2":

15% SI:
   28640006291 cycles#0.447 GHz 
[83.23%]
   38764605205 instructions  #1.35  insns per cycle

30% SI:
   56225552442 cycles#0.877 GHz 
[83.23%]
   39718182298 instructions  #0.71  insns per cycle

CPUs never go above C1 state, all cores speed from /proc/cpuinfo is constant at
2899.942 MHz. ASPM is disabled.

All non-essential userspace apps was explicitly killed for test time, there
was no active cron jobs too. So we should assume no interference with
userspace.

Kernel version is 3.12.15 (ixgbe 3.21.2), but we have the same behavior with
ancient 2.6.35 (ixgbe 3.10.16). Although on 2.6.35 we sometimes get 160-170
seconds interval and different symbols at the "perf top" output (especially
local_bh_enable() which is completely blows my mind).

Does anybody have some thoughts about the reasons of this kind of behavior?
Sandy Bridge CPU has many uncore/offcore events, which I can sample, maybe
some of them can shed some light on such behavior?

Thank you!

--
wbr, Oleg.

"Anarchy is about taking complete responsibility for yourself."
  Alan Moore.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Memory profiling tools for Linux Kernel

2014-04-03 Thread Kumar Amit Mehta
I was looking for some tools for memory profiling for Linux Kernel. I
I wish to analyze the memory usage statistics by comparing the results
(with and without the usage of Lookaside caches) by the consumer (Say
a certain driver).I found some tools such as kmemcheck[1] and KEDR [2]
but before I go further and explore these tools, I was wondering if
somebody has already used these tools to acquire similar statistics
or I should try out some other tool.

[1] https://www.kernel.org/doc/Documentation/kmemcheck.txt
[2] http://kedr.berlios.de/

Thanks,
Kumar

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


#lsfmmsummit

2014-04-03 Thread parinay
http://lwn.net/Articles/LSFMM2014/

passing on.

cheers

-- 
easy is right
begin right and you're easy
continue easy and you're right
the right way to go easy is to forget the right way
and forget that the going is easy

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


/proc/slabinfo

2014-04-03 Thread Pietro Paolini
Hello everyone,

I am experiencing a problem where the system reboot due low memory conditions, 
looking at the 
/proc/slabinfo 

I am able to see that a lot of memory falls under the size-1024 line with 
reports a high number of active objects, then
in order to debug that I tried to compile with :

CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SLAB_LEAK=y

Then now I am able to cat the /proc/slab_allocators leaf but that name, nor 
either all the size-* - are not present 
in that output, do you know why ?

Best Regards,
Pietro Paolini
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies