Greetings,
Stefan Lambrev wrote:
Greetings,
In my desire to increase network throughput, and to be able to handle
more then ~250-270kpps
I started experimenting with lagg and link aggregation control
protocol (lacp).
To my surprise this doesn't increase the amount of packets my server
can handle
Here is what netstat reports:
netstat -w1 -I lagg0
input (lagg0) output
packets errs bytes packets errs bytes colls
267180 0 16030806 254056 0 14735542 0
266875 0 16012506 253829 0 14722260 0
netstat -w1 -I em0
input (em0) output
packets errs bytes packets errs bytes colls
124789 72976 7487340 115329 0 6690468 0
126860 67350 7611600 114769 0 6658002 0
netstat -w1 -I em2
input (em2) output
packets errs bytes packets errs bytes colls
123695 65533 7421700 113575 0 6584856 0
130277 62646 7816626 113648 0 6592280 0
123545 64171 7412706 113714 0 6596174 0
Using lagg doesn't improve situation at all, and also errors are not
reported.
Also using lagg increased content switches:
procs memory page disk faults cpu
r b w avm fre flt re pi po fr sr ad4 in sy cs us
sy id
1 0 0 81048 1914640 52 0 0 0 50 0 0 3036 37902
13512 1 20 79
0 0 0 81048 1914640 13 0 0 0 0 0 0 9582 83 22166
0 56 44
0 0 0 81048 1914640 13 0 0 0 0 0 0 9594 80 22028
0 55 45
0 0 0 81048 1914640 13 0 0 0 0 0 0 9593 82 22095
0 56 44
Top showed for CPU states +55% system, which is quite high?
I'll use hwpmc and lock_profiling to see where the kernel spends it's
time.
Greetings,
Here is what hwpmc shows (without using lagg):
% cumulative self self total
time seconds seconds calls ms/call ms/call name
14.7 325801.00 325801.00 0 100.00% MD5Transform [1]
8.4 512008.00 186207.00 0 100.00% _mtx_unlock_flags [2]
6.1 646787.00 134779.00 0 100.00% _mtx_lock_flags [3]
5.6 769909.00 123122.00 0 100.00% uma_zalloc_arg [4]
5.0 879853.00 109944.00 0 100.00% rn_match [5]
3.5 957294.00 77441.00 0 100.00% memcpy [6]
3.1 1025989.00 68695.00 0 100.00% bzero [7]
2.8 1087273.00 61284.00 0 100.00% em_encap [8]
2.6 1145231.00 57958.00 0 100.00% ip_output [9]
2.5 1200105.00 54874.00 0 100.00%
bus_dmamap_load_mbuf_sg [10]
2.3 1251626.00 51521.00 0 100.00% syncache_add [11]
2.1 1297826.50 46200.50 0 100.00% syncache_lookup [12]
2.1 1343661.50 45835.00 0 100.00% tcp_input [13]
1.8 1383912.00 40250.50 0 100.00% ip_input [14]
1.5 1417997.00 34085.00 0 100.00% syncache_respond [15]
1.5 1451114.50 33117.50 0 100.00% uma_zfree_internal
[16]
1.5 1484046.00 32931.50 0 100.00% critical_exit [17]
1.5 1516899.00 32853.00 0 100.00% MD5Update [18]
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
ether 00:15:17:58:11:a5
inet 10.3.3.1 netmask 0xffffff00 broadcast 10.3.3.255
media: Ethernet autoselect (1000baseTX <full-duplex>)
status: active
Is it normal so much time to be spent in MD5Transform with tx/rx enabled?
LOCK_PROFILING results here - http://89.186.204.158/lock_profiling2.txt
--
Best Wishes,
Stefan Lambrev
ICQ# 24134177
_______________________________________________
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"