Richard, thanks for weighing in. Nick had also suggested iperf. I agree that cat into netcat is unnecessarily stupid.
I have applied the following as recommendet at <http://www.solarisinternals.com/wiki/index.php/Networks> (thanks Nick): # ndd -set /dev/tcp tcp_recv_hiwat 400000 # ndd -set /dev/tcp tcp_xmit_hiwat 400000 # ndd -set /dev/tcp tcp_max_buf 2097152 # ndd -set /dev/tcp tcp_cwnd_max 2097152 which improved the situation. Now iperf tells me: # ./iperf -c 192.168.168.5 ------------------------------------------------------------ Client connecting to 192.168.168.5, TCP port 5001 TCP window size: 391 KByte (default) ------------------------------------------------------------ [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 3.42 GBytes 2.93 Gbits/sec # ./iperf -c 192.168.168.5 -P10 ------------------------------------------------------------ Client connecting to 192.168.168.5, TCP port 5001 TCP window size: 391 KByte (default) ------------------------------------------------------------ [ ID] Interval Transfer Bandwidth [ 12] 0.0-10.0 sec 300 MBytes 252 Mbits/sec [ 4] 0.0-10.0 sec 300 MBytes 252 Mbits/sec [ 6] 0.0-10.0 sec 300 MBytes 251 Mbits/sec [ 10] 0.0-10.0 sec 300 MBytes 252 Mbits/sec [ 8] 0.0-10.0 sec 300 MBytes 252 Mbits/sec [ 3] 0.0-10.0 sec 300 MBytes 251 Mbits/sec [ 5] 0.0-10.0 sec 300 MBytes 252 Mbits/sec [ 7] 0.0-10.0 sec 300 MBytes 252 Mbits/sec [ 9] 0.0-10.0 sec 300 MBytes 252 Mbits/sec [ 11] 0.0-10.0 sec 300 MBytes 251 Mbits/sec [SUM] 0.0-10.0 sec 2.93 GBytes 2.51 Gbits/sec Observing CPU utilization during the test using mpstat, I see that all cores but one are mostly idle, and one core goes to 100% utilization, even when running iperf with a single thread. Nick suggested that based on this, I should try increasing rx_queue_number and tx_queue_number for the ixgbe driver. AFAICS, I would need to do that in /kernel/drv/ixgbe.conf, which in turn means I need to do something like <http://dtrace.org/blogs/wesolows/2013/12/28/anonymous-tracing-on-smartos/> or is there a simpler way? Thanks, Chris Am 21.07.2014 um 00:23 schrieb Richard Elling via smartos-discuss <[email protected]>: > On Jul 19, 2014, at 6:42 AM, Chris Ferebee via smartos-discuss > <[email protected]> wrote: > >> >> I'm trying to debug a network performance issue. >> >> I have two servers running SmartOS (20140613T024634Z and 20140501T225642Z), >> one is a Supermicro dual Xeon E5649 (64 GB RAM) and the other is a dual Xeon >> E5-2620v2 (128 GB RAM). Each has an Intel X520-DA1 10GbE card, and they are >> both connected to 10GbE ports on a NetGear GS752TXS switch. >> >> The switch reports 10GbE links: >> >> 1/xg49 Enable 10G Full 10G Full Link Up >> Enable 1518 20:0C:C8:46:C8:3E 49 49 >> 1/xg50 Enable 10G Full 10G Full Link Up >> Enable 1518 20:0C:C8:46:C8:3E 50 50 >> >> as do both hosts: >> >> [root@90-e2-ba-00-2a-e2 ~]# dladm show-phys >> LINK MEDIA STATE SPEED DUPLEX DEVICE >> igb0 Ethernet down 0 half >> igb0 >> igb1 Ethernet down 0 half >> igb1 >> ixgbe0 Ethernet up 10000 full >> ixgbe0 >> >> [root@00-1b-21-bf-e1-b4 ~]# dladm show-phys >> LINK MEDIA STATE SPEED DUPLEX DEVICE >> igb0 Ethernet down 0 half >> igb0 >> ixgbe0 Ethernet up 10000 full >> ixgbe0 >> igb1 Ethernet down 0 half >> igb1 >> >> Per dladm show-linkprop, maxbw is not set on either of the net0 vnic >> interfaces. >> >> And yet, as measured via netcat, throughput is just below 1 Gbit/s: >> >> [root@90-e2-ba-00-2a-e2 ~]# time cat /zones/test/10gb | nc -v -v -n >> 192.168.168.5 8888 > > It's called "netcat" for a reason, why are you cat'ing into it? > time nc -v -v -n 192.168.168.5 8888 </zones/test/10gb > >> Connection to 192.168.168.5 8888 port [tcp/*] succeeded! >> >> real 1m34.662s >> user 0m11.422s >> sys 1m53.957s >> >> (In this test, 10gb is a test file that is warm in RAM and transfers via dd >> to /dev/null at approx. 2.4 GByte/s.) >> >> What could be causing the slowdown, and how might I go about debugging this? > > nc doesn't buffer, so a pipeline of data flowing through cat <-> nc <-> > network <-> nc <-> ?? > is susceptible to delays at any stage rippling their latency back to the far > end. You're better > off testing performance with proper network performance testing tools like > iperf where such > things are not in the design. > > -- richard > > >> >> FTR, disk throughput, while not an issue here, appears to be perfectly >> reasonable, approx. 900 MB/s read performance. >> >> Thanks for any pointers! >> >> Chris >> >> >> >> >> ------------------------------------------- >> smartos-discuss >> Archives: https://www.listbox.com/member/archive/184463/=now >> RSS Feed: https://www.listbox.com/member/archive/rss/184463/21953302-fd56db47 >> Modify Your Subscription: https://www.listbox.com/member/?& >> Powered by Listbox: http://www.listbox.com > > -- > > [email protected] > +1-760-896-4422 > > > > smartos-discuss | Archives | Modify Your Subscription ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00 Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb Powered by Listbox: http://www.listbox.com
