Richard,

thanks for weighing in. Nick had also suggested iperf. I agree that cat into 
netcat is unnecessarily stupid.

I have applied the following as recommendet at 
<http://www.solarisinternals.com/wiki/index.php/Networks> (thanks Nick):

# ndd -set /dev/tcp tcp_recv_hiwat 400000                                
# ndd -set /dev/tcp tcp_xmit_hiwat 400000
# ndd -set /dev/tcp tcp_max_buf 2097152
# ndd -set /dev/tcp tcp_cwnd_max 2097152

which improved the situation. Now iperf tells me:

# ./iperf -c 192.168.168.5
------------------------------------------------------------
Client connecting to 192.168.168.5, TCP port 5001
TCP window size:  391 KByte (default)
------------------------------------------------------------
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  3.42 GBytes  2.93 Gbits/sec

# ./iperf -c 192.168.168.5 -P10
------------------------------------------------------------
Client connecting to 192.168.168.5, TCP port 5001
TCP window size:  391 KByte (default)
------------------------------------------------------------
[ ID] Interval       Transfer     Bandwidth
[ 12]  0.0-10.0 sec   300 MBytes   252 Mbits/sec
[  4]  0.0-10.0 sec   300 MBytes   252 Mbits/sec
[  6]  0.0-10.0 sec   300 MBytes   251 Mbits/sec
[ 10]  0.0-10.0 sec   300 MBytes   252 Mbits/sec
[  8]  0.0-10.0 sec   300 MBytes   252 Mbits/sec
[  3]  0.0-10.0 sec   300 MBytes   251 Mbits/sec
[  5]  0.0-10.0 sec   300 MBytes   252 Mbits/sec
[  7]  0.0-10.0 sec   300 MBytes   252 Mbits/sec
[  9]  0.0-10.0 sec   300 MBytes   252 Mbits/sec
[ 11]  0.0-10.0 sec   300 MBytes   251 Mbits/sec
[SUM]  0.0-10.0 sec  2.93 GBytes  2.51 Gbits/sec

Observing CPU utilization during the test using mpstat, I see that all cores 
but one are mostly idle, and one core goes to 100% utilization, even when 
running iperf with a single thread.

Nick suggested that based on this, I should try increasing rx_queue_number and 
tx_queue_number for the ixgbe driver. AFAICS, I would need to do that in 
/kernel/drv/ixgbe.conf, which in turn means I need to do something like

        
<http://dtrace.org/blogs/wesolows/2013/12/28/anonymous-tracing-on-smartos/>

or is there a simpler way?

Thanks,
Chris


Am 21.07.2014 um 00:23 schrieb Richard Elling via smartos-discuss 
<[email protected]>:

> On Jul 19, 2014, at 6:42 AM, Chris Ferebee via smartos-discuss 
> <[email protected]> wrote:
> 
>> 
>> I'm trying to debug a network performance issue.
>> 
>> I have two servers running SmartOS (20140613T024634Z and 20140501T225642Z), 
>> one is a Supermicro dual Xeon E5649 (64 GB RAM) and the other is a dual Xeon 
>> E5-2620v2 (128 GB RAM). Each has an Intel X520-DA1 10GbE card, and they are 
>> both connected to 10GbE ports on a NetGear GS752TXS switch.
>> 
>> The switch reports 10GbE links:
>> 
>> 1/xg49                       Enable  10G Full        10G Full        Link Up 
>> Enable  1518    20:0C:C8:46:C8:3E       49      49
>> 1/xg50                       Enable  10G Full        10G Full        Link Up 
>> Enable  1518    20:0C:C8:46:C8:3E       50      50
>> 
>> as do both hosts:
>> 
>> [root@90-e2-ba-00-2a-e2 ~]# dladm show-phys
>> LINK MEDIA           STATE   SPEED           DUPLEX          DEVICE
>> igb0         Ethernet                down    0                       half    
>>                 igb0
>> igb1         Ethernet                down    0                       half    
>>                 igb1
>> ixgbe0       Ethernet                up              10000           full    
>>                 ixgbe0
>> 
>> [root@00-1b-21-bf-e1-b4 ~]# dladm show-phys
>> LINK MEDIA           STATE   SPEED           DUPLEX          DEVICE
>> igb0         Ethernet                down    0                       half    
>>                 igb0
>> ixgbe0       Ethernet                up              10000           full    
>>                 ixgbe0
>> igb1         Ethernet                down    0                       half    
>>                 igb1
>> 
>> Per dladm show-linkprop, maxbw is not set on either of the net0 vnic 
>> interfaces.
>> 
>> And yet, as measured via netcat, throughput is just below 1 Gbit/s:
>> 
>> [root@90-e2-ba-00-2a-e2 ~]# time cat /zones/test/10gb | nc -v -v -n 
>> 192.168.168.5 8888
> 
> It's called "netcat" for a reason, why are you cat'ing into it?
>       time nc -v -v -n 192.168.168.5 8888  </zones/test/10gb
> 
>> Connection to 192.168.168.5 8888 port [tcp/*] succeeded!
>> 
>> real         1m34.662s
>> user         0m11.422s
>> sys          1m53.957s
>> 
>> (In this test, 10gb is a test file that is warm in RAM and transfers via dd 
>> to /dev/null at approx. 2.4 GByte/s.)
>> 
>> What could be causing the slowdown, and how might I go about debugging this?
> 
> nc doesn't buffer, so a pipeline of data flowing through cat <-> nc <-> 
> network <-> nc <-> ?? 
> is susceptible to delays at any stage rippling their latency back to the far 
> end. You're better
> off testing performance with proper network performance testing tools like 
> iperf where such
> things are not in the design.
> 
>  -- richard
> 
> 
>> 
>> FTR, disk throughput, while not an issue here, appears to be perfectly 
>> reasonable, approx. 900 MB/s read performance.
>> 
>> Thanks for any pointers!
>> 
>> Chris
>> 
>> 
>> 
>> 
>> -------------------------------------------
>> smartos-discuss
>> Archives: https://www.listbox.com/member/archive/184463/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/184463/21953302-fd56db47
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
> 
> --
> 
> [email protected]
> +1-760-896-4422
> 
> 
> 
> smartos-discuss | Archives  | Modify Your Subscription        



-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to