[nfs-discuss] Very low NFS throughput over 10 Gbit link

Piyush Shivam Thu, 06 Aug 2009 14:55:30 -0500

On 08/06/09 02:37, Henrik Johansen wrote:
> Piyush Shivam wrote:
>> On 08/05/09 15:53, Henrik Johansen wrote:
>>> Hi list,
>>>
>>> I have 2 servers which are directly connected via ixgbe based nics, 
>>> both
>>> running OpenSolaris 2009.06.
>>>
>>> The actual network connection seems fine, iperf reports ~6.3 Gbits/sec
>>> in terms of throughput and nicstat seems to agree that the nics are 
>>> ~63%
>>> utilized.
>>> Iperf : henrik at opensolaris:~# ./iperf-2.0.4/src/iperf -c 10.10.10.2 
>>> -N -t 40
>>> ------------------------------------------------------------
>>> Client connecting to 10.10.10.2, TCP port 5001
>>> TCP window size: 391 KByte (default)
>>> ------------------------------------------------------------
>>> [ 3] local 10.10.10.3 port 56583 connected with 10.10.10.2 port 5001
>>> [ ID] Interval Transfer Bandwidth
>>> [ 3] 0.0-40.0 sec 29.3 GBytes 6.29 Gbits/sec
>>>
>>> Nicstat : henrik at naz01:/tmpfs# /export/home/henrik/nicstat -i ixgbe0 2
>>> Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
>>> 21:13:02 ixgbe0 776175 1222.1 96592.9 18961.7 8228.4 66.00 63.7 83018.3
>>> 21:13:04 ixgbe0 773081 1217.2 96221.2 18885.3 8227.2 66.00 63.4 82717.5
>>>
>>> To measure the NFS throughput over this link I have created a tmpfs
>>> filesystem on the server to avoid the synchronous writes issue as much
>>> as possible.
>>>
>>> Client : henrik at opensolaris:~# mount | grep /nfs
>>> /nfs on 10.10.10.2:/tmpfs 
>>> remote/read/write/setuid/devices/forcedirectio/xattr/dev=4dc0007 on 
>>> Wed Aug 5 20:06:25 2009
>>>
>>> Server :
>>> henrik at naz01:/tmpfs# share | grep tmpfs
>>> - /tmpfs sec=sys,root=10.10.10.3 ""
>>> henrik at naz01:/tmpfs# mount | grep tmpfs
>>> /tmpfs on swap read/write/setuid/devices/xattr/dev=4b80006 on Wed 
>>> Aug 5 21:59:31 2009
>>>
>>> I have set the 'forcedirectio' option on the client mount to ensure 
>>> that
>>> the clients cache gets circumvented.
>>>
>>> Using the randomwrite microbenchmark in filebench ($filesize set to 
>>> 1gb)
>>> I get :
>>> Local on tmpfs :
>>> IO Summary: 5013937 ops, 82738.5 ops/s, (0/82738 r/w) 646.4mb/s, 
>>> 71us cpu/op, 0.0ms latency
>>>
>>> Tmpfs over NFS :
>>> IO Summary: 383488 ops, 6328.2 ops/s, (0/6328 r/w) 49.4mb/s, 65us 
>>> cpu/op, 0.2ms latency
>>>
>>> These are 2 fully populated 4 socket machines - why the extremely low
>>> transfer speed ?
>> randomwrite.f is a single threaded workload (assuming you are using 
>> randomwrite.f filebench workload), which may not be sending enough 
>> work for the server to begin with. If you drive the number of threads 
>> in the workload higher (modify the nthreads variable in 
>> randomwrite.f), you should see better numbers, unless there is some 
>> other limits in the system. You can examine the CPU utilization of 
>> the client (and the server) machine to make sure that the client is 
>> busy sending work to the server.
>
> It indeed is the randomwrite.f workload.
>
> Now, using 256 threads I can actually push the numbers :
> IO Summary: 2429950 ops, 40099.1 ops/s, (0/40099 r/w) 313.2mb/s,
> 75us cpu/op, 5.9ms latency
>
> CPU utilisation on the client is about 25 percent - the server hovers
> around 50%.
> Sadly this is not what I wanted to do - I need to test and measure the
> maximum ramdomwrite / randomread throughput over very few NFS
> connections since this will be the production workload for these
> machines.
>
> If I understand you correctly then filebench is the culprit and simply
> not pushing the server hard enough ?
I would not say that filebench is the culprit. filebench is a 
parameterizable workload generator and its behavior will be consistent 
with the parameters you set.
> Any ideas about how I can measure a light threads scenario ?
If the goal is to drive the server to its limits for a given workload 
(in your case random read/random write), there needs to be enough 
requests/work in the pipeline for the server to process. One common way 
to do that is to have a large number of threads per client and/or have 
multiple clients issuing the load to the server. If you need to generate 
a significant load on the server, while keeping the utilization of each 
client low, you can do so by using multiple clients, and a small number 
of threads for each client.


You can get the multi connection scenario working by leveraging the 
multi-client capability of filebench.
http://www.solarisinternals.com/wiki/images/f/f8/FileBench_Multi_Client.pdf

-Piyush

[nfs-discuss] Very low NFS throughput over 10 Gbit link

Reply via email to