On 08/06/09 02:37, Henrik Johansen wrote: > Piyush Shivam wrote: >> On 08/05/09 15:53, Henrik Johansen wrote: >>> Hi list, >>> >>> I have 2 servers which are directly connected via ixgbe based nics, >>> both >>> running OpenSolaris 2009.06. >>> >>> The actual network connection seems fine, iperf reports ~6.3 Gbits/sec >>> in terms of throughput and nicstat seems to agree that the nics are >>> ~63% >>> utilized. >>> Iperf : henrik at opensolaris:~# ./iperf-2.0.4/src/iperf -c 10.10.10.2 >>> -N -t 40 >>> ------------------------------------------------------------ >>> Client connecting to 10.10.10.2, TCP port 5001 >>> TCP window size: 391 KByte (default) >>> ------------------------------------------------------------ >>> [ 3] local 10.10.10.3 port 56583 connected with 10.10.10.2 port 5001 >>> [ ID] Interval Transfer Bandwidth >>> [ 3] 0.0-40.0 sec 29.3 GBytes 6.29 Gbits/sec >>> >>> Nicstat : henrik at naz01:/tmpfs# /export/home/henrik/nicstat -i ixgbe0 2 >>> Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat >>> 21:13:02 ixgbe0 776175 1222.1 96592.9 18961.7 8228.4 66.00 63.7 83018.3 >>> 21:13:04 ixgbe0 773081 1217.2 96221.2 18885.3 8227.2 66.00 63.4 82717.5 >>> >>> To measure the NFS throughput over this link I have created a tmpfs >>> filesystem on the server to avoid the synchronous writes issue as much >>> as possible. >>> >>> Client : henrik at opensolaris:~# mount | grep /nfs >>> /nfs on 10.10.10.2:/tmpfs >>> remote/read/write/setuid/devices/forcedirectio/xattr/dev=4dc0007 on >>> Wed Aug 5 20:06:25 2009 >>> >>> Server : >>> henrik at naz01:/tmpfs# share | grep tmpfs >>> - /tmpfs sec=sys,root=10.10.10.3 "" >>> henrik at naz01:/tmpfs# mount | grep tmpfs >>> /tmpfs on swap read/write/setuid/devices/xattr/dev=4b80006 on Wed >>> Aug 5 21:59:31 2009 >>> >>> I have set the 'forcedirectio' option on the client mount to ensure >>> that >>> the clients cache gets circumvented. >>> >>> Using the randomwrite microbenchmark in filebench ($filesize set to >>> 1gb) >>> I get : >>> Local on tmpfs : >>> IO Summary: 5013937 ops, 82738.5 ops/s, (0/82738 r/w) 646.4mb/s, >>> 71us cpu/op, 0.0ms latency >>> >>> Tmpfs over NFS : >>> IO Summary: 383488 ops, 6328.2 ops/s, (0/6328 r/w) 49.4mb/s, 65us >>> cpu/op, 0.2ms latency >>> >>> These are 2 fully populated 4 socket machines - why the extremely low >>> transfer speed ? >> randomwrite.f is a single threaded workload (assuming you are using >> randomwrite.f filebench workload), which may not be sending enough >> work for the server to begin with. If you drive the number of threads >> in the workload higher (modify the nthreads variable in >> randomwrite.f), you should see better numbers, unless there is some >> other limits in the system. You can examine the CPU utilization of >> the client (and the server) machine to make sure that the client is >> busy sending work to the server. > > It indeed is the randomwrite.f workload. > > Now, using 256 threads I can actually push the numbers : > IO Summary: 2429950 ops, 40099.1 ops/s, (0/40099 r/w) 313.2mb/s, > 75us cpu/op, 5.9ms latency > > CPU utilisation on the client is about 25 percent - the server hovers > around 50%. > Sadly this is not what I wanted to do - I need to test and measure the > maximum ramdomwrite / randomread throughput over very few NFS > connections since this will be the production workload for these > machines. > > If I understand you correctly then filebench is the culprit and simply > not pushing the server hard enough ? I would not say that filebench is the culprit. filebench is a parameterizable workload generator and its behavior will be consistent with the parameters you set. > Any ideas about how I can measure a light threads scenario ? If the goal is to drive the server to its limits for a given workload (in your case random read/random write), there needs to be enough requests/work in the pipeline for the server to process. One common way to do that is to have a large number of threads per client and/or have multiple clients issuing the load to the server. If you need to generate a significant load on the server, while keeping the utilization of each client low, you can do so by using multiple clients, and a small number of threads for each client.
You can get the multi connection scenario working by leveraging the multi-client capability of filebench. http://www.solarisinternals.com/wiki/images/f/f8/FileBench_Multi_Client.pdf -Piyush