On 4/30/2013 10:23 AM, Yan Burman wrote:
-----Original Message-----
From: Tom Talpey [mailto:t...@talpey.com]
On Sun, Apr 28, 2013 at 06:28:16AM +0000, Yan Burman wrote:
I finally got up to 4.1GB/sec bandwidth with RDMA (ipoib-CM bandwidth is
also way higher now).
For some reason when I had intel IOMMU enabled, the performance
dropped significantly.
I now get up to ~95K IOPS and 4.1GB/sec bandwidth.

Excellent, but is that 95K IOPS a typo? At 4KB, that's less than 400MBps.


That is not a typo. I get 95K IOPS with randrw test with block size 4K.
I get 4.1GBps with block size 256K randread test.

Well, then I suggest you focus on whether you are satisfied with a
high bandwidth goal or a high IOPS goal. They are two very different
things, and clearly there are still significant issues to track down
in the server.

What is the client CPU percentage you see under this workload, and how
different are the NFS/RDMA and NFS/IPoIB overheads?

NFS/RDMA has about more 20-30% CPU usage than NFS/IPoIB, but RDMA has almost 
twice the bandwidth of IPoIB.

So, for 125% of the CPU, RDMA is delivering 200% of the bandwidth.
A common reporting approach is to calculate cycles per Byte (roughly,
CPU/MB/sec), and you'll find this can be a great tool for comparison
when overhead is a consideration.

Overall, CPU usage gets up to about 20% for randread and 50% for randwrite.

This is *client* CPU? Writes require the server to take additional
overhead to make RDMA Read requests, but the client side is doing
practically the same thing for the read vs write path. Again, you
may want to profile more deeply to track that difference down.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to