Steve, Thanks for the confirmation.
--Anuj On Sun, Nov 24, 2013 at 12:57 PM, Steve Wise <sw...@opengridcomputing.com> wrote: > On 11/22/2013 10:13 PM, Anuj Kalia wrote: >> >> Update: I found ways to improve active side performance from 10 >> million RDMA writes per second to 20 million (which I believe is the >> PCIe bottleneck): >> >> 1. Use inline payload - I think this reduces PCIe traffic. > > > Yes, without inline, each IO requires 2 PCIe transactions: 1 to fetch (or > push) the work request, and one to fetch the payload/data. If you use > inline, the data is included in the work request. So you cut the required > transactions in half. > > >> 2. Use non-signalled RDMA writes + don't poll for completion for every >> write - I don't know if ibv_poll_cq() uses the PCIe much. > > > Each signaled work request generates a completion entry (CQE) which is > pushed from the adapter into the CQ in host memory. So reducing the number > of these required also reduces the PCIe transactions. > > Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html