Hi Sagi-

> On Nov 11, 2015, at 4:28 AM, Sagi Grimberg <sa...@dev.mellanox.co.il> wrote:
> 
>> I’d like to see our NFS server use the local DMA lkey where it
>> makes sense, to avoid the cost of registering and invalidating
>> memory.
>> 
>> I have to agree with Tom that once the device’s s/g limit is
>> exceeded, the server has to post an RDMA Read WR every few
>> pages, and appears to get expensive for large NFS requests.
>> 
>> The current mechanism of statically choosing either FRMR or
>> local DMA depending on the device is probably not adequate.
>> Maybe we could post all the Read WRs via a single chain? Or
>> stick with FRMR when the amount of data to read is significant.
>> 
>> I’ve also tested Christoph’s patch. The logic currently in
>> rdma_read_chunk_lcl does not seem up to the task. Once the
>> device’s s/g limit is exceeded, the server starts throwing
>> local length exceptions, and the client workload hangs.
> 
> This is probably because this code path wasn't reached/tested for
> a long time as it's hard to find a device that doesn't support FRWR
> for a long time now…

An alternate explanation is that the provider is not setting
device->max_sge_rd properly. rdma_read_chunks_lcl() seems to
be the only thing in my copy of the kernel tree that relies on
that value.

I’ve reproduced the local length errors with CX-2 and CX-3
Pro, which both set that field to 32. If I artificially
set that field to 30, I don’t see any issue.

Is commit 18ebd40773bf correct?


--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to