Re: Why is Infiniband a Lossless medium?

2014-04-21 Thread Nicolas Carlier
On Sat, Apr 19, 2014 at 7:25 PM, Christoph Lameter c...@linux.com wrote: On Fri, 18 Apr 2014, Nicolas Carlier wrote: If the receiving QP does not have buffers available then the HCA will silently drop UD packets. This is somethig that tripped us up initialy. So its lossless only from HCA

[PATCH] RDMA/cxgb4: Fix memory leaks in c4iw_alloc() error paths

2014-04-21 Thread Christoph Jaeger
c4iw_alloc() bails out without freeing the storage that 'devp' points to. Picked up by Coverity - CID 1204241. Fixes: fa658a98a2 (RDMA/cxgb4: Use the BAR2/WC path for kernel QPs and T5 devices) Signed-off-by: Christoph Jaeger christophjae...@linux.com --- drivers/infiniband/hw/cxgb4/device.c |

RE: [PATCH] RDMA/cxgb4: Fix memory leaks in c4iw_alloc() error paths

2014-04-21 Thread Steve Wise
-Original Message- From: Christoph Jaeger [mailto:christophjae...@linux.com] Sent: Monday, April 21, 2014 10:03 AM To: sw...@chelsio.com; rol...@kernel.org; sean.he...@intel.com; hal.rosenst...@gmail.com Cc: linux-rdma@vger.kernel.org; linux-ker...@vger.kernel.org; Christoph Jaeger

[PATCH librdmacm] rstream: fix -T resolve detection

2014-04-21 Thread Patrick MacArthur
The effect of the check for -T resolve was reversed, so that -T with any invalid value would result in the -T resolve behavior, and -T resolve would result in an error. --- examples/rstream.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/rstream.c

Re: [PATCH librdmacm] rstream: fix -T resolve detection

2014-04-21 Thread Patrick MacArthur
Hi, Sean, Add: Signed-off-by: Patrick MacArthur pmaca...@iol.unh.edu I can resend the patch if you would like. Thanks, Patrick -- Patrick MacArthur pmaca...@iol.unh.edu AIM: PmacarthAtIOL Research and Development, High Performance Networking and Storage UNH InterOperability Laboratory On

RE: [PATCH librdmacm] rstream: fix -T resolve detection

2014-04-21 Thread Hefty, Sean
Signed-off-by: Patrick MacArthur pmaca...@iol.unh.edu I can resend the patch if you would like. thanks - no need to resend -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at

Re: Why is Infiniband a Lossless medium?

2014-04-21 Thread Christoph Lameter
On Mon, 21 Apr 2014, Nicolas Carlier wrote: Interresting, I don't see this counter neither on fw/sw release notes nor on the systems. What the name of this counter ? On which HCA do you see this counter ? The mlx4 driver supports this counter and its in /sys/class/infiniband/diag_counters/

Re: [PATCH for-next] IB/mlx5: Add block multicast loopback support

2014-04-21 Thread Christoph Lameter
On Wed, 2 Apr 2014, Or Gerlitz wrote: From: Eli Cohen e...@dev.mellanox.co.il Add support for the block multicast loopback QP creation flag along the proper firmware API for that. Could we get this merged? We need this on a lot of systems. -- To unsubscribe from this list: send the line

[PATCH] IB/mlx4: Allow to always block UD multicast loopback

2014-04-21 Thread Christoph Lameter
We need this option for many hosts to avoid backflow of multicast packages. Could we get that merged? From 18ceae090b02b3055382e11c305dcb334d938122 Mon Sep 17 00:00:00 2001 From: Or Gerlitz ogerl...@mellanox.com Date: Tue, 4 Mar 2014 17:20:00 +0200 Subject: [PATCH] IB/mlx4: Allow to always

Re: [PATCH for-next] IB/mlx5: Add block multicast loopback support

2014-04-21 Thread Roland Dreier
commit f360d88a2efd upstream. On Mon, Apr 21, 2014 at 10:56 AM, Christoph Lameter c...@linux.com wrote: On Wed, 2 Apr 2014, Or Gerlitz wrote: From: Eli Cohen e...@dev.mellanox.co.il Add support for the block multicast loopback QP creation flag along the proper firmware API for that.

Re: [PATCH for-next] IB/mlx5: Add block multicast loopback support

2014-04-21 Thread Christoph Lameter
On Mon, 21 Apr 2014, Roland Dreier wrote: commit f360d88a2efd upstream. Great. Is there a corresponding patch for mlx4? Dont see that in the kernel and we have mostly mlx4. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to

Re: [PATCH for-next] IB/mlx5: Add block multicast loopback support

2014-04-21 Thread Roland Dreier
Is something more than commit 521e575b9a73 (from 2008) required? On Mon, Apr 21, 2014 at 11:59 AM, Christoph Lameter c...@linux.com wrote: On Mon, 21 Apr 2014, Roland Dreier wrote: commit f360d88a2efd upstream. Great. Is there a corresponding patch for mlx4? Dont see that in the kernel and

Re: [PATCH] IB/mlx4: Allow to always block UD multicast loopback

2014-04-21 Thread Or Gerlitz
On Mon, Apr 21, 2014 at 9:09 PM, Christoph Lameter c...@linux.com wrote: We need this option for many hosts to avoid backflow of multicast packages. Could we get that merged? [...] Currently, there's no way for user-space applications to specify the IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK QP

Re: [PATCH for-next] IB/mlx5: Add block multicast loopback support

2014-04-21 Thread Christoph Lameter
On Mon, 21 Apr 2014, Roland Dreier wrote: Is something more than commit 521e575b9a73 (from 2008) required? Reviewing that right now. If it does what it seems to do then we are fine. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to

Re: [PATCH] IB/mlx4: Allow to always block UD multicast loopback

2014-04-21 Thread Christoph Lameter
On Mon, 21 Apr 2014, Or Gerlitz wrote: Currently, there's no way for user-space applications to specify the IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK QP creation flags defined by commit 47ee1b9 IB/core: Add support for multicast loopback blocking. As a result, applications who send and

Re: [PATCH] IB/mlx4: Allow to always block UD multicast loopback

2014-04-21 Thread Or Gerlitz
On Tue, Apr 22, 2014 at 12:06 AM, Christoph Lameter c...@linux.com wrote: On Mon, 21 Apr 2014, Or Gerlitz wrote: Currently, there's no way for user-space applications to specify the IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK QP creation flags defined by commit 47ee1b9 IB/core: Add support for

[PATCH V2 02/17] nfs-rdma: Fix for FMR leaks

2014-04-21 Thread Chuck Lever
From: Allen Andrews allen.andr...@emulex.com Two memory region leaks were found during testing: 1. rpcrdma_buffer_create: While allocating RPCRDMA_FRMR's ib_alloc_fast_reg_mr is called and then ib_alloc_fast_reg_page_list is called. If ib_alloc_fast_reg_page_list returns an error it bails out

[PATCH V2 01/17] xprtrdma: mind the device's max fast register page list depth

2014-04-21 Thread Chuck Lever
From: Steve Wise sw...@opengridcomputing.com Some rdma devices don't support a fast register page list depth of at least RPCRDMA_MAX_DATA_SEGS. So xprtrdma needs to chunk its fast register regions according to the minimum of the device max supported depth or RPCRDMA_MAX_DATA_SEGS.

[PATCH V2 00/17] NFS/RDMA patches for review

2014-04-21 Thread Chuck Lever
After folks tried out RPCRDMA_REGISTER support as I requested in the cover letter of the last version of this series, existing problems were discovered already in the upstream kernel, starting with the problem addressed by Steve's LOCAL_WRITE patch from last week. Rather than address them, this

[PATCH V2 04/17] xprtrdma: RPC/RDMA must invoke xprt_wake_pending_tasks() in process context

2014-04-21 Thread Chuck Lever
An IB provider can invoke rpcrdma_conn_func() in an IRQ context, thus rpcrdma_conn_func() cannot be allowed to directly invoke generic RPC functions like xprt_wake_pending_tasks(). Signed-off-by: Chuck Lever chuck.le...@oracle.com Tested-by: Steve Wise sw...@opengridcomputing.com ---

[PATCH V2 03/17] xprtrdma: Enable RDMA pad optimization by default

2014-04-21 Thread Chuck Lever
Section 4 of RFC 5667 (NFS/RDMA) says: The server MUST ignore any Read list for other NFS procedures, as well as additional Read list entries beyond the first in the list. Our XDR code adds a zero pad at the end of NFS WRITEs and SYMLINKs whose content is not a multiple of 4 octets long.

[PATCH V2 05/17] xprtrdma: Remove BOUNCEBUFFERS memory registration mode

2014-04-21 Thread Chuck Lever
Clean up: This memory registration mode is slow and was never meant for use in production environments. Remove it to reduce implementation complexity. Signed-off-by: Chuck Lever chuck.le...@oracle.com Tested-by: Steve Wise sw...@opengridcomputing.com --- net/sunrpc/xprtrdma/rpc_rdma.c |8

[PATCH V2 06/17] xprtrdma: Remove MEMWINDOWS registration modes

2014-04-21 Thread Chuck Lever
The MEMWINDOWS and MEMWINDOES_ASYNC memory registration modes were intended as stop-gap modes before the introduction of FRMR. They are now considered obsolete. MEMWINDOWS_ASYNC is also considered unsafe because it can leave client memory registered and exposed for an indeterminant time after

[PATCH V2 10/17] xprtrdma: Add CONFIG setting that can disable ALLPHYSICAL

2014-04-21 Thread Chuck Lever
ALLPHYSICAL is not a safe memory registration mode because it permits NFS servers to write anywhere in a client's memory. NFS server bugs could result in client memory being overwritten. This can be useful for embedded systems which do not support more surgical RDMA memory registration and

[PATCH V2 08/17] xprtrdma: Fall back to MTHCAFMR when FRMR is not supported

2014-04-21 Thread Chuck Lever
An audit of in-kernel RDMA providers that do not support the FRMR memory registration shows that several of them support MTHCAFMR. Prefer MTHCAFMR when FRMR is not supported. If MTHCAFMR is not supported, only then choose ALLPHYSICAL. Signed-off-by: Chuck Lever chuck.le...@oracle.com ---

[PATCH V2 12/17] xprtrdma: Make rpcrdma_ep_destroy() return void

2014-04-21 Thread Chuck Lever
Clean up: rpcrdma_ep_destroy() returns a value that is used only to print a debugging message. rpcrdma_ep_destroy() already prints debugging messages in all error cases. Make rpcrdma_ep_destroy() return void instead. Signed-off-by: Chuck Lever chuck.le...@oracle.com Tested-by: Steve Wise

[PATCH V2 11/17] xprtrdma: Simplify rpcrdma_deregister_external() synopsis

2014-04-21 Thread Chuck Lever
Clean up: All remaining callers of rpcrdma_deregister_external() pass NULL as the last argument, so remove that argument. Signed-off-by: Chuck Lever chuck.le...@oracle.com Tested-by: Steve Wise sw...@opengridcomputing.com --- net/sunrpc/xprtrdma/rpc_rdma.c |2 +-

[PATCH V2 13/17] xprtrdma: Split the completion queue

2014-04-21 Thread Chuck Lever
The current CQ handler uses the ib_wc.opcode field to distinguish between event types. However, the contents of that field are not reliable if the completion status is not IB_WC_SUCCESS. When an error completion occurs on a send event, the CQ handler schedules a tasklet with something that is not

[PATCH V2 14/17] xprtrmda: Reduce lock contention in completion handlers

2014-04-21 Thread Chuck Lever
Skip the ib_poll_cq() after re-arming, if the provider knows there are no additional items waiting. (Have a look at commit ed23a727 for more details). Signed-off-by: Chuck Lever chuck.le...@oracle.com --- net/sunrpc/xprtrdma/verbs.c | 14 ++ 1 files changed, 10 insertions(+), 4

[PATCH V2 17/17] xprtrdma: Reduce the number of hardway buffer allocations

2014-04-21 Thread Chuck Lever
While marshaling an RPC/RDMA request, the inline_{rsize,wsize} settings determine whether an inline request is used, or whether read or write chunks lists are built. The current default value of these settings is 1024. Any RPC request smaller than 1024 bytes is sent to the NFS server completely

[PATCH V2 16/17] xprtrdma: Limit work done by completion handler

2014-04-21 Thread Chuck Lever
Sagi Grimberg sa...@dev.mellanox.co.il points out that a steady stream of CQ events could starve other work because of the boundless loop pooling in rpcrdma_{send,recv}_poll(). Instead of a (potentially infinite) while loop, return after collecting a budgeted number of completions. Note that the

[PATCH V2 07/17] xprtrdma: Remove REGISTER memory registration mode

2014-04-21 Thread Chuck Lever
All kernel RDMA providers except amso1100 support either MTHCAFMR or FRMR, both of which are faster than REGISTER. amso1100 can continue to use ALLPHYSICAL. The only other ULP consumer in the kernel that uses the reg_phys_mr verb is Lustre. Signed-off-by: Chuck Lever chuck.le...@oracle.com ---

[PATCH V2 09/17] xprtrdma: mount reports Invalid mount option if memreg mode not supported

2014-04-21 Thread Chuck Lever
If the selected memory registration mode is not supported by the underlying provider/HCA, the NFS mount command reports that there was an invalid mount option, and fails. This is misleading. Reporting a problem allocating memory is a lot closer to the truth. Signed-off-by: Chuck Lever

[PATCH V2 15/17] xprtrmda: Reduce calls to ib_poll_cq() in completion handlers

2014-04-21 Thread Chuck Lever
Change the completion handlers to grab up to 16 items per ib_poll_cq() call. No extra ib_poll_cq() is needed if fewer than 16 items are returned. Signed-off-by: Chuck Lever chuck.le...@oracle.com --- net/sunrpc/xprtrdma/verbs.c | 56 ++-

Re: [PATCH] IB/mlx4: Allow to always block UD multicast loopback

2014-04-21 Thread Christoph Lameter
On Tue, 22 Apr 2014, Or Gerlitz wrote: Urgh. So we have all these flags and we cannot use them? Christoph, the support (kernel IB core API and HW drivers e.g mlx4 and now also mlx5) is there for **kernel** consumers (e.g IPoIB). and missing for **user-space** consumers whose calls are