Re: ipoib race in multicast flow (was: IPoIB oops)

2012-07-25 Thread Or Gerlitz
On 24/07/2012 18:14, Yishai Hadas wrote: Just encountered a kernel oops in IPoIB on upstream kernel 3.5 [...] oops happened in ipoib_mcast_join_task. Roland, I made a review now on the issue Yishai raised, and took a look on few related commits to that area, as you wrote in a77a57a1a IPoIB:

Re: memory region limit at 32 GB?

2012-08-07 Thread Or Gerlitz
On Mon, Aug 6, 2012 at 6:51 PM, Yishai Hadas yish...@dev.mellanox.co.il wrote: In the meanwhile I have found the root cause of the limit of log_num_mtt of 28. Plan to send in coming days an extra patch that enables value of 31 which match to 8TB. nice doing. Just an ordering comment, for

Re: [PATCH for-next V2 00/22] Add SRIOV support for IB interfaces

2012-08-12 Thread Or Gerlitz
On 03/08/2012 11:40, Jack Morgenstein wrote: This patch set adds SRIOV support for IB interfaces. Patches 1-4 are precondition patches. Patches 5-22 actually implement the feature. NOTE: Patch 18 depends on patch IB/mlx4: fix possible deadlock with sm_lock spinlock (a

Re: [PATCH V2] net/mlx4_core: enable 8TB of memory registration

2012-08-12 Thread Or Gerlitz
On 07/08/2012 15:34, yish...@dev.mellanox.co.il wrote: From: Yishai Hadas yish...@mellanox.com This patch solves below issue: Fix the mlx4 core limitation of log num mtt higher than 28. - There were some int overflows which were fixed to enable using 31. - When we auto scaling number of MTTs

[PATCH fix V2 1/2] net/mlx4_core: Allow large mlx4_buddy objects

2012-08-13 Thread Or Gerlitz
num mtt is 26 or higher, and is a step in the direction of allowing to register large amounts of memory. Signed-off-by: Yishai Hadas yish...@mellanox.com Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- drivers/net/ethernet/mellanox/mlx4/mr.c | 21 +++-- 1 files changed, 15

[PATCH fix V2 2/2] net/mlx4_core: Fix scaling issues related to memory registration

2012-08-13 Thread Or Gerlitz
-by: Or Gerlitz ogerl...@mellanox.com --- drivers/net/ethernet/mellanox/mlx4/icm.c |9 ++--- drivers/net/ethernet/mellanox/mlx4/icm.h |2 +- drivers/net/ethernet/mellanox/mlx4/mlx4.h|4 ++-- drivers/net/ethernet/mellanox/mlx4/mr.c |4 ++-- drivers/net/ethernet

[PATCH fix V2 0/2] mlx4_core: Fix scaling issues related to memory registration

2012-08-13 Thread Or Gerlitz
Hi Roland, Here are Yishai's fixes which added few issues that come into play for systems with large RAM, with these fixes, your commit db5a7a65 mlx4_core: Scale size of MTT table with system RAM can really play, where without them, things don't work. These are fixes, should be ok to 3.6 and to

[PATCH fixes 1/2] IB/ipoib: Add missing locking when CM object is deleted

2012-08-13 Thread Or Gerlitz
shlo...@mellanox.com Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- drivers/infiniband/ulp/ipoib/ipoib_cm.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 95ecf4e..24683fd

[PATCH fixes 2/2] IB/ipoib: Fix RCU pointer dereference to wrong object

2012-08-13 Thread Or Gerlitz
in a crash. Signed-off-by: Shlomo Pongratz shlo...@mellanox.com Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- drivers/infiniband/ulp/ipoib/ipoib_main.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib

Re: Patchwork back online

2012-08-14 Thread Or Gerlitz
On Tue, Aug 14, 2012 at 7:20 PM, Roland Dreier rol...@purestorage.com wrote: It's not in patchwork because I already applied it. I don't see it in none of the branches of your kernel.org tree, is that on a local clone? Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma

Re: Trust model for raw QPs

2012-08-15 Thread Or Gerlitz
On 15/08/2012 17:06, Christoph Lameter wrote: On Wed, 15 Aug 2012, Or Gerlitz wrote: Currently, for an app to open a raw QP from user space, we (verbs) require admin permission, for which we (Mellanox) got customer feedback saying this is problematic on some of the environments. Well yes

Re: Trust model for raw QPs

2012-08-15 Thread Or Gerlitz
Jason Gunthorpe jguntho...@obsidianresearch.com wrote: Can you fix this by elevating the process with SELinux? Chirstoph, do you think this would valid option from users standpoint? Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to

Re: linux-next: build failure after merge of the infiniband tree

2012-08-16 Thread Or Gerlitz
On 16/08/2012 07:07, Roland Dreier wrote: [...] should be fixed in tomorrow's for-next (added include of linux/vmalloc.h). thanks for taking care of that. See Rule 1 in Documentation/SubmitChecklist. Heh, but it compiles fine on x86! I wonder if there is a way to easily catch such errors

[PATCH for-next 0/4] batch of maintainance patches for 3.7

2012-08-23 Thread Or Gerlitz
Hi Roland, Here's a batch with few simple patches, please apply for 3.7 Or. Dotan Barak (3): IB/core: Remove unused variables in ucm/ucma IB/mlx4: Fill in sq_sig_type in query QP net/mlx4_core: Fix wrong offset in query device caps Or Gerlitz (1): net/mlx4_core: Remove annoying debug

[PATCH for-next 4/4] net/mlx4_core: Remove annoying debug message in the resource tracker

2012-08-23 Thread Or Gerlitz
This innocent print makes it very hard to actually use the mlx4 core debug messages -- for example, the module load sequence of a device with two VFs yielded 3200 debug prints, with 2800 of them being this exact one, remove it. Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- .../net/ethernet

[PATCH for-next 3/4] net/mlx4_core: Fix wrong offset in query device caps

2012-08-23 Thread Or Gerlitz
From: Dotan Barak dot...@dev.mellanox.co.il The wrong offset was used when parsing the number of XRCs in mlx4_QUERY_DEV_CAP(), fix that. Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- drivers/net/ethernet/mellanox/mlx4/fw.c |2 +- 1

[PATCH for-next 1/4] IB/core: Remove unused variables in ucm/ucma

2012-08-23 Thread Or Gerlitz
From: Dotan Barak dot...@dev.mellanox.co.il Remove unused wait objects from ucm/ucma events flow. Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- drivers/infiniband/core/ucm.c |1 - drivers/infiniband/core/ucma.c |1 - 2 files

[PATCH for-next 2/4] IB/mlx4: Fill in sq_sig_type in query QP

2012-08-23 Thread Or Gerlitz
From: Dotan Barak dot...@dev.mellanox.co.il The query QP code was didn't fill that attribute, do that. Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- drivers/infiniband/hw/mlx4/qp.c |4 1 files changed, 4 insertions(+), 0

[PATCH RFC] RDMA/cma: Make IPoIB port space multicast joins consistent with IPoIB

2012-08-23 Thread Or Gerlitz
...@dev.mellanox.co.il Reviewed-by: Jack Morgenstein ja...@dev.mellanox.co.il Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- Trying to actually reproduce the problem without this patch, I used an IPoIB partition for which the MTU is 4k, that is set by the following in partitions.conf # 5 = 4k 0x3

Re: [PATCH RFC] RDMA/cma: Make IPoIB port space multicast joins consistent with IPoIB

2012-08-23 Thread Or Gerlitz
On Thu, Aug 23, 2012 at 6:45 PM, Hefty, Sean sean.he...@intel.com wrote: diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 7172559..f7e4cb9 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -3056,9 +3056,16 @@ static int

Re: [PATCH RFC] RDMA/cma: Make IPoIB port space multicast joins consistent with IPoIB

2012-08-24 Thread Or Gerlitz
On Thu, Aug 23, 2012 at 10:18 PM, Hefty, Sean sean.he...@intel.com wrote: diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 7172559..f7e4cb9 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -3056,9 +3056,16 @@ static int

Re: [PATCH] libdmacm/rspreload: Avoid rsocket calls until after fork

2012-08-25 Thread Or Gerlitz
On 23/08/2012 21:51, Hefty, Sean wrote: When an rsocket call is made before an application calls fork(), the forked applications can hang. This can be seen by running netserver and two netperf clients simultaneously. The second netperf client will eventually stop performing data transfers.

Re: [PATCH] libdmacm/rspreload: Avoid rsocket calls until after fork

2012-08-26 Thread Or Gerlitz
On 23/08/2012 21:51, Hefty, Sean wrote: It's not clear what the specific problem is. The best guess is that libibverbs or the provider library (e.g. libmlx4) perform some initialization, such as mmap'ing device memory, which does not work when fork is called. Are you calling from rsockets to

Re: [PATCH] libdmacm/rspreload: Avoid rsocket calls until after fork

2012-08-27 Thread Or Gerlitz
On Mon, Aug 27, 2012 at 8:54 PM, Hefty, Sean sean.he...@intel.com wrote: rsockets enables fork support only when RDMAV_FORK_SAFE has been set. I do not call ibv_fork_init(). understood, so you see the problem also when RDMAV_FORK_SAFE has been set? Or. -- To unsubscribe from this list: send

Re: [PATCH for-next V1 0/4] IB/IPoIB TSS and RSS support for datagram mode

2012-08-28 Thread Or Gerlitz
On Mon, Aug 13, 2012 at 5:27 PM, Tzahi Oved tza...@mellanox.com wrote: Sean – thanks for the feedback. Reg the XRC semantics and object model: Sean, Can you let us know your thoughts here? Or. - XRC domain object allows many to many mappings where multiple XRC TGT QPs and multiple XRC SRQs

Re: [PATCH for-next V1 0/4] IB/IPoIB TSS and RSS support for datagram mode

2012-08-28 Thread Or Gerlitz
On Tue, Aug 28, 2012 at 9:07 PM, Hefty, Sean sean.he...@intel.com wrote: Can you let us know your thoughts here? I understand the purpose behind TSS/RSS. I'm not fond of making verbs more complex, but I haven't come up with anything that's really simpler. Tzahi's response addressed my main

[PATCH fixes/for-3.6 1/3] IB/ipoib: Fix memory leak in the neigh table deletion flow

2012-08-29 Thread Or Gerlitz
Pongratz shlo...@mellanox.com Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- drivers/infiniband/ulp/ipoib/ipoib.h |3 +++ drivers/infiniband/ulp/ipoib/ipoib_main.c | 23 +-- 2 files changed, 20 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/ulp

[PATCH fixes/for-3.6 3/3] net/mlx4_core: Enable 8TB memory registration

2012-08-29 Thread Or Gerlitz
-off-by: Yishai Hadas yish...@mellanox.com Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- drivers/net/ethernet/mellanox/mlx4/icm.c | 30 ++ drivers/net/ethernet/mellanox/mlx4/icm.h | 10 +- 2 files changed, 23 insertions(+), 17 deletions(-) diff --git

[PATCH fixes/for-3.6 0/3] few more IB fixes for 3.6

2012-08-29 Thread Or Gerlitz
Hi Roland, This short series include two more fixes from Shlomo to the newly introduced IPoIB neighbour table, and a fix for Yishai that completes the support for memory registration of up to 8TB. Or. Shlomo Pongratz (2): IB/ipoib: Fix memory leak in the neigh table deletion flow IB/ipoib:

[PATCH fixes/for-3.6 2/3] IB/ipoib: Fix AB-BA deadlock when deleting neighbours

2012-08-29 Thread Or Gerlitz
in parallel to error flow in two different CPUs. The solution was to drop the neigh table rwlock and use only priv-lock. Signed-off-by: Shlomo Pongratz shlo...@mellanox.com Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- drivers/infiniband/ulp/ipoib/ipoib.h |1 - drivers/infiniband

[PATCH for-next V1] RDMA/cma: Fix multicast joins of the IPoIB port space to be consistent

2012-08-30 Thread Or Gerlitz
the case, since some of the component mask fields set by ipoib weren't set by the CMA, fix that. Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il Reviewed-by: Jack Morgenstein ja...@dev.mellanox.co.il Acked-by: Sean Hefty sean.he...@intel.com Signed-off-by: Or Gerlitz ogerl...@mellanox.com

[PATCH RFC for-next] net/mlx4_core: Fix racy flow in the driver CQ completion handler

2012-08-30 Thread Or Gerlitz
after free, null pointer dereference, etc various crashes. Fix that by wrapping the radix tree lookup with taking the table lock, and increase the ref count for the time of the callback. Signed-off-by: Yishai Hadas yish...@mellanox.com Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- Hi Roland

Re: [PATCH RFC for-next] net/mlx4_core: Fix racy flow in the driver CQ completion handler

2012-08-30 Thread Or Gerlitz
Roland Dreier ‎rol...@kernel.org wrote: Can you be explicit about the race you're worried about? few 1. on the time CQ A is deleted an interrupt that relates to CQ B takes place and a radix tree lookup is running while an element is being deleted from the tree, looking on the radix tree API,

Re: [PATCH RFC for-next] net/mlx4_core: Fix racy flow in the driver CQ completion handler

2012-08-31 Thread Or Gerlitz
On Fri, Aug 31, 2012 at 1:35 AM, Roland Dreier rol...@kernel.org wrote: On Thu, Aug 30, 2012 at 3:17 PM, Or Gerlitz or.gerl...@gmail.com wrote: 1. on the time CQ A is deleted an interrupt that relates to CQ B takes place and a radix tree lookup is running while an element is being deleted

Re: [PATCH RFC for-next] net/mlx4_core: Fix racy flow in the driver CQ completion handler

2012-08-31 Thread Or Gerlitz
Roland Dreier ‎rol...@kernel.org wrote: Can you be explicit about the race you're worried about? Roland, This patch was made after we got the below report from the field. Dotan, can we get Roland access to the full vmcore image? Here's the report I got: [...] box panic'ed when trying to

Re: [PATCH RFC for-next] net/mlx4_core: Fix racy flow in the driver CQ completion handler

2012-08-31 Thread Or Gerlitz
On Fri, Aug 31, 2012 at 1:35 AM, Roland Dreier rol...@kernel.org wrote: I don't think this is a real problem; the radix tree code is [...] So maybe this patch wouldn't land in 3.7, we'll see, however, so far no other patch sits in the for-next branch of your tree for the next window... any plan

Re: [PATCH fixes/for-3.6 0/3] few more IB fixes for 3.6

2012-09-08 Thread Or Gerlitz
On Wed, Aug 29, 2012 at 6:14 PM, Or Gerlitz ogerl...@mellanox.com wrote: Hi Roland, This short series include two more fixes from Shlomo to the newly introduced IPoIB neighbour table, and a fix for Yishai that completes the support for memory registration of up to 8TB. Roland, AFAIK 3.6

Re: [PATCH RFC for-next] net/mlx4_core: Fix racy flow in the driver CQ completion handler

2012-09-09 Thread Or Gerlitz
On Tue, Sep 4, 2012 at 11:12 AM, Max Matveev m...@gmx.co.uk wrote: On Thu, Aug 30, 2012 at 22:35 PM Roland Dreier wrote: On Thu, Aug 30, 2012 at 3:17 PM, Or Gerlitz or.gerl...@gmail.com wrote: Roland Dreier rol...@kernel.org wrote: Can you be explicit about the race you're worried about

Re: [PATCH RFC for-next] net/mlx4_core: Fix racy flow in the driver CQ completion handler

2012-09-09 Thread Or Gerlitz
On Wed, Sep 5, 2012 at 4:25 PM, Max Matveev m...@gmx.co.uk wrote: On Tue, 4 Sep 2012 11:02:32 -0700, Roland Dreier wrote: roland On Tue, Sep 4, 2012 at 1:12 AM, Max Matveev m...@gmx.co.uk wrote: What about races between radix_tree_extend and radix_tree_lookup? As far as I can see

Re: [PATCH RFC for-next] net/mlx4_core: Fix racy flow in the driver CQ completion handler

2012-09-10 Thread Or Gerlitz
On 10/09/2012 09:57, Jack Morgenstein wrote: + * + * For API usage, in general, + * - any function _modifying_ the the tree or tags (inserting or deleting + * items, setting or clearing tags must exclude other modifications, and + * exclude any functions reading the tree. + * - any function

Re: [PATCH RFC for-next] net/mlx4_core: Fix racy flow in the driver CQ completion handler

2012-09-10 Thread Or Gerlitz
On 10/09/2012 16:17, Jack Morgenstein wrote: I don't know. I do notice (in file include/linux/rcupdate.h) that rcu_read_lock/unlock is meant to be used in the interrupt context. Would it be sufficient (besides rcu_read_lock/unlock calls) to add a call rcu_synchronize() in mlx4_cq_free (after

Re: [PATCH RFC for-next] net/mlx4_core: Fix racy flow in the driver CQ completion handler

2012-09-11 Thread Or Gerlitz
On Tue, Sep 11, 2012 at 9:03 AM, Jack Morgenstein ja...@dev.mellanox.co.il wrote: On Monday 10 September 2012 16:27, Or Gerlitz wrote: I took a look on the practice/wrapping used over the mm subsystem for radix_tree_lookup calls, whose maintainer, Andrew Morton is signed on the patch Roland

Re: [PATCH fixes/for-3.6 2/3] IB/ipoib: Fix AB-BA deadlock when deleting neighbours

2012-09-12 Thread Or Gerlitz
On 12/09/2012 19:23, Roland Dreier wrote: thanks, applied this and 1/3 OK, what about 3/3, any issue with merging it? also, I don't see them applied in your kernnel.org tree fixes branch nor for-next, are you going to push them to 3.6 soon? On Wed, Aug 29, 2012 at 8:14 AM, Or Gerlitz

Re: [PATCH for-next V2 02/22] IB/core: change pkey table lookups to support full and partial membership for the same pkey

2012-09-13 Thread Or Gerlitz
On 13/09/2012 10:35, Jack Morgenstein wrote: I seem to recall that there were problems with IPoIB when partial membership pkeys are used. There are some issues in the overall solution, since ARPs sent over the broadcast group reach also nodes with partial membership their HCA generated pkey

Re: [PATCH for-next V2 02/22] IB/core: change pkey table lookups to support full and partial membership for the same pkey

2012-09-13 Thread Or Gerlitz
On 13/09/2012 10:35, Jack Morgenstein wrote: I seem to recall that there were problems with IPoIB when partial membership pkeys are used. There are some issues in the overall solution, since ARPs sent over the broadcast group reach also nodes with partial membership their HCA generated pkey

Re: [PATCH for-next V2 02/22] IB/core: change pkey table lookups to support full and partial membership for the same pkey

2012-09-13 Thread Or Gerlitz
On 11/09/2012 19:52, Doug Ledford wrote: On 8/3/2012 4:40 AM, Jack Morgenstein wrote: Enhance the cached and non-cached pkey table lookups to enable limited and full members of the same pkey to co-exist in the pkey table. This is necessary for SRIOV to allow for a scheme where some guests

Re: [PATCH for-next V2 02/22] IB/core: change pkey table lookups to support full and partial membership for the same pkey

2012-09-13 Thread Or Gerlitz
On 13/09/2012 18:53, Or Gerlitz wrote: The physical PKey table can contain both full and partial memberships of the same Pkey. This is needed to serve 2 VFs that are granted access to the same PKey, albeit with different membership types. Example use case -- RDMA or IPoIB network storage

how to preserve QP over HA events for librdmacm applications

2012-09-19 Thread Or Gerlitz
Hi Sean, We have a case here where an app which uses librdmacm wants to preserve its QP over HA events such IB link down/up, specifically the sequence of operations done by the app is the following: 1. rdma_create_id using the IPoIB port space 2. rdma_bind _addr 3. rdma_create_qp using UD

Re: how to preserve QP over HA events for librdmacm applications

2012-09-19 Thread Or Gerlitz
On 19/09/2012 18:48, Hefty, Sean wrote: Can this flushing be somehow done with the current librdmacm/libibverbs APIs or we need some enhancement? You can call verbs directly to transition the QP state. That leaves the CM state unchanged, which doesn't really matter for UD QPs anyway.

[PATCH libibverbs 0/3] add raw packet QP, new helper and examples cleanups

2012-09-20 Thread Or Gerlitz
helpers to deal with new InfiniBand link speeds Fix resource leaks in the pingpong examples present in the failure/error flows. Or Gerlitz (1): Add raw packet QP type Makefile.am|6 +++- examples/rc_pingpong.c | 43 ++--- examples

[PATCH libibverbs 1/3] Add raw packet QP type

2012-09-20 Thread Or Gerlitz
with the NET_RAW capability are allowed to create raw packet QPs (the name raw packet QP is supposed to suggest an analogy to AF_PACKET / SOL_RAW sockets). Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- include/infiniband/verbs.h |3 ++- man/ibv_create_qp.3|2 +- man

[PATCH libibverbs 2/3] Add helpers to deal with new InfiniBand link speeds

2012-09-20 Thread Or Gerlitz
-by: Or Gerlitz ogerl...@mellanox.com --- Makefile.am|6 +++- include/infiniband/verbs.h | 23 - man/ibv_rate_to_mbps.3 | 45 + src/libibverbs.map |3 ++ src/verbs.c| 48

[PATCH libibverbs 3/3] Fix resource leaks in the pingpong examples present in the failure/error flows.

2012-09-20 Thread Or Gerlitz
From: Dotan Barak dot...@dev.mellanox.co.il Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- examples/rc_pingpong.c | 43 --- examples/srq_pingpong.c | 51

[PATCH libmlx4 0/8] add raw packet QP, resource limitations, fixes/cleanups

2012-09-20 Thread Or Gerlitz
-logs and cleaned some checkpatch comments. Or. Dotan Barak (5): Replace sscanf() to strtol() Allow to use the whole BF buffer Use BlueFlame for RDMA_WRITE/WITH_IMM without data Change enumeration names for masked atomic opcodes When calling ibv_modify_qp() return right value Or Gerlitz

[PATCH libmlx4 1/8] Add raw packet QP support

2012-09-20 Thread Or Gerlitz
Implement raw packet QPs for Ethernet ports. Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- src/qp.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/src/qp.c b/src/qp.c index 40a6689..90c4e80 100644 --- a/src/qp.c +++ b/src/qp.c @@ -286,6 +286,10 @@ int

[PATCH libmlx4 3/8] Limit qp resources accepted for ibv_create_qp()

2012-09-20 Thread Or Gerlitz
...@dev.mellanox.co.il Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- src/mlx4.h | 14 ++ src/qp.c|6 -- src/verbs.c | 18 +- 3 files changed, 31 insertions(+), 7 deletions(-) diff --git a/src/mlx4.h b/src/mlx4.h index efaa7e9..e1daaf7 100644 --- a/src/mlx4

[PATCH libmlx4 7/8] Change enumeration names for masked atomic opcodes

2012-09-20 Thread Or Gerlitz
From: Dotan Barak dot...@dev.mellanox.co.il Change the enumeration names of the masked atomic opcodes to be consistent with the ones used by the mlx4 kernel driver. Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- src/mlx4.h |4 ++-- 1

[PATCH libmlx4 6/8] Use BlueFlame for RDMA_WRITE/WITH_IMM without data

2012-09-20 Thread Or Gerlitz
-by: Or Gerlitz ogerl...@mellanox.com --- src/qp.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/src/qp.c b/src/qp.c index 812e6ec..e770ec8 100644 --- a/src/qp.c +++ b/src/qp.c @@ -267,6 +267,8 @@ int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr

[PATCH libmlx4 5/8] Allow to use the whole BF buffer

2012-09-20 Thread Or Gerlitz
From: Dotan Barak dot...@dev.mellanox.co.il Increase the maximum size of messages (from 192 to 208) that will use the blue flame buffer. Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il Reviewed-by: Jack Morgenstein ja...@dev.mellanox.co.il Signed-off-by: Or Gerlitz ogerl...@mellanox.com

[PATCH libmlx4 4/8] Replace sscanf() to strtol()

2012-09-20 Thread Or Gerlitz
From: Dotan Barak dot...@dev.mellanox.co.il When converting a string to a numeric value, strtol() is more safe to use. Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- src/mlx4.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions

[PATCH libmlx4 8/8] When calling ibv_modify_qp() return right value

2012-09-20 Thread Or Gerlitz
From: Dotan Barak dot...@dev.mellanox.co.il When the ibv_query_port() call made by mlx4_modify_qp() fails, the return value from the latter should indicate the error status of the former and not simply -1. Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il Signed-off-by: Or Gerlitz ogerl

Re: [PATCH for-next V2 04/22] IB/mlx4: SRIOV IB context objects and proxy/tunnel sqp support

2012-09-20 Thread Or Gerlitz
On Tue, Sep 11, 2012 at 8:10 PM, Doug Ledford dledf...@redhat.com wrote: On 8/3/2012 4:40 AM, Jack Morgenstein wrote: struct mlx4_ib_sriov{} is created by the master only. It is a container for the following: 1. All the info required by the PPF to multiplex and de-multiplex MADs

Re: [PATCH libmlx4 1/8] Add raw packet QP support

2012-09-21 Thread Or Gerlitz
On Fri, Sep 21, 2012 at 3:51 AM, Luick, Dean dean.lu...@intel.com wrote: @@ -286,6 +286,10 @@ int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, size += sizeof (struct mlx4_wqe_datagram_seg) / 16; break; + case

[PATCH libmlx4 FIXED] Add raw packet QP support

2012-09-23 Thread Or Gerlitz
Implement raw packet QPs for Ethernet ports. Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- changes from previous version: - addressed reviewer comment to add break on the post send flow for the new QP type src/qp.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions

[PATCH] IB/iser: add more RX CQs to scale out processing of SCSI responses

2012-09-23 Thread Or Gerlitz
. Once this is made, the RX flow processing of IO responses will now be distributed across multiple CPUs. QPs (-- iser sessions) are assigned to CQs in round robin manner using the current CQ with minimal number of sessions attached to it. Signed-off-by: Or Gerlitz ogerl...@mellanox.com Signed

Re: Quick mlx4 IB SR-IOV howto?

2012-09-26 Thread Or Gerlitz
On Wed, Sep 26, 2012 at 7:14 PM, Roland Dreier rol...@purestorage.com wrote: So I have SR-IOV enabled on a ConnectX-3 adapter, and I loaded the driver with num_vfs=1 probe_vf=1, so in the host I see: # The master device $ ibv_devinfo -d mlx4_1 hca_id: mlx4_1 transport:

Re: Quick mlx4 IB SR-IOV howto?

2012-09-26 Thread Or Gerlitz
On Wed, Sep 26, 2012 at 10:22 PM, Or Gerlitz or.gerl...@gmail.com wrote: On Wed, Sep 26, 2012 at 7:14 PM, Roland Dreier rol...@purestorage.com wrote: What do I need for the slave VF's port to become active? I'm running opensm 3.3.13 on a different box, is that new enough? (does SR-IOV require

Re: [PATCH] mlx4_core: Fix crash on uninitialized priv-cmd.slave_sem

2012-09-26 Thread Or Gerlitz
On Wed, Sep 26, 2012 at 6:42 AM, Roland Dreier rol...@kernel.org wrote: From: Roland Dreier rol...@purestorage.com On an SR-IOV master device, __mlx4_init_one() calls mlx4_init_hca() before mlx4_multi_func_init(). However, for unlucky configurations, mlx4_init_hca() might call

Re: Quick mlx4 IB SR-IOV howto?

2012-09-27 Thread Or Gerlitz
On 27/09/2012 08:47, Roland Dreier wrote: On Wed, Sep 26, 2012 at 2:30 PM, Or Gerlitz or.gerl...@gmail.com wrote: Roland, did this help? do you have IB link for the VF? IPoIB working on it? Sorry, replied to Hal only by accident. Yes, latest opensm makes things work fine for me. Good, so

Re: [PATCH] mlx4_core: Fix crash on uninitialized priv-cmd.slave_sem

2012-09-27 Thread Or Gerlitz
On 27/09/2012 08:46, Roland Dreier wrote: On Wed, Sep 26, 2012 at 2:51 PM, Or Gerlitz or.gerl...@gmail.com wrote: What exactly did you mean by saying for unlucky configurations above? what value did you use for mlx4_core's port_array_type module param? I didn't set the parameter at all. What

Re: [PATCH] mlx4_core: Fix crash on uninitialized priv-cmd.slave_sem

2012-09-27 Thread Or Gerlitz
On 27/09/2012 10:17, Roland Dreier wrote: I think I had it cabled up directly to another HCA, and that HCA was in a system that was either off or at least didn't have the driver loaded. So the port was in the physically DOWN state... However, I just tried it and even with that other HCA

Re: [PATCH] mlx4_core: Fix crash on uninitialized priv-cmd.slave_sem

2012-09-27 Thread Or Gerlitz
On 27/09/2012 08:46, Roland Dreier wrote: On Wed, Sep 26, 2012 at 2:51 PM, Or Gerlitz or.gerl...@gmail.com wrote: What exactly did you mean by saying for unlucky configurations above? what value did you use for mlx4_core's port_array_type module param? I didn't set the parameter at all. What

Re: [PATCH 3/3] mlx4_core: Disable SENSE_PORT for multifunction devices

2012-09-27 Thread Or Gerlitz
-function devices. makes sense, nice doing! Acked-by: Or Gerlitz ogerl...@mellanox.com for patches 1-3 Roland, I see that these three patches are queued @ your for-next and also the initial patch which in a way is more lengthy and heavy. I wonder whether wouldn't it be fare to allow for Jack to review

Re: [PATCH] IB/iser: add more RX CQs to scale out processing of SCSI responses

2012-09-27 Thread Or Gerlitz
On Sun, Sep 23, 2012 at 5:17 PM, Or Gerlitz ogerl...@mellanox.com wrote: From: Alex Tabachnik al...@mellanox.com RX/TX CQs will now be selected from a per HCA pool, for the RX flow this has the effect of using different interrupt vectors over low level drivers (such as mlx4) who map

Re: linux-next: Tree for Oct 2 (ipoib_netlink.c)

2012-10-02 Thread Or Gerlitz
On Wed, Oct 3, 2012 at 6:33 AM, Roland Dreier rol...@kernel.org wrote: From: Roland Dreier rol...@purestorage.com I'll be sending the following to Linus shortly: [PATCH] IPoIB: Fix build with CONFIG_INFINIBAND_IPOIB_CM=n With the new netlink support in commit 862096a8bbf8 (IB/ipoib: Add

Re: Problem running rping over Intel adapters

2012-10-03 Thread Or Gerlitz
On 04/10/2012 03:47, Steve Wise wrote: Not used by iwarp drivers... Which one, the retry counter or the RNR retry counter? Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at

[PATCH V2] {NET,IB}/mlx4: 64 byte CQE/EQE support

2012-10-04 Thread Or Gerlitz
file at ~ogerlitz/tmp-patches/0001-NET-IB-mlx4-64-byte-CQE-EQE-support.patch Jack, I'd like you to review the part in this patch which relates to SRIOV, I've tested it now, applied on Roland's for-next, and it works OK on a system with a VF probed on the host and doing ipoib ping. Both VF and PF

Re: [PATCH V2] {NET,IB}/mlx4: 64 byte CQE/EQE support

2012-10-04 Thread Or Gerlitz
On 04/10/2012 15:05, Or Gerlitz wrote: I'd like to try and push this for 3.7 Indeed, but this is WIP so please ignore (actually if you have something send comments also now...) Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord

if/how to dictate IB device name per PCI BDF

2012-10-11 Thread Or Gerlitz
Hi Roland, We got a report that on a system with multiple (say two) ConnectX HCAs, its possible for the order of device probing to be different across simple reboots, that is sometimes the device with PCI BDF X is probed 1st and gets to be IB device mlx4_0 and some other-timesthe device with

Re: (R)DMA in userspace

2012-10-11 Thread Or Gerlitz
On Thu, Oct 11, 2012 at 10:44 PM, Roland Dreier rol...@purestorage.com wrote: No one has really ever tried to deal with the issue of userspace RDMA on a cache-incoherent architecture. Basically if you try the current stack, the in-kernel users (IPoIB etc) should be OK but libibverbs etc. will

[PATCH for-3.7 2/3] IB/mlx4: Synchronize cleanup of MCGs in mcg paravirtualization

2012-10-17 Thread Or Gerlitz
From: Eli Cohen e...@mellanox.com A client re-register event invokes cleanup of all MCGs. This is required to protect against misbehaved guests leading to corruption of join/leave database. However, since cleaning up the MCGs is a heavy operation, it is pushed to a work queue for further

[PATCH for-3.7 1/3] IB/mlx4: Fix QP1 pkey processing in the Primary Physical Function (PPF)

2012-10-17 Thread Or Gerlitz
From: Jack Morgenstein ja...@dev.mellanox.co.il In the MAD paravirtualization code, one of the checks performed when forwarding QP1 (GSI) packets from wire to slave was a pkey check: The pkey received in the MAD must be present in the guest's paravirtualized pkey table, and at least one of the

[PATCH libmlx4] Add support for 64B CQEs

2012-10-17 Thread Or Gerlitz
read from the device uverbs sysfs entry, and uses it as the key to realize the CQE size if/as advertized by the kernel mlx4_ib driver. Older kernel mlx4_ib ABI versions are still supported. Signed-off-by: Eli Cohen e...@mellanox.co.il Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- src/cq.c

[PATCH for-3.7 0/4] mlx4 SRIOV fixes, 64B CQE/EQE patches re-spin

2012-10-17 Thread Or Gerlitz
Function (PPF) Or Gerlitz (1): {NET,IB}/mlx4: 64 byte CQE/EQE support drivers/infiniband/hw/mlx4/cq.c| 34 +++-- drivers/infiniband/hw/mlx4/mad.c | 89 +++- drivers/infiniband/hw/mlx4/main.c | 27 ++-- drivers

[PATCH for-3.7 3/3] {NET,IB}/mlx4: 64 byte CQE/EQE support

2012-10-17 Thread Or Gerlitz
does use 64B CQEs or future device capabilities which must be in sync by user space. This practice allows to work with unmodified libmlx4 on older devices (e.g A0, B0) which don't support 64 byte CQEs. Signed-off-by: Eli Cohen e...@mellanox.com Signed-off-by: Or Gerlitz ogerl...@mellanox.com

Re: [PATCH for-3.7 1/3] IB/mlx4: Fix QP1 pkey processing in the Primary Physical Function (PPF)

2012-10-18 Thread Or Gerlitz
On Thu, Oct 18, 2012 at 7:33 PM, Roland Dreier rol...@kernel.org wrote: thanks, applied thanks, any insight/s on patches 3/4? Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at

Re: [PATCH for-3.7 0/4] mlx4 SRIOV fixes, 64B CQE/EQE patches re-spin

2012-10-19 Thread Or Gerlitz
On Thu, Oct 18, 2012 at 4:58 PM, Roland Dreier rol...@kernel.org wrote: On Wed, Oct 17, 2012 at 9:42 AM, Or Gerlitz ogerl...@mellanox.com wrote: Also, a respin of the 64B CQE/EQE patches (kernel and user-space), over the V1 posting you were asking for max flexibility - e.g expose/force

Re: [PATCH for-3.7 0/4] mlx4 SRIOV fixes, 64B CQE/EQE patches re-spin

2012-10-20 Thread Or Gerlitz
On Fri, Oct 19, 2012 at 1:58 AM, Roland Dreier rol...@kernel.org wrote: [...] So I think we need some flag passed to the mlx4_core (that drives the PPF) that lets the user opt into 64B CQEs. I would suggest that we start with the default value be disabled and then flip that after a few kernel

[PATCH 1/3] net/mlx4_core: Remove more annoying debug messages from the SRIOV flow

2012-10-21 Thread Or Gerlitz
it pretty hard to actually use the mlx4_core debug messages when running in SRIOV/IB mode -- for example, the module load sequence of a device with one VF yielded 631 debug prints, with 408 of them being from this set. Let's just remove them. Signed-off-by: Or Gerlitz ogerl...@mellanox.com

[PATCH 2/3] net/mlx4: Perform correct resource cleanup if mlx4_QUERY_ADAPTER() fails

2012-10-21 Thread Or Gerlitz
From: Dotan Barak dot...@dev.mellanox.co.il Fixed the resource cleanup to act correctly and prevent a kernel oops when mlx4_QUERY_ADAPTER() fails. Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il Reviewed-by: Jack Morgenstein ja...@dev.mellanox.co.il Signed-off-by: Or Gerlitz ogerl

[PATCH 0/3] mlx4 SRIOV fixes, 64B CQE/EQE V3

2012-10-21 Thread Or Gerlitz
to set it on, as was requested during the V2 review. Or Dotan Barak (1): net/mlx4: Perform correct resource cleanup if mlx4_QUERY_ADAPTER() fails Or Gerlitz (2): net/mlx4_core: Remove more annoying debug messages from the SRIOV flow {NET,IB}/mlx4: 64 byte CQE/EQE support drivers/infiniband/hw

[PATCH V3 3/3] {NET,IB}/mlx4: 64 byte CQE/EQE support

2012-10-21 Thread Or Gerlitz
capabilities change towards VFs and ABI change towards libmlx4 -- a knob was left in the driver under which the new capabilities will take effect only under specific admin directive, of setting the enable_64b_cqe_eqe module param, whose fault value is false. Signed-off-by: Or Gerlitz ogerl

Re: no-snoop flag in memory registration?

2012-10-25 Thread Or Gerlitz
On 23/10/2012 16:32, Klaus Wacker wrote: we are implementing a Linux/RDMA solution based on Mellanox/RoCE. Our memory registration is done via ib_get_dma_mr(). During a problem follow-up someone asked us about the no-snoop flag and how it is set during memory registration in Linux. Can you

some warnings seen while building librdmacm 1.0.16

2012-10-29 Thread Or Gerlitz
Hi Sean, FYI -- the below warnings seen while building librdmacm 1.0.16 with gcc 4.4.6 through rpmbuild Or. make[1]: Entering directory `/root/rpmbuild/BUILD/librdmacm-1.0.16' CC src_librdmacm_la-cma.lo CC src_librdmacm_la-addrinfo.lo CC src_librdmacm_la-acm.lo CC

Re: [PATCH libmlx4 0/8] add raw packet QP, resource limitations, fixes/cleanups

2012-10-31 Thread Or Gerlitz
On Thu, Sep 20, 2012 at 10:30 PM, Or Gerlitz ogerl...@mellanox.com wrote: Roland, This batch of libmlx4 patch contains the patch to support raw packet QP, two patches from Sagi that relate to resource limitations, and few simple fixes/cleanups from Dotan. The first three were submitted pretty

Re: [PATCH] IB: fix task hanging on error recovery

2012-11-01 Thread Or Gerlitz
On 19/10/2012 23:58, Kleber Sacilotto de Souza wrote: During PCI error recovery, the calls to wait_for_completion() in the infiniband core path can hang waiting for some tasks that will never complete, since the hardware is nonfunctional. INFO: task eehd:16029 blocked for more than 120 seconds.

Re: [PATCH] IB: fix task hanging on error recovery

2012-11-05 Thread Or Gerlitz
On Mon, Nov 5, 2012 at 7:21 PM, Hefty, Sean sean.he...@intel.com wrote: drivers/infiniband/core/ucm.c |2 +- drivers/infiniband/core/ucma.c|2 +- On these files, as far as I understand this code from quick looking, I'm not sure on what exactly the

Re: [PATCH] IB: fix task hanging on error recovery

2012-11-05 Thread Or Gerlitz
On Mon, Nov 5, 2012 at 9:54 PM, Kleber Sacilotto de Souza kleb...@linux.vnet.ibm.com wrote: The driver is not returning the completions because during EEH (Extended Error Handling) recovery on powerpc systems the PCI slot is frozen, and we are not going to receive any interrupt from the

Re: [PATCH] IB: fix task hanging on error recovery

2012-11-06 Thread Or Gerlitz
On Tue, Nov 6, 2012 at 11:58 AM, Kleber Sacilotto de Souza kleb...@linux.vnet.ibm.com wrote: During my tests I've seen the wait_for_completion() call hanging on different parts of the code, but not on ucm/ucma. So would it be OK to change the other calls and leave the ucm/ucma as it is?

Re: [PATCH] IB: fix task hanging on error recovery

2012-11-06 Thread Or Gerlitz
On Tue, Nov 6, 2012 at 6:44 PM, Or Gerlitz or.gerl...@gmail.com wrote: the other parts are OK to me. I wanted to say that you stepped on real problem and provided a solution, Roland's wondering is ofcourse correct, how do we avoid use after free in very slow non pci hotunplug cases

<    2   3   4   5   6   7   8   9   10   11   >