[PATCH] librdmacm/rsocket: Support IPV6_V6ONLY socket option

2012-06-14 Thread Hefty, Sean
Signed-off-by: Sean Hefty --- Patch is dependent on proposed kernel changes. include/rdma/rdma_cma.h |1 + src/rsocket.c | 23 +++ 2 files changed, 24 insertions(+), 0 deletions(-) diff --git a/include/rdma/rdma_cma.h b/include/rdma/rdma_cma.h index c0f83b1..

[PATCH 3/3 v2] rdma/cm: Allow user to restrict listens to bound address family

2012-06-14 Thread Hefty, Sean
Provide an option for the user to specify that listens should only accept connections where the incoming address family matches that of the locally bound address. This is used to support the equivalent of IPV6_V6ONLY socket option, which allows an app to only accept connection requests directed to

[PATCH 2/3 v2] rdma/cm: Listen on specific address family

2012-06-14 Thread Hefty, Sean
The rdma_cm maps IPv4 and IPv6 addresses to the same service ID. This prevents apps from listening only for IPv4 or IPv6 addresses. It also results in an app binding to an IPv4 address receiving connection requests for an IPv6 address. Match socket behavior. Restrict listens on IPv4 addresses t

[PATCH 1/3 v2] rdma/cm: Bind to a specific address family

2012-06-14 Thread Hefty, Sean
The rdma cm uses a single port space for all associated (tcp, udp, etc.) port bindings, regardless of the address family that the user binds to. The result is that if a user binds to AF_INET, but does not specify an IP address, the bind will occur for AF_INET6. This causes an attempt to bind to th

RE: rsockets and standard socket based TCP benchmarks

2012-06-14 Thread Hefty, Sean
> Yes, good point. If the other side does not have rsockets then it is not that > straightforward. > > Some thoughts: > > 1. The best option might be if we exchanged an option during > connection setup. This tells the peers if the other side is capable of RDMA. > If it is then one can switch to s

RE: rsockets and other performance

2012-06-14 Thread Hefty, Sean
> Traditional sockets based applications wanting high throughput could use > rsockets Since it is layered on top of uverbs we expected to see good > throughput numbers. > So, we started to run netperf and iperf. We observed that it tops off at > about 20Gb/s with QDR adapters. A quick "perf top" re

RE: [PATCH 2/3] rdma/cm: Listen on specific address family

2012-06-13 Thread Hefty, Sean
> > Match socket behavior. Restrict listens on IPv4 addresses > > to only IPv4 addresses. If a listen is on an IPv6 address, > > allow it to receive either IPv4 or IPv6 addresses. > > Can you match the IP stack and incorporate /proc/sys/net/ipv6/bindv6only I'll check > and socket option IPV6_V

[PATCH 1/3] rdma/cm: Bind to a specific address family

2012-06-13 Thread Hefty, Sean
The rdma cm uses a single port space for all associated (tcp, udp, etc.) port bindings, regardless of the address family that the user binds to. The result is that if a user binds to AF_INET, but does not specify an IP address, the bind will occur for AF_INET6. This causes an attempt to bind to th

[PATCH 3/3] rdma/cm: Allow user to restrict listens to bound address family

2012-06-13 Thread Hefty, Sean
Provide an option for the user to specify that listens should only accept connections where the incoming address family matches that of the locally bound address. This is used to support the equivalent of IPV6_V6ONLY socket option, which allows an app to only accept connection requests directed to

[PATCH 2/3] rdma/cm: Listen on specific address family

2012-06-13 Thread Hefty, Sean
The rdma_cm maps IPv4 and IPv6 addresses to the same service ID. This prevents apps from listening only for IPv4 or IPv6 addresses. It also results in an app binding to an IPv4 address receiving connection requests for an IPv6 address. Match socket behavior. Restrict listens on IPv4 addresses t

RE: ibv_poll_cq() and wc->byte_len

2012-06-13 Thread Hefty, Sean
> In a parallel universe, struct ibv_wc would have a bitmap field indicating > which others fields are valid. In this part of the multiverse, a more > complete documentation would be welcome. If libmlx4/libmthca behavior is > the compliant one, I can provide an updated man page. The best documenta

RE: [PATCH for-next V1 0/4] IB/IPoIB TSS and RSS support for datagram mode

2012-06-12 Thread Hefty, Sean
> Do you have something more specific re how to actually align (say) the > RSS QP group into the framework used by XRC? Not really. That was more of a conceptual statement regarding the design. The operation, particularly on the receive side, just seems similar. I don't think we should force a

RE: rsockets and standard socket based TCP benchmarks

2012-06-11 Thread Hefty, Sean
> Though one can consider the fall-back in reverse order i.e. if the > rdma connection fails continue with the already established connection (over > the normal inet socket). When I consider fallback, one of the issues is handling the case where one of the two sides is not using rsockets. This

RE: rsockets and standard socket based TCP benchmarks

2012-06-08 Thread Hefty, Sean
> But to map standard networking applications to rsockets we will run into the > above problem i.e. fork() will not work.  It would be very useful to > allow for the standard networking paradigm of: bind()->listen()->accept()- > ->fork(), and then the server goes back to accept(). That would allow

RE: [PATCH 1/4] librdamcm/rsocket: Handle SHUT_RD/WR shutdown flags

2012-06-08 Thread Hefty, Sean
> Unfortunately this introduces another issue. > This causes netperf to hang in recv() after shutdown(SHUT_WR) > on the data socket. Bah. Thanks for testing. I was hoping for a simple work-around, rather than expand rsocket states. Oh well. Let me figure out the proper way to handle partial

[PATCH 1/4] librdamcm/rsocket: Handle SHUT_RD/WR shutdown flags

2012-06-06 Thread Hefty, Sean
Sridhar Samudrala reported an error (EOPNOTSUPP) after calling select(). The issue is that rshutdown(SHUT_WR) was called before select(). As part of shutdown, rsockets switches the underlying fd from nonblocking to blocking to ensure that previously sent data has completed. shutdown(SHUT_WR) ind

[PATCH 4/4] librdamcm/rsocket: Use configuration files to specify default settings

2012-06-06 Thread Hefty, Sean
Give an administrator control over the default settings used by rsockets. Use files under %sysconfig%/rdma/rsocket as shown: mem_default - default size of receive buffer(s) wmem_default - default size of send buffer(s) sqsize_default - default size of send queue rqsize_default - default size of r

[PATCH 3/4] librdamcm/rsocket: Spin before blocking on an rsocket

2012-06-06 Thread Hefty, Sean
The latency cost of blocking is significant compared to round trip ping-pong time. Spin briefly on rsockets before calling into the kernel and blocking. The time to spin before blocking is read from an rsocket configuration file %sysconfig%/rdma/rsocket/polling_time. This is user adjustable. As

[PATCH 2/4] librdamcm/rsocket: Handle TCP_MAXSEG socket option

2012-06-06 Thread Hefty, Sean
netperf uses the TCP_MAXSEG socket option. Add support for it. Problem reported by Sridhar Samudrala getsockopt returns the path MTU as the TCP_MAXSEG. setsockopt currently ignores the value. Signed-off-by: Sean Hefty --- src/rsocket.c |9 + 1 files changed, 9 insertions(+), 0 de

[PATCH 1/2] ibacm/acme: Eliminate segfault when SLID/SGID not given

2012-06-04 Thread Hefty, Sean
Problem and cause reported by Hal Rosenstock Signed-off-by: Sean Hefty --- src/acme.c | 14 +- 1 files changed, 9 insertions(+), 5 deletions(-) diff --git a/src/acme.c b/src/acme.c index e6ae188..0e1d4ed 100644 --- a/src/acme.c +++ b/src/acme.c @@ -495,7 +495,8 @@ static int reso

[PATCH 2/2] ibacm: Automatically select local port if not specified by path record

2012-06-04 Thread Hefty, Sean
If the user specifies a DLID or DGID as part of a path record lookup, automatically select a local port. This allows a user to query an SA without needing to specify the local SLID or SGID. Signed-off-by: Sean Hefty --- src/acm.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-)

RE: [PATCH for-next V1 0/4] IB/IPoIB TSS and RSS support for datagram mode

2012-06-04 Thread Hefty, Sean
> Still, the 1st and most thing to handle here is feedback on the QP > groups concept suggested by this patch set to support TSS/RSS over > verbs. The plan is for this concept to (with little help from a > framework for verbs extension) apply to user space RSS as well, > for both UD and RAW QPs. C

RE: HCA concurrency

2012-06-01 Thread Hefty, Sean
> In my test case to evaluate fetch-and-add, I spawn multiple threads, each > owning its own QP inside the same PD and context and sending fetch-and-add > requests without any inner contention (no lock, etc.). I quickly reach a > ceiling of about 900KOPS with 5/6 threads, and I have a hard time fig

RE: [PATCH] librdmacm/man/rdma_getaddrinfo.3: Add RDMA_PS_IB to supported port spaces

2012-06-01 Thread Hefty, Sean
thanks! - applied -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH] ibacm/acme.c: Eliminate seg fault when source not supplied

2012-05-31 Thread Hefty, Sean
> diff --git a/src/acme.c b/src/acme.c > index d3f8174..533588c 100644 > --- a/src/acme.c > +++ b/src/acme.c > @@ -618,12 +618,18 @@ static void resolve(char *svc) > ret = resolve_name(&path); > break; > case 'l': > -

RE: [PATCH] ibacm/acm.c: Make sure shift for subnet timeout is not negative

2012-05-31 Thread Hefty, Sean
thanks! - applied -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH 3/16] librdmacm/rsocket: Fix hang in rrecv/rsend after disconnecting

2012-05-30 Thread Hefty, Sean
Thanks for the feedback. > 1. netperf: get_transport_info: getsockopt: errno 95 > This failure is due to the missing TCP_MAXSEG socket option support. May > be this is OK as this option > doesn't make much sense when using RDMA. Or we could return a reasonable > value. Missing socket options are

[PATCH 9/16] librdmacm/rsockets: Allow user to specify the QP sizes

2012-05-30 Thread Hefty, Sean
Add setsockopt options that allow the user to specify the desired size of the underlying QP. The provided sizes are used as the maximum size when creating the QP. The actual sizes of the QP are the smaller of the user provided maximum and the maximum sizes supported by the underlying hardware. A

[PATCH 8/16] librdmacm/rsockets: Define options specific to rsockets

2012-05-30 Thread Hefty, Sean
Allow a user to control some of the RDMA related attributes of an rsocket through setsockopt/getsockopt. A user specifies that the rsocket should be modified through SOL_RDMA level. This patch provides the initial framework. Subsequent patches will add the configurable parameters. Signed-off-by

[PATCH 10/16] librdmacm/rsocket: Add option to specify size of inline data

2012-05-30 Thread Hefty, Sean
Allow the user to override the default inline data size. We still require a minimum size in order to transfer receive buffer update message. Signed-off-by: Sean Hefty --- We can eliminate the need for inline entirely by reserving some of the send buffer space for control messages, but that work i

[PATCH 11/16] librdmacm/rs-preload: Use environment variable to set QP size

2012-05-30 Thread Hefty, Sean
Allow the user to specify the size of the send/receive queues and inline data size through environment variables: RS_SQ_SIZE, RS_RQ_SIZE, and RS_INLINE. Signed-off-by: Sean Hefty --- src/preload.c | 39 +++ 1 files changed, 39 insertions(+), 0 deletions(-)

[PATCH 12/16] librdmacm/rsockets: Simplify state checks

2012-05-30 Thread Hefty, Sean
Signed-off-by: Sean Hefty --- src/rsocket.c | 26 +++--- 1 files changed, 11 insertions(+), 15 deletions(-) diff --git a/src/rsocket.c b/src/rsocket.c index ef070a8..b89ef42 100644 --- a/src/rsocket.c +++ b/src/rsocket.c @@ -126,6 +126,9 @@ union rs_wr_id { }; };

[PATCH 15/16] librdmacm/rstream: Use snprintf in place of sprintf

2012-05-30 Thread Hefty, Sean
Avoid possible buffer overrun. Signed-off-by: Sean Hefty --- examples/rstream.c | 38 +- 1 files changed, 21 insertions(+), 17 deletions(-) diff --git a/examples/rstream.c b/examples/rstream.c index df36e34..054d11e 100644 --- a/examples/rstream.c +++ b/exa

[PATCH 13/16] librdmacm/rsockets: Change the default QP size from 512 to 384

2012-05-30 Thread Hefty, Sean
Simple latency/bandwidth tests using rstream showed minimal difference in performance between using a QP sized to 384 entries versus 512. Reduce the overhead of a default rsocket by using 384 entries. A user can request a larger size by calling rsetsockopt. Signed-off-by: Sean Hefty --- src/rs

[PATCH 14/16] librdmacm/rstream: Add option to specify size of send/recv buffers

2012-05-30 Thread Hefty, Sean
Signed-off-by: Sean Hefty --- examples/rstream.c | 34 +- man/rstream.1 |5 - 2 files changed, 17 insertions(+), 22 deletions(-) diff --git a/examples/rstream.c b/examples/rstream.c index c440f04..df36e34 100644 --- a/examples/rstream.c +++ b/exampl

[PATCH 6/16] librdmacm/rs-preload: Handle recursive socket() calls

2012-05-30 Thread Hefty, Sean
When ACM support is enabled in the librdmacm, it will attempt to establish a socket connection to the ACM daemon. When the rsocket preload library is in use, this can result in a recursive call to socket() that results in the library hanging. The resulting call stack is: socket() -> rsocket() ->

[PATCH 7/16] librdmacm/rsockets: Reduce QP size if larger than hardware maximums

2012-05-30 Thread Hefty, Sean
When porting rsockets to iwarp, it was discovered that the default QP size (512) was larger than that supported by the hardware. Decrease the size of the QP if the default size is larger than the maximum supported by the hardware. Signed-off-by: Sean Hefty --- src/cma.c | 10 src/

[PATCH 4/16] librdmacm/acm: Use -1 to indicate an invalid socket rather than 0

2012-05-30 Thread Hefty, Sean
socket() can return 0 as a valid socket. This can happen when using a daemon that closes stdin/out/err. Signed-off-by: Sean Hefty --- src/acm.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/acm.c b/src/acm.c index bcf11da..9c65919 100755 --- a/src/acm.c +++

[PATCH 5/16] librdmacm: Delay ACM connection until resolving an address

2012-05-30 Thread Hefty, Sean
Avoid creating a connection to the ACM service when it's not needed. For example, if the user of the librdmacm is a server application, it will not use ACM services. Signed-off-by: Sean Hefty --- src/acm.c | 22 +- src/cma.c |2 -- src/cma.h |2 -- 3 files changed,

[PATCH 3/16] librdmacm/rsocket: Fix hang in rrecv/rsend after disconnecting

2012-05-30 Thread Hefty, Sean
If a user calls rrecv() after a blocking rsocket has been disconnected, it will hang. This problem and the cause was reported by Sridhar Samudrala . It can be reproduced by running netserver -f -D using the rs-preload library. A similar issue exists with rsend(). Fix this by not blocking on a C

[PATCH 16/16] librdmacm/rstream: Use separate connections for latency/bw tests

2012-05-30 Thread Hefty, Sean
Optimize each connection for either latency or bandwidth results. This improves small message latency under 384 bytes by .5 - 1 us, while increasing bandwidth by 1 - 1.5 Gbps. Signed-off-by: Sean Hefty --- examples/rstream.c | 146 ++-- 1 files c

[PATCH 2/16] librdmacm/rstream: Check for connection error on async connect

2012-05-30 Thread Hefty, Sean
Signed-off-by: Sean Hefty --- examples/rstream.c | 14 +- 1 files changed, 13 insertions(+), 1 deletions(-) diff --git a/examples/rstream.c b/examples/rstream.c index 104b318..c440f04 100644 --- a/examples/rstream.c +++ b/examples/rstream.c @@ -448,7 +448,8 @@ static int client_co

[PATCH 0/16] librdmacm: rsocket improvements

2012-05-30 Thread Hefty, Sean
The following patch set provides fixes, adds user configurability, and optimizes rsockets based on the needs and results of performance analysis and wider testing. Additional optimizations will follow, but I wanted to go ahead and push these changes out. Signed-off-by: Sean Hefty -- To unsubscr

[PATCH 1/16] librdmacm: Check that send and recv CQs are different before destroying

2012-05-30 Thread Hefty, Sean
ucma_destroy_cqs() destroys both the send and recv CQs if they are non-null. If the two CQs are actually the same one, this results in a crash when trying to destroy the second CQ. Check that the CQs are different before destroying the second CQ. This fixes a crash when using rsockets, which set

RE: [PATCH 2/3] IB/mlx4: Fix max_wqe capacity report for query device

2012-05-29 Thread Hefty, Sean
> Did you try out the patches? was it helpful to address the problem > you're facing? I have not had time to test it yet -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/maj

RE: Question about RDMA_CM_EVENT_ROUTE_ERROR

2012-05-24 Thread Hefty, Sean
> We are trying to figure out the cause for RDMA_CM_EVENT_ROUTE_ERROR > errors after a failover event of the bonding driver. > The event status returned is -EINVAL. To gather further information on > when this EINVAL is returned, > I added some debug which showed 3 for mad_hdr.status in the below >

RE: [PATCH 2/3] IB/mlx4: Fix max_wqe capacity report for query device

2012-05-24 Thread Hefty, Sean
> 1. Limit max number of wqes per QP reported when querying the device, > so that ib_create_qp will never fail due to any additional headroom WQEs > allocated. > > 2. Limit qp resources accepted for ib_create_qp() to the limits > reported in ib_query_device(). In kernel space, make sure that > the

RE: rsockets and standard socket based TCP benchmarks

2012-05-23 Thread Hefty, Sean
> netperf > - By default netserver forks a child process for each netperf client. As > rsockets >doesn't support fork() yet, this doesn't work. Btw, fork() is unlikely to work anytime soon, if ever... > - netserver has a -f option to disable forking a child and handle 1 > netperf client at a

[PATCH] librdmacm: Support older acm.h header files

2012-05-18 Thread Hefty, Sean
Older versions of acm.h do not include the resolve_data or perf_data fields in struct acm_msg. If we're using an older version of the acm.h header file, use an internal definition of struct acm_msg. Signed-off-by: Sean Hefty --- configure.in |5 + src/acm.c| 16 ++-- 2

[PATCH 1/4] librdmacm/rsocket: Succeed setsockopt REUSEADDR on connected sockets

2012-05-16 Thread Hefty, Sean
The RDMA CM fail calls to set REUSEADDR on an rdma_cm_id if it is not in the idle state. As a result, this causes a failure in NetPipe when run with socket calls intercepted by rsockets. Fix this by returning success when REUSEADDR is set on an rsocket that has already been connected. When runnin

[PATCH 4/4] librdmacm/rstream: Set rsocket nonblocking if set to async operation

2012-05-16 Thread Hefty, Sean
If asynchronous use is specified (use of poll/select), set the rsocket to nonblocking. This matches the common usage case for asynchronous sockets. When asynchronous support is enabled, the nonblocking/blocking test option determines whether the poll/select call will block, or if rstream will spi

[PATCH 2/4] librdmacm/rstream: Always set TCP_NODELAY on rsocket

2012-05-16 Thread Hefty, Sean
The NODELAY option is coupled with whether the socket is blocking or nonblocking. Remove this coupling and always set the NODELAY option. NODELAY currently has no effect on rsockets. Signed-off-by: Sean Hefty --- examples/rstream.c | 11 ++- 1 files changed, 2 insertions(+), 9 deleti

[PATCH 3/4] librdmacm/rstream: Set rsocket nonblocking for base tests

2012-05-16 Thread Hefty, Sean
The base set of rstream tests want nonblocking rsockets, but don't actually set the rsocket to nonblocking. It instead relies on the MSG_DONTWAIT flag. Make the code match the expected behavior and set the rsocket to nonblocking and make nonblocking the default. Provide a test option to switch i

[PATCH] librdmacm/rsocket: Succeed setsockopt REUSEADDR on connected sockets

2012-05-11 Thread Hefty, Sean
The RDMA CM fail calls to set REUSEADDR on an rdma_cm_id if it is not in the idle state. As a result, this causes a failure in NetPipe when run with socket calls intercepted by rsockets. Fix this by returning success when REUSEADDR is set on an rsocket that has already been connected. When runnin

RE: [PATCH] librdmacm/rsockets: Optimize synchronization to improve performance

2012-05-10 Thread Hefty, Sean
> > A test that acquired and released a lock 2 billion times reported that > > the custom lock was roughly 20% faster than using the mutex. > > 26.6 seconds versus 33.0 seconds. > > I think you are measuring the fact your call is inlined and pthreads > has an indirect jump - because internally pth

[PATCH] librdmacm/rsockets: Optimize synchronization to improve performance

2012-05-09 Thread Hefty, Sean
Hotspot performance analysis using VTune showed pthread_mutex_unlock() as the most significant hotspot when transferring small messages using rstream. To reduce the impact of using pthread mutexes, replace it with a custom lock built using an atomic variable and a semaphore. When there's no conten

RE: ib_destroy_cm_id() versus cm callback race ?

2012-04-30 Thread Hefty, Sean
> Are you sure that only one thread at a time will invoke a CM callback ? As > far as I can see cm_recv_handler() queues work without checking whether > any other work is ongoing. From drivers/infiniband/core/cm.c: All callbacks for a single ID should be serialized. (I think the listen ID is an

RE: ib_destroy_cm_id() versus cm callback race ?

2012-04-30 Thread Hefty, Sean
> That makes me wonder how it is prevented that two CM callbacks for the > same CM ID run concurrently on different CPUs ? The callback code ends up looking like this: ret = atomic_inc_and_test(&cm_id_priv->work_count); if (!ret) list_add_tail(&work->list, &cm_id_p

RE: ib_destroy_cm_id() versus cm callback race ?

2012-04-27 Thread Hefty, Sean
> If I interpret the source code in drivers/infiniband/core/cm.c correctly > ib_destroy_cm_id() can return before an ongoing cm_id callback has > finished. Is this on purpose ? If not, isn't there a > flush_workqueue(cm.wq) call missing in cm_destroy_id() ? ib_destroy_cm_id() will block while ther

RE: ibstat does not recognize iWARP RNIC adapters

2012-04-26 Thread Hefty, Sean
> Hal/Sean, I defer to you on whether you think we should add this change to > ibstat. If you all recommend against it, > we'll advise customers to use ibv_devinfo, which is included with libibverbs > and is required for iwarp user apps. But > it seems a minimal change to add, and previously it d

RE: ibstat does not recognize iWARP RNIC adapters

2012-04-26 Thread Hefty, Sean
> Users seem to expect ibstat to show all rdma devices... They why not change ibstat to use ibverbs or have it gather its data directly? What other functionality does umad provide for RNICs? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to maj

RE: ibstat does not recognize iWARP RNIC adapters

2012-04-25 Thread Hefty, Sean
> I noticed that doing ibstat, none of the iWARP RNIC adapters were showing > up.  I have attached a patch to address the issue (libibumad.patch). > > SYS_NODE_TYPE for iWARP RNIC is 4 and is_ib_type only checked to 3. I don't think libibumad should support RNICs. Users can use ibv_devinfo inst

[PATCH] rdma/cm: Fix false possible recursive locking

2012-04-25 Thread Hefty, Sean
The following lockdep problem was reported by Or Gerlitz . = [ INFO: possible recursive locking detected ] 3.3.0-32035-g1b2649e-dirty #4 Not tainted - kworker/5:1/418 is trying to acquire lock: (&id_priv->hand

RE: possible recursive locking detected in the cma?

2012-04-24 Thread Hefty, Sean
> The usual fix is to use mutex_lock_nested, cf "Exception: Nested data > dependencies leading to nested locking" in > Documentation/lockdep-design.txt. I haven't looked at how easy that > is to do in this case. I think in this case, it's easier to just delay the call to rdma_destroy_id() until

RE: possible recursive locking detected in the cma?

2012-04-24 Thread Hefty, Sean
> Specifically, I see that your commit 186834b5de69 "RDMA/ucma: Fix > AB-BA deadlock" deals only with the ucma and this trace doesn't > have ucma symbols, so is that a different race? or false alarm? Maybe I'm overlooking something, but this looks like a false match. > ==

RE: [PATCH] libibverbs-xrc: Rename the attribute private -> private_data

2012-04-24 Thread Hefty, Sean
thanks - applied -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH][TRIVIAL] rping: security fix: replace sprintf with snprintf

2012-04-24 Thread Hefty, Sean
thanks - applied -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: How to use IB netlink infrastructure

2012-04-23 Thread Hefty, Sean
> It's been a few days since I posted this question, > but I've had no responses so far. Can I use the IB netlink infrastructure to produce actual RDMA data transfers? no If this is not possible, what is the actual entry point in the IB kernel code that results in a call to get_user_pages()? I

RE: [PATCH] [TRIVIAL] ibacm: security fix: replace sprintf with snprintf

2012-04-23 Thread Hefty, Sean
thanks - applied -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: The libibverbs with the verbs extension framework uses reserved word in C++

2012-04-23 Thread Hefty, Sean
> We evaluated the verbs extension framework and we noticed that two new > attributes in the structures: > ibv_device > ibv_context > > contain an attribute called "private", which is a reserved work in C++. > Changing those attributes to private_data seems like a good idea. > > Do you wa

rsockets direct data placement

2012-04-18 Thread Hefty, Sean
I've committed the rsocket implementation in my librdmacm git tree, but I'll be fairly open about wire protocol changes until an actual release. I'd like to start a discussion on the best way to support direct placement of data into an application's buffer with rsockets. I spoke with many peopl

RE: [RFC] [PATCH 1/4] librdmacm: Define streaming over RDMA interface (rsockets)

2012-04-13 Thread Hefty, Sean
I'm a little slow writing this up, but for anyone interested, see below for details on the wire protocol. > +#define RS_QP_CTRL_SIZE 4 4 entries on the send queue are reserved for control messages. (At least 1 is needed to avoid deadlock.) User data is only transferred if there is an availabl

[ANNOUNCE] ibacm release 1.0.6

2012-04-13 Thread Hefty, Sean
ibacm release 1.0.6 is now available from: https://www.openfabrics.org/downloads/rdmacm/ibacm-1.0.6.tar.gz The git shortlog from 1.0.5 is: Dotan Barak (1): After allocation of dynamic memory blocks, check the allocation Hal Rosenstock (4): ib_acme: Fix typo ib_acme: Use IPv4 r

RE: [RFC] [PATCH 0/4] librdmacm: Rsockets API and implementation

2012-04-11 Thread Hefty, Sean
> Interesting work. Regarding direct data placement: have the io_submit() > and io_getevents() system calls been considered ? Those are the > foundation of libaio. I did look at aio and other interfaces, but I didn't try to move beyond the idea phase. With aio, I didn't see how the calls could e

[RFC] [PATCH 3/4] rsocket: Add sample application to copy files over rsockets

2012-04-11 Thread Hefty, Sean
rcopy will copy files from a source system to a specified remote server. It's essentially a really dumb FTP type program that can be used to quickly transfer files between systems, which can be useful to verify data integrity. (It was easier to create this program than modify an existing FTP clie

[RFC] [PATCH 0/4] librdmacm: Rsockets API and implementation

2012-04-11 Thread Hefty, Sean
The following patch set contains an initial implementation of rsockets as presented at the 2012 OpenFabrics Workshop. A copy of that presentation is available at: https://www.openfabrics.org/downloads/rdmacm/rsockets-ofa12.pptx and a video of the presentation can be found here:

[RFC] [PATCH 2/4] rsocket: Add example program that uses rsocket

2012-04-11 Thread Hefty, Sean
rstream provides an example that uses either rsocket or socket APIs. The latter allows rstream to be used to verify rsocket behavior compared to socket. Signed-off-by: Sean Hefty --- Makefile.am|5 examples/rstream.c | 570 man/

[RFC] [PATCH 4/4] rsocket: Allow use of LD_PRELOAD to intercept socket calls

2012-04-11 Thread Hefty, Sean
Intercept socket calls and convert TCP socket operation to streaming over RDMA. Allow falling back from rsockets to normal sockets on error or when trying to bind/connect to a reserved port. This is needed to handle MPI job startup, where MPI should use rsockets, but mpiexect needs to communicate

[PATCH v2] ibacm: Fixes to ACM package to support distros

2012-04-06 Thread Hefty, Sean
ibacm: Fixes to ACM package to support distros From: Sean Hefty Set of changes to fixup the ibacm package for inclusion into RedHat 6. Changes are based on feedback from Doug Ledford . These are primarily changes to the build files, along with name changes to the man pages and sample configurati

RE: [PATCH][TRIVIAL] ibacm/acm.c: LID format should be unsigned decimal

2012-04-06 Thread Hefty, Sean
Thanks - I combined this patch with your other LID format change patch. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH] ibacm: Fixes to ACM package to support distros

2012-04-05 Thread Hefty, Sean
> > I create a .tar.gz package using 'make dist', copy it to another > > system, then install it using 'configure && make install'. When I do > > that, sysconfdir defaults to /usr/local/etc, sbindir /usr/local/sbin, > > and bindir to /usr/local/bin. I added /usr/local to PATH, so that > > the ini

RE: [PATCH] ibacm: Fixes to ACM package to support distros

2012-04-05 Thread Hefty, Sean
> In this instance you need to have configure variable substitute your > init script as well. You should not manipulate PATH from an init.d > script, and you should use an absolute path to the daemon to avoid > contamination from the invoking user's environment. This is exactly what I wanted to do

RE: [PATCH] ibacm/acme.c: acm/acme.c: Better error handling in resolve_gid

2012-04-05 Thread Hefty, Sean
applied Thanks for the changes! -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH] ibacm/libacm.c: Use IPv4 rather than IPv6 for ACM server communication

2012-04-05 Thread Hefty, Sean
thanks - applied I will add a separate patch to support IPv6, since librdmacm should be updated as well. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.htm

RE: [PATCH][TRIVIAL] ibacm/acme.C: Fix typo in show_path output

2012-04-05 Thread Hefty, Sean
Thanks - applied -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH] ibacm: Fixes to ACM package to support distros

2012-04-05 Thread Hefty, Sean
I've incorporated most of the changes, but see below. > > -EXTRA_DIST = src/acm_mad.h src/libacm.h \ > > -linux/osd.h linux/dlist.h ibacm.spec.in $(man_MANS) acm_opts.cfg \ > > -acm_addr.cfg > > +EXTRA_DIST = src/acm_mad.h src/libacm.h ibacm.init \ > > +linux/osd.h linux/dl

[PATCH] librdmacm: Automatically detect if ibacm is installed

2012-04-04 Thread Hefty, Sean
If the ibacm header file is available, automatically have the librdmacm configured to use it. This removes the --with-ib_acm configure option. Signed-off-by: Sean Hefty --- configure.in | 16 ++-- 1 files changed, 2 insertions(+), 14 deletions(-) diff --git a/configure.in b/confi

[PATCH] ibacm: Fixes to ACM package to support distros

2012-04-03 Thread Hefty, Sean
Set of changes to fixup the ibacm package for inclusion into RedHat 6. Changes are based on feedback from Doug Ledford . These are primarily changes to the build files, along with name changes to the man pages and sample configuration files. Rename the ib_acm service to match the package name, iba

RE: ibacm fixes/updates

2012-04-02 Thread Hefty, Sean
> By default ibacm expects to find its configuration files in /etc/ibacm. > This adds to the proliferation of directories in /etc/ needlessly. We > already have a number of RDMA related directories to choose from > depending on your install (OFED == /etc/ofed or /etc/openib in the old > days, RHE

RE: Does the CM know how to handle bad packets?

2012-04-02 Thread Hefty, Sean
> We noticed the following interesting scenario: > > One host sent a CM request to a remote host. > > The remote host, which doesn't have any CM support, performed the following > steps: > > > * Replaced the sLID with the dLID > * Added an indication that this is a response MAD > *

RE: [PATCH 1/2] IB/mlx4: fix the case of invalid speed value returned when the port is down

2012-04-02 Thread Hefty, Sean
> On 4/2/2012 7:35 PM, Hal Rosenstock wrote: > > Rather than always overwriting active_speed in this case, wouldn't it > > be better to only do that for invalid values? > > Yes, I have thought about that, however, spotting invalid values would > make the code a bit ugly, so I took this approach, R

RE: [PATCH v1 4/9] ocrdma: Driver for Emulex OneConnect RDMA adapter

2012-03-30 Thread Hefty, Sean
> - async event handling. Can you briefly explain how the code synchronizes between async event reporting and the user trying to free the same object? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info

RE: [PATCH 5/9] ocrdma: Driver for Emulex OneConnect RDMA adapter

2012-03-21 Thread Hefty, Sean
> +static int ocrdma_inet6addr_event(struct notifier_block *, > + unsigned long, void *); > + > +static struct notifier_block ocrdma_inet6addr_notifier = { > + .notifier_call = ocrdma_inet6addr_event > +}; > + > +int ocrdma_get_instance(void) > +{ > + int insta

RE: [PATCH 4/9] ocrdma: Driver for Emulex OneConnect RDMA adapter

2012-03-21 Thread Hefty, Sean
> +static inline void *ocrdma_get_eqe(struct ocrdma_eq *eq) > +{ > + return (u8 *)eq->q.va + (eq->q.tail * sizeof(struct ocrdma_eqe)); > +} casts from (void *) to (u8 *) are not needed. This occurs in multiple places. > +enum ib_qp_state get_ibqp_state(enum ocrdma_qp_state qps) > +{ > +

RE: [PATCH 2/9] ocrdma: Driver for Emulex OneConnect RDMA adapter

2012-03-21 Thread Hefty, Sean
> +struct ocrdma_alloc_ucontext_resp { > + u32 dev_id; > + u32 wqe_size; > + u32 max_inline_data; > + u32 dpp_wqe_size; > + u64 ah_tbl_page; > + u32 ah_tbl_len; > + u32 rsvd; > + u8 fw_ver[32]; > + u32 rqe_size; > + u64 rsvd1; > +} __packed; Is there some re

RE: [PATCH 1/9] ocrdma: Driver for Emulex OneConnect RDMA adapter

2012-03-21 Thread Hefty, Sean
> +struct ocrdma_cq { > + struct ib_cq ibcq; > + struct ocrdma_dev *dev; nit: There are several structures where you store ocrdma_dev *. You can remove these and use the struct ib_* to reach it as well. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body

RE: [PATCH fixed] libibmad: Add MKey support to SMP requests via smp_mkey_get/set()

2012-03-09 Thread Hefty, Sean
> What mkey model is being proposed here ? It looks to me like it is a > single mkey for all ports in the subnet which is the simplest but least > flexible model. If so, I think we need something more flexible as IBA > allows each port to have it's own different mkey. Is something more needed than

RE: [PATCH 1/25 v2] rdma/cm: define native IB address

2012-03-07 Thread Hefty, Sean
> On Mon, Feb 27, 2012 at 2:22 PM, Hefty, Sean wrote: > >> > --- a/include/linux/socket.h > >> > +++ b/include/linux/socket.h > >> > @@ -184,6 +184,7 @@ struct ucred { > >> >  #define AF_PPPOX       24      /* PPPoX sockets                */ &g

[PATCH] rdma/ucm: Fix circular locking resulting in deadlock

2012-03-01 Thread Hefty, Sean
When we destroy a cm_id, we must purge associated events from the event queue. If the cm_id is for a listen request, we also purge corresponding pending connect requests. This requires destroying the cm_id's associated with the connect requests by calling rdma_destroy_id(). rdma_destroy_id() blo

RE: possible circular locking dependency in ucma

2012-03-01 Thread Hefty, Sean
> I run into the below ucma related warnings from the kernel with 3.3-rc5 > when I stepped over crash of process as of wrong libs/etc (not the point > here...). Do you see here a real bug? basically the process was exiting > and the cleanup code in the kernel was running rdma_destroy_id when a > ca

<    3   4   5   6   7   8   9   10   11   12   >