[PATCH 4/26 v3] rdma/cm: Update port reservation to support AF_IB

2012-09-24 Thread Hefty, Sean
The AF_IB uses a 64-bit service id (SID), which the user can control through the use of a mask. The rdma_cm will assign values to the unmasked portions of the SID based on the selected port space and port number. Because the IB spec divides the SID range into several regions, a SID/mask combinati

[PATCH 3/26 v3] ib/addr: Add AF_IB support to ip_addr_size

2012-09-24 Thread Hefty, Sean
Add support for AF_IB to ip_addr_size, and rename the function to account for the change. Give the compiler more control over whether the call should be inline or not by moving the definition into the .c file, removing the static inline, and exporting it. Signed-off-by: Sean Hefty --- drivers/i

[PATCH 2/26 v3] rdma/cm: Include AF_IB in loopback and any address checks

2012-09-24 Thread Hefty, Sean
Enhance checks for loopback and any address to support AF_IB in addition to AF_INET and AF_INET6. This will allow future patches to use AF_IB when binding and resolving addresses. Signed-off-by: Sean Hefty --- drivers/infiniband/core/cma.c | 40 1 file

[PATCH 0/26 v3] rdma/cm: Add support for native InfiniBand addressing

2012-09-24 Thread Hefty, Sean
I'm Sean Hefty, and I approve this message. This patch series adds the ability to handle native Infiniband addressing to the rdma_cm. In addition to supporting native addresses, this support allows us to offload name and/or address translation services to a user space daemon, providing the user g

RE: how to preserve QP over HA events for librdmacm applications

2012-09-20 Thread Hefty, Sean
> Fair enough, I understand one needs to use a different CM id. For the IB > case I was thinking of avoiding APM (since that is limited to a device > -isn't that so?). APM is limited to a single device, as is memory registration, CQs, PDs, SRQs, etc. Migration between devices requires entirely n

[PATCH 7/8] libibverbs: Add man page for ibv_open_qp

2012-09-20 Thread Hefty, Sean
Signed-off-by: Sean Hefty --- man/ibv_open_qp.3 | 50 ++ 1 files changed, 50 insertions(+), 0 deletions(-) create mode 100644 man/ibv_open_qp.3 diff --git a/man/ibv_open_qp.3 b/man/ibv_open_qp.3 new file mode 100644 index 000..0bc5647 --- /d

[PATCH 2/2] libmlx4: Add support for XRC QPs

2012-09-20 Thread Hefty, Sean
Signed-off-by: Sean Hefty --- Note that I have a hack in cq.c. Someone more familiar with the mlx4 HW needs to look at the change. src/buf.c |6 +- src/cq.c | 40 --- src/mlx4-abi.h |6 ++ src/mlx4.c | 19 +++-- src/mlx4.h | 59 src/qp.

[PATCH 1/2] libmlx4: Infra-structure changes to support verbs extensions

2012-09-20 Thread Hefty, Sean
From: Yishai Hadas Signed-off-by: Yishai Hadas Signed-off-by: Tzahi Oved --- src/mlx4.c | 83 +++- src/mlx4.h | 16 2 files changed, 70 insertions(+), 29 deletions(-) diff --git a/src/mlx4.c b/src/mlx4.c index 8cf249a..1

[PATCH 8/8] libibverbs: Add XRC sample application

2012-09-20 Thread Hefty, Sean
From: Jay Sternberg Signed-off-by: Jay Sternberg Signed-off-by: Sean Hefty --- Makefile.am |4 examples/xsrq_pingpong.c | 877 ++ 2 files changed, 880 insertions(+), 1 deletions(-) create mode 100644 examples/xsrq_pingpong.c diff

[PATCH 2/8] libibverbs: Support older providers that do not support extensions

2012-09-20 Thread Hefty, Sean
In order to support providers that do not handle extensions, including providers built against an older version of ibverbs, add a compatibility layer. This allows most of the core ibverbs code to assume that extensions are always available. The compatibility layer is responsible for converting be

[PATCH 5/8] libibverbs: libibverbs: Add support for XRC QPs

2012-09-20 Thread Hefty, Sean
XRC queue pairs: xrc defines two new types of QPs. The initiator, or send-side, xrc qp behaves similar to a send- only RC qp. xrc send qp's are managed through the existing QP functions. The send_wr structure is extended in a back- wards compatible way to support posting sends on a send xrc qp,

[PATCH 6/8] libibverbs: libibverbs: Add ibv_open_qp

2012-09-20 Thread Hefty, Sean
XRC receive QPs are shareable across multiple processes. Allow any process with access to the xrc domain to open an existing QP. After opening the QP, the process will receive events related to the QP and be able to modify the QP. Signed-off-by: Sean Hefty --- include/infiniband/driver.h |

[PATCH 3/8] libibverbs: Introduce XRC domains

2012-09-20 Thread Hefty, Sean
XRC introduces several new concepts and structures, one of which is the XRC domain. XRC domains: xrcd's are a type of protection domain used to associate shared receive queues with xrc queue pairs. Since xrcd are meant to be shared among multiple processes, we introduce new APIs to open/close xrc

[PATCH 4/8] livibverbs: Add support for XRC SRQs

2012-09-20 Thread Hefty, Sean
XRC support requires the use of a new type of SRQ. XRC shared receive queues: xrc srq's are similar to normal srq's, except that they are bound to an xrcd, rather than to a protection domain. Based on the current spec and implementation, they are only usable with xrc qps. To support xrc srq's, w

[PATCH 1/8] libibverbs: Infra-structure changes to support verbs extension

2012-09-20 Thread Hefty, Sean
From: Yishai Hadas Infrastructure to support extended verbs capabilities in a forward/backward manner. The general operation as shown in the following pseudo-code: ibv_open_device() { context = device->ops.alloc_context(); if (context == -1) { context_ex = malloc

RE: how to preserve QP over HA events for librdmacm applications

2012-09-20 Thread Hefty, Sean
> What if you say pre-created a second (fail over) QP for HA purposes all > under the covers of a single socket? And both QPs were connected before > the failure. Not sure if that would work with the same CM id though. If > not, we will need to rdma_connect() the second QP after failure. CM IDs ar

RE: how to preserve QP over HA events for librdmacm applications

2012-09-19 Thread Hefty, Sean
> I don't know if it matters to the conversation or not, but I use an SRQ. I am > unclear how to remove a QP from the SRQ. Is ibv_destroy_qp() sufficient? Or do > I need to use rdma_destroy_qp()? rdma_destroy_qp() is a wrapper around ibv_destroy_qp(), plus destroys any internally allocated resour

RE: how to preserve QP over HA events for librdmacm applications

2012-09-19 Thread Hefty, Sean
> I too would be interested in bringing a QP from error back to a usable state. > I > have been debating whether to reconnect using the current RDMA calls versus > trying to transition the existing RC QP. > > I assumed to transition the existing QP that I would need to open a socket to > coordina

RE: how to preserve QP over HA events for librdmacm applications

2012-09-19 Thread Hefty, Sean
> Can this flushing be somehow done with the current librdmacm/libibverbs APIs > or we need some enhancement? You can call verbs directly to transition the QP state. That leaves the CM state unchanged, which doesn't really matter for UD QPs anyway. - Sean -- To unsubscribe from this list: send

[PATCH] librdmacm/rsockets: Document rsocket protocol and design

2012-09-11 Thread Hefty, Sean
Include a brief overview of the rsocket protocol and underlying design with the source code to make it easier for someone trying to decipher the actual code. Signed-off-by: Sean Hefty --- docs/rsocket | 144 ++ 1 files changed, 144 inserti

RE: [PATCH] IB: new module params. cm_response_timeout, max_cm_retries

2012-09-10 Thread Hefty, Sean
> Create two kernel parameters, in order to make variables configurable. > i.e. cma_cm_response_timeout for CM response timeout, > and cma_max_cm_retries for the number of retries. > > They can now be configured via command line for the kernel modules. > For example: > # modprobe ib_srp c

RE: rsocket library and dup2()

2012-09-05 Thread Hefty, Sean
> I found the following code in dup2(): > > oldfdi = idm_lookup(&idm, oldfd); > if (oldfdi && oldfdi->type == fd_fork) > fork_passive(oldfd); > > In that code the file descriptor type ("type") is compared with a fork > state enum value ("fd_fork"). Is that on purpose ??

RE: Writing RDMA applications on Linux

2012-08-28 Thread Hefty, Sean
> $ ./examples/rstream -s 10.30.3.2 -S all > name bytes xfers iters total time Gb/secusec/xfer > 16k_lat 16k 1 10k 312m0.52s 5.06 25.93 > 24k_lat 24k 1 10k 468m0.82s 4.79 41.08 > 32k_lat 32k 1 1

RE: ibv_modify_qp to IBV_QPS_ERR returns EAGAIN

2012-08-28 Thread Hefty, Sean
> I was wondering if anyone could shed some light about what kind > conditions might cause ibv_modify_qp to IBV_QPS_ERR to return EAGAIN? > > The error occurred on a QP that probably already had some work > completions for requests that failed. I've only seen it happen once in > about 3 months, so

RE: [PATCH for-next V1 0/4] IB/IPoIB TSS and RSS support for datagram mode

2012-08-28 Thread Hefty, Sean
> Can you let us know your thoughts here? I understand the purpose behind TSS/RSS. I'm not fond of making verbs more complex, but I haven't come up with anything that's really simpler. Tzahi's response addressed my main concerns. Is there a compelling reason for ever exposing this feature to

RE: [PATCH 1/3] librdmacm: Report error messages on stderr

2012-08-27 Thread Hefty, Sean
thanks - all 3 applied -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH] libdmacm/rspreload: Avoid rsocket calls until after fork

2012-08-27 Thread Hefty, Sean
> > rsockets enables fork support only when RDMAV_FORK_SAFE has been set. > > I do not call ibv_fork_init(). > > understood, so you see the problem also when RDMAV_FORK_SAFE has been set? yes -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majo

RE: completions for unsignalled WRs following rdma_disconnect

2012-08-27 Thread Hefty, Sean
> Could someone confirm that rdma_disconnect is supposed to generate > completion event(s) for all posted work requests, whether signaled or not? rdma_disconnect transitions the QP into the error state, which should flush all posted work requests. -- To unsubscribe from this list: send the line "

RE: [PATCH] libdmacm/rspreload: Avoid rsocket calls until after fork

2012-08-27 Thread Hefty, Sean
> Are you calling from rsockets to ibv_fork_init or setting one of > libibverb's yyy_FORK_SAFE env vars? rsockets enables fork support only when RDMAV_FORK_SAFE has been set. I do not call ibv_fork_init(). - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the

RE: [PATCH 2/8] opensm/complib: define "if" statements with branch prediction hints

2012-08-27 Thread Hefty, Sean
> On 8/15/2012 12:54 AM, Jason Gunthorpe wrote: > > On Tue, Aug 14, 2012 at 09:39:23PM +0000, Hefty, Sean wrote: > >>> +#define if_PF(cond) if(CL_PREDICT_FALSE(cond)) > >>> +#define if_PT(cond) if(CL_PREDICT_TRUE(cond)) > >> >

RE: rsockets and fork

2012-08-24 Thread Hefty, Sean
> I don't think those mmap()s should be an issue with fork they are > mapping adapter PCI space into userspace, but it should work across > fork. makes sense Do you have any ideas on ways to identify what in the initialization paths might cause the problems? (Assuming that is where the prob

RE: Writing RDMA applications on Linux

2012-08-24 Thread Hefty, Sean
> post a message receive > rdma connection > wait for rdma connection event > <> > start: >register memory containing bytes to transfer >wait remote memory region addr/key ( I wait for a ibv_wc) >send data with ibv_post_send (IBV_WR_RDMA_WRITE) >post a message receive >wait for

RE: [PATCH RFC] RDMA/cma: Make IPoIB port space multicast joins consistent with IPoIB

2012-08-23 Thread Hefty, Sean
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > index 7172559..f7e4cb9 100644 > --- a/drivers/infiniband/core/cma.c > +++ b/drivers/infiniband/core/cma.c > @@ -3056,9 +3056,16 @@ static int cma_join_ib_multicast(struct rdma_id_private > *id_priv, > I

RE: [PATCH for-next 1/4] IB/core: Remove unused variables in ucm/ucma

2012-08-23 Thread Hefty, Sean
> From: Dotan Barak > > Remove unused wait objects from ucm/ucma events flow. > > Signed-off-by: Dotan Barak > Signed-off-by: Or Gerlitz Acked-by: Sean Hefty > --- It looks like these were last used in 2.6.21. > drivers/infiniband/core/ucm.c |1 - > drivers/infiniband/core/ucma.c |

[PATCH] libdmacm/rspreload: Avoid rsocket calls until after fork

2012-08-23 Thread Hefty, Sean
When an rsocket call is made before an application calls fork(), the forked applications can hang. This can be seen by running netserver and two netperf clients simultaneously. The second netperf client will eventually stop performing data transfers. LD_PRELOAD=librspreload.so netserver -D LD_P

RE: [PATCH RFC] RDMA/cma: Make IPoIB port space multicast joins consistent with IPoIB

2012-08-23 Thread Hefty, Sean
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > index 7172559..f7e4cb9 100644 > --- a/drivers/infiniband/core/cma.c > +++ b/drivers/infiniband/core/cma.c > @@ -3056,9 +3056,16 @@ static int cma_join_ib_multicast(struct rdma_id_private > *id_priv, > I

RE: rsockets and fork

2012-08-22 Thread Hefty, Sean
> I saw this code in preload library and was wondering why rsocket() is called > and closed immediately if fork_support is enabled. I guess you are doing this > so that you can fallback to real socket at the initial socket() call instead > of waiting all the way until fork_active/fork_passive. This

RE: rsockets and fork

2012-08-22 Thread Hefty, Sean
I'm haven't identified the specific problem with fork support, but I did see this in libmlx4: mlx4_alloc_context() { ... context->uar = mmap(NULL, to_mdev(ibdev)->page_size, PROT_WRITE, MAP_SHARED, cmd_fd, 0); if (context->uar == MAP_FAILED)

RE: [PATCH 5/7] librdmacm/rspreload: Do not block connect when supporting fork

2012-08-21 Thread Hefty, Sean
> I am seeing another hang with 2 clients doing bi-directional traffic to a > server. > I am consistently seeing this hang with the attached test programs. > > You can start the server using > tcp_server -d -e > > and run 2 clients > tcp_client -h -n 1 -d -e > > Can you try it out with

RE: [PATCH 5/7] librdmacm/rspreload: Do not block connect when supporting fork

2012-08-20 Thread Hefty, Sean
> I just updated the repository and noticed that we are falling back to > normal sockets even with preload. > After some debugging, found a couple of issuse. See inline. Thanks for the report. I've pushed a patch to the repository to address those issues, along with one other problem that I was

RE: rsockets and fork

2012-08-17 Thread Hefty, Sean
> > retries = 0, err = 0, index = 16, ctrl_avail = 3, sqe_avail = 1020, > > ^^ > This looks like part of the problem. There should be 4 control messages > available by default. The receiver has sent a control message to the sender, > but the co

RE: rsockets and fork

2012-08-17 Thread Hefty, Sean
thanks - this helps ... a little The sender is waiting for the receiver to publish additional receive buffer space. > (gdb) bt > #0 0x003286ed83f0 in __read_nocancel () from /lib64/libc.so.6 > #1 0x003b7220a1c4 in ibv_get_cq_event () from /usr/lib64/libibverbs.so.1 > #2 0x7fee0064

RE: [ANNOUNCE] Remote IB Statistics

2012-08-17 Thread Hefty, Sean
> rpms can be created but are not currently posted anywhere but only for the > linux components. where would you suggest they be stored for general > consumption? I guess source packages, RPMS, executables, or whatever else makes sense could be placed at www.openfabrics.org/downloads, maybe unde

RE: [ANNOUNCE] Remote IB Statistics

2012-08-17 Thread Hefty, Sean
> I was asked to develop a way for IB statistics to be accessed on a smart phone > and came up with Remote IB Statistics or rIBs for short and posted the repos > on openfabrics.org for review/comment. Are packaged alpha/beta releases available? -- To unsubscribe from this list: send the line "unsu

RE: rsockets and fork

2012-08-16 Thread Hefty, Sean
> This test is using Mellanox 10Gb RoCEE with MTU set to 9000 > > Server is started using > # ldr netserver -D > > 2 clients are started in 2 windows as follows. > > # ldr netperf -v2 -c -C -H 192.168.0.22 -l10 > MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 192.168.0.22

[PATCH 2/2] librdmacm/rstream: Use MSG_WAITALL for blocking test

2012-08-16 Thread Hefty, Sean
Signed-off-by: Sean Hefty --- examples/rstream.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/examples/rstream.c b/examples/rstream.c index befb7c6..1d221d0 100644 --- a/examples/rstream.c +++ b/examples/rstream.c @@ -607,7 +607,7 @@ static int set_test_opt(char *op

[PATCH 1/2] librdmacm/rsockets: Add support for MSG_WAITALL rrecv() flag

2012-08-16 Thread Hefty, Sean
Adapted from patch by Sridhar Samudrala Signed-off-by: Sean Hefty --- The diff makes the changes look bigger than they are. I used a loop, kept my state check, and fixed how rbuf_bytes_avail gets updated. src/rsocket.c | 71 +++-- 1 files

RE: [PATCH] librdmacm/rsockets: Support MSG_WAITALL with rsockets recv()

2012-08-16 Thread Hefty, Sean
> Support MSG_WAITALL flag with recv() when using rsockets. > > Signed-off-by: Sridhar Samudrala The MSG_PEEK description that you pointed me to wasn't in the man page documentation that I was looking at. That simplifies things. I originally expected adding MSG_WAITALL support to be as trivia

RE: rdma_connect() timeout question

2012-08-16 Thread Hefty, Sean
> Is there a way to tune rdma_connect() timeout to a lower value ? I don't believe so. You could abort the connection by destroying the id after a specified amount of time had expired. > Can cma_response_timeout and CMA_MAX_CM_RETRIES be made tunables ? > Even better would have been if the call

[PATCH 5/7] librdmacm/rspreload: Do not block connect when supporting fork

2012-08-16 Thread Hefty, Sean
Many FTP servers require fork support. However, FTP clients, such as ncftp, will perform the following call sequence: send PASV request to server over connection 1 server will listen for connection 2 issue nonblocking connect to server send ACCEPT request to server over connection 1

[PATCH 7/7] librdmacm/rspreload: Add fstat support

2012-08-16 Thread Hefty, Sean
vsftpd calls fstat on a socket. Fake it out. Signed-off-by: Sean Hefty --- src/preload.c | 17 + 1 files changed, 17 insertions(+), 0 deletions(-) diff --git a/src/preload.c b/src/preload.c index c6cf176..8f19af5 100644 --- a/src/preload.c +++ b/src/preload.c @@ -87,6 +87,7 @

[PATCH 4/7] librdmacm/rspreload: Minor cleanup of fork_passive handling

2012-08-16 Thread Hefty, Sean
Minor code cleanup in passive side handling of fork support. Signed-off-by: Sean Hefty --- src/preload.c |7 ++- 1 files changed, 2 insertions(+), 5 deletions(-) diff --git a/src/preload.c b/src/preload.c index b18d310..bb8e3fb 100644 --- a/src/preload.c +++ b/src/preload.c @@ -492,7 +4

[PATCH 6/7] librdmacm/rspreload: Support sendfile

2012-08-16 Thread Hefty, Sean
Handle users calling sendfile with an rsocket. Signed-off-by: Sean Hefty --- src/preload.c | 24 1 files changed, 24 insertions(+), 0 deletions(-) diff --git a/src/preload.c b/src/preload.c index 8b86415..c6cf176 100644 --- a/src/preload.c +++ b/src/preload.c @@ -38,6

[PATCH 1/7] librdmacm/rspreload: Call real.close in fd_close

2012-08-16 Thread Hefty, Sean
The index into the preload lookup table is obtained by opening /dev/null and use the returned value. When closing the file, use the real close call and not the preload close call. This is a minor optimization, but clarifies the expected operation. Signed-off-by: Sean Hefty --- src/preload.c |

[PATCH 2/7] librdmacm/rspreload: Support dup2 calls

2012-08-16 Thread Hefty, Sean
vsftpd requires dup2() support. To handle dup2, we need to add reference count tracking to the preload fd's. Signed-off-by: Sean Hefty --- src/cma.h | 32 +++ src/preload.c | 79 - 2 files changed, 109 insertion

[PATCH 3/7] librdmacm/rsockets: Support SO_OOBINLINE

2012-08-16 Thread Hefty, Sean
We don't support urgent data, so just return success. Signed-off-by: Sean Hefty --- src/rsocket.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/src/rsocket.c b/src/rsocket.c index b9105a1..996cb2f 100644 --- a/src/rsocket.c +++ b/src/rsocket.c @@ -1820,6 +1820,10 @

RE: [PATCH] librdmacm/rsockets: Support MSG_WAITALL with rsockets recv()

2012-08-16 Thread Hefty, Sean
> Support MSG_WAITALL flag with recv() when using rsockets. > > Signed-off-by: Sridhar Samudrala I started working on a patch for MSG_WAITALL a few weeks ago, but set it aside. The problem I hit into was trying to handle MSG_PEEK at the same time. This complicated the patch quite a bit. - S

RE: handling rdma apps using chroot

2012-08-15 Thread Hefty, Sean
> There are lots of issues with using a dev/sysfs interface instead of > system calls and trying to support chroot... Not sure how rdmacm > works, but verbs returns event channel FDs directly 'in-band' which > avoids further use of dev.. The rdmacm calls open() in rdma_create_event_channel()... >

RE: handling rdma apps using chroot

2012-08-15 Thread Hefty, Sean
> Somehow you need to open the verbs device before doing the choot.. I > think once verbs is open there is no further need for sysfs and dev.. I could hook chroot to force this. That should fix the /sys issue. However, the librdmacm accesses /dev when creating an event channel, which occurs af

RE: [PATCH 2/8] opensm/complib: define "if" statements with branch prediction hints

2012-08-14 Thread Hefty, Sean
> +#define if_PF(cond) if(CL_PREDICT_FALSE(cond)) > +#define if_PT(cond) if(CL_PREDICT_TRUE(cond)) If CL_PREDICT_TRUE/FALSE are too long, why not just shorten those, rather than abstract if statements behind a macro? -- To unsubscribe from this list: send the line "unsubscribe l

RE: rsockets and fork

2012-08-14 Thread Hefty, Sean
> Yes. it is also using rsockets. > The second session always hangs after sending a fixed number of bytes > (38469632). > rsend() blocks waiting for the CQ event. Can you send me the parameters that you use for testing?

RE: Setting service level for a QP with ibv_modify_qp and RDMA_CM

2012-08-14 Thread Hefty, Sean
> we are trying to set the service level for a QP with ibv_modify_qp, but > ibv_modify_qp() returns an error (errno = EINVAL). > > We are using RDMA CM and use rdma_create_qp() to allocate the queue > pair. After some searching in the net we found posts, which indicate > that ibv_modify_qp() canno

RE: rsockets and fork

2012-08-13 Thread Hefty, Sean
> I could not get fork enabled netperf to work with rsockets in the latest > librdmacm git repository. > After some debugging, i found that the child netserver process is blocked at > sem_wait() call in fork_passive(). > It is not clear to me how this call is supposed to unblock as sem_post() > is

RE: [ANNOUNCE] librdmacm-1.0.16

2012-08-13 Thread Hefty, Sean
> > > This release contains several bug fixes from 1.0.15, plus introduces the > rsocket API and protocol. > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > > > the body of a message to majord...@vger.kernel.org > > > More majordomo info at http://vger.kerne

RE: [ANNOUNCE] librdmacm-1.0.16

2012-08-13 Thread Hefty, Sean
> > Are there any plans to include SOCK_DGRAM support? > > > > I could see that being potentionally interesting along with mapping > > broadcast/multicast to IB physical layer multicast. > > I took a look at linux/net/rds to see if something similar could be > done in terms of transparently suppor

RE: [PATCH] RDMA/ucma.c: Different fix for ucma context uid=0, causing iWarp RDMA applications to fail in connection establishment

2012-08-10 Thread Hefty, Sean
> > Roland, there's a race here where ucma_set_event_context() copies ctx->uid > > to > the event structure outside of the mutex. Once the mutex is acquired, > ctx->uid > is checked. However, the uid could have changed between saving it off to the > event and checking it. > > OK. So then this

RE: [PATCH] [trivial] infiniband: Fix typo in infiniband driver

2012-08-09 Thread Hefty, Sean
> diff --git a/drivers/infiniband/hw/amso1100/c2_rnic.c > b/drivers/infiniband/hw/amso1100/c2_rnic.c > index 8c81992..b80867e 100644 > --- a/drivers/infiniband/hw/amso1100/c2_rnic.c > +++ b/drivers/infiniband/hw/amso1100/c2_rnic.c > @@ -439,7 +439,7 @@ static int c2_rnic_close(struct c2_dev *c2dev)

RE: [PATCH] RDMA/ucma.c: Different fix for ucma context uid=0, causing iWarp RDMA applications to fail in connection establishment

2012-08-04 Thread Hefty, Sean
> diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c > index 8002ae6..88c50d2 100644 > --- a/drivers/infiniband/core/ucma.c > +++ b/drivers/infiniband/core/ucma.c > @@ -267,6 +267,7 @@ static int ucma_event_handler(struct rdma_cm_id *cm_id, > if (!uevent) >

RE: [PATCH] RDMA/ucma.c: Fix for ucma context uid=0, causing iWarp RDMA applications to fail in connection establishment

2012-08-02 Thread Hefty, Sean
> drivers/infiniband/core/ucma.c |3 +-- > 1 files changed, 1 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c > index 8002ae6..6cc40de 100644 > --- a/drivers/infiniband/core/ucma.c > +++ b/drivers/infiniband/core/ucma.c > @@ -803,

RE: [PATCH 3/5] librspreload: Support server apps that call fork()

2012-08-01 Thread Hefty, Sean
> Have you tried this with netperf? Yes, I tested it with: netserver netserver -D netserver -D -f Example: # export RDMAV_FORK_SAFE=1 # LD_PRELOAD=/usr/local/lib/rsocket/librspreload.so netserver Starting netserver with host 'IN(6)ADDR_ANY' port '12865' and family AF_UNSPEC # export RDMAV_FOR

RE: [RFC] zero-copy extensions for rsockets

2012-07-31 Thread Hefty, Sean
> I'm not sure that is so great, one of the benefits of the aio > interface is you have just one queue and one eventfd to manage, no > matter how many fd's you are AIOing against. Completions can happen > out of order. Requiring an app to juggle multiple ioq thingies split > on some arbitrary axis

RE: [RFC] zero-copy extensions for rsockets

2012-07-31 Thread Hefty, Sean
> libaio is designed to be used along with an eventfd that provides the > epoll like semantics you are talking about. Each time you call > io_submit you can call io_set_eventfd() on the iocb and the aio engine > will trigger that eventfd when the IO completes. poll or epoll on the > eventfd fd. A

RE: [RFC] zero-copy extensions for rsockets

2012-07-31 Thread Hefty, Sean
> This looks very similar to the libaio interface.. I did look at aio. It may be possible to use aio context in place of ioq, and I'm open to that. I was actually modeling ioq more after epoll than aio. It just seemed simpler to treat an ioq as a standard fd. For the get/put calls, there's

[RFC] zero-copy extensions for rsockets

2012-07-31 Thread Hefty, Sean
Before implementing this, I'm looking for feedback. The following proposal defines user-space APIs to support zero-copy. The intent is that the use of these extensions is fully compatible with existing calls, allowing applications to make selective use of them. Although I'm specifically looki

[PATCH 1/3] librdmacm/rsockets: Enable support for privileged ports

2012-07-30 Thread Hefty, Sean
Allow the preload library to use rsockets with privileged ports. Signed-off-by: Sean Hefty --- src/preload.c | 30 ++ 1 files changed, 6 insertions(+), 24 deletions(-) diff --git a/src/preload.c b/src/preload.c index c8ad747..52eaf1a 100644 --- a/src/preload.c +++

[PATCH 2/3] librdmacm/rsockets: Use wr_id to determine completion type

2012-07-30 Thread Hefty, Sean
If a work request has completed in error, the completion type field is undefined. Use the wr_id to determine if the failed completion was a send or receive. This fixes an issue where MPI can hang during finalize. With both sides of a connection shutting down simultaneously, one side may complete

[PATCH 3/3] librdmacm/rsocket: Improve disconnect time

2012-07-30 Thread Hefty, Sean
When both sides of a connection attempt to close at the same time, one of the two sides can easily get an error when sending a disconnect message. This results in that side hanging during close until the send times out. (The time out is caused by the remote side destroying its QP.) We can reduce

RE: [PATCH] RDMA/ucma: Convert open-coded equivalent to memdup_user()

2012-07-27 Thread Hefty, Sean
> From: Roland Dreier > > Suggested by scripts/coccinelle/api/memdup_user.cocci. > > Reported-by: Fengguang Wu > Signed-off-by: Roland Dreier Acked-by: Sean Hefty > --- > drivers/infiniband/core/ucma.c | 19 +++ > 1 file changed, 7 insertions(+), 12 deletions(-) > > diff

[PATCH 1/5] librdmacm: Only allocate verbs resources when needed

2012-07-24 Thread Hefty, Sean
The librdmacm allocates a PD per device on initialization. Although we need to maintain the device list while the library is loaded (see rdma_get_devices), we can reduce the overhead by only allocating verbs resources when they are needed. This allows the rsocket preload library to support fork f

[PATCH 2/5] librspreload: Make socket_fallback() call more generic

2012-07-24 Thread Hefty, Sean
socket_fallback is used to switch from an rsocket to a normal socket in the case of failures. Rename the call and make it more generic, so that it can switch between an rsocket and a normal socket in either direction. This will be used to support fork(). As part of this change, we move the list

[PATCH 3/5] librspreload: Support server apps that call fork()

2012-07-24 Thread Hefty, Sean
Provide limited support for applications that call fork(). To handle fork(), we establish connections using normal sockets. The socket is later converted to an rsocket when the user makes the first call to a data transfer function (e.g. send, recv, read, write, etc.). Fork support is indicated by

[PATCH 4/5] librdmacm/rstream: Add option to test fork support

2012-07-24 Thread Hefty, Sean
If the user specifies '-T f', rstream will process connections in a child process. The server continues to run until all child processes have completed their tests. Fork support requires use of the librspreload library. Signed-off-by: Sean Hefty --- examples/rstream.c | 36 ++

[PATCH 5/5] librspreload: Call init from getsockname()

2012-07-24 Thread Hefty, Sean
netperf for some unknown reason calls getsockname() using a hard coded value of 0, without first allocating a socket. This causes the rsocket preload library to crash, since the library has not been properly initialized. Signed-off-by: Sean Hefty --- src/preload.c |1 + 1 files changed, 1 in

RE: uverbs message alignment

2012-07-20 Thread Hefty, Sean
> struct c4iw_create_raw_qp_req { > struct ibv_create_qp ibv_req; > __u32 port; > __u32 vlan_pri; > __u32 nfids; > }; struct ibv_create_qp contains a u64, which will force the size of the structure to 64-bit. You should to add an additional 32-bits of padding.

fork support, was RE: rsockets and standard socket based TCP benchmarks

2012-07-20 Thread Hefty, Sean
> > Have you had a chance to look more into the above for fork() support? Well, the good news is that fork support is possible under some limitations. I have a modified version of librspreload that hooks fork and transitions the last accepted connection from a normal socket to an rsocket. (Thi

RE: rdma_connect() "timeout"

2012-07-18 Thread Hefty, Sean
> According to the OpenSM default configuration (/usr/sbin/opensm > --create-config ) : > > # The subnet_timeout code that will be set for all the ports > # The actual timeout is 4.096usec * 2^ > subnet_timeout 18 > > # The code of maximal time a packet can live in a switch > # The actu

RE: rdma_connect() "timeout"

2012-07-18 Thread Hefty, Sean
> Is there a way to setup the timeout in rdma_connect() ? For IB, the timeout is based on the packet lifetime in the path record returned by the SA. The rdma_cm will retry a CM REQ the maximum number of times (15). > Is there a way to change the CM parameters ? e.g. "Service Timeout" to > wait

RE: rsockets and standard socket based TCP benchmarks

2012-07-16 Thread Hefty, Sean
> Have you had a chance to look more into the above for fork() support? Actually, I just started working on it last Friday. I'll post a patch once I have at least something working. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to major

[ANNOUNCE] librdmacm-1.0.16

2012-07-13 Thread Hefty, Sean
librdmacm release 1.0.16 is now available from www.openfabrics.org/downloads/rdmacm This release contains several bug fixes from 1.0.15, plus introduces the rsocket API and protocol. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vg

RE: [PATCH] librdmacm/preload.c: Eliminate some compile warnings

2012-07-12 Thread Hefty, Sean
doh (to me) - thanks! applied -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH] librdmacm/rsocket: Build librspreload library as part of build

2012-07-11 Thread Hefty, Sean
Build the rsocket preload library as part of the build. To reduce the risk of the preload library intercepting calls without the user's knowledge, the preload library is installed into {_libdir}/rsocket. Signed-off-by: Sean Hefty --- diff --git a/Makefile.am b/Makefile.am index 51b2f89..b27

RE: [PATCH librdmacm] rdma_resolve_addr: source address protocol family must be valid

2012-07-11 Thread Hefty, Sean
thanks - applied -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: linux-next: build failure after merge of the infiniband tree

2012-07-05 Thread Hefty, Sean
> Thanks, fixed with a "#if IS_ENABLED(CONFIG_IPV6)" around the code > that touches ipv6... Sean, let me know if more is required. The fix-up looks to be complete. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org

RE: [Q] How to tranfer a file which is over 2GB(2^31) size in RDMA network?

2012-07-03 Thread Hefty, Sean
> Hello Parav.Pandit > > Thank you for your advice. > > I'll try it. You can also look at rsockets in the latest librdmacm library. You'd need to download and build the library yourself, since rsockets is not yet available in any release. But there's a sample program (rcopy) that will copy a

RE: rsockets with RoCE

2012-06-29 Thread Hefty, Sean
> > No objection. The rdma_cm shouldn't be considered speed path anyway. Btw, > > the IB CM exports some > counters which can sometimes be helpful in debugging, though, those only > report a count of which > messages have been sent/received. > > I have not used this before. How does one read t

RE: rsockets with RoCE

2012-06-27 Thread Hefty, Sean
> I attempted to use rsockets with ConnectX-EN adapters and the client > receives a "Connection refused" message. I debugged this a bit further > and see that the client is actually receiving an IB_CM_REJ_RECEIVED with > reason being 28 i.e. "Consumer Reject". Could this be because of the > differe

[PATCH] librdmacm/rsocket: Set readfds event if rsocket has been disconnected

2012-06-27 Thread Hefty, Sean
Signed-off-by: Sean Hefty --- src/rsocket.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/rsocket.c b/src/rsocket.c index 394fed4..c833d46 100644 --- a/src/rsocket.c +++ b/src/rsocket.c @@ -1631,7 +1631,7 @@ rs_poll_to_select(int nfds, struct pollfd *fds, fd_set *

[PATCH] librdmacm/rsocket: Handle other shutdown options

2012-06-27 Thread Hefty, Sean
Handle SHUT_RD and SHUT_WR shutdown options. In order to handle shutting down the send and receive sides separately, we break the connection state into multiple sub-states. This allows us to be partially connected (i.e. for either just reads or just writes). Support for SHUT_WR is needed to handl

RE: RDMA_CM_EVENT_REJECTED and ressources release

2012-06-21 Thread Hefty, Sean
> - call rdma_disconnect(): even if the connection is not established, > rdma_disconnect() can be called. > > In this case, all receive WR posted came back in error. On processing a reject event, the librdmacm should transition the QP into the error state. That should flush all posted work

[PATCH] rdma/cm: QP type check on received REQs should be AND not OR

2012-06-14 Thread Hefty, Sean
Change || check to && when checking the QP type in a received connection request against the listening endpoint. Signed-off-by: Sean Hefty --- Found by code inspection. drivers/infiniband/core/cma.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/cor

<    2   3   4   5   6   7   8   9   10   11   >