> Does going through timewait always holds? e.g no matter what's the
> return status of rdma_disconnect and/or the status of the rdma_cm
> disconnected event?
It usually holds. It will fail if rdma_disconnect() is called from a bogus
state. But otherwise, I believe that it will enter timewait o
> I'm debugging some disconnect related race in iser - and wanted to check
> with you something re the CM/RDMA-CM state machine: I see that when a
> disconnected is initiated by the passive side (iser target) of a
> connection, such that the active side (iser initiator) gets
> RDMA_CM_EVENT_DISCONN
> > Yes, but the use of the cache is hidden from the user.
>
> user who?
ib_cm, rdma_cm, ib_mad, ib_ipoib, etc. You would need to trace the use up to
all ULPs.
> Since the drivers would be users of cache calls, the behavior would be
> as assumed for the query calls for the driver. Refer mthca
> This fix adds an x86_64 specific routine that 1) probes for
> X86_FEATURE_REP_GOOD
> and 2) uses an inline asm routine builton rep movsq that testing has shown is
> better than the builtin memcpy for all cases up to 4K. The probing routine is
> now called when the qib module is loaded to enable
> > > 1. Greater degree of control by individual drivers. Drivers have a
> > >choice to use it or not.
> >
> > I believe that some callers need to know that specific query calls will not
> sleep. That capability should either be required or exposed through the API.
>
> The new cache access fu
> The main motivations are:
>
> 1. Greater degree of control by individual drivers. Drivers have a
>choice to use it or not.
I believe that some callers need to know that specific query calls will not
sleep. That capability should either be required or exposed through the API.
> 2. The lib
Modify rdma_bind_addr to allow the user to specify AF_IB when
binding to a device. AF_IB indicates that the user is not
mapping an IP address to the native IB addressing. (The mapping
may have already been done, or is not needed.)
Signed-off-by: Sean Hefty
---
drivers/infiniband/core/cma.c |
The AF_IB uses a 64-bit service id (SID), which the
user can control through the use of a mask. The rdma_cm
will assign values to the unmasked portions of the SID
based on the selected port space and port number.
Because the IB spec divides the SID range into several regions,
a SID/mask combinati
Add support for AF_IB to ip_addr_size, and rename the function
to account for the change. Give the compiler more control over
whether the call should be inline or not by moving the definition
into the .c file, removing the static inline, and exporting it.
Signed-off-by: Sean Hefty
---
drivers/i
Enhance checks for loopback and any address to support AF_IB
in addition to AF_INET and AF_INT6. This will allow future
patches to use AF_IB when binding and resolving addresses.
Signed-off-by: Sean Hefty
---
drivers/infiniband/core/cma.c | 40
1 files
Define AF_IB and sockaddr_ib to allow the rdma_cm to use native IB
addressing.
Signed-off-by: Sean Hefty
---
The format of sockaddr_ib was a result after a lengthy discussion on
the linux-rdma list, mostly between myself and Jason.
include/linux/socket.h |2 +
include/rdma/ib.h | 89
The following patches are the first 5 in a series of 25 total that
adds the ability to handle native Infiniband addressing to the rdma_cm.
I'm hoping by submitting only a small subset of the patches at a time,
they will be easier to review.
Adding support for native IB addressing allows us to offl
In order to support OFED, vendor specific calls, or new ibverbs
operations, define a generic extension mechanism. This allows
OFED, an RDMA vendor, or another registered 3rd party (for
example, the librdmacm) to define RDMA extensions, plus provides
a backwards compatible way to add new features t
> Are there calls in rdma_cm that allow me to do this? For the life of
> me I can't figure out how, for instance, I would write an app that has
> a single QP that recieves UD datagrams from anyone. How do potential
> clients find the QP number, qkey and ah without using listen/accept
> which will
> Do you want me to resent a new patch only with the cleanup code?
If you have time, please do.
Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
> Is it possible to send/recv datagrams with no previous connection
> similar to UDP sendto/recvfrom? UD examples I've seen are connection
> oriented and listen/connect/accept, establishing a connection prior to
> conversing.
To send a datagram, you need the QP number, qkey, and addressing inform
> The function idr_pre_get may fail, so there is a need to check the status of
> it. Since this function allocate resources, we need to clean them during the
> resource cleaning in case of error.
We should add the cleanup code below, but I don't think we care if idr_pre_get
fails here. We'll jus
> Yes, a user can modify path lifetime via rdma_create_ep() but there
> is no way for the user to know how much that will be manipulated and
> increased in the IB CM driver.
Sure there is. It's an open source driver. :)
The ib_cm calculates the "correct" timeout based on the packet lifetime
pr
> For larger, more congested fabrics, a larger ACK timer is needed.
> Consumers can still change default with environment variable
> DAPL_ACK_TIMER if they need to increase or decrease.
>
> This applies to SCM and UCM providers only. The CMA provider, which
> uses rdma_cm, has no way to control ac
> Let's not get into fairness here... I'm trying to make progress on my backlog
> but there are patches that for better or worse have been around for a year
> or more.
Along these lines, is there any news on when patchwork might be available
again? I've been trying to help review some of the bac
> My question is: will the mc->next contain the right value?
> (since the compiler may cache the value of mc->next within that function and
> not read the updated value from memory,
> so if the mc_list will be changed by another thread, we may get bad results
> ...)
Access to the list is protected
> If a user tried to join a multicast group and the variable "mc" is being added
> to the list of the mc_list:
> 1) If the write() fails, the value of mc->next may be changed by another
> thread (for example, removing the mcast which is pointed by mc->next)
> will the "mc->next" still point to
thanks - applied
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
> Sean, wait, the rdmacm IPoIB port space allows librdmacm consumers to
> subscribe to multicast groups created by IPoIB and vise versa, so for
> udp/multicast the answer is yes, agree?
Good point - for multicast it should work.
--
To unsubscribe from this list: send the line "unsubscribe linux-r
> I try to use ibv_poll_cq to identify connectivity problems. The
> scenario is following, based on modified rping example:
>
> 1) preliminary steps done and rdma connection established between
> Client and Server, retry_count in rdma_conn_param is set 1;
> 2) Server lost its link (corresponding s
> Is it possible using verbs to send and/or recv UDP data traveling over
> IPoIB? Or to assocate a socket with a completion queue?
I didn't see a response to this, but the answers are no.
- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to
thanks - applied
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
> Subject: Are there are problems using the CMA with IPv6?
IPv6 should work fine.
> I noticed the following code in the cmatose/rping examples:
but, I can't say whether all of the examples support ipv6. I believe that they
do.
>
> static int get_addr(char *dst, struct sockaddr *addr)
> {
>
> again, and just to make sure I got it - for basic XRC testing which doesn't go
> to MPI nor to
> the OFED compatability APIs, what env/test would you recommend - is that the
> xrc branch on the three libraries and rdma_x{client, server}?
Yes - please make sure you have pulled those branches rece
> So what else would you suggest for further testing, is that pulling
> the xrc branch of your ofa hosted librdmacm/libibverbs/libmlx4 trees
> and run librdmacm's rdma_{xclient,xserver} example? I was a bit confused
> since I see this example both in the master and the xrc brach.
I wanted to keep
> I wonder how important wire compatibilty of the ibv_xx_pingpong
> examples is... should I worry about this?
I know that for some of the larger clusters, they may not be able to upgrade
the software on all systems during a scheduled downtime. This is about the
only reason I can think of why yo
> I pulled the xrc patches in your for-next branch and ran some simple
> tests against it. Between the last time I tested XRC and now, I'm now
> seeing mvapich2 hang during MPI finalize, which I'm debugging.
Just an update: the issues that I was seeing were caused by missing patches in
my librar
> Our 3.1-rc9 included Rolands for-next branch.
Actually, mine did too. I wonder if this OFED patch to libibverbs is causing
the issue:
http://git.openfabrics.org/git?p=ofed_1_5/libibverbs.git;a=blob;f=fixes/rocee_examples.patch;h=eda5a401a3424a104e8100848b5b6bf4e5b63bee;hb=HEAD
--
To unsubscri
> Running ibv_ud_pingpong and ibc_uc_pingpong between two hosts. One with
> OFED 1.5.3.1 (Ubuntu LTS 10.04) and another on linux 3.1.0-rc9 (Same
> ubuntu version uderlying) with the upstream libraries.
FWIW, I was able to run 3.1-rc9 in loopback and between 3.0 and 3.1-rc9
systems. I don't have
We need to add an entry into the uverbs and device command
tables to allow user space to actually call ib_open_qp.
Signed-off-by: Sean Hefty
---
If possible, this should just be merged with the last patch in the XRC
series.
In my previous tests, this was not getting called. I either did not
hav
> Wait, now I'm baffled by the patch (ie
> http://git.openfabrics.org/git?p=~shefty/rdma-
> dev.git;a=commitdiff;h=1ec4e62a6e967ddc258e7c4e674168debb727d39)
>
> I don't see anything that calls ib_uverbs_open_qp(). Am I missing something??
>
> Does the OFED API compatibility actually call this fu
> The result is pushed out to my github for-next branch, with the
> expectation that I'll ask Linus to pull for 3.2.
Thanks - I'll take a look and test again.
> However I do have one question: the last patch
> ("RDMA/uverbs: Export ib_open_qp() capability to user space" in
> my tree) adds IB_US
> Did we every resolve the controversy about the rcv QPs with
> MPI users? The design seemed sane to me, but
Yes - I believe so.
> Also (I'm sure you already posted this once, but...) Sean,
> do you have a git tree with all the kernel patches included?
My latest patches are at:
git
> Why is the OFED libibverbs library binary incompatible with the
> non-OFED libibverbs library ? Why hasn't XRC support been implemented
> in the OFED libibverbs library such that applications built against
> the upstream libibverbs headers also work with the latest OFED version
> of that library
> The following extended speeds are introduced:
> FDR-10 - is a proprietary link speed which is 10.3125 Gbps at 64/66
> encoding rather than 8b10b encoding.
> FDR - represents the IBA extended speed: 14.0625 Gbps.
> EDR - represents the IBA extended speed: 25.78125 Gbps.
>
> Signe
> @@ -186,16 +187,30 @@ static ssize_t rate_show(struct ib_port
> return ret;
>
> switch (attr.active_speed) {
> - case 2: speed = " DDR"; break;
> - case 4: speed = " QDR"; break;
> + case 2:
> + speed = " DDR";
> + break;
> + case 4:
>
> Why not just define this function to return the rate in units where
> it's a whole number?
> ie just have the function return the value in Mbps instead of Gbps, so
> it returns
> 2500, 5000, etc. instead of 2+(*rounded=5), 5+(*rounded=0), etc.
even simpler!
> > Or if ib_rate_to_int() is only us
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -77,6 +77,35 @@ enum ib_rate mult_to_ib_rate(int mult)
> }
> EXPORT_SYMBOL(mult_to_ib_rate);
>
> +int ib_rate_to_int(enum ib_rate rate, int *rounded)
> +{
I like this approach better, but I wonder if it still c
> The rdma_cm uses the local qp_type to determine how to
> process an incoming request. This can result in an
> incoming REQ being treated as a SIDR REQ and vice versa.
> Fix this by switching off the event type instead, and for
> good measure verify that the listener supports the incoming
> conne
> I'm using RDMA CM. I am making the ibv_modify_qp() call immediately after
> rdma_create_qp(). Should I move this code after the rdma_connect() /
> rdma_accept()?
>
> Sample code I am looking at is from the OFA coding course.
If you're using the rdma cm, the QP transitions are handled for you
> Thanks for responding Roland. I wasn't setting these up. I've added a
> call to ibv_modify_qp to set these but it doesn't help. I even explicitly
> set the access rights on the QP
>
> qp_attr.max_rd_atomic = 32;
> qp_attr.max_dest_rd_atomic = 32;
> qp_attr.qp_access_fl
> * When I reduce the sge length of the first work request to 20 ibv_post_send
> works
> * When I remove IBV_SEND_INLINE from the send_flags, ibv_post_send works.
>
> Either case I am unable to achieve the desired functionality.
>
> Is there a maximum sge size limit in the case of IBV_SEND_INLINE
> code snips
> start
> --
> int rc;
> struct ibv_send_wr bad_send_wr[3];
This should just be:
struct ibv_send_wr *bad_send_wr;
On a post error, the pointer will be set to the work request that failed.
>
> 1. Remove ibstat and use ibv_devinfo instead
> 2. Change ibstat to obtain the fdr10 information using ibverbs
> 3. Move the is_fdr10 functionality into OS specific files or code
> sections of ibstat
> 4. Change ibstat to obtain fdr10 data using MADs or some other OS
> inde
So far, it seems that the choices are:
1. Remove ibstat and use ibv_devinfo instead
2. Change ibstat to obtain the fdr10 information using ibverbs
3. Move the is_fdr10 functionality into OS specific files or code sections of
ibstat
4. Change ibstat to obtain fdr10 data using MADs or some other OS
> > The only way to determine whether fdr10 is active or not is via the
> > vendor proprietary MAD. That info may be reflected in some other API
> > (and/or file) so that MAD does not need to be reissued. In a separate
> > thread on linux-rdma, there was discussion on a couple of different
> > ways
> Even if ibv_devinfo is updated to include the additional information,
> do we want to require libibverbs, etc. on any IB management machine
> just for this ? That's not the case today on IB management nodes.
IMO, it's acceptable for ibverbs to be the basic requirement for any IB
userspace appli
> It supresses a warning, warnings are not build failures.
Actually, on windows the build fails. No executables are built.
mad_[get|set]_field[8|16|32] should work fine. But those are exported from
ibmad, which would require ib-diags to either check for them as a requirement
or implement them
> Adding a cast is just pointless noise, doesn't fix or prove anything.
It fixes the build on windows. You can call that pointless, but I do not.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at
Signed-off-by: Sean Hefty
---
changes from v1: reset mode back to 644
libibnetdisc/src/ibnetdisc.c |2 +-
src/ibportstate.c|4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/libibnetdisc/src/ibnetdisc.c b/libibnetdisc/src/ibnetdisc.c
index 86210eb..c93e7a
> > libibnetdisc/src/ibnetdisc.c | 2 +-
> > src/ibportstate.c | 4 ++--
> > 2 files changed, 3 insertions(+), 3 deletions(-)
> > mode change 100644 => 100755 libibnetdisc/src/ibnetdisc.c
> > mode change 100644 => 100755 src/ibportstate.c
>
> A minor nit: is this mode change re
Signed-off-by: Sean Hefty
---
libibnetdisc/src/ibnetdisc.c |2 +-
src/ibportstate.c|4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
mode change 100644 => 100755 libibnetdisc/src/ibnetdisc.c
mode change 100644 => 100755 src/ibportstate.c
diff --git a/libibnetdisc/sr
> Does this mean "ibstatus" does not work on Windows?
We do not support any of the scripts on windows. As far as I could tell, the
scripts look like they just do post-processing of available output.
> How are you proposing the addition to ibverbs? It seems this would break ABI
> there.
On wi
commit 1344cb3feacafc462440dabfa5997c5205486d83 added support for FDR10 in a
way that is not compatible with Windows support. Windows does not use files to
read attribute information.
I will probably need to obtain the necessary information using ibverbs on
windows by reading port attributes.
> What is your end goal? To have one code base for OpenSM that would be able to
> be compiled on both Linux and Windows based on __WIN__ definition?
My end goal is to decrease the maintenance cost porting opensm to Windows.
Ideally, I'd like to have a common code base for opensm, similar to what
> Why to test for __WIN__ instead of _WIN32 (defined both when building
> 32-bit and 64-bit code -- see also
> http://msdn.microsoft.com/en-us/library/b0084kay%28v=vs.80%29.aspx) ?
I have no idea. This is just what's currently in the code. I can change this
portion of the code if we want to use
It would be easier to maintain opensm on windows if it truly shared the same
code base. For now, I'd just like to start with a common ib_types.h file.
(There are currently hundreds, if not thousands, of lines that differ.)
ib_types.h uses #if defined(__WIN__) to separate linux from windows co
> It maps to what was done in the PortInfo attribute to add the new
> extended speeds. There was no room for expansion in the existing
> original link speed fields so a "parallel" set of fields had to be
> added there..
That's was an issue with the wire protocol format, correct? Why carry that
s
> Index: b/drivers/infiniband/core/verbs.c
> ===
> --- a/drivers/infiniband/core/verbs.c 2011-09-13 13:34:19.660539000 +0300
> +++ b/drivers/infiniband/core/verbs.c 2011-09-13 16:42:39.713754400 +0300
> @@ -77,6 +77,23 @@ enum ib_rate
> Index: b/drivers/infiniband/core/sysfs.c
> ===
> --- a/drivers/infiniband/core/sysfs.c 2011-09-14 13:49:58.0 +0300
> +++ b/drivers/infiniband/core/sysfs.c 2011-09-14 13:50:43.731775900 +0300
> @@ -209,7 +209,7 @@ static ssize
> Ultimately I think the scalable/compatible answer is to move these
> RMPP work loads to a verbs QP and we need to have a user space RMPP
> implementation for that anyhow.
Many to one is never scalable. The applications simply cannot rely on every
node querying the SA at the same time, especial
> As I see it the problem flow would be this:
>
> poll
> ibv_req_notify_cq
> // HCA Write CQ Entry
> // Trigger EVENT
Since the trigger event must be separate from writing the cq entry, if it
doesn't happen until...
> poll -> return CQ
>
> poll
> ibv_req_notify_cq // does nothing, event is alr
Maybe someone at Mellanox can provide some guidance here. Here are the
arm/get_event calls from the libmlx4.
int mlx4_arm_cq(struct ibv_cq *ibvcq, int solicited)
{
...
sn = cq->arm_sn & 3;
ci = cq->cons_index & 0xff;
cmd = solicited ? MLX4_CQ_DB_REQ_NOT_SOL
> poll and cq_events are totally independent, I belive the implementation
> of CQ events is like:
>
> enum {DISABLED,ARMED,TRIGGERED} flag;
I was suggesting that there were only 2 states, not 3, with ibv_get_cq_event()
simply retrieving a queued event from the kernel without touching the CQ stat
> > Case 1:
> > ret = ibv_poll_cq(id->recv_cq, 1, wc);
> > ret = ibv_req_notify_cq(id->recv_cq, 0);
> >
> > while (!(ret = ibv_poll_cq(id->recv_cq, 1, wc))) {
> > ret = ibv_get_cq_event(id->recv_cq_channel, &cq, &context);
> > ibv_ack_cq_events(id->recv_cq, 1);
>
I have a ping-pong test application that loops doing: send, wait for send
completion, wait for receive completion. The test occasionally hangs in the
following code at ibv_get_cq_event() (error handling removed):
Case 1:
ret = ibv_poll_cq(id->recv_cq, 1, wc);
ret = ibv_req_notif
Roland,
Can I get a quick status update regarding this series?
Thanks,
Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
> Couldn't the kernel just copy data as needed, and say that
> userspace needs to keep the buffer stable until the send
> completes? (With an opt-in from userspace maybe?)
I agree, and I'll add that the IBTA could also take up the task of coming up
with a far more efficient way of transferring e
cmd is unsigned, no need to check for < 0. Found by
code inspection.
Signed-off-by: Sean Hefty
---
drivers/infiniband/core/ucm.c |2 +-
drivers/infiniband/core/ucma.c|2 +-
drivers/infiniband/core/user_mad.c|5 ++---
drivers/infiniband/core/uverbs_main.c |3 +
If a received MAD contains an invalid or reserved mgmt class,
we will attempt to access method_table outside of its range.
Add a check to ensure that mgmt class falls within the
handled range.
Found by code inspection.
Signed-off-by: Sean Hefty
---
drivers/infiniband/core/mad.c |3 +++
1 fi
Check that conn_param is not null before dereferencing it
when processing rdma_accept(). This is necessary to prevent
a possible system crash, which can be invoked by user space.
Problem found by code inspection.
Signed-off-by: Sean Hefty
---
drivers/infiniband/core/cma.c | 38 ++
The following code in addr6_resolve can result in calling
neigh_event_send() with a NULL neighbour:
neigh = dst->neighbour;
if (!neigh || !(neigh->nud_state & NUD_VALID)) {
neigh_event_send(dst->neighbour, NULL);
ret = -ENODATA;
goto put;
}
Fix this. Found by code i
> Does IPoIB module depend on Subnet manager/Subnet Administration components
> for
> any service while resolving IP address to GID.
Yes - it relies on multicast groups having been setup and SA queries.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a messa
> Can one connect a UC QPs using the CMA?
No, but the xrc patches should enable this.
An app may be able to create UC QPs and connect them, but the rdma_cm will
treat the connection as RC.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord
> The only issue from my viewpoint is whether the change referenced by patch 11
> (ib/cm: Update xrc support based on xrc annex errata) has been approved by the
> IBTA. I'm waiting on a response on that.
I received confirmation that the errata to the XRC Annex was approved. Both
patch 10 (ib/cm
> In the code, it looks like during resolving IP addr to GID, one of the kernel
> module (ib_addr) is using neigh_lookup(..) to get the hardware address of the
> given IP. I did not get the dependency on the module ib_ipoib to resolve
> remote
> IP addr to GID. Am I missing anything here? Please
> Prevent resource leak by destroying the event channel before returning from
> function in an error flow.
Thanks - I applied this patch after my patch "Fix resource leak when
CMA_CREATE_MSG_CMD_RESP fails"
- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the b
If resources are allocated before CMA_CREATE_MSG_CMD_RESP or
CMA_CREATE_MSG_CMD are called, and those calls fail, we need
to cleanup the resources before returning.
Fix this by changing the CMA_CREATE_MSG macros to remove the
alloca and calling return. The request and response structures
are now
> -#define CMA_CREATE_MSG_CMD_RESP(msg, cmd, resp, type, size) \
> +#define CMA_CREATE_MSG_CMD_RESP(msg, cmd, resp, type, size, clean_cmd) \
This starts to get ugly, especially with usage that ends up looking like this:
> - CMA_CREATE_MSG_CMD_RESP(msg, cmd, resp, UCMA_CMD_DESTROY_ID, size);
>
The rdma_cm uses the local qp_type to determine how to
process an incoming request. This can result in an
incoming REQ being treated as a SIDR REQ and vice versa.
Fix this by switching off the event type instead, and for
good measure verify that the listener supports the incoming
connection reques
> Did you have a chance for reviewing those patches?
I did see and flagged them for follow up, but I have not reviewed them yet. I
should get to them sometime this week.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
> I am a bit concerned here. In the current usage model, target QPs are
> destroyed when their reference
> count goes to zero
> (ib_reg_xrc_recv_qp and ibv_xrc_create_qp increment the reference count,
> while ib_unreg_xrc_recv_qp
> decrements it).
> In this model, the TGT QP user/consumer does n
> Great stuff! I'd really like to get this into kernel 3.2... do you have any
> open issues or is it all good from your POV?
The only issue from my viewpoint is whether the change referenced by patch 11
(ib/cm: Update xrc support based on xrc annex errata) has been approved by the
IBTA. I'm wa
Implement the libibverbs xrc support using the defined xrc
extension.
This patch is based on a patch by Jack Morgenstein
.
Signed-off-by: Sean Hefty
---
Changes from v1:
Add support for open_qp().
Avoid allocating unnecessary resources for XRC TGT QPs.
src/buf.c |6 +
src/cq.c |
Update the libmlx4 library to register extensions with
libibverbs, if it supports extensions.
By registering extension support, this indicates to ibverbs
that extended data structures are available.
Signed-off-by: Sean Hefty
---
Makefile.am|2 +-
configure.in |3 ++
src/mlx4-ext.c
The following patches add support for XRC to ibverbs. Support for XRC
requires additional function calls into the verbs driver library
(e.g. libmlx4): open_xrcd, close_xrcd, create xrc srqs, and open_qp.
Since no mechanism is currently defined to allow for
these calls, we first define a way to add
This patch allows libibverbs to support both libibverbs API that shipped with
OFED 1.5 and the upstream libibverbs API. This supports existing apps
that are compiled against the upstream libibverbs (ibverbs). And in ideal
cases, an application coded to the OFED version of libibverbs (ofverbs)
wou
Signed-off-by: Sean Hefty
---
man/ibv_open_qp.3 | 48
1 files changed, 48 insertions(+), 0 deletions(-)
create mode 100644 man/ibv_open_qp.3
diff --git a/man/ibv_open_qp.3 b/man/ibv_open_qp.3
new file mode 100644
index 000..577a28d
--- /dev
XRC receive QPs are shareable across multiple processes. Allow
any process with access to the xrc domain to open an existing
QP. After opening the QP, the process will receive events
related to the QP and be able to modify the QP.
Signed-off-by: Sean Hefty
---
Submitting this change separate fr
From: Jay Sternberg
Signed-off-by: Jay Sternberg
Signed-off-by: Sean Hefty
---
Makefile.am |4
examples/xsrq_pingpong.c | 873 ++
2 files changed, 876 insertions(+), 1 deletions(-)
create mode 100644 examples/xsrq_pingpong.c
diff
Define a common libibverbs driver side extension to support XRC.
XRC introduces several new concepts and structures:
XRC domains: xrcd's are a type of protection domain used to
associate shared receive queues with xrc queue pairs. Since
xrcd are meant to be shared among multiple processes, we
in
Signed-off-by: Sean Hefty
---
Makefile.am |3 +-
man/ibv_create_qp.3 | 13 +++-
man/ibv_create_xsrq.3 | 80 +
man/ibv_open_xrcd.3 | 65
man/ibv_post_send.3 | 11 ++-
5 file
In order to support OFED, vendor specific calls, or new ibverbs
operations, define a generic extension mechanism. This allows
OFED, an RDMA vendor, or another registered 3rd party (for
example, the librdmacm) to define RDMA extensions, plus provides
a backwards compatible way to add new features t
XRC provides a scalability enhancement when a process on one node
must communicate with multiple processes on another node. This
is commonly the case when running MPI applications on multi-core
systems in a cluster.
An XRC connection consists of an initiator (XRC INI) qp and a target
(XRC TGT) qp
XRC TGT QPs are shared resources among multiple processes. Since the
creating process may exit, allow other processes which share the same
XRC domain to open the existing QP. This allows us to transfer
ownership of an xrc tgt qp to another process.
Conceptually, verbs treats an xrc tgt qp as a '
901 - 1000 of 1448 matches
Mail list logo