Re: [PATCH V1] NFS-RDMA: fix qp pointer validation checks
On 4/24/2014 2:30 AM, Devesh Sharma wrote: Hi Chuck Following is the complete call trace of a typical NFS-RDMA transaction while mounting a share. It is unavoidable to stop calling post-send in case it is not created. Therefore, applying checks to the connection state is a must While registering/deregistering frmrs on-the-fly. The unconnected state of QP implies don't call post_send/post_recv from any context. Long thread... didn't follow it all. If I understand correctly this race comes only for *cleanup* (LINV) of FRMR registration while teardown flow destroyed the QP. I think this might be disappear if for each registration you post LINV+FRMR. This is assuming that a situation where trying to post Fastreg on a "bad" QP can never happen (usually since teardown flow typically suspends outgoing commands). Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IB/cma: Make timeout dependent on the subnet timeout
On 23/04/2014 16:44, Hefty, Sean wrote: Regarding SubnetTimeout changes: the code in drivers/infiniband/core/cache.c already queues a work request after each port state change. Inside that work request e.g. the P_Key cache is updated. Would it be acceptable to modify ib_cache_update() such that it also queries the port attributes and caches these ? Cached port attributes could e.g. be stored in struct ib_port. Without looking at details, this at least sounds reasonable. Sean, can't we have CMA to follow the same practice used in the CM where we derive the RC QP timeout based on the packet life time retrieved in path queries? e.g base the cm response time out on this value too? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mm: get_user_pages(write,force) refuse to COW in shared areas
Hi Hugh, Sorry for late reply. First of all, to avoid the confusion, I think the patch is fine. When I saw this patch I decided that uprobes should be updated accordingly, but I just realized that I do not understand what should I write in the changelog. On 04/04, Hugh Dickins wrote: > > + if (gup_flags & FOLL_WRITE) { > + if (!(vm_flags & VM_WRITE)) { > + if (!(gup_flags & FOLL_FORCE)) > + goto efault; > + /* > + * We used to let the write,force case do COW > + * in a VM_MAYWRITE VM_SHARED !VM_WRITE vma, so > + * ptrace could set a breakpoint in a read-only > + * mapping of an executable, without corrupting > + * the file (yet only when that file had been > + * opened for writing!). Anon pages in shared > + * mappings are surprising: now just reject it. > + */ > + if (!is_cow_mapping(vm_flags)) { > + WARN_ON_ONCE(vm_flags & VM_MAYWRITE); > + goto efault; > + } OK. But could you please clarify "Anon pages in shared mappings are surprising" ? I mean, does this only apply to "VM_MAYWRITE VM_SHARED !VM_WRITE vma" mentioned above or this is bad even if a !FMODE_WRITE file was mmaped as MAP_SHARED ? Yes, in this case this vma is not VM_SHARED and it is not VM_MAYWRITE, it is only VM_MAYSHARE. This is in fact private mapping except mprotect(PROT_WRITE) will not work. But with or without this patch gup(FOLL_WRITE | FOLL_FORCE) won't work in this case, (although perhaps it could ?), is_cow_mapping() == F because of !VM_MAYWRITE. However, currently uprobes assumes that a cowed anon page is fine in this case, and this differs from gup(). So, what do you think about the patch below? It is probably fine in any case, but is there any "strong" reason to follow the gup's behaviour and forbid the anon page in VM_MAYSHARE && !VM_MAYWRITE vma? Oleg. --- x/kernel/events/uprobes.c +++ x/kernel/events/uprobes.c @@ -127,12 +127,13 @@ struct xol_area { */ static bool valid_vma(struct vm_area_struct *vma, bool is_register) { - vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC | VM_SHARED; + vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC; if (is_register) flags |= VM_WRITE; - return vma->vm_file && (vma->vm_flags & flags) == VM_MAYEXEC; + return vma->vm_file && is_cow_mapping(vma->vm_flags) && + (vma->vm_flags & flags) == VM_MAYEXEC; } static unsigned long offset_to_vaddr(struct vm_area_struct *vma, loff_t offset) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 00/20] On demand paging
On 02/03/2014 12:49, Haggai Eran wrote: The following set of patches implements on-demand paging (ODP) support in the RDMA stack and in the mlx5_ib Infiniband driver. I've placed the latest cut of the ODP patches on my public git tree @ git://beany.openfabrics.org/~ogerlitz/linux-2.6.git odp this is actually V0.1 with the following changes: - Rebase against v3.15-rc2 - Removed dependency on patches that were accepted upstream - Changed use of compound_trans_head to compound_head as the latter was removed in 3.14 f5d7fc1 IB/mlx5: Implement on demand paging by adding support for MMU notifiers 09eae22 IB/mlx5: Add support for RDMA write responder page faults ca84a78 IB/mlx5: Handle page faults 302a6ea IB/mlx5: Page faults handling infrastructure df792a8 IB/mlx5: Add function to read WQE from user-space 1b4f69b IB/mlx5: Add mlx5_ib_update_mtt to update page tables after creation bc5a6b0 IB/mlx5: Changes in memory region creation to support on-demand paging 335c8ef IB/mlx5: Implement the ODP capability query verb 5a67390 net/mlx5_core: Add support for page faults events and low level handling 11450c4 IB/mlx5: Refactor UMR to have its own context struct 51deb37 IB/mlx5: Enhance UMR support to allow partial page table update 107bc64 IB/mlx5: Set QP offsets and parameters for user QPs and not just for kernel QPs e91a314 mlx5: Store MR attributes in mlx5_mr_core during creation and after UMR 894f946 IB/mlx5: Add MR to radix tree in reg_mr_callback 9283891 IB/mlx5: Fix error handling in reg_umr 467f4e7 IB/core: Implement support for MMU notifiers regarding on demand paging regions c98a42e IB/core: Add support for on demand paging regions 8fb5241 IB/core: Add umem function to read data from user-space 16c9cf0 IB/core: Replace ib_umem's offset field with a full address 9f0d8b5 IB/core: Add flags for on demand paging support -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1] NFS-RDMA: fix qp pointer validation checks
On Apr 24, 2014, at 3:12 AM, Sagi Grimberg wrote: > On 4/24/2014 2:30 AM, Devesh Sharma wrote: >> Hi Chuck >> >> Following is the complete call trace of a typical NFS-RDMA transaction while >> mounting a share. >> It is unavoidable to stop calling post-send in case it is not created. >> Therefore, applying checks to the connection state is a must >> While registering/deregistering frmrs on-the-fly. The unconnected state of >> QP implies don't call post_send/post_recv from any context. >> > > Long thread... didn't follow it all. I think you got the gist of it. > If I understand correctly this race comes only for *cleanup* (LINV) of FRMR > registration while teardown flow destroyed the QP. > I think this might be disappear if for each registration you post LINV+FRMR. > This is assuming that a situation where trying to post Fastreg on a "bad" QP > can > never happen (usually since teardown flow typically suspends outgoing > commands). That’s typically true for “hard” NFS mounts. But “soft” NFS mounts wake RPCs after a timeout while the transport is disconnected, in order to kill them. At that point, deregistration still needs to succeed somehow. IMO there are three related problems. 1. rpcrdma_ep_connect() is allowing RPC tasks to be awoken while there is no QP at all (->qp is NULL). The woken RPC tasks are trying to deregister buffers that may include page cache pages, and it’s oopsing because ->qp is NULL. That’s a logic bug in rpcrdma_ep_connect(), and I have an idea how to address it. 2. If a QP is present but disconnected, posting LOCAL_INV won’t work. That leaves buffers (and page cache pages, potentially) registered. That could be addressed with LINV+FRMR. But... 3. The client should not leave page cache pages registered indefinitely. Both LINV+FRMR and our current approach depends on having a working QP _at_ _some_ _point_ … but the client simply can’t depend on that. What happens if an NFS server is, say, destroyed by fire while there are active client mount points? What if the HCA’s firmware is permanently not allowing QP creation? Here's a relevant comment in rpcrdma_ep_connect(): 815 /* TEMP TEMP TEMP - fail if new device: 816 * Deregister/remarshal *all* requests! 817 * Close and recreate adapter, pd, etc! 818 * Re-determine all attributes still sane! 819 * More stuff I haven't thought of! 820 * Rrrgh! 821 */ xprtrdma does not do this today. When a new device is created, all existing RPC requests could be deregistered and re-marshalled. As far as I can tell, rpcrdma_ep_connect() is executing in a synchronous context (the connect worker) and we can simply use dereg_mr, as long as later, when the RPCs are re-driven, they know they need to re-marshal. I’ll try some things today. -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH V1] NFS-RDMA: fix qp pointer validation checks
Thanks Chuck for summarizing. One more issue is being added to the list below. > -Original Message- > From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- > ow...@vger.kernel.org] On Behalf Of Chuck Lever > Sent: Thursday, April 24, 2014 8:31 PM > To: Sagi Grimberg > Cc: Devesh Sharma; Linux NFS Mailing List; linux-rdma@vger.kernel.org; > Trond Myklebust > Subject: Re: [PATCH V1] NFS-RDMA: fix qp pointer validation checks > > > On Apr 24, 2014, at 3:12 AM, Sagi Grimberg > wrote: > > > On 4/24/2014 2:30 AM, Devesh Sharma wrote: > >> Hi Chuck > >> > >> Following is the complete call trace of a typical NFS-RDMA transaction > while mounting a share. > >> It is unavoidable to stop calling post-send in case it is not > >> created. Therefore, applying checks to the connection state is a must > While registering/deregistering frmrs on-the-fly. The unconnected state of > QP implies don't call post_send/post_recv from any context. > >> > > > > Long thread... didn't follow it all. > > I think you got the gist of it. > > > If I understand correctly this race comes only for *cleanup* (LINV) of FRMR > registration while teardown flow destroyed the QP. > > I think this might be disappear if for each registration you post LINV+FRMR. > > This is assuming that a situation where trying to post Fastreg on a > > "bad" QP can never happen (usually since teardown flow typically suspends > outgoing commands). > > That's typically true for "hard" NFS mounts. But "soft" NFS mounts wake > RPCs after a timeout while the transport is disconnected, in order to kill > them. At that point, deregistration still needs to succeed somehow. > > IMO there are three related problems. > > 1. rpcrdma_ep_connect() is allowing RPC tasks to be awoken while > there is no QP at all (->qp is NULL). The woken RPC tasks are > trying to deregister buffers that may include page cache pages, > and it's oopsing because ->qp is NULL. > > That's a logic bug in rpcrdma_ep_connect(), and I have an idea > how to address it. > > 2. If a QP is present but disconnected, posting LOCAL_INV won't work. > That leaves buffers (and page cache pages, potentially) registered. > That could be addressed with LINV+FRMR. But... > > 3. The client should not leave page cache pages registered indefinitely. > Both LINV+FRMR and our current approach depends on having a working > QP _at_ _some_ _point_ ... but the client simply can't depend on that. > What happens if an NFS server is, say, destroyed by fire while there > are active client mount points? What if the HCA's firmware is > permanently not allowing QP creation? Addition to the list 4. If rdma traffic is in progress and the network link goes down and comes back up after some time (t > 10 secs ), The rpcrdma_ep_connect() does not destroys the existing QP because rpcrdma_create_id fails (rdma_resolve_addr fails). Now, once the connect worker thread Gets rescheduled again, every time CM fails with establishment error. Finally, after multiple tries CM fails with rdma_cm_event = 15 and entire recovery thread sits silently forever and kernel reports user app is blocked for more than 120 secs. > > Here's a relevant comment in rpcrdma_ep_connect(): > > 815 /* TEMP TEMP TEMP - fail if new device: > 816 * Deregister/remarshal *all* requests! > 817 * Close and recreate adapter, pd, etc! > 818 * Re-determine all attributes still sane! > 819 * More stuff I haven't thought of! > 820 * Rrrgh! > 821 */ > > xprtrdma does not do this today. > > When a new device is created, all existing RPC requests could be > deregistered and re-marshalled. As far as I can tell, > rpcrdma_ep_connect() is executing in a synchronous context (the connect > worker) and we can simply use dereg_mr, as long as later, when the RPCs are > re-driven, they know they need to re-marshal. > > I'll try some things today. > > -- > Chuck Lever > chuck[dot]lever[at]oracle[dot]com > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the > body of a message to majord...@vger.kernel.org More majordomo info at > http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1] NFS-RDMA: fix qp pointer validation checks
On Apr 24, 2014, at 11:48 AM, Devesh Sharma wrote: > Thanks Chuck for summarizing. > One more issue is being added to the list below. > >> -Original Message- >> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- >> ow...@vger.kernel.org] On Behalf Of Chuck Lever >> Sent: Thursday, April 24, 2014 8:31 PM >> To: Sagi Grimberg >> Cc: Devesh Sharma; Linux NFS Mailing List; linux-rdma@vger.kernel.org; >> Trond Myklebust >> Subject: Re: [PATCH V1] NFS-RDMA: fix qp pointer validation checks >> >> >> On Apr 24, 2014, at 3:12 AM, Sagi Grimberg >> wrote: >> >>> On 4/24/2014 2:30 AM, Devesh Sharma wrote: Hi Chuck Following is the complete call trace of a typical NFS-RDMA transaction >> while mounting a share. It is unavoidable to stop calling post-send in case it is not created. Therefore, applying checks to the connection state is a must >> While registering/deregistering frmrs on-the-fly. The unconnected state of >> QP implies don't call post_send/post_recv from any context. >>> >>> Long thread... didn't follow it all. >> >> I think you got the gist of it. >> >>> If I understand correctly this race comes only for *cleanup* (LINV) of FRMR >> registration while teardown flow destroyed the QP. >>> I think this might be disappear if for each registration you post LINV+FRMR. >>> This is assuming that a situation where trying to post Fastreg on a >>> "bad" QP can never happen (usually since teardown flow typically suspends >> outgoing commands). >> >> That's typically true for "hard" NFS mounts. But "soft" NFS mounts wake >> RPCs after a timeout while the transport is disconnected, in order to kill >> them. At that point, deregistration still needs to succeed somehow. >> >> IMO there are three related problems. >> >> 1. rpcrdma_ep_connect() is allowing RPC tasks to be awoken while >>there is no QP at all (->qp is NULL). The woken RPC tasks are >>trying to deregister buffers that may include page cache pages, >>and it's oopsing because ->qp is NULL. >> >>That's a logic bug in rpcrdma_ep_connect(), and I have an idea >>how to address it. >> >> 2. If a QP is present but disconnected, posting LOCAL_INV won't work. >>That leaves buffers (and page cache pages, potentially) registered. >>That could be addressed with LINV+FRMR. But... >> >> 3. The client should not leave page cache pages registered indefinitely. >>Both LINV+FRMR and our current approach depends on having a working >>QP _at_ _some_ _point_ ... but the client simply can't depend on that. >>What happens if an NFS server is, say, destroyed by fire while there >>are active client mount points? What if the HCA's firmware is >>permanently not allowing QP creation? > Addition to the list > 4. If rdma traffic is in progress and the network link goes down and comes > back up after some time (t > 10 secs ), >The rpcrdma_ep_connect() does not destroys the existing QP because > rpcrdma_create_id fails (rdma_resolve_addr fails). >Now, once the connect worker thread Gets rescheduled again, every time CM > fails with establishment error. Finally, after multiple tries >CM fails with rdma_cm_event = 15 and entire recovery thread sits silently > forever and kernel reports user app is blocked for more than 120 secs. I think I see that now. I should be able to address it with the fixes for 1. -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3.15-rc3--to=rol...@purestorage.com 1/3] iw_cxgb4: Fix endpoint mutex deadlocks
In cases where the cm calls c4iw_modify_rc_qp() with the endpoint mutex held, they must be called with internal == 1. rx_data() and process_mpa_reply() are not doing this. This causes a deadlock because c4iw_modify_rc_qp() might call c4iw_ep_disconnect() in some !internal cases, and c4iw_ep_disconnect() acquires the endpoint mutex. The design was intended to only do the disconnect for !internal calls. Change rx_data(), FPDU_MODE case, to call c4iw_modify_rc_qp() with internal == 1, and then disconnect only after releasing the mutex. Change process_mpa_reply() to call c4iw_modify_rc_qp(TERMINATE) with internal == 1 and set a new attr flag telling it to send a TERMINATE message. Previously this was implied by !internal. Change process_mpa_reply() to return whether the caller should disconnect after releasing the endpoint mutex. Now rx_data() will do the disconnect in the cases where process_mpa_reply() wants to disconnect after the TERMINATE is sent. Change c4iw_modify_rc_qp() RTS->TERM to only disconnect if !internal, and to send a TERMINATE message if attrs->send_term is 1. Change abort_connection() to not aquire the ep mutex for setting the state, and make all calls to abort_connection() do so with the mutex held. Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb4/cm.c | 31 --- drivers/infiniband/hw/cxgb4/iw_cxgb4.h |1 + drivers/infiniband/hw/cxgb4/qp.c |9 + 3 files changed, 26 insertions(+), 15 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c index 185452a..f9b04bc 100644 --- a/drivers/infiniband/hw/cxgb4/cm.c +++ b/drivers/infiniband/hw/cxgb4/cm.c @@ -996,7 +996,7 @@ static void close_complete_upcall(struct c4iw_ep *ep, int status) static int abort_connection(struct c4iw_ep *ep, struct sk_buff *skb, gfp_t gfp) { PDBG("%s ep %p tid %u\n", __func__, ep, ep->hwtid); - state_set(&ep->com, ABORTING); + __state_set(&ep->com, ABORTING); set_bit(ABORT_CONN, &ep->com.history); return send_abort(ep, skb, gfp); } @@ -1154,7 +1154,7 @@ static int update_rx_credits(struct c4iw_ep *ep, u32 credits) return credits; } -static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) +static int process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) { struct mpa_message *mpa; struct mpa_v2_conn_params *mpa_v2_params; @@ -1164,6 +1164,7 @@ static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) struct c4iw_qp_attributes attrs; enum c4iw_qp_attr_mask mask; int err; + int disconnect = 0; PDBG("%s ep %p tid %u\n", __func__, ep, ep->hwtid); @@ -1173,7 +1174,7 @@ static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) * will abort the connection. */ if (stop_ep_timer(ep)) - return; + return 0; /* * If we get more than the supported amount of private data @@ -1195,7 +1196,7 @@ static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) * if we don't even have the mpa message, then bail. */ if (ep->mpa_pkt_len < sizeof(*mpa)) - return; + return 0; mpa = (struct mpa_message *) ep->mpa_pkt; /* Validate MPA header. */ @@ -1235,7 +1236,7 @@ static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) * We'll continue process when more data arrives. */ if (ep->mpa_pkt_len < (sizeof(*mpa) + plen)) - return; + return 0; if (mpa->flags & MPA_REJECT) { err = -ECONNREFUSED; @@ -1337,9 +1338,11 @@ static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) attrs.layer_etype = LAYER_MPA | DDP_LLP; attrs.ecode = MPA_NOMATCH_RTR; attrs.next_state = C4IW_QP_STATE_TERMINATE; + attrs.send_term = 1; err = c4iw_modify_qp(ep->com.qp->rhp, ep->com.qp, - C4IW_QP_ATTR_NEXT_STATE, &attrs, 0); + C4IW_QP_ATTR_NEXT_STATE, &attrs, 1); err = -ENOMEM; + disconnect = 1; goto out; } @@ -1355,9 +1358,11 @@ static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) attrs.layer_etype = LAYER_MPA | DDP_LLP; attrs.ecode = MPA_INSUFF_IRD; attrs.next_state = C4IW_QP_STATE_TERMINATE; + attrs.send_term = 1; err = c4iw_modify_qp(ep->com.qp->rhp, ep->com.qp, - C4IW_QP_ATTR_NEXT_STATE, &attrs, 0); + C4IW_QP_ATTR_NEXT_STATE, &attrs, 1); err = -ENOMEM; + disconnect = 1; goto out; } goto out; @@ -1366,7 +1371,7 @@ err: send_abort(ep, sk
[PATCH 3.15-rc3--to=rol...@purestorage.com 2/3] iw_cxgb4: force T5 connections to use TAHOE cong control
This is required to work around a T5 HW issue. Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb4/cm.c |8 drivers/infiniband/hw/cxgb4/t4fw_ri_api.h | 14 ++ 2 files changed, 22 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c index f9b04bc..1f863a9 100644 --- a/drivers/infiniband/hw/cxgb4/cm.c +++ b/drivers/infiniband/hw/cxgb4/cm.c @@ -587,6 +587,10 @@ static int send_connect(struct c4iw_ep *ep) opt2 |= SACK_EN(1); if (wscale && enable_tcp_window_scaling) opt2 |= WND_SCALE_EN(1); + if (is_t5(ep->com.dev->rdev.lldi.adapter_type)) { + opt2 |= T5_OPT_2_VALID; + opt2 |= V_CONG_CNTRL(CONG_ALG_TAHOE); + } t4_set_arp_err_handler(skb, NULL, act_open_req_arp_failure); if (is_t4(ep->com.dev->rdev.lldi.adapter_type)) { @@ -2018,6 +2022,10 @@ static void accept_cr(struct c4iw_ep *ep, struct sk_buff *skb, if (tcph->ece && tcph->cwr) opt2 |= CCTRL_ECN(1); } + if (is_t5(ep->com.dev->rdev.lldi.adapter_type)) { + opt2 |= T5_OPT_2_VALID; + opt2 |= V_CONG_CNTRL(CONG_ALG_TAHOE); + } rpl = cplhdr(skb); INIT_TP_WR(rpl, ep->hwtid); diff --git a/drivers/infiniband/hw/cxgb4/t4fw_ri_api.h b/drivers/infiniband/hw/cxgb4/t4fw_ri_api.h index dc193c2..6121ca0 100644 --- a/drivers/infiniband/hw/cxgb4/t4fw_ri_api.h +++ b/drivers/infiniband/hw/cxgb4/t4fw_ri_api.h @@ -836,4 +836,18 @@ struct ulptx_idata { #define V_RX_DACK_CHANGE(x) ((x) << S_RX_DACK_CHANGE) #define F_RX_DACK_CHANGEV_RX_DACK_CHANGE(1U) +enum { /* TCP congestion control algorithms */ + CONG_ALG_RENO, + CONG_ALG_TAHOE, + CONG_ALG_NEWRENO, + CONG_ALG_HIGHSPEED +}; + +#define S_CONG_CNTRL14 +#define M_CONG_CNTRL0x3 +#define V_CONG_CNTRL(x) ((x) << S_CONG_CNTRL) +#define G_CONG_CNTRL(x) (((x) >> S_CONG_CNTRL) & M_CONG_CNTRL) + +#define T5_OPT_2_VALID (1 << 31) + #endif /* _T4FW_RI_API_H_ */ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3.15-rc3--to=rol...@purestorage.com 3/3] iw_cxgb4: only allow kernel db ringing for T4 devs
The whole db drop avoidance stuff is for T4 only. So we cannot allow that to be enabled for T5 devices. Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb4/qp.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c index f18ef34..086f62f 100644 --- a/drivers/infiniband/hw/cxgb4/qp.c +++ b/drivers/infiniband/hw/cxgb4/qp.c @@ -1777,11 +1777,15 @@ int c4iw_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, /* * Use SQ_PSN and RQ_PSN to pass in IDX_INC values for * ringing the queue db when we're in DB_FULL mode. +* Only allow this on T4 devices. */ attrs.sq_db_inc = attr->sq_psn; attrs.rq_db_inc = attr->rq_psn; mask |= (attr_mask & IB_QP_SQ_PSN) ? C4IW_QP_ATTR_SQ_DB : 0; mask |= (attr_mask & IB_QP_RQ_PSN) ? C4IW_QP_ATTR_RQ_DB : 0; + if (is_t5(to_c4iw_qp(ibqp)->rhp->rdev.lldi.adapter_type) && + (mask & (C4IW_QP_ATTR_SQ_DB|C4IW_QP_ATTR_RQ_DB))) + return -EINVAL; return c4iw_modify_qp(rhp, qhp, mask, &attrs, 0); } -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RESEND 3.15-rc3 1/3] iw_cxgb4: Fix endpoint mutex deadlocks
In cases where the cm calls c4iw_modify_rc_qp() with the endpoint mutex held, they must be called with internal == 1. rx_data() and process_mpa_reply() are not doing this. This causes a deadlock because c4iw_modify_rc_qp() might call c4iw_ep_disconnect() in some !internal cases, and c4iw_ep_disconnect() acquires the endpoint mutex. The design was intended to only do the disconnect for !internal calls. Change rx_data(), FPDU_MODE case, to call c4iw_modify_rc_qp() with internal == 1, and then disconnect only after releasing the mutex. Change process_mpa_reply() to call c4iw_modify_rc_qp(TERMINATE) with internal == 1 and set a new attr flag telling it to send a TERMINATE message. Previously this was implied by !internal. Change process_mpa_reply() to return whether the caller should disconnect after releasing the endpoint mutex. Now rx_data() will do the disconnect in the cases where process_mpa_reply() wants to disconnect after the TERMINATE is sent. Change c4iw_modify_rc_qp() RTS->TERM to only disconnect if !internal, and to send a TERMINATE message if attrs->send_term is 1. Change abort_connection() to not aquire the ep mutex for setting the state, and make all calls to abort_connection() do so with the mutex held. Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb4/cm.c | 31 --- drivers/infiniband/hw/cxgb4/iw_cxgb4.h |1 + drivers/infiniband/hw/cxgb4/qp.c |9 + 3 files changed, 26 insertions(+), 15 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c index 185452a..f9b04bc 100644 --- a/drivers/infiniband/hw/cxgb4/cm.c +++ b/drivers/infiniband/hw/cxgb4/cm.c @@ -996,7 +996,7 @@ static void close_complete_upcall(struct c4iw_ep *ep, int status) static int abort_connection(struct c4iw_ep *ep, struct sk_buff *skb, gfp_t gfp) { PDBG("%s ep %p tid %u\n", __func__, ep, ep->hwtid); - state_set(&ep->com, ABORTING); + __state_set(&ep->com, ABORTING); set_bit(ABORT_CONN, &ep->com.history); return send_abort(ep, skb, gfp); } @@ -1154,7 +1154,7 @@ static int update_rx_credits(struct c4iw_ep *ep, u32 credits) return credits; } -static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) +static int process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) { struct mpa_message *mpa; struct mpa_v2_conn_params *mpa_v2_params; @@ -1164,6 +1164,7 @@ static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) struct c4iw_qp_attributes attrs; enum c4iw_qp_attr_mask mask; int err; + int disconnect = 0; PDBG("%s ep %p tid %u\n", __func__, ep, ep->hwtid); @@ -1173,7 +1174,7 @@ static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) * will abort the connection. */ if (stop_ep_timer(ep)) - return; + return 0; /* * If we get more than the supported amount of private data @@ -1195,7 +1196,7 @@ static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) * if we don't even have the mpa message, then bail. */ if (ep->mpa_pkt_len < sizeof(*mpa)) - return; + return 0; mpa = (struct mpa_message *) ep->mpa_pkt; /* Validate MPA header. */ @@ -1235,7 +1236,7 @@ static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) * We'll continue process when more data arrives. */ if (ep->mpa_pkt_len < (sizeof(*mpa) + plen)) - return; + return 0; if (mpa->flags & MPA_REJECT) { err = -ECONNREFUSED; @@ -1337,9 +1338,11 @@ static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) attrs.layer_etype = LAYER_MPA | DDP_LLP; attrs.ecode = MPA_NOMATCH_RTR; attrs.next_state = C4IW_QP_STATE_TERMINATE; + attrs.send_term = 1; err = c4iw_modify_qp(ep->com.qp->rhp, ep->com.qp, - C4IW_QP_ATTR_NEXT_STATE, &attrs, 0); + C4IW_QP_ATTR_NEXT_STATE, &attrs, 1); err = -ENOMEM; + disconnect = 1; goto out; } @@ -1355,9 +1358,11 @@ static void process_mpa_reply(struct c4iw_ep *ep, struct sk_buff *skb) attrs.layer_etype = LAYER_MPA | DDP_LLP; attrs.ecode = MPA_INSUFF_IRD; attrs.next_state = C4IW_QP_STATE_TERMINATE; + attrs.send_term = 1; err = c4iw_modify_qp(ep->com.qp->rhp, ep->com.qp, - C4IW_QP_ATTR_NEXT_STATE, &attrs, 0); + C4IW_QP_ATTR_NEXT_STATE, &attrs, 1); err = -ENOMEM; + disconnect = 1; goto out; } goto out; @@ -1366,7 +1371,7 @@ err: send_abort(ep, sk
[PATCH RESEND 3.15-rc3 2/3] iw_cxgb4: force T5 connections to use TAHOE cong control
This is required to work around a T5 HW issue. Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb4/cm.c |8 drivers/infiniband/hw/cxgb4/t4fw_ri_api.h | 14 ++ 2 files changed, 22 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c index f9b04bc..1f863a9 100644 --- a/drivers/infiniband/hw/cxgb4/cm.c +++ b/drivers/infiniband/hw/cxgb4/cm.c @@ -587,6 +587,10 @@ static int send_connect(struct c4iw_ep *ep) opt2 |= SACK_EN(1); if (wscale && enable_tcp_window_scaling) opt2 |= WND_SCALE_EN(1); + if (is_t5(ep->com.dev->rdev.lldi.adapter_type)) { + opt2 |= T5_OPT_2_VALID; + opt2 |= V_CONG_CNTRL(CONG_ALG_TAHOE); + } t4_set_arp_err_handler(skb, NULL, act_open_req_arp_failure); if (is_t4(ep->com.dev->rdev.lldi.adapter_type)) { @@ -2018,6 +2022,10 @@ static void accept_cr(struct c4iw_ep *ep, struct sk_buff *skb, if (tcph->ece && tcph->cwr) opt2 |= CCTRL_ECN(1); } + if (is_t5(ep->com.dev->rdev.lldi.adapter_type)) { + opt2 |= T5_OPT_2_VALID; + opt2 |= V_CONG_CNTRL(CONG_ALG_TAHOE); + } rpl = cplhdr(skb); INIT_TP_WR(rpl, ep->hwtid); diff --git a/drivers/infiniband/hw/cxgb4/t4fw_ri_api.h b/drivers/infiniband/hw/cxgb4/t4fw_ri_api.h index dc193c2..6121ca0 100644 --- a/drivers/infiniband/hw/cxgb4/t4fw_ri_api.h +++ b/drivers/infiniband/hw/cxgb4/t4fw_ri_api.h @@ -836,4 +836,18 @@ struct ulptx_idata { #define V_RX_DACK_CHANGE(x) ((x) << S_RX_DACK_CHANGE) #define F_RX_DACK_CHANGEV_RX_DACK_CHANGE(1U) +enum { /* TCP congestion control algorithms */ + CONG_ALG_RENO, + CONG_ALG_TAHOE, + CONG_ALG_NEWRENO, + CONG_ALG_HIGHSPEED +}; + +#define S_CONG_CNTRL14 +#define M_CONG_CNTRL0x3 +#define V_CONG_CNTRL(x) ((x) << S_CONG_CNTRL) +#define G_CONG_CNTRL(x) (((x) >> S_CONG_CNTRL) & M_CONG_CNTRL) + +#define T5_OPT_2_VALID (1 << 31) + #endif /* _T4FW_RI_API_H_ */ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RESEND 3.15-rc3 3/3] iw_cxgb4: only allow kernel db ringing for T4 devs
The whole db drop avoidance stuff is for T4 only. So we cannot allow that to be enabled for T5 devices. Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb4/qp.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c index f18ef34..086f62f 100644 --- a/drivers/infiniband/hw/cxgb4/qp.c +++ b/drivers/infiniband/hw/cxgb4/qp.c @@ -1777,11 +1777,15 @@ int c4iw_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, /* * Use SQ_PSN and RQ_PSN to pass in IDX_INC values for * ringing the queue db when we're in DB_FULL mode. +* Only allow this on T4 devices. */ attrs.sq_db_inc = attr->sq_psn; attrs.rq_db_inc = attr->rq_psn; mask |= (attr_mask & IB_QP_SQ_PSN) ? C4IW_QP_ATTR_SQ_DB : 0; mask |= (attr_mask & IB_QP_RQ_PSN) ? C4IW_QP_ATTR_RQ_DB : 0; + if (is_t5(to_c4iw_qp(ibqp)->rhp->rdev.lldi.adapter_type) && + (mask & (C4IW_QP_ATTR_SQ_DB|C4IW_QP_ATTR_RQ_DB))) + return -EINVAL; return c4iw_modify_qp(rhp, qhp, mask, &attrs, 0); } -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mm: get_user_pages(write,force) refuse to COW in shared areas
On Thu, 24 Apr 2014, Oleg Nesterov wrote: > Hi Hugh, > > Sorry for late reply. First of all, to avoid the confusion, I think the > patch is fine. > > When I saw this patch I decided that uprobes should be updated accordingly, > but I just realized that I do not understand what should I write in the > changelog. Thanks a lot for considering similar issues in uprobes, Oleg: I merely checked that its uses of get_user_pages() would not be problematic, and didn't look around to rediscover the worrying mm business that goes on down there in kernel/events. > > On 04/04, Hugh Dickins wrote: > > > > + if (gup_flags & FOLL_WRITE) { > > + if (!(vm_flags & VM_WRITE)) { > > + if (!(gup_flags & FOLL_FORCE)) > > + goto efault; > > + /* > > +* We used to let the write,force case do COW > > +* in a VM_MAYWRITE VM_SHARED !VM_WRITE vma, so > > +* ptrace could set a breakpoint in a read-only > > +* mapping of an executable, without corrupting > > +* the file (yet only when that file had been > > +* opened for writing!). Anon pages in shared > > +* mappings are surprising: now just reject it. > > +*/ > > + if (!is_cow_mapping(vm_flags)) { > > + WARN_ON_ONCE(vm_flags & VM_MAYWRITE); > > + goto efault; > > + } > > OK. But could you please clarify "Anon pages in shared mappings are > surprising" ? > I mean, does this only apply to "VM_MAYWRITE VM_SHARED !VM_WRITE vma" > mentioned > above or this is bad even if a !FMODE_WRITE file was mmaped as MAP_SHARED ? Good question. I simply didn't consider that - and (as you have realized) didn't need to consider it, because I was just stopping the problematic behaviour in gup(), and didn't need to consider whether other behaviour prohibited by gup() was actually unproblematic. > > Yes, in this case this vma is not VM_SHARED and it is not VM_MAYWRITE, it is > only > VM_MAYSHARE. This is in fact private mapping except mprotect(PROT_WRITE) will > not > work. > > But with or without this patch gup(FOLL_WRITE | FOLL_FORCE) won't work in > this case, "this" meaning my patch rather than yours below > (although perhaps it could ?), is_cow_mapping() == F because of !VM_MAYWRITE. > > However, currently uprobes assumes that a cowed anon page is fine in this > case, and > this differs from gup(). > > So, what do you think about the patch below? It is probably fine in any case, > but is there any "strong" reason to follow the gup's behaviour and forbid the > anon page in VM_MAYSHARE && !VM_MAYWRITE vma? I don't think there is a "strong" reason to forbid it. The strongest reason is simply that it's much safer if uprobes follows the same conventions as mm, and get_user_pages() happens to have forbidden that all along. The philosophical reason to forbid it is that the user mmapped with MAP_SHARED, and it's merely a kernel-internal detail that we flip off VM_SHARED and treat these read-only shared mappings very much like private mappings. The user asked for MAP_SHARED, and we prefer to respect that by not letting private COWs creep in. We could treat those mappings even more like private mappings, and allow the COWs; but better to be strict about it, so long as doing so doesn't give you regressions. > > Oleg. > > --- x/kernel/events/uprobes.c > +++ x/kernel/events/uprobes.c > @@ -127,12 +127,13 @@ struct xol_area { > */ > static bool valid_vma(struct vm_area_struct *vma, bool is_register) > { > - vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC | VM_SHARED; > + vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC; I think a one-line patch changing VM_SHARED to VM_MAYSHARE would do it, wouldn't it? And save you from having to export is_cow_mapping() from mm/memory.c. (I used is_cow_mapping() because I had to make the test more complex anyway, just to exclude the case which had been oddly handled before.) Hugh > > if (is_register) > flags |= VM_WRITE; > > - return vma->vm_file && (vma->vm_flags & flags) == VM_MAYEXEC; > + return vma->vm_file && is_cow_mapping(vma->vm_flags) && > + (vma->vm_flags & flags) == VM_MAYEXEC; > } > > static unsigned long offset_to_vaddr(struct vm_area_struct *vma, loff_t > offset) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH opensm] libvendor/osm_vendor_ibumad.c: Support GRH (for GS classes)
If GRH is in incoming GS class management packet, convert umad GRH information to OpenSM GRH information. On outgoing side, convert OpenSM GRH information into umad GRH information. Note that only base port 0 (GID index 0) is supported. This is mainly for SA although other GS classes could use it (but don't). Note also that SA reports with GRH is not handled by this patch. Signed-off-by: Hal Rosenstock --- diff --git a/libvendor/osm_vendor_ibumad.c b/libvendor/osm_vendor_ibumad.c index f9d3036..e9651f6 100644 --- a/libvendor/osm_vendor_ibumad.c +++ b/libvendor/osm_vendor_ibumad.c @@ -261,9 +261,20 @@ ib_mad_addr_conv(ib_user_mad_t * umad, osm_mad_addr_t * osm_mad_addr, osm_mad_addr->addr_type.gsi.remote_qkey = ib_mad_addr->qkey; osm_mad_addr->addr_type.gsi.pkey_ix = umad_get_pkey(umad); osm_mad_addr->addr_type.gsi.service_level = ib_mad_addr->sl; - osm_mad_addr->addr_type.gsi.global_route = 0; /* FIXME: handle GRH */ - memset(&osm_mad_addr->addr_type.gsi.grh_info, 0, - sizeof osm_mad_addr->addr_type.gsi.grh_info); + if (ib_mad_addr->grh_present) { + osm_mad_addr->addr_type.gsi.global_route = 1; + osm_mad_addr->addr_type.gsi.grh_info.hop_limit = ib_mad_addr->hop_limit; + osm_mad_addr->addr_type.gsi.grh_info.ver_class_flow = + ib_grh_set_ver_class_flow(6,/* GRH version */ + ib_mad_addr->traffic_class, + ib_mad_addr->flow_label); + memcpy(&osm_mad_addr->addr_type.gsi.grh_info.dest_gid, + &ib_mad_addr->gid, 16); + } else { + osm_mad_addr->addr_type.gsi.global_route = 0; + memset(&osm_mad_addr->addr_type.gsi.grh_info, 0, + sizeof osm_mad_addr->addr_type.gsi.grh_info); + } } static void *swap_mad_bufs(osm_madw_t * p_madw, void *umad) @@ -290,6 +301,7 @@ static void *umad_receiver(void *p_ptr) osm_mad_addr_t osm_addr; osm_madw_t *p_madw, *p_req_madw; ib_mad_t *p_mad, *p_req_mad; + ib_mad_addr_t *p_mad_addr; void *umad = 0; int mad_agent, length; @@ -342,6 +354,14 @@ static void *umad_receiver(void *p_ptr) } p_mad = (ib_mad_t *) umad_get_mad(umad); + p_mad_addr = umad_get_mad_addr(umad); + /* Only support GID index 0 currently */ + if (p_mad_addr->grh_present && p_mad_addr->gid_index) { + OSM_LOG(p_ur->p_log, OSM_LOG_ERROR, "ERR 5409: " + "GRH received on GID index %d for mgmt class 0x%x\n", + p_mad_addr->gid_index, p_mad->mgmt_class); + continue; + } ib_mad_addr_conv(umad, &osm_addr, p_mad->mgmt_class == IB_MCLASS_SUBN_LID || @@ -1070,6 +1090,7 @@ osm_vendor_send(IN osm_bind_handle_t h_bind, osm_mad_addr_t *const p_mad_addr = osm_madw_get_mad_addr_ptr(p_madw); ib_mad_t *const p_mad = osm_madw_get_mad_ptr(p_madw); ib_sa_mad_t *const p_sa = (ib_sa_mad_t *) p_mad; + ib_mad_addr_t mad_addr; int ret = -1; int __attribute__((__unused__)) is_rmpp = 0; uint32_t sent_mad_size; @@ -1098,7 +1119,17 @@ osm_vendor_send(IN osm_bind_handle_t h_bind, p_mad_addr->addr_type.gsi.remote_qp, p_mad_addr->addr_type.gsi.service_level, IB_QP1_WELL_KNOWN_Q_KEY); - umad_set_grh(p_vw->umad, NULL); /* FIXME: GRH support */ + if (p_mad_addr->addr_type.gsi.global_route) { + mad_addr.grh_present = 1; + mad_addr.gid_index = 0; + mad_addr.hop_limit = p_mad_addr->addr_type.gsi.grh_info.hop_limit; + ib_grh_get_ver_class_flow(p_mad_addr->addr_type.gsi.grh_info.ver_class_flow, + NULL, &mad_addr.traffic_class, + &mad_addr.flow_label); + memcpy(&mad_addr.gid, &p_mad_addr->addr_type.gsi.grh_info.dest_gid, 16); + umad_set_grh(p_vw->umad, &mad_addr); + } else + umad_set_grh(p_vw->umad, NULL); umad_set_pkey(p_vw->umad, p_mad_addr->addr_type.gsi.pkey_ix); if (ib_class_is_rmpp(p_mad->mgmt_class)) { /* RMPP GS classes FIXME: no GRH */ if (!ib_rmpp_is_flag_set((ib_rmpp_mad_t *) p_sa, -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: IB/cma: Make timeout dependent on the subnet timeout
> Sean, can't we have CMA to follow the same practice used in the CM where > we derive the RC QP timeout based on the packet life time retrieved in > path queries? e.g base the cm response time out on this value too? We could. The timeout that's being modified by the patch is the time needed by the remote peer to process the incoming message and send a response. This time is in addition to the packet life time value that gets used. For a remote kernel agent, the time needed to respond to a CM message may be fairly small. For a user space client, the time may be significant, on the order to seconds to minutes. We can probably make due with a fairly short timeout, provided that MRAs are used by the remote side. There's no great solution that I can think of. Maybe the RDMA CM can adjust the timeout based on the remote address, assuming that it can determine if the remote address is a user space or kernel agent. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: isert for mellanox drivers
On Thu, Apr 24, 2014 at 12:54:40PM +0300, sagi grimberg wrote: > Well, I feel the same way (although less harsh about it), I would > prefer to have it all inbox. > As I see it, OFED is useful for costumers who want to upgrade RDMA > functionality (or get Tech previews) > without upgrading their distro or wait for it to land upstream. For that we have the compat drivers project, which could easily handle the rdma drivers as well. The problem with OFED is (or was last time a looked) that it's a big pile that includes backports, and new features not submitted or even rejected upstream. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[no subject]
Claim your 500,000,00 Euros -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: isert for mellanox drivers
On Thu, Apr 24, 2014 at 10:38:25PM -0700, Christoph Hellwig wrote: > On Thu, Apr 24, 2014 at 12:54:40PM +0300, sagi grimberg wrote: > > Well, I feel the same way (although less harsh about it), I would > > prefer to have it all inbox. > > As I see it, OFED is useful for costumers who want to upgrade RDMA > > functionality (or get Tech previews) > > without upgrading their distro or wait for it to land upstream. > > For that we have the compat drivers project, which could easily handle > the rdma drivers as well. > > The problem with OFED is (or was last time a looked) that it's a big > pile that includes backports, and new features not submitted or even > rejected upstream. Official OFA OFED is now strictly backports from a given kernel verison, and TBH, is not widely used now that everything is included in the modern distros. The vendor 'OFEDs' remain a big pile. I'm not even sure source is proved for them.. At least it isn't readily apparent. IMHO, the vendors should not be co-opting the OFED branding, but that is a whole other topic Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: isert for mellanox drivers
On Thu, Apr 24, 2014 at 11:55:08PM -0600, Jason Gunthorpe wrote: > Official OFA OFED is now strictly backports from a given kernel > verison, and TBH, is not widely used now that everything is included > in the modern distros. > > The vendor 'OFEDs' remain a big pile. I'm not even sure source is > proved for them.. At least it isn't readily apparent. > > IMHO, the vendors should not be co-opting the OFED branding, but that > is a whole other topic Thanks for the clarification Jason! I'll take back my rant and will apply it to the vendors instead :) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html