date:20160104

Re: [PATCHv2 TRIVIAL] IB/core: ib_mad.h ib_mad_snoop_handler documentation fix

2016-01-04 Thread Or Gerlitz


On 1/4/2016 10:44 PM, Hal Rosenstock wrote:

ib_mad_snoop_handler uses send_buf rather than send_wr

Signed-off-by: Hal Rosenstock 


Please use higher language in commit titles e.g

 IB/core: Documentation fix in the MAD header file


---
Change since v1:
Fixed typo in patch description

diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index ec9b44d..2b3573d 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -424,11 +424,11 @@ typedef void (*ib_mad_send_handler)(struct ib_mad_agent 
*mad_agent,
  /**
   * ib_mad_snoop_handler - Callback handler for snooping sent MADs.
   * @mad_agent: MAD agent that snooped the MAD.
- * @send_wr: Work request information on the sent MAD.
+ * @send_buf: send MAD data buffer.
   * @mad_send_wc: Work completion information on the sent MAD.  Valid
   *   only for snooping that occurs on a send completion.
   *
- * Clients snooping MADs should not modify data referenced by the @send_wr
+ * Clients snooping MADs should not modify data referenced by the @send_buf
   * or @mad_send_wc.
   */
  typedef void (*ib_mad_snoop_handler)(struct ib_mad_agent *mad_agent,
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

__
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
__


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] IB/sysfs: Fix sparse warning on attr_id

2016-01-04 Thread Hal Rosenstock

On 1/3/2016 10:44 PM, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> Attributed ID was declared as an int while the value should really be big
> endian 16.
> 
> Fixes: 35c4cbb17811 ("IB/core: Create get_perf_mad function in sysfs.c")
> 
> Reported-by: Bart Van Assche 
> Signed-off-by: Ira Weiny 

Reviewed-by: Hal Rosenstock 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv2 TRIVIAL] IB/core: ib_mad.h ib_mad_snoop_handler documentation fix

2016-01-04 Thread ira.weiny

On Mon, Jan 04, 2016 at 03:44:15PM -0500, Hal Rosenstock wrote:
> ib_mad_snoop_handler uses send_buf rather than send_wr
> 
> Signed-off-by: Hal Rosenstock 

Reviewed-by: Ira Weiny 

> ---
> Change since v1:
> Fixed typo in patch description
> 
> diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
> index ec9b44d..2b3573d 100644
> --- a/include/rdma/ib_mad.h
> +++ b/include/rdma/ib_mad.h
> @@ -424,11 +424,11 @@ typedef void (*ib_mad_send_handler)(struct ib_mad_agent 
> *mad_agent,
>  /**
>   * ib_mad_snoop_handler - Callback handler for snooping sent MADs.
>   * @mad_agent: MAD agent that snooped the MAD.
> - * @send_wr: Work request information on the sent MAD.
> + * @send_buf: send MAD data buffer.
>   * @mad_send_wc: Work completion information on the sent MAD.  Valid
>   *   only for snooping that occurs on a send completion.
>   *
> - * Clients snooping MADs should not modify data referenced by the @send_wr
> + * Clients snooping MADs should not modify data referenced by the @send_buf
>   * or @mad_send_wc.
>   */
>  typedef void (*ib_mad_snoop_handler)(struct ib_mad_agent *mad_agent,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] IB/mad: use CQ abstraction

2016-01-04 Thread ira.weiny

On Mon, Jan 04, 2016 at 02:15:59PM +0100, Christoph Hellwig wrote:
> Remove the local workqueue to process mad completions and use the CQ API
> instead.
> 
> Signed-off-by: Christoph Hellwig 

One minor nit below...


> ---
>  drivers/infiniband/core/mad.c  | 159 
> +
>  drivers/infiniband/core/mad_priv.h |   2 +-
>  2 files changed, 58 insertions(+), 103 deletions(-)
> 
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index cbe232a..286d1a9 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -61,18 +61,6 @@ MODULE_PARM_DESC(send_queue_size, "Size of send queue in 
> number of work requests
>  module_param_named(recv_queue_size, mad_recvq_size, int, 0444);
>  MODULE_PARM_DESC(recv_queue_size, "Size of receive queue in number of work 
> requests");
>  
> -/*
> - * Define a limit on the number of completions which will be processed by the
> - * worker thread in a single work item.  This ensures that other work items
> - * (potentially from other users) are processed fairly.
> - *
> - * The number of completions was derived from the default queue sizes above.
> - * We use a value which is double the larger of the 2 queues (receive @ 512)
> - * but keep it fixed such that an increase in that value does not introduce
> - * unfairness.
> - */
> -#define MAD_COMPLETION_PROC_LIMIT 1024
> -
>  static struct list_head ib_mad_port_list;
>  static u32 ib_mad_client_id = 0;
>  
> @@ -96,6 +84,9 @@ static int add_nonoui_reg_req(struct ib_mad_reg_req 
> *mad_reg_req,
> u8 mgmt_class);
>  static int add_oui_reg_req(struct ib_mad_reg_req *mad_reg_req,
>  struct ib_mad_agent_private *agent_priv);
> +static bool ib_mad_send_error(struct ib_mad_port_private *port_priv,
> +   struct ib_wc *wc);
> +static void ib_mad_send_done(struct ib_cq *cq, struct ib_wc *wc);
>  
>  /*
>   * Returns a ib_mad_port_private structure or NULL for a device/port
> @@ -702,11 +693,11 @@ static void snoop_recv(struct ib_mad_qp_info *qp_info,
>  }
>  
>  static void build_smp_wc(struct ib_qp *qp,
> -  u64 wr_id, u16 slid, u16 pkey_index, u8 port_num,
> +  void *wr_cqe, u16 slid, u16 pkey_index, u8 port_num,

Sorry I did not catch this before but rather than void * wouldn't it be better
to use struct ib_cqe?

Regardless:

Reviewed-by: Ira Weiny 

>struct ib_wc *wc)
>  {
>   memset(wc, 0, sizeof *wc);
> - wc->wr_id = wr_id;
> + wc->wr_cqe = wr_cqe;
>   wc->status = IB_WC_SUCCESS;
>   wc->opcode = IB_WC_RECV;
>   wc->pkey_index = pkey_index;
> @@ -844,7 +835,7 @@ static int handle_outgoing_dr_smp(struct 
> ib_mad_agent_private *mad_agent_priv,
>   }
>  
>   build_smp_wc(mad_agent_priv->agent.qp,
> -  send_wr->wr.wr_id, drslid,
> +  send_wr->wr.wr_cqe, drslid,
>send_wr->pkey_index,
>send_wr->port_num, &mad_wc);
>  
> @@ -1051,7 +1042,9 @@ struct ib_mad_send_buf * ib_create_send_mad(struct 
> ib_mad_agent *mad_agent,
>  
>   mad_send_wr->sg_list[1].lkey = mad_agent->qp->pd->local_dma_lkey;
>  
> - mad_send_wr->send_wr.wr.wr_id = (unsigned long) mad_send_wr;
> + mad_send_wr->mad_list.cqe.done = ib_mad_send_done;
> +
> + mad_send_wr->send_wr.wr.wr_cqe = &mad_send_wr->mad_list.cqe;
>   mad_send_wr->send_wr.wr.sg_list = mad_send_wr->sg_list;
>   mad_send_wr->send_wr.wr.num_sge = 2;
>   mad_send_wr->send_wr.wr.opcode = IB_WR_SEND;
> @@ -1163,8 +1156,9 @@ int ib_send_mad(struct ib_mad_send_wr_private 
> *mad_send_wr)
>  
>   /* Set WR ID to find mad_send_wr upon completion */
>   qp_info = mad_send_wr->mad_agent_priv->qp_info;
> - mad_send_wr->send_wr.wr.wr_id = (unsigned long)&mad_send_wr->mad_list;
>   mad_send_wr->mad_list.mad_queue = &qp_info->send_queue;
> + mad_send_wr->mad_list.cqe.done = ib_mad_send_done;
> + mad_send_wr->send_wr.wr.wr_cqe = &mad_send_wr->mad_list.cqe;
>  
>   mad_agent = mad_send_wr->send_buf.mad_agent;
>   sge = mad_send_wr->sg_list;
> @@ -2185,13 +2179,14 @@ handle_smi(struct ib_mad_port_private *port_priv,
>   return handle_ib_smi(port_priv, qp_info, wc, port_num, recv, response);
>  }
>  
> -static void ib_mad_recv_done_handler(struct ib_mad_port_private *port_priv,
> -  struct ib_wc *wc)
> +static void ib_mad_recv_done(struct ib_cq *cq, struct ib_wc *wc)
>  {
> + struct ib_mad_port_private *port_priv = cq->cq_context;
> + struct ib_mad_list_head *mad_list =
> + container_of(wc->wr_cqe, struct ib_mad_list_head, cqe);
>   struct ib_mad_qp_info *qp_info;
>   struct ib_mad_private_header *mad_priv_hdr;
>   struct ib_mad_private *recv, *response = NULL;
> - struct ib_mad_list_head *mad_list;
>   struct ib_mad_agent_private *mad_agent;
>   int port_num;
>

Re: [PATCH 1/2] IB/mad: pass ib_mad_send_buf explicitly to the recv_handler

2016-01-04 Thread ira.weiny

On Mon, Jan 04, 2016 at 02:15:58PM +0100, Christoph Hellwig wrote:
> Stop abusing wr_id and just pass the parameter explicitly.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Ira Weiny 

> ---
>  drivers/infiniband/core/cm.c  |  1 +
>  drivers/infiniband/core/mad.c | 18 ++
>  drivers/infiniband/core/sa_query.c|  7 ---
>  drivers/infiniband/core/user_mad.c|  1 +
>  drivers/infiniband/ulp/srpt/ib_srpt.c |  1 +
>  include/rdma/ib_mad.h |  2 ++
>  6 files changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index e3a95d1..ad3726d 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -3503,6 +3503,7 @@ int ib_cm_notify(struct ib_cm_id *cm_id, enum 
> ib_event_type event)
>  EXPORT_SYMBOL(ib_cm_notify);
>  
>  static void cm_recv_handler(struct ib_mad_agent *mad_agent,
> + struct ib_mad_send_buf *send_buf,
>   struct ib_mad_recv_wc *mad_recv_wc)
>  {
>   struct cm_port *port = mad_agent->context;
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index d4d2a61..cbe232a 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -693,7 +693,7 @@ static void snoop_recv(struct ib_mad_qp_info *qp_info,
>  
>   atomic_inc(&mad_snoop_priv->refcount);
>   spin_unlock_irqrestore(&qp_info->snoop_lock, flags);
> - mad_snoop_priv->agent.recv_handler(&mad_snoop_priv->agent,
> + mad_snoop_priv->agent.recv_handler(&mad_snoop_priv->agent, NULL,
>  mad_recv_wc);
>   deref_snoop_agent(mad_snoop_priv);
>   spin_lock_irqsave(&qp_info->snoop_lock, flags);
> @@ -1994,9 +1994,9 @@ static void ib_mad_complete_recv(struct 
> ib_mad_agent_private *mad_agent_priv,
>   /* user rmpp is in effect
>* and this is an active RMPP MAD
>*/
> - mad_recv_wc->wc->wr_id = 0;
> - 
> mad_agent_priv->agent.recv_handler(&mad_agent_priv->agent,
> -mad_recv_wc);
> + mad_agent_priv->agent.recv_handler(
> + &mad_agent_priv->agent, NULL,
> + mad_recv_wc);
>   atomic_dec(&mad_agent_priv->refcount);
>   } else {
>   /* not user rmpp, revert to normal behavior and
> @@ -2010,9 +2010,10 @@ static void ib_mad_complete_recv(struct 
> ib_mad_agent_private *mad_agent_priv,
>   spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
>  
>   /* Defined behavior is to complete response before 
> request */
> - mad_recv_wc->wc->wr_id = (unsigned long) 
> &mad_send_wr->send_buf;
> - 
> mad_agent_priv->agent.recv_handler(&mad_agent_priv->agent,
> -mad_recv_wc);
> + mad_agent_priv->agent.recv_handler(
> + &mad_agent_priv->agent,
> + &mad_send_wr->send_buf,
> + mad_recv_wc);
>   atomic_dec(&mad_agent_priv->refcount);
>  
>   mad_send_wc.status = IB_WC_SUCCESS;
> @@ -2021,7 +2022,7 @@ static void ib_mad_complete_recv(struct 
> ib_mad_agent_private *mad_agent_priv,
>   ib_mad_complete_send_wr(mad_send_wr, &mad_send_wc);
>   }
>   } else {
> - mad_agent_priv->agent.recv_handler(&mad_agent_priv->agent,
> + mad_agent_priv->agent.recv_handler(&mad_agent_priv->agent, NULL,
>  mad_recv_wc);
>   deref_mad_agent(mad_agent_priv);
>   }
> @@ -2762,6 +2763,7 @@ static void local_completions(struct work_struct *work)
>  IB_MAD_SNOOP_RECVS);
>   recv_mad_agent->agent.recv_handler(
>   &recv_mad_agent->agent,
> + &local->mad_send_wr->send_buf,
>   
> &local->mad_priv->header.recv_wc);
>   spin_lock_irqsave(&recv_mad_agent->lock, flags);
>   atomic_dec(&recv_mad_agent->refcount);
> diff --git a/drivers/infiniband/core/sa_query.c 
> b/drivers/infiniband/core/sa_query.c
> index e364a42..1f91b6e 100644
> --- a/drivers/infiniband/core/sa_query.c
> +++ b/drivers/infiniband/core/sa_query.c
> @@ -1669,14 +1669,15 @@ static void send_handler(struct ib_mad_agent *agent,

re: iser-target: Add iSCSI Extensions for RDMA (iSER) target driver

2016-01-04 Thread Dan Carpenter

Hello Nicholas Bellinger,

The patch b8d26b3be8b3: "iser-target: Add iSCSI Extensions for RDMA
(iSER) target driver" from Mar 7, 2013, leads to the following static
checker warning:

drivers/infiniband/ulp/isert/ib_isert.c:423 isert_device_get()
error: passing non negative 1 to ERR_PTR

drivers/infiniband/ulp/isert/ib_isert.c
   417  
   418  device->ib_device = cma_id->device;
   419  ret = isert_create_device_ib_res(device);
   420  if (ret) {
   421  kfree(device);
   422  mutex_unlock(&device_list_mutex);
   423  return ERR_PTR(ret);

The warning here is because isert_create_device_ib_res() returns either
a negative error code, zero or one.  The documentation is not clear what
that means.  AHAHAHAHAHAHAHAH.  I joke.  There is no documentation.

Anyway, it's definitely a bug and it leads to a NULL dereference in the
caller.

   424  }
   425  
   426  device->refcount++;
   427  list_add_tail(&device->dev_node, &device_list);
   428  isert_info("Created a new iser device %p refcount %d\n",
   429 device, device->refcount);
   430  mutex_unlock(&device_list_mutex);
   431  
   432  return device;
   433  }

regards,
dan carpenter
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv2 TRIVIAL] IB/core: ib_mad.h ib_mad_snoop_handler documentation fix

2016-01-04 Thread Hal Rosenstock

ib_mad_snoop_handler uses send_buf rather than send_wr

Signed-off-by: Hal Rosenstock 
---
Change since v1:
Fixed typo in patch description

diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index ec9b44d..2b3573d 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -424,11 +424,11 @@ typedef void (*ib_mad_send_handler)(struct ib_mad_agent 
*mad_agent,
 /**
  * ib_mad_snoop_handler - Callback handler for snooping sent MADs.
  * @mad_agent: MAD agent that snooped the MAD.
- * @send_wr: Work request information on the sent MAD.
+ * @send_buf: send MAD data buffer.
  * @mad_send_wc: Work completion information on the sent MAD.  Valid
  *   only for snooping that occurs on a send completion.
  *
- * Clients snooping MADs should not modify data referenced by the @send_wr
+ * Clients snooping MADs should not modify data referenced by the @send_buf
  * or @mad_send_wc.
  */
 typedef void (*ib_mad_snoop_handler)(struct ib_mad_agent *mad_agent,
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] staging/rdma/hfi1: check for ARMED->ACTIVE transition in receive interrupt

2016-01-04 Thread Leon Romanovsky

On Mon, Jan 04, 2016 at 11:21:19AM -0500, Jubin John wrote:
> From: Jim Snow 
> 
>   } else {
> + /* Auto activate link on non-SC15 packet receive */
> + if (unlikely(rcd->ppd->host_link_state ==
> +  HLS_UP_ARMED))
> + if (set_armed_to_active(rcd, packet, dd))
> + goto bail;

What is the advantage of double "if" over one "if"?
Something like that
+ if (unlikely(rcd->ppd->host_link_state == HLS_UP_ARMED) && 
(set_armed_to_active(rcd, packet, dd))
+   goto bail;

>   last = process_rcv_packet(&packet, thread);
>   }
>  
> @@ -984,6 +1020,42 @@ bail:
>  }
>  
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH TRIVIAL] IB/core: ib_mad.h ib_mad_snoop_handler documentation fix

2016-01-04 Thread Leon Romanovsky

On Mon, Jan 04, 2016 at 11:04:53AM -0500, Hal Rosenstock wrote:
> ib_mad_snoop_handler ues send_buf rather than send_wr

ues --> uses
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: start moving user space visible constants to uapi headers

2016-01-04 Thread Steve Wise


On 12/24/2015 8:39 AM, Christoph Hellwig wrote:

Currently very little of the uverbs user interface is actually exposed in
uapi headers, and it's a constant struggle to figure out what's kernel
internal and what is actually exposed in public.  This series starts
sorting this out by creating the infrastructure for a uapi header shared
between uverbs and the core IB stack, and starts moving all WR and WC
constants as well as the device capabilitity flags there.

A lot more work will have to follow, and I hope others will help out as
well.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Series looks ok to me.

Reviewed-by: Steve Wise 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] svc_rdma: use local_dma_lkey

2016-01-04 Thread Steve Wise



On 12/22/2015 7:11 AM, Christoph Hellwig wrote:

We now alwasy have a per-PD local_dma_lkey available.  Make use of that
fact in svc_rdma and stop registering our own MR.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Sagi Grimberg 
Reviewed-by: Jason Gunthorpe 
Reviewed-by: Chuck Lever 
---
  include/linux/sunrpc/svc_rdma.h|  2 --
  net/sunrpc/xprtrdma/svc_rdma_backchannel.c |  2 +-
  net/sunrpc/xprtrdma/svc_rdma_recvfrom.c|  4 ++--
  net/sunrpc/xprtrdma/svc_rdma_sendto.c  |  6 ++---
  net/sunrpc/xprtrdma/svc_rdma_transport.c   | 36 --
  5 files changed, 10 insertions(+), 40 deletions(-)


Reviewed-by: Steve Wise 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2] staging/rdma/hfi1: check for ARMED->ACTIVE transition in receive interrupt

2016-01-04 Thread Jubin John

From: Jim Snow 

The link state will transition from ARMED to ACTIVE when a non-SC15
packet arrives, but the driver might not notice the change.  With this
fix, if the slowpath receive interrupt handler sees a non-SC15 packet
while in the ARMED state, we queue work to call linkstate_active_work
from process context to promote it to ACTIVE.

Reviewed-by: Dean Luick 
Reviewed-by: Ira Weiny 
Reviewed-by: Mike Marciniszyn 
Signed-off-by: Jim Snow 
Signed-off-by: Brendan Cunningham 
Signed-off-by: Jubin John 
---
Changes in v2:
- Fixed whitespace
- Converted armed->active transition to inline function
- Added comment to document reason for skipping HFI1_CTRL_CTXT
  in set_all_slowpath()

 drivers/staging/rdma/hfi1/chip.c   |  5 +--
 drivers/staging/rdma/hfi1/chip.h   |  2 ++
 drivers/staging/rdma/hfi1/driver.c | 72 ++
 drivers/staging/rdma/hfi1/hfi.h| 11 ++
 drivers/staging/rdma/hfi1/init.c   |  1 +
 5 files changed, 89 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/chip.c b/drivers/staging/rdma/hfi1/chip.c
index f7bf902..63d5d71 100644
--- a/drivers/staging/rdma/hfi1/chip.c
+++ b/drivers/staging/rdma/hfi1/chip.c
@@ -7878,7 +7878,7 @@ static inline void clear_recv_intr(struct hfi1_ctxtdata 
*rcd)
 }
 
 /* force the receive interrupt */
-static inline void force_recv_intr(struct hfi1_ctxtdata *rcd)
+void force_recv_intr(struct hfi1_ctxtdata *rcd)
 {
write_csr(rcd->dd, CCE_INT_FORCE + (8 * rcd->ireg), rcd->imask);
 }
@@ -7977,7 +7977,7 @@ u32 read_physical_state(struct hfi1_devdata *dd)
& DC_DC8051_STS_CUR_STATE_PORT_MASK;
 }
 
-static u32 read_logical_state(struct hfi1_devdata *dd)
+u32 read_logical_state(struct hfi1_devdata *dd)
 {
u64 reg;
 
@@ -9952,6 +9952,7 @@ int set_link_state(struct hfi1_pportdata *ppd, u32 state)
ppd->link_enabled = 1;
}
 
+   set_all_slowpath(ppd->dd);
ret = set_local_link_attributes(ppd);
if (ret)
break;
diff --git a/drivers/staging/rdma/hfi1/chip.h b/drivers/staging/rdma/hfi1/chip.h
index b46ef66..78ba425 100644
--- a/drivers/staging/rdma/hfi1/chip.h
+++ b/drivers/staging/rdma/hfi1/chip.h
@@ -690,6 +690,8 @@ u64 read_dev_cntr(struct hfi1_devdata *dd, int index, int 
vl);
 u64 write_dev_cntr(struct hfi1_devdata *dd, int index, int vl, u64 data);
 u64 read_port_cntr(struct hfi1_pportdata *ppd, int index, int vl);
 u64 write_port_cntr(struct hfi1_pportdata *ppd, int index, int vl, u64 data);
+u32 read_logical_state(struct hfi1_devdata *dd);
+void force_recv_intr(struct hfi1_ctxtdata *rcd);
 
 /* Per VL indexes */
 enum {
diff --git a/drivers/staging/rdma/hfi1/driver.c 
b/drivers/staging/rdma/hfi1/driver.c
index 3218520..dd8b2c5 100644
--- a/drivers/staging/rdma/hfi1/driver.c
+++ b/drivers/staging/rdma/hfi1/driver.c
@@ -862,6 +862,37 @@ static inline void set_all_dma_rtail(struct hfi1_devdata 
*dd)
&handle_receive_interrupt_dma_rtail;
 }
 
+void set_all_slowpath(struct hfi1_devdata *dd)
+{
+   int i;
+
+   /* HFI1_CTRL_CTXT must always use the slow path interrupt handler */
+   for (i = HFI1_CTRL_CTXT + 1; i < dd->first_user_ctxt; i++)
+   dd->rcd[i]->do_interrupt = &handle_receive_interrupt;
+}
+
+static inline int set_armed_to_active(struct hfi1_ctxtdata *rcd,
+ struct hfi1_packet packet,
+ struct hfi1_devdata *dd)
+{
+   struct work_struct *lsaw = &rcd->ppd->linkstate_active_work;
+   struct hfi1_message_header *hdr = hfi1_get_msgheader(packet.rcd->dd,
+packet.rhf_addr);
+
+   if (hdr2sc(hdr, packet.rhf) != 0xf) {
+   int hwstate = read_logical_state(dd);
+
+   if (hwstate != LSTATE_ACTIVE) {
+   dd_dev_info(dd, "Unexpected link state %d\n", hwstate);
+   return 0;
+   }
+
+   queue_work(rcd->ppd->hfi1_wq, lsaw);
+   return 1;
+   }
+   return 0;
+}
+
 /*
  * handle_receive_interrupt - receive a packet
  * @rcd: the context
@@ -929,6 +960,11 @@ int handle_receive_interrupt(struct hfi1_ctxtdata *rcd, 
int thread)
last = skip_rcv_packet(&packet, thread);
skip_pkt = 0;
} else {
+   /* Auto activate link on non-SC15 packet receive */
+   if (unlikely(rcd->ppd->host_link_state ==
+HLS_UP_ARMED))
+   if (set_armed_to_active(rcd, packet, dd))
+   goto bail;
last = process_rcv_packet(&packet, thread);
}
 
@@ -984,6 +1020,42 @@ bail:
 }
 
 /*
+ * We may discover in the interrupt that the hardware link

Re: [PATCH 2/2] IB/mad: use CQ abstraction

2016-01-04 Thread Hal Rosenstock

On 1/4/2016 9:16 AM, Christoph Hellwig wrote:
> Remove the local workqueue to process mad completions and use the CQ API
> instead.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Hal Rosenstock 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH TRIVIAL] IB/core: ib_mad.h ib_mad_snoop_handler documentation fix

2016-01-04 Thread Hal Rosenstock

ib_mad_snoop_handler ues send_buf rather than send_wr

Signed-off-by: Hal Rosenstock 
---
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index ec9b44d..2b3573d 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -424,11 +424,11 @@ typedef void (*ib_mad_send_handler)(struct ib_mad_agent 
*mad_agent,
 /**
  * ib_mad_snoop_handler - Callback handler for snooping sent MADs.
  * @mad_agent: MAD agent that snooped the MAD.
- * @send_wr: Work request information on the sent MAD.
+ * @send_buf: send MAD data buffer.
  * @mad_send_wc: Work completion information on the sent MAD.  Valid
  *   only for snooping that occurs on a send completion.
  *
- * Clients snooping MADs should not modify data referenced by the @send_wr
+ * Clients snooping MADs should not modify data referenced by the @send_buf
  * or @mad_send_wc.
  */
 typedef void (*ib_mad_snoop_handler)(struct ib_mad_agent *mad_agent,
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH] staging: rdma: hfi1: diag: constify hfi1_filter_array structure

2016-01-04 Thread Marciniszyn, Mike

> From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On
> Behalf Of Julia Lawall
> Subject: [PATCH] staging: rdma: hfi1: diag: constify hfi1_filter_array
> structure
> 
> The hfi1_filter_array structure is never modified, so declare it as const.
> 
> Done with the help of Coccinelle.
> 
> Signed-off-by: Julia Lawall 
> 

Thanks for the patch!

Acked-by: Mike Marciniszyn 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] IB/sysfs: Fix sparse warning on attr_id

2016-01-04 Thread Christoph Lameter

On Sun, 3 Jan 2016, ira.we...@intel.com wrote:

> Attributed ID was declared as an int while the value should really be big
> endian 16.

Reviewed-by: Christoph Lameter 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC contig pages support 1/2] IB: Supports contiguous memory operations

2016-01-04 Thread Vlastimil Babka


[Sorry for resending, forgot to CC Minchan]

On 12/23/2015 05:30 PM, Shachar Raindel wrote:
>>>
>>> I completely agree, and this RFC was sent in order to start discussion
>>> on this subject.
>>>
>>> Dear MM people, can you please advise on the subject?
>>>
>>> Multiple HW vendors, from different fields, ranging between embedded
>> SoC
>>> devices (TI) and HPC (Mellanox) are looking for a solution to allocate
>>> blocks of contiguous memory to user space applications, without using
>> huge
>>> pages.
>>>
>>> What should be the API to expose such feature?
>>>
>>> Should we create a virtual FS that allows the user to create "files"
>>> representing memory allocations, and define the contiguous level we
>>> attempt to allocate using folders (similar to hugetlbfs)?
>>>
>>> Should we patch hugetlbfs to allow allocation of contiguous memory
>> chunks,
>>> without creating larger memory mapping in the CPU page tables?
>>>
>>> Should we create a special "allocator" virtual device, that will hand
>> out
>>> memory in contiguous chunks via a call to mmap with an FD connected to
>> the
>>> device?
>>
>> How much memory do you assume to be used like this?
>
> Depends on the use case. Most likely several MBs/core, used for 
interfacing

> with the HW (packet rings, frame buffers, etc.).
>
> Some applications might want to perform calculations in such memory, to
> optimize communication time, especially in the HPC market.

OK.

>
>> Is this memory
>> supposed to be swappable, migratable, etc? I.e. on LRU lists?
>
> Most likely not. In many of the relevant applications (embedded, HPC),
> there is no swap and the application threads are pinned to specific cores
> and NUMA nodes.
> The biggest pain here is that these memory pages will not be eligible for
> compaction, making it harder to handle fragmentations and CMA allocation
> requests.

There was a patch set to enable compaction on such pages, see 
https://lwn.net/Articles/650917/
Minchan was going to pick this after Gioh left, and then it should be 
possible. But it requires careful driver-specific cooperation, i.e. when 
a page can be isolated for the migration, see 
http://article.gmane.org/gmane.linux.kernel.mm/136457


>> Allocating a lot of memory (e.g. most of userspace memory) that's not
>> LRU wouldn't be nice. But LRU operations are not prepared to work witch
>> such non-standard-sized allocations, regardless of what API you use.  So
>> I think that's the more fundamental questions here.
>
> I agree that there are fundamental questions here.
>
> That being said, there is a clear need for an API allowing
> allocation, to the user space, limited size of memory that
> is composed of large contiguous blocks.
>
> What will be the best way to implement such solution?

Given the likely driver-specific constraints/handling of the page 
migration, I'm not sure if some completely universal API is feasible.
Maybe some reusable parts of the functionality in the patch in this 
thread could be provided by mm.


> Thanks,
> --Shachar
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email:  em...@kvack.org 
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC contig pages support 1/2] IB: Supports contiguous memory operations

2016-01-04 Thread Vlastimil Babka


On 12/23/2015 05:30 PM, Shachar Raindel wrote:
>>>
>>> I completely agree, and this RFC was sent in order to start discussion
>>> on this subject.
>>>
>>> Dear MM people, can you please advise on the subject?
>>>
>>> Multiple HW vendors, from different fields, ranging between embedded
>> SoC
>>> devices (TI) and HPC (Mellanox) are looking for a solution to allocate
>>> blocks of contiguous memory to user space applications, without using
>> huge
>>> pages.
>>>
>>> What should be the API to expose such feature?
>>>
>>> Should we create a virtual FS that allows the user to create "files"
>>> representing memory allocations, and define the contiguous level we
>>> attempt to allocate using folders (similar to hugetlbfs)?
>>>
>>> Should we patch hugetlbfs to allow allocation of contiguous memory
>> chunks,
>>> without creating larger memory mapping in the CPU page tables?
>>>
>>> Should we create a special "allocator" virtual device, that will hand
>> out
>>> memory in contiguous chunks via a call to mmap with an FD connected to
>> the
>>> device?
>>
>> How much memory do you assume to be used like this?
>
> Depends on the use case. Most likely several MBs/core, used for 
interfacing

> with the HW (packet rings, frame buffers, etc.).
>
> Some applications might want to perform calculations in such memory, to
> optimize communication time, especially in the HPC market.

OK.

>
>> Is this memory
>> supposed to be swappable, migratable, etc? I.e. on LRU lists?
>
> Most likely not. In many of the relevant applications (embedded, HPC),
> there is no swap and the application threads are pinned to specific cores
> and NUMA nodes.
> The biggest pain here is that these memory pages will not be eligible for
> compaction, making it harder to handle fragmentations and CMA allocation
> requests.

There was a patch set to enable compaction on such pages, see 
https://lwn.net/Articles/650917/
Minchan was going to pick this after Gioh left, and then it should be 
possible. But it requires careful driver-specific cooperation, i.e. when 
a page can be isolated for the migration, see 
http://article.gmane.org/gmane.linux.kernel.mm/136457


>> Allocating a lot of memory (e.g. most of userspace memory) that's not
>> LRU wouldn't be nice. But LRU operations are not prepared to work witch
>> such non-standard-sized allocations, regardless of what API you use.  So
>> I think that's the more fundamental questions here.
>
> I agree that there are fundamental questions here.
>
> That being said, there is a clear need for an API allowing
> allocation, to the user space, limited size of memory that
> is composed of large contiguous blocks.
>
> What will be the best way to implement such solution?

Given the likely driver-specific constraints/handling of the page 
migration, I'm not sure if some completely universal API is feasible.
Maybe some reusable parts of the functionality in the patch in this 
thread could be provided by mm.


> Thanks,
> --Shachar
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email:  em...@kvack.org 
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] IB/mad: pass ib_mad_send_buf explicitly to the recv_handler

2016-01-04 Thread Hal Rosenstock

On 1/4/2016 8:15 AM, Christoph Hellwig wrote:
> Stop abusing wr_id and just pass the parameter explicitly.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Hal Rosenstock 

> ---
>  drivers/infiniband/core/cm.c  |  1 +
>  drivers/infiniband/core/mad.c | 18 ++
>  drivers/infiniband/core/sa_query.c|  7 ---
>  drivers/infiniband/core/user_mad.c|  1 +
>  drivers/infiniband/ulp/srpt/ib_srpt.c |  1 +
>  include/rdma/ib_mad.h |  2 ++
>  6 files changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index e3a95d1..ad3726d 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -3503,6 +3503,7 @@ int ib_cm_notify(struct ib_cm_id *cm_id, enum 
> ib_event_type event)
>  EXPORT_SYMBOL(ib_cm_notify);
>  
>  static void cm_recv_handler(struct ib_mad_agent *mad_agent,
> + struct ib_mad_send_buf *send_buf,
>   struct ib_mad_recv_wc *mad_recv_wc)
>  {
>   struct cm_port *port = mad_agent->context;
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index d4d2a61..cbe232a 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -693,7 +693,7 @@ static void snoop_recv(struct ib_mad_qp_info *qp_info,
>  
>   atomic_inc(&mad_snoop_priv->refcount);
>   spin_unlock_irqrestore(&qp_info->snoop_lock, flags);
> - mad_snoop_priv->agent.recv_handler(&mad_snoop_priv->agent,
> + mad_snoop_priv->agent.recv_handler(&mad_snoop_priv->agent, NULL,
>  mad_recv_wc);
>   deref_snoop_agent(mad_snoop_priv);
>   spin_lock_irqsave(&qp_info->snoop_lock, flags);
> @@ -1994,9 +1994,9 @@ static void ib_mad_complete_recv(struct 
> ib_mad_agent_private *mad_agent_priv,
>   /* user rmpp is in effect
>* and this is an active RMPP MAD
>*/
> - mad_recv_wc->wc->wr_id = 0;
> - 
> mad_agent_priv->agent.recv_handler(&mad_agent_priv->agent,
> -mad_recv_wc);
> + mad_agent_priv->agent.recv_handler(
> + &mad_agent_priv->agent, NULL,
> + mad_recv_wc);
>   atomic_dec(&mad_agent_priv->refcount);
>   } else {
>   /* not user rmpp, revert to normal behavior and
> @@ -2010,9 +2010,10 @@ static void ib_mad_complete_recv(struct 
> ib_mad_agent_private *mad_agent_priv,
>   spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
>  
>   /* Defined behavior is to complete response before 
> request */
> - mad_recv_wc->wc->wr_id = (unsigned long) 
> &mad_send_wr->send_buf;
> - 
> mad_agent_priv->agent.recv_handler(&mad_agent_priv->agent,
> -mad_recv_wc);
> + mad_agent_priv->agent.recv_handler(
> + &mad_agent_priv->agent,
> + &mad_send_wr->send_buf,
> + mad_recv_wc);
>   atomic_dec(&mad_agent_priv->refcount);
>  
>   mad_send_wc.status = IB_WC_SUCCESS;
> @@ -2021,7 +2022,7 @@ static void ib_mad_complete_recv(struct 
> ib_mad_agent_private *mad_agent_priv,
>   ib_mad_complete_send_wr(mad_send_wr, &mad_send_wc);
>   }
>   } else {
> - mad_agent_priv->agent.recv_handler(&mad_agent_priv->agent,
> + mad_agent_priv->agent.recv_handler(&mad_agent_priv->agent, NULL,
>  mad_recv_wc);
>   deref_mad_agent(mad_agent_priv);
>   }
> @@ -2762,6 +2763,7 @@ static void local_completions(struct work_struct *work)
>  IB_MAD_SNOOP_RECVS);
>   recv_mad_agent->agent.recv_handler(
>   &recv_mad_agent->agent,
> + &local->mad_send_wr->send_buf,
>   
> &local->mad_priv->header.recv_wc);
>   spin_lock_irqsave(&recv_mad_agent->lock, flags);
>   atomic_dec(&recv_mad_agent->refcount);
> diff --git a/drivers/infiniband/core/sa_query.c 
> b/drivers/infiniband/core/sa_query.c
> index e364a42..1f91b6e 100644
> --- a/drivers/infiniband/core/sa_query.c
> +++ b/drivers/infiniband/core/sa_query.c
> @@ -1669,14 +1669,15 @@ static void send_handler(struct ib_mad_agent *agent,
>  }
>  
>  stat

[PATCH 2/2] IB/mad: use CQ abstraction

2016-01-04 Thread Christoph Hellwig

Remove the local workqueue to process mad completions and use the CQ API
instead.

Signed-off-by: Christoph Hellwig 
---
 drivers/infiniband/core/mad.c  | 159 +
 drivers/infiniband/core/mad_priv.h |   2 +-
 2 files changed, 58 insertions(+), 103 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index cbe232a..286d1a9 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -61,18 +61,6 @@ MODULE_PARM_DESC(send_queue_size, "Size of send queue in 
number of work requests
 module_param_named(recv_queue_size, mad_recvq_size, int, 0444);
 MODULE_PARM_DESC(recv_queue_size, "Size of receive queue in number of work 
requests");
 
-/*
- * Define a limit on the number of completions which will be processed by the
- * worker thread in a single work item.  This ensures that other work items
- * (potentially from other users) are processed fairly.
- *
- * The number of completions was derived from the default queue sizes above.
- * We use a value which is double the larger of the 2 queues (receive @ 512)
- * but keep it fixed such that an increase in that value does not introduce
- * unfairness.
- */
-#define MAD_COMPLETION_PROC_LIMIT 1024
-
 static struct list_head ib_mad_port_list;
 static u32 ib_mad_client_id = 0;
 
@@ -96,6 +84,9 @@ static int add_nonoui_reg_req(struct ib_mad_reg_req 
*mad_reg_req,
  u8 mgmt_class);
 static int add_oui_reg_req(struct ib_mad_reg_req *mad_reg_req,
   struct ib_mad_agent_private *agent_priv);
+static bool ib_mad_send_error(struct ib_mad_port_private *port_priv,
+ struct ib_wc *wc);
+static void ib_mad_send_done(struct ib_cq *cq, struct ib_wc *wc);
 
 /*
  * Returns a ib_mad_port_private structure or NULL for a device/port
@@ -702,11 +693,11 @@ static void snoop_recv(struct ib_mad_qp_info *qp_info,
 }
 
 static void build_smp_wc(struct ib_qp *qp,
-u64 wr_id, u16 slid, u16 pkey_index, u8 port_num,
+void *wr_cqe, u16 slid, u16 pkey_index, u8 port_num,
 struct ib_wc *wc)
 {
memset(wc, 0, sizeof *wc);
-   wc->wr_id = wr_id;
+   wc->wr_cqe = wr_cqe;
wc->status = IB_WC_SUCCESS;
wc->opcode = IB_WC_RECV;
wc->pkey_index = pkey_index;
@@ -844,7 +835,7 @@ static int handle_outgoing_dr_smp(struct 
ib_mad_agent_private *mad_agent_priv,
}
 
build_smp_wc(mad_agent_priv->agent.qp,
-send_wr->wr.wr_id, drslid,
+send_wr->wr.wr_cqe, drslid,
 send_wr->pkey_index,
 send_wr->port_num, &mad_wc);
 
@@ -1051,7 +1042,9 @@ struct ib_mad_send_buf * ib_create_send_mad(struct 
ib_mad_agent *mad_agent,
 
mad_send_wr->sg_list[1].lkey = mad_agent->qp->pd->local_dma_lkey;
 
-   mad_send_wr->send_wr.wr.wr_id = (unsigned long) mad_send_wr;
+   mad_send_wr->mad_list.cqe.done = ib_mad_send_done;
+
+   mad_send_wr->send_wr.wr.wr_cqe = &mad_send_wr->mad_list.cqe;
mad_send_wr->send_wr.wr.sg_list = mad_send_wr->sg_list;
mad_send_wr->send_wr.wr.num_sge = 2;
mad_send_wr->send_wr.wr.opcode = IB_WR_SEND;
@@ -1163,8 +1156,9 @@ int ib_send_mad(struct ib_mad_send_wr_private 
*mad_send_wr)
 
/* Set WR ID to find mad_send_wr upon completion */
qp_info = mad_send_wr->mad_agent_priv->qp_info;
-   mad_send_wr->send_wr.wr.wr_id = (unsigned long)&mad_send_wr->mad_list;
mad_send_wr->mad_list.mad_queue = &qp_info->send_queue;
+   mad_send_wr->mad_list.cqe.done = ib_mad_send_done;
+   mad_send_wr->send_wr.wr.wr_cqe = &mad_send_wr->mad_list.cqe;
 
mad_agent = mad_send_wr->send_buf.mad_agent;
sge = mad_send_wr->sg_list;
@@ -2185,13 +2179,14 @@ handle_smi(struct ib_mad_port_private *port_priv,
return handle_ib_smi(port_priv, qp_info, wc, port_num, recv, response);
 }
 
-static void ib_mad_recv_done_handler(struct ib_mad_port_private *port_priv,
-struct ib_wc *wc)
+static void ib_mad_recv_done(struct ib_cq *cq, struct ib_wc *wc)
 {
+   struct ib_mad_port_private *port_priv = cq->cq_context;
+   struct ib_mad_list_head *mad_list =
+   container_of(wc->wr_cqe, struct ib_mad_list_head, cqe);
struct ib_mad_qp_info *qp_info;
struct ib_mad_private_header *mad_priv_hdr;
struct ib_mad_private *recv, *response = NULL;
-   struct ib_mad_list_head *mad_list;
struct ib_mad_agent_private *mad_agent;
int port_num;
int ret = IB_MAD_RESULT_SUCCESS;
@@ -2199,7 +2194,17 @@ static void ib_mad_recv_done_handler(struct 
ib_mad_port_private *port_priv,
u16 resp_mad_pkey_index = 0;
bool opa;
 
-   mad_list = (struct ib_mad_list_head *)(unsigned long)wc->wr_id;
+   if (list_empty_careful(&port_priv->port_list))
+   return;
+
+   if (wc->sta

[PATCH 1/2] IB/mad: pass ib_mad_send_buf explicitly to the recv_handler

2016-01-04 Thread Christoph Hellwig

Stop abusing wr_id and just pass the parameter explicitly.

Signed-off-by: Christoph Hellwig 
---
 drivers/infiniband/core/cm.c  |  1 +
 drivers/infiniband/core/mad.c | 18 ++
 drivers/infiniband/core/sa_query.c|  7 ---
 drivers/infiniband/core/user_mad.c|  1 +
 drivers/infiniband/ulp/srpt/ib_srpt.c |  1 +
 include/rdma/ib_mad.h |  2 ++
 6 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index e3a95d1..ad3726d 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3503,6 +3503,7 @@ int ib_cm_notify(struct ib_cm_id *cm_id, enum 
ib_event_type event)
 EXPORT_SYMBOL(ib_cm_notify);
 
 static void cm_recv_handler(struct ib_mad_agent *mad_agent,
+   struct ib_mad_send_buf *send_buf,
struct ib_mad_recv_wc *mad_recv_wc)
 {
struct cm_port *port = mad_agent->context;
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index d4d2a61..cbe232a 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -693,7 +693,7 @@ static void snoop_recv(struct ib_mad_qp_info *qp_info,
 
atomic_inc(&mad_snoop_priv->refcount);
spin_unlock_irqrestore(&qp_info->snoop_lock, flags);
-   mad_snoop_priv->agent.recv_handler(&mad_snoop_priv->agent,
+   mad_snoop_priv->agent.recv_handler(&mad_snoop_priv->agent, NULL,
   mad_recv_wc);
deref_snoop_agent(mad_snoop_priv);
spin_lock_irqsave(&qp_info->snoop_lock, flags);
@@ -1994,9 +1994,9 @@ static void ib_mad_complete_recv(struct 
ib_mad_agent_private *mad_agent_priv,
/* user rmpp is in effect
 * and this is an active RMPP MAD
 */
-   mad_recv_wc->wc->wr_id = 0;
-   
mad_agent_priv->agent.recv_handler(&mad_agent_priv->agent,
-  mad_recv_wc);
+   mad_agent_priv->agent.recv_handler(
+   &mad_agent_priv->agent, NULL,
+   mad_recv_wc);
atomic_dec(&mad_agent_priv->refcount);
} else {
/* not user rmpp, revert to normal behavior and
@@ -2010,9 +2010,10 @@ static void ib_mad_complete_recv(struct 
ib_mad_agent_private *mad_agent_priv,
spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
 
/* Defined behavior is to complete response before 
request */
-   mad_recv_wc->wc->wr_id = (unsigned long) 
&mad_send_wr->send_buf;
-   
mad_agent_priv->agent.recv_handler(&mad_agent_priv->agent,
-  mad_recv_wc);
+   mad_agent_priv->agent.recv_handler(
+   &mad_agent_priv->agent,
+   &mad_send_wr->send_buf,
+   mad_recv_wc);
atomic_dec(&mad_agent_priv->refcount);
 
mad_send_wc.status = IB_WC_SUCCESS;
@@ -2021,7 +2022,7 @@ static void ib_mad_complete_recv(struct 
ib_mad_agent_private *mad_agent_priv,
ib_mad_complete_send_wr(mad_send_wr, &mad_send_wc);
}
} else {
-   mad_agent_priv->agent.recv_handler(&mad_agent_priv->agent,
+   mad_agent_priv->agent.recv_handler(&mad_agent_priv->agent, NULL,
   mad_recv_wc);
deref_mad_agent(mad_agent_priv);
}
@@ -2762,6 +2763,7 @@ static void local_completions(struct work_struct *work)
   IB_MAD_SNOOP_RECVS);
recv_mad_agent->agent.recv_handler(
&recv_mad_agent->agent,
+   &local->mad_send_wr->send_buf,

&local->mad_priv->header.recv_wc);
spin_lock_irqsave(&recv_mad_agent->lock, flags);
atomic_dec(&recv_mad_agent->refcount);
diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index e364a42..1f91b6e 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -1669,14 +1669,15 @@ static void send_handler(struct ib_mad_agent *agent,
 }
 
 static void recv_handler(struct ib_mad_agent *mad_agent,
+struct ib_mad_send_buf *send_buf,
 struct ib_mad_recv_wc *mad_recv_wc)
 {

convert mad to the new CQ API

2016-01-04 Thread Christoph Hellwig

This series converts the mad handler to the new CQ API so that
it ensures fairness in completions intead of starving other
processes while also greatly simplifying the code.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH V1 for-next 1/2] IB/core: Rename rdma_addr_find_dmac_by_grh

2016-01-04 Thread Matan Barak

rdma_addr_find_dmac_by_grh resolves dmac, vlan_id and if_index and
downsteram patch will also add hop_limit as an output parameter,
thus we rename it to rdma_addr_find_l2_eth_by_grh.

Signed-off-by: Matan Barak 
---
 drivers/infiniband/core/addr.c   |  7 ---
 drivers/infiniband/core/verbs.c  | 18 +-
 drivers/infiniband/hw/ocrdma/ocrdma_ah.c |  6 +++---
 include/rdma/ib_addr.h   |  5 +++--
 4 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index 0b5f245..ce3c68e 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -540,8 +540,9 @@ static void resolve_cb(int status, struct sockaddr 
*src_addr,
complete(&((struct resolve_cb_context *)context)->comp);
 }
 
-int rdma_addr_find_dmac_by_grh(const union ib_gid *sgid, const union ib_gid 
*dgid,
-  u8 *dmac, u16 *vlan_id, int *if_index)
+int rdma_addr_find_l2_eth_by_grh(const union ib_gid *sgid,
+const union ib_gid *dgid,
+u8 *dmac, u16 *vlan_id, int *if_index)
 {
int ret = 0;
struct rdma_dev_addr dev_addr;
@@ -583,7 +584,7 @@ int rdma_addr_find_dmac_by_grh(const union ib_gid *sgid, 
const union ib_gid *dgi
dev_put(dev);
return ret;
 }
-EXPORT_SYMBOL(rdma_addr_find_dmac_by_grh);
+EXPORT_SYMBOL(rdma_addr_find_l2_eth_by_grh);
 
 int rdma_addr_find_smac_by_sgid(union ib_gid *sgid, u8 *smac, u16 *vlan_id)
 {
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 072b94d..66eb498 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -467,11 +467,11 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 
port_num,
if (!idev)
return -ENODEV;
 
-   ret = rdma_addr_find_dmac_by_grh(&dgid, &sgid,
-ah_attr->dmac,
-wc->wc_flags & IB_WC_WITH_VLAN 
?
-NULL : &vlan_id,
-&if_index);
+   ret = rdma_addr_find_l2_eth_by_grh(&dgid, &sgid,
+  ah_attr->dmac,
+  wc->wc_flags & 
IB_WC_WITH_VLAN ?
+  NULL : &vlan_id,
+  &if_index);
if (ret) {
dev_put(idev);
return ret;
@@ -1158,10 +1158,10 @@ int ib_resolve_eth_dmac(struct ib_qp *qp,
 
ifindex = sgid_attr.ndev->ifindex;
 
-   ret = rdma_addr_find_dmac_by_grh(&sgid,
-
&qp_attr->ah_attr.grh.dgid,
-qp_attr->ah_attr.dmac,
-NULL, &ifindex);
+   ret = rdma_addr_find_l2_eth_by_grh(&sgid,
+  
&qp_attr->ah_attr.grh.dgid,
+  
qp_attr->ah_attr.dmac,
+  NULL, &ifindex);
 
dev_put(sgid_attr.ndev);
}
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_ah.c 
b/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
index a343e03..850e0d1 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
@@ -152,9 +152,9 @@ struct ib_ah *ocrdma_create_ah(struct ib_pd *ibpd, struct 
ib_ah_attr *attr)
if ((pd->uctx) &&
(!rdma_is_multicast_addr((struct in6_addr *)attr->grh.dgid.raw)) &&
(!rdma_link_local_addr((struct in6_addr *)attr->grh.dgid.raw))) {
-   status = rdma_addr_find_dmac_by_grh(&sgid, &attr->grh.dgid,
-   attr->dmac, &vlan_tag,
-   &sgid_attr.ndev->ifindex);
+   status = rdma_addr_find_l2_eth_by_grh(&sgid, &attr->grh.dgid,
+ attr->dmac, &vlan_tag,
+ &sgid_attr.ndev->ifindex);
if (status) {
pr_err("%s(): Failed to resolve dmac from gid." 
"status = %d\n", __func__, status);
diff --git a/include/rdma/ib_addr.h b/include/rdma/ib_addr.h
index 87156dc..73fd088 100644
--- a/include/rdma/ib_addr.h
+++ b/include/rdma/ib_addr.h
@@ -130,8 +130,9 @@ int rdma_copy_addr(struct rdma_dev_addr *dev_addr, struct 
net_device *dev,
 int rdma_addr_size(struct sockaddr *addr);
 
 int rdma_addr_find_smac_by_sgid(union ib_gid *sgid, u8 *smac, u16 *v

[PATCH V1 for-next 0/2] Fix hop-limit for RoCE

2016-01-04 Thread Matan Barak

Hi Doug,

Previously, the hop limit of RoCE packets were set to
IPV6_DEFAULT_HOPLIMIT. This generally works, but RoCE stack needs to
follow the IP stack rules. Therefore, this patch series use
ip4_dst_hoplimit and ip6_dst_hoplimit in order to set the correct
hop limit for RoCE traffic.

The first patch refactors the name of rdma_addr_find_dmac_by_grh to
rdma_addr_find_l2_eth_by_grh while the second one does the actual
change.

Regards,
Matan

Changes from V0:
 - Hop limit in IB when using reversible path should be 0xff.

Matan Barak (2):
  IB/core: Rename rdma_addr_find_dmac_by_grh
  IB/core: Use hop-limit from IP stack for RoCE

 drivers/infiniband/core/addr.c   | 14 +++---
 drivers/infiniband/core/cm.c |  1 +
 drivers/infiniband/core/cma.c| 12 +---
 drivers/infiniband/core/verbs.c  | 30 ++
 drivers/infiniband/hw/ocrdma/ocrdma_ah.c |  7 ---
 include/rdma/ib_addr.h   |  7 +--
 6 files changed, 40 insertions(+), 31 deletions(-)

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH V1 for-next 2/2] IB/core: Use hop-limit from IP stack for RoCE

2016-01-04 Thread Matan Barak

Previously, IPV6_DEFAULT_HOPLIMIT was used as the hop limit value for
RoCE. Fixing that by taking ip4_dst_hoplimit and ip6_dst_hoplimit as
hop limit values.

Signed-off-by: Matan Barak 
---
 drivers/infiniband/core/addr.c   |  9 -
 drivers/infiniband/core/cm.c |  1 +
 drivers/infiniband/core/cma.c| 12 +---
 drivers/infiniband/core/verbs.c  | 16 +++-
 drivers/infiniband/hw/ocrdma/ocrdma_ah.c |  3 ++-
 include/rdma/ib_addr.h   |  4 +++-
 6 files changed, 26 insertions(+), 19 deletions(-)

diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index ce3c68e..f924d90 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -252,6 +252,8 @@ static int addr4_resolve(struct sockaddr_in *src_in,
if (rt->rt_uses_gateway)
addr->network = RDMA_NETWORK_IPV4;
 
+   addr->hoplimit = ip4_dst_hoplimit(&rt->dst);
+
*prt = rt;
return 0;
 out:
@@ -295,6 +297,8 @@ static int addr6_resolve(struct sockaddr_in6 *src_in,
if (rt->rt6i_flags & RTF_GATEWAY)
addr->network = RDMA_NETWORK_IPV6;
 
+   addr->hoplimit = ip6_dst_hoplimit(dst);
+
*pdst = dst;
return 0;
 put:
@@ -542,7 +546,8 @@ static void resolve_cb(int status, struct sockaddr 
*src_addr,
 
 int rdma_addr_find_l2_eth_by_grh(const union ib_gid *sgid,
 const union ib_gid *dgid,
-u8 *dmac, u16 *vlan_id, int *if_index)
+u8 *dmac, u16 *vlan_id, int *if_index,
+int *hoplimit)
 {
int ret = 0;
struct rdma_dev_addr dev_addr;
@@ -581,6 +586,8 @@ int rdma_addr_find_l2_eth_by_grh(const union ib_gid *sgid,
*if_index = dev_addr.bound_dev_if;
if (vlan_id)
*vlan_id = rdma_vlan_dev_vlan_id(dev);
+   if (hoplimit)
+   *hoplimit = dev_addr.hoplimit;
dev_put(dev);
return ret;
 }
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index e3a95d1..cd3d345 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -1641,6 +1641,7 @@ static int cm_req_handler(struct cm_work *work)
cm_format_paths_from_req(req_msg, &work->path[0], &work->path[1]);
 
memcpy(work->path[0].dmac, cm_id_priv->av.ah_attr.dmac, ETH_ALEN);
+   work->path[0].hop_limit = cm_id_priv->av.ah_attr.grh.hop_limit;
ret = ib_get_cached_gid(work->port->cm_dev->ib_device,
work->port->port_num,
cm_id_priv->av.ah_attr.grh.sgid_index,
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 559ee3d..66983da 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2424,7 +2424,6 @@ static int cma_resolve_iboe_route(struct rdma_id_private 
*id_priv)
 {
struct rdma_route *route = &id_priv->id.route;
struct rdma_addr *addr = &route->addr;
-   enum ib_gid_type network_gid_type;
struct cma_work *work;
int ret;
struct net_device *ndev = NULL;
@@ -2478,14 +2477,13 @@ static int cma_resolve_iboe_route(struct 
rdma_id_private *id_priv)
&route->path_rec->dgid);
 
/* Use the hint from IP Stack to select GID Type */
-   network_gid_type = ib_network_to_gid_type(addr->dev_addr.network);
-   if (addr->dev_addr.network != RDMA_NETWORK_IB) {
-   route->path_rec->gid_type = network_gid_type;
+   if (route->path_rec->gid_type < 
ib_network_to_gid_type(addr->dev_addr.network))
+   route->path_rec->gid_type = 
ib_network_to_gid_type(addr->dev_addr.network);
+   if (((struct sockaddr *)&id_priv->id.route.addr.dst_addr)->sa_family != 
AF_IB)
/* TODO: get the hoplimit from the inet/inet6 device */
-   route->path_rec->hop_limit = IPV6_DEFAULT_HOPLIMIT;
-   } else {
+   route->path_rec->hop_limit = addr->dev_addr.hoplimit;
+   else
route->path_rec->hop_limit = 1;
-   }
route->path_rec->reversible = 1;
route->path_rec->pkey = cpu_to_be16(0x);
route->path_rec->mtu_selector = IB_SA_EQ;
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 66eb498..b1998bc 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -434,6 +434,7 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 
port_num,
int ret;
enum rdma_network_type net_type = RDMA_NETWORK_IB;
enum ib_gid_type gid_type = IB_GID_TYPE_IB;
+   int hoplimit = 0xff;
union ib_gid dgid;
union ib_gid sgid;
 
@@ -471,7 +472,7 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 
port_num,
   ah_attr->dmac,

Re: [PATCH for-next 2/2] IB/core: Use hop-limit from IP stack for RoCE

2016-01-04 Thread Matan Barak

On Sun, Jan 3, 2016 at 9:03 PM, Jason Gunthorpe
 wrote:
> On Sun, Jan 03, 2016 at 03:59:11PM +0200, Matan Barak wrote:
>> @@ -434,6 +434,7 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 
>> port_num,
>>   int ret;
>>   enum rdma_network_type net_type = RDMA_NETWORK_IB;
>>   enum ib_gid_type gid_type = IB_GID_TYPE_IB;
>> + int hoplimit = grh->hop_limit;
>
>>   ah_attr->grh.flow_label = flow_class & 0xF;
>> - ah_attr->grh.hop_limit = 0xFF;
>> + ah_attr->grh.hop_limit = hoplimit;
>
> No, this is wrong for IB. Please be careful to follow the IB
> specification language for computing a hop limit on a reversible path.
>

You're right, this should be 0xff. Thanks.

> No idea about rocee, but I can't believe using grh->hop_limit is right
> there either.
>

Regarding RoCE, the hop limit is set from the routing table (in the
same function):
ret = rdma_addr_find_l2_eth_by_grh(&dgid, &sgid,
   ah_attr->dmac,

wc->wc_flags & IB_WC_WITH_VLAN ?
   NULL : &vlan_id,
   &if_index,
&hoplimit);



> Jason
>

Regards,
Matan

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] IB/sysfs: Fix sparse warning on attr_id

2016-01-04 Thread Bart Van Assche


On 01/04/2016 04:44 AM, ira.we...@intel.com wrote:

From: Ira Weiny 

Attributed ID was declared as an int while the value should really be big
endian 16.

Fixes: 35c4cbb17811 ("IB/core: Create get_perf_mad function in sysfs.c")

Reported-by: Bart Van Assche 
Signed-off-by: Ira Weiny 


Reviewed-by: Bart Van Assche 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv2 TRIVIAL] IB/core: ib_mad.h ib_mad_snoop_handler documentation fix

Re: [PATCH] IB/sysfs: Fix sparse warning on attr_id

Re: [PATCHv2 TRIVIAL] IB/core: ib_mad.h ib_mad_snoop_handler documentation fix

Re: [PATCH 2/2] IB/mad: use CQ abstraction

Re: [PATCH 1/2] IB/mad: pass ib_mad_send_buf explicitly to the recv_handler

re: iser-target: Add iSCSI Extensions for RDMA (iSER) target driver

[PATCHv2 TRIVIAL] IB/core: ib_mad.h ib_mad_snoop_handler documentation fix

Re: [PATCH v2] staging/rdma/hfi1: check for ARMED->ACTIVE transition in receive interrupt

Re: [PATCH TRIVIAL] IB/core: ib_mad.h ib_mad_snoop_handler documentation fix

Re: start moving user space visible constants to uapi headers

Re: [PATCH] svc_rdma: use local_dma_lkey

[PATCH v2] staging/rdma/hfi1: check for ARMED->ACTIVE transition in receive interrupt

Re: [PATCH 2/2] IB/mad: use CQ abstraction

[PATCH TRIVIAL] IB/core: ib_mad.h ib_mad_snoop_handler documentation fix

RE: [PATCH] staging: rdma: hfi1: diag: constify hfi1_filter_array structure

Re: [PATCH] IB/sysfs: Fix sparse warning on attr_id

Re: [RFC contig pages support 1/2] IB: Supports contiguous memory operations

Re: [RFC contig pages support 1/2] IB: Supports contiguous memory operations

Re: [PATCH 1/2] IB/mad: pass ib_mad_send_buf explicitly to the recv_handler

[PATCH 2/2] IB/mad: use CQ abstraction

[PATCH 1/2] IB/mad: pass ib_mad_send_buf explicitly to the recv_handler

convert mad to the new CQ API

[PATCH V1 for-next 1/2] IB/core: Rename rdma_addr_find_dmac_by_grh

[PATCH V1 for-next 0/2] Fix hop-limit for RoCE

[PATCH V1 for-next 2/2] IB/core: Use hop-limit from IP stack for RoCE

Re: [PATCH for-next 2/2] IB/core: Use hop-limit from IP stack for RoCE

Re: [PATCH] IB/sysfs: Fix sparse warning on attr_id

27 matches

Site Navigation

Mail list logo

Footer information