Re: iser target fixes for kernel 4.1

2015-03-29 Thread Nicholas A. Bellinger
On Sun, 2015-03-29 at 15:52 +0300, Sagi Grimberg wrote:
> Hi Nic,
> 
> This set consists of:
> - Bug fixes (1-2)
> - Performance Optimization (3)
> - Code refactoring (3-10,13-16)
> - Renaming (11-12,17)
> - Version bump (18)
> 
> Sagi Grimberg (18):
>   iser-target: Fix session hang in case of an rdma read DIF error
>   iser-target: Fix possible deadlock in RDMA_CM connection error
>   iser-target: Use a single DMA MR and PD per device
>   iser-target: Remove redundant check on recv completion
>   iser-target: Remove dead code
>   iser-target: Remove redundant local variable
>   iser-target: Remove redundant casting on void pointers
>   iser-target: Split isert_setup_qp
>   iser-target: Introduce isert_[alloc|free]_comps
>   iser-target: Remove redundant assignment to local variable
>   iser-target: Rename rend/recv completion routines
>   iser-target: Rename device find/release routines
>   iser-target: Split some logic in isert_connect_request to routines
>   iser-target: Get rid of redundant max_accept
>   iser-target: Remove redundant check on the device
>   iser-target: Remove un-needed rdma_listen backlog
>   iser-target: Remove conn_ prefix from struct isert_conn members
>   iser-target: Bump version to 1.0
> 
>  drivers/infiniband/ulp/isert/ib_isert.c |  691 
> +--
>  drivers/infiniband/ulp/isert/ib_isert.h |   37 +-
>  2 files changed, 398 insertions(+), 330 deletions(-)
> 

Applied to target-pending/for-next.

Also added a v3.14+ tag for patch #1 and v3.10+ tag for patch #2.
Let me know if that is not correct.

Thanks Sagi!

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-next 5/9] IB/mlx4: Change init flow to request alias GUIDs for active VFs

2015-03-29 Thread Or Gerlitz
From: Yishai Hadas 

Change the init flow to ask GUIDs only for active VFs. This is done for
both SM & HOST modes so that there is no need any more to maintain the
ownership record type.

In case SM mode is used, the initial value will be 0, ask the SM to assign,
for the HOST mode the initial value will be the HOST generated GUID.

This will enable out of the box experience for both probed and attached VFs.

Signed-off-by: Yishai Hadas 
Signed-off-by: Jack Morgenstein 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/hw/mlx4/alias_GUID.c |  104 ++-
 drivers/infiniband/hw/mlx4/mlx4_ib.h|8 +--
 drivers/infiniband/hw/mlx4/sysfs.c  |   17 -
 3 files changed, 50 insertions(+), 79 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/alias_GUID.c 
b/drivers/infiniband/hw/mlx4/alias_GUID.c
index c20ce09..cbadab0 100644
--- a/drivers/infiniband/hw/mlx4/alias_GUID.c
+++ b/drivers/infiniband/hw/mlx4/alias_GUID.c
@@ -303,19 +303,17 @@ static void aliasguid_query_handler(int status,
 */
if (sm_response == MLX4_NOT_SET_GUID) {
if (rec->guids_retry_schedule[i] == 0)
-   mlx4_ib_warn(&dev->ib_dev, "%s:Record num %d in 
"
-"block_num: %d was declined by SM, 
"
-"ownership by %d (0 = driver, 
1=sysAdmin,"
-" 2=None)\n", __func__, i,
-guid_rec->block_num,
-rec->ownership);
+   mlx4_ib_warn(&dev->ib_dev,
+"%s:Record num %d in  block_num: 
%d was declined by SM\n",
+__func__, i,
+guid_rec->block_num);
goto entry_declined;
} else {
   /* properly assigned record. */
   /* We save the GUID we just got from the SM in the
* admin_guid in order to be persistent, and in the
* request from the sm the process will ask for the same 
GUID */
-   if (rec->ownership == MLX4_GUID_SYSADMIN_ASSIGN &&
+   if (required_val &&
sm_response != required_val) {
/* Warn only on first retry */
if (rec->guids_retry_schedule[i] == 0)
@@ -421,9 +419,7 @@ static void invalidate_guid_record(struct mlx4_ib_dev *dev, 
u8 port, int index)
need to assign GUIDs, then don't put it up for assignment.
*/
if (MLX4_GUID_FOR_DELETE_VAL == cur_admin_val ||
-   (!index && !i) ||
-   MLX4_GUID_NONE_ASSIGN == dev->sriov.alias_guid.
-   ports_guid[port - 1].all_rec_per_port[index].ownership)
+   (!index && !i))
continue;
comp_mask |= mlx4_ib_get_aguid_comp_mask_from_ix(i);
}
@@ -531,6 +527,30 @@ out:
return err;
 }
 
+static void mlx4_ib_guid_port_init(struct mlx4_ib_dev *dev, int port)
+{
+   int j, k, entry;
+   __be64 guid;
+
+   /*Check if the SM doesn't need to assign the GUIDs*/
+   for (j = 0; j < NUM_ALIAS_GUID_REC_IN_PORT; j++) {
+   for (k = 0; k < NUM_ALIAS_GUID_IN_REC; k++) {
+   entry = j * NUM_ALIAS_GUID_IN_REC + k;
+   /* no request for the 0 entry (hw guid) */
+   if (!entry || entry > dev->dev->persist->num_vfs ||
+   !mlx4_is_slave_active(dev->dev, entry))
+   continue;
+   guid = mlx4_get_admin_guid(dev->dev, entry, port);
+   *(__be64 *)&dev->sriov.alias_guid.ports_guid[port - 1].
+   all_rec_per_port[j].all_recs
+   [GUID_REC_SIZE * k] = guid;
+   pr_debug("guid was set, entry=%d, val=0x%llx, 
port=%d\n",
+entry,
+be64_to_cpu(guid),
+port);
+   }
+   }
+}
 void mlx4_ib_invalidate_all_guid_record(struct mlx4_ib_dev *dev, int port)
 {
int i;
@@ -540,6 +560,13 @@ void mlx4_ib_invalidate_all_guid_record(struct mlx4_ib_dev 
*dev, int port)
 
spin_lock_irqsave(&dev->sriov.going_down_lock, flags);
spin_lock_irqsave(&dev->sriov.alias_guid.ag_work_lock, flags1);
+
+   if (dev->sriov.alias_guid.ports_guid[port - 1].state_flags &
+   GUID_STATE_NEED_PORT_INIT) {
+   mlx4_ib_guid_port_init(dev, port);
+   dev->sriov.alias_guid.ports_guid[port - 1].state_flags &=
+   (~GUID_STATE_NEED_PORT_INIT);
+ 

[PATCH for-next 0/9] mlx4 changes in virtual GID management

2015-03-29 Thread Or Gerlitz
Under the existing implementation for virtual GIDs, if the SM is not 
reachable or incurs a delayed response, or if the VF is probed into a
VM before their GUID is registered with the SM, there exists a window 
in time in which the VF sees an incorrect GID, i.e., not the GID that 
was intended by the admin. This results in exposing a temporal identity 
to the VF.

Moreover, a subsequent change in the alias GID causes a spec-incompliant
change to the VF identity. Some guest operating systems, such as Windows,
cannot tolerate such changes.

This series solves above problem by exposing the admin desired value instead 
of the value that was approved by the SM. As long as the SM doesn't approve 
the GID, the VF would see its link as down.

In addition, we request GIDs from the SM on demand, i.e., when a VF actually
needs them, and release them when the GIDs are no longer in use. In cloud
environments, this is useful for GID migrations, in which a GID is assigned to
a VF on the destination HCA, while the VF on the source HCA is shut down (but
the GID was not administratively released).

For reasons of compatibility, an explicit admin request to set/change a GUID
entry is done immediately, regardless of whether the VF is active or not. This
allows administrators to change the GUID without the need to unbind/bind the VF.

In addition, the existing implementation doesn't support a persistency
mechanism to retry a GUID request when the SM has rejected it for any reason.
The PF driver shall keep trying to acquire the specified GUID indefinitely by
utilizing an exponential back off scheme, this should be managed per GUID and
be aligned with other incoming admin requests.

This ability needed especially for the on-demand GUID feature. In this case, we
must manage the GUID's status per entry and handle cases that some entries are
temporarily rejected.

The first patch adds the persistency support and is pre-requisites for the
series.  Further patches make the change to use the admin VF behavior as
described above.

Finally, the default mode is changed to be HOST assigned instead of SM
assigned. This is the expected operational mode, because it doesn't depend on
SM availability as described above.

Yishai and Or.

Yishai Hadas (9):
  IB/mlx4: Alias GUID adding persistency support
  net/mlx4_core: Manage alias GUID per VF
  net/mlx4_core: Set initial admin GUIDs for VFs
  IB/mlx4: Manage admin alias GUID upon admin request
  IB/mlx4: Change init flow to request alias GUIDs for active VFs
  IB/mlx4: Request alias GUID on demand
  net/mlx4_core: Raise slave shutdown event upon FLR
  net/mlx4_core: Return the admin alias GUID upon host view request
  IB/mlx4: Change alias guids default to be host assigned

 drivers/infiniband/hw/mlx4/alias_GUID.c   |  468 +
 drivers/infiniband/hw/mlx4/main.c |   26 ++-
 drivers/infiniband/hw/mlx4/mlx4_ib.h  |   14 +-
 drivers/infiniband/hw/mlx4/sysfs.c|   44 +--
 drivers/net/ethernet/mellanox/mlx4/cmd.c  |   42 ++-
 drivers/net/ethernet/mellanox/mlx4/eq.c   |2 +
 drivers/net/ethernet/mellanox/mlx4/main.c |   39 +++
 drivers/net/ethernet/mellanox/mlx4/mlx4.h |1 +
 include/linux/mlx4/device.h   |4 +
 9 files changed, 459 insertions(+), 181 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-next 7/9] net/mlx4_core: Raise slave shutdown event upon FLR

2015-03-29 Thread Or Gerlitz
From: Yishai Hadas 

There might be cases that PF doesn't get a "reset" command upon slave down
(e.g. virsh destroy). In these cases, however, an FLR event is issued.

Therefore, when the PF receives an FLR event for a slave, it should also
generate a shutdown event on the PF for that slave, to let the PF upper
layers (mlx4_ib, eth) perform any required cleanup/actions associated
with slave shutdown.

Signed-off-by: Yishai Hadas 
Signed-off-by: Jack Morgenstein 
Signed-off-by: Or Gerlitz 
---
 drivers/net/ethernet/mellanox/mlx4/eq.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/eq.c 
b/drivers/net/ethernet/mellanox/mlx4/eq.c
index 264bc15..901a0c6 100644
--- a/drivers/net/ethernet/mellanox/mlx4/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/eq.c
@@ -706,6 +706,8 @@ static int mlx4_eq_int(struct mlx4_dev *dev, struct mlx4_eq 
*eq)

priv->mfunc.master.slave_state[flr_slave].is_slave_going_down = 1;
}

spin_unlock_irqrestore(&priv->mfunc.master.slave_state_lock, flags);
+   mlx4_dispatch_event(dev, MLX4_DEV_EVENT_SLAVE_SHUTDOWN,
+   flr_slave);
queue_work(priv->mfunc.master.comm_wq,
   &priv->mfunc.master.slave_flr_event_work);
break;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-next 8/9] net/mlx4_core: Return the admin alias GUID upon host view request

2015-03-29 Thread Or Gerlitz
From: Yishai Hadas 

Return the admin alias GUID value upon a GET request via HOST. We do this so
that the GUID value requested by the admin is returned even if the SM has not
yet approved this GUID (e.g. the SM is down).

Note that this does not create a problem, since the virtual port will remain
down until the SM does ACK the requested GUID value.

Signed-off-by: Yishai Hadas 
Signed-off-by: Jack Morgenstein 
Signed-off-by: Or Gerlitz 
---
 drivers/net/ethernet/mellanox/mlx4/cmd.c |   41 +++--
 1 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c 
b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index 778de74..bcc6b2d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -936,21 +936,34 @@ static int mlx4_MAD_IFC_wrapper(struct mlx4_dev *dev, int 
slave,
return err;
}
if (smp->attr_id == IB_SMP_ATTR_GUID_INFO) {
-   /* compute slave's gid block */
-   smp->attr_mod = cpu_to_be32(slave / 8);
-   /* execute cmd */
-   err = mlx4_cmd_box(dev, inbox->dma, outbox->dma,
-vhcr->in_modifier, opcode_modifier,
-vhcr->op, MLX4_CMD_TIME_CLASS_C, 
MLX4_CMD_NATIVE);
-   if (!err) {
-   /* if needed, move slave gid to index 0 
*/
-   if (slave % 8)
-   memcpy(outsmp->data,
-  outsmp->data + (slave % 
8) * 8, 8);
-   /* delete all other gids */
-   memset(outsmp->data + 8, 0, 56);
+   __be64 guid = mlx4_get_admin_guid(dev, slave,
+ port);
+
+   /* set the PF admin guid to the FW/HW burned
+* GUID, if it wasn't yet set
+*/
+   if (slave == 0 && guid == 0) {
+   smp->attr_mod = 0;
+   err = mlx4_cmd_box(dev,
+  inbox->dma,
+  outbox->dma,
+  vhcr->in_modifier,
+  opcode_modifier,
+  vhcr->op,
+  
MLX4_CMD_TIME_CLASS_C,
+  MLX4_CMD_NATIVE);
+   if (err)
+   return err;
+   mlx4_set_admin_guid(dev,
+   *(__be64 *)outsmp->
+   data, slave, port);
+   } else {
+   memcpy(outsmp->data, &guid, 8);
}
-   return err;
+
+   /* clean all other gids */
+   memset(outsmp->data + 8, 0, 56);
+   return 0;
}
if (smp->attr_id == IB_SMP_ATTR_NODE_INFO) {
err = mlx4_cmd_box(dev, inbox->dma, outbox->dma,
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-next 3/9] net/mlx4_core: Set initial admin GUIDs for VFs

2015-03-29 Thread Or Gerlitz
From: Yishai Hadas 

To have out of the box experience, the PF generates random GUIDs who
serve as the initial admin values.

Signed-off-by: Yishai Hadas 
Signed-off-by: Jack Morgenstein 
Signed-off-by: Or Gerlitz 
---
 drivers/net/ethernet/mellanox/mlx4/cmd.c  |1 +
 drivers/net/ethernet/mellanox/mlx4/main.c |   23 +++
 include/linux/mlx4/device.h   |1 +
 3 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c 
b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index 20b3c7b..778de74 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -2255,6 +2255,7 @@ int mlx4_multi_func_init(struct mlx4_dev *dev)

priv->mfunc.master.vf_oper[i].vport[port].state.default_vlan = MLX4_VGT;

priv->mfunc.master.vf_oper[i].vport[port].vlan_idx = NO_INDX;

priv->mfunc.master.vf_oper[i].vport[port].mac_idx = NO_INDX;
+   mlx4_set_random_admin_guid(dev, i, port);
}
spin_lock_init(&s_state->lock);
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c 
b/drivers/net/ethernet/mellanox/mlx4/main.c
index 6d1f10e..67e57e5 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -2245,6 +2246,28 @@ __be64 mlx4_get_admin_guid(struct mlx4_dev *dev, int 
entry, int port)
 }
 EXPORT_SYMBOL_GPL(mlx4_get_admin_guid);
 
+void mlx4_set_random_admin_guid(struct mlx4_dev *dev, int entry, int port)
+{
+   struct mlx4_priv *priv = mlx4_priv(dev);
+   u8 random_mac[6];
+   char *raw_gid;
+
+   /* hw GUID */
+   if (entry == 0)
+   return;
+
+   eth_random_addr(random_mac);
+   raw_gid = (char *)&priv->mfunc.master.vf_admin[entry].vport[port].guid;
+   raw_gid[0] = random_mac[0] ^ 2;
+   raw_gid[1] = random_mac[1];
+   raw_gid[2] = random_mac[2];
+   raw_gid[3] = 0xff;
+   raw_gid[4] = 0xfe;
+   raw_gid[5] = random_mac[3];
+   raw_gid[6] = random_mac[4];
+   raw_gid[7] = random_mac[5];
+}
+
 static int mlx4_setup_hca(struct mlx4_dev *dev)
 {
struct mlx4_priv *priv = mlx4_priv(dev);
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 5c67bf0..f867d25 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -1340,6 +1340,7 @@ void mlx4_counter_free(struct mlx4_dev *dev, u32 idx);
 void mlx4_set_admin_guid(struct mlx4_dev *dev, __be64 guid, int entry,
 int port);
 __be64 mlx4_get_admin_guid(struct mlx4_dev *dev, int entry, int port);
+void mlx4_set_random_admin_guid(struct mlx4_dev *dev, int entry, int port);
 int mlx4_flow_attach(struct mlx4_dev *dev,
 struct mlx4_net_trans_rule *rule, u64 *reg_id);
 int mlx4_flow_detach(struct mlx4_dev *dev, u64 reg_id);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-next 4/9] IB/mlx4: Manage admin alias GUID upon admin request

2015-03-29 Thread Or Gerlitz
From: Yishai Hadas 

Set the admin alias GUID per the administrator's request via the sysfs
mechanism into the core layer.

The "get" request returns the current value. However, if the administrator
requests the SM to assign a new value by requesting 0, the SM assigned
GUID is returned.

Signed-off-by: Yishai Hadas 
Signed-off-by: Jack Morgenstein 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/hw/mlx4/alias_GUID.c |6 ++
 drivers/infiniband/hw/mlx4/sysfs.c  |   18 +-
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/alias_GUID.c 
b/drivers/infiniband/hw/mlx4/alias_GUID.c
index 2ca8984..c20ce09 100644
--- a/drivers/infiniband/hw/mlx4/alias_GUID.c
+++ b/drivers/infiniband/hw/mlx4/alias_GUID.c
@@ -333,6 +333,12 @@ static void aliasguid_query_handler(int status,
} else {
*(__be64 *)&rec->all_recs[i * GUID_REC_SIZE] =
sm_response;
+   if (required_val == 0)
+   mlx4_set_admin_guid(dev->dev,
+   sm_response,
+   (guid_rec->block_num
+   * 
NUM_ALIAS_GUID_IN_REC) + i,
+   cb_ctx->port);
goto next_entry;
}
}
diff --git a/drivers/infiniband/hw/mlx4/sysfs.c 
b/drivers/infiniband/hw/mlx4/sysfs.c
index 7423d7e..bb1c34a 100644
--- a/drivers/infiniband/hw/mlx4/sysfs.c
+++ b/drivers/infiniband/hw/mlx4/sysfs.c
@@ -46,21 +46,17 @@
 static ssize_t show_admin_alias_guid(struct device *dev,
  struct device_attribute *attr, char *buf)
 {
-   int record_num;/*0-15*/
-   int guid_index_in_rec; /*0 - 7*/
struct mlx4_ib_iov_sysfs_attr *mlx4_ib_iov_dentry =
container_of(attr, struct mlx4_ib_iov_sysfs_attr, dentry);
struct mlx4_ib_iov_port *port = mlx4_ib_iov_dentry->ctx;
struct mlx4_ib_dev *mdev = port->dev;
+   __be64 sysadmin_ag_val;
 
-   record_num = mlx4_ib_iov_dentry->entry_num / 8 ;
-   guid_index_in_rec = mlx4_ib_iov_dentry->entry_num % 8 ;
+   sysadmin_ag_val = mlx4_get_admin_guid(mdev->dev,
+ mlx4_ib_iov_dentry->entry_num,
+ port->num);
 
-   return sprintf(buf, "%llx\n",
-  be64_to_cpu(*(__be64 *)&mdev->sriov.alias_guid.
-  ports_guid[port->num - 1].
-  all_rec_per_port[record_num].
-  all_recs[8 * guid_index_in_rec]));
+   return sprintf(buf, "%llx\n", be64_to_cpu(sysadmin_ag_val));
 }
 
 /* store_admin_alias_guid stores the (new) administratively assigned value of 
that GUID.
@@ -98,6 +94,10 @@ static ssize_t store_admin_alias_guid(struct device *dev,
/* Change the state to be pending for update */
mdev->sriov.alias_guid.ports_guid[port->num - 
1].all_rec_per_port[record_num].status
= MLX4_GUID_INFO_STATUS_IDLE ;
+   mlx4_set_admin_guid(mdev->dev, cpu_to_be64(sysadmin_ag_val),
+   mlx4_ib_iov_dentry->entry_num,
+   port->num);
+
switch (sysadmin_ag_val) {
case MLX4_GUID_FOR_DELETE_VAL:
mdev->sriov.alias_guid.ports_guid[port->num - 
1].all_rec_per_port[record_num].ownership
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-next 6/9] IB/mlx4: Request alias GUID on demand

2015-03-29 Thread Or Gerlitz
From: Yishai Hadas 

Request GIDs from the SM on demand, i.e., when a VF actually needs them,
and release them when the GIDs are no longer in use.

In cloud environments, this is useful for GID migrations, in which a
GID is assigned to a VF on the destination HCA, while the VF on the
source HCA is shutdown (but the GID was not administratively released).

Signed-off-by: Yishai Hadas 
Signed-off-by: Jack Morgenstein 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/hw/mlx4/alias_GUID.c |   51 +++
 drivers/infiniband/hw/mlx4/main.c   |   22 +
 drivers/infiniband/hw/mlx4/mlx4_ib.h|2 +
 3 files changed, 75 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/alias_GUID.c 
b/drivers/infiniband/hw/mlx4/alias_GUID.c
index cbadab0..f5935c5 100644
--- a/drivers/infiniband/hw/mlx4/alias_GUID.c
+++ b/drivers/infiniband/hw/mlx4/alias_GUID.c
@@ -123,6 +123,57 @@ ib_sa_comp_mask mlx4_ib_get_aguid_comp_mask_from_ix(int 
index)
return IB_SA_COMP_MASK(4 + index);
 }
 
+void mlx4_ib_slave_alias_guid_event(struct mlx4_ib_dev *dev, int slave,
+   int port,  int slave_init)
+{
+   __be64 curr_guid, required_guid;
+   int record_num = slave / 8;
+   int index = slave % 8;
+   int port_index = port - 1;
+   unsigned long flags;
+   int do_work = 0;
+
+   spin_lock_irqsave(&dev->sriov.alias_guid.ag_work_lock, flags);
+   if (dev->sriov.alias_guid.ports_guid[port_index].state_flags &
+   GUID_STATE_NEED_PORT_INIT)
+   goto unlock;
+   if (!slave_init) {
+   curr_guid = *(__be64 *)&dev->sriov.
+   alias_guid.ports_guid[port_index].
+   all_rec_per_port[record_num].
+   all_recs[GUID_REC_SIZE * index];
+   if (curr_guid == cpu_to_be64(MLX4_GUID_FOR_DELETE_VAL) ||
+   !curr_guid)
+   goto unlock;
+   required_guid = cpu_to_be64(MLX4_GUID_FOR_DELETE_VAL);
+   } else {
+   required_guid = mlx4_get_admin_guid(dev->dev, slave, port);
+   if (required_guid == cpu_to_be64(MLX4_GUID_FOR_DELETE_VAL))
+   goto unlock;
+   }
+   *(__be64 *)&dev->sriov.alias_guid.ports_guid[port_index].
+   all_rec_per_port[record_num].
+   all_recs[GUID_REC_SIZE * index] = required_guid;
+   dev->sriov.alias_guid.ports_guid[port_index].
+   all_rec_per_port[record_num].guid_indexes
+   |= mlx4_ib_get_aguid_comp_mask_from_ix(index);
+   dev->sriov.alias_guid.ports_guid[port_index].
+   all_rec_per_port[record_num].status
+   = MLX4_GUID_INFO_STATUS_IDLE;
+   /* set to run immediately */
+   dev->sriov.alias_guid.ports_guid[port_index].
+   all_rec_per_port[record_num].time_to_run = 0;
+   dev->sriov.alias_guid.ports_guid[port_index].
+   all_rec_per_port[record_num].
+   guids_retry_schedule[index] = 0;
+   do_work = 1;
+unlock:
+   spin_unlock_irqrestore(&dev->sriov.alias_guid.ag_work_lock, flags);
+
+   if (do_work)
+   mlx4_ib_init_alias_guid_work(dev, port_index);
+}
+
 /*
  * Whenever new GUID is set/unset (guid table change) create event and
  * notify the relevant slave (master also should be notified).
diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index b972c0b..35f00ae 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2790,9 +2790,31 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void 
*ibdev_ptr,
case MLX4_DEV_EVENT_SLAVE_INIT:
/* here, p is the slave id */
do_slave_init(ibdev, p, 1);
+   if (mlx4_is_master(dev)) {
+   int i;
+
+   for (i = 1; i <= ibdev->num_ports; i++) {
+   if (rdma_port_get_link_layer(&ibdev->ib_dev, i)
+   == IB_LINK_LAYER_INFINIBAND)
+   mlx4_ib_slave_alias_guid_event(ibdev,
+  p, i,
+  1);
+   }
+   }
return;
 
case MLX4_DEV_EVENT_SLAVE_SHUTDOWN:
+   if (mlx4_is_master(dev)) {
+   int i;
+
+   for (i = 1; i <= ibdev->num_ports; i++) {
+   if (rdma_port_get_link_layer(&ibdev->ib_dev, i)
+   == IB_LINK_LAYER_INFINIBAND)
+   mlx4_ib_slave_alias_guid_event(ibdev,
+  p, i,
+  0);
+ 

[PATCH for-next 9/9] IB/mlx4: Change alias guids default to be host assigned

2015-03-29 Thread Or Gerlitz
From: Yishai Hadas 

Change the default mode to be HOST assigned instead of SM assigned. This is
the expected operational mode, because it doesn't depend on SM availability.

As PF generates random GUIDs as the initial admin values, this gives
out of the box experience.

Signed-off-by: Yishai Hadas 
Signed-off-by: Jack Morgenstein 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/hw/mlx4/main.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index 35f00ae..7a55fee 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -66,9 +66,9 @@ MODULE_DESCRIPTION("Mellanox ConnectX HCA InfiniBand driver");
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_VERSION(DRV_VERSION);
 
-int mlx4_ib_sm_guid_assign = 1;
+int mlx4_ib_sm_guid_assign = 0;
 module_param_named(sm_guid_assign, mlx4_ib_sm_guid_assign, int, 0444);
-MODULE_PARM_DESC(sm_guid_assign, "Enable SM alias_GUID assignment if 
sm_guid_assign > 0 (Default: 1)");
+MODULE_PARM_DESC(sm_guid_assign, "Enable SM alias_GUID assignment if 
sm_guid_assign > 0 (Default: 0)");
 
 static const char mlx4_ib_version[] =
DRV_NAME ": Mellanox ConnectX InfiniBand driver v"
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-next 1/9] IB/mlx4: Alias GUID adding persistency support

2015-03-29 Thread Or Gerlitz
From: Yishai Hadas 

If the SM rejects an alias GUID request the PF driver keeps trying to acquire
the specified GUID indefinitely, utilizing an exponential backoff scheme.

Retrying is managed per GUID entry. Each entry that wasn't applied holds its
next retry information. Retry requests to the SM consist of records of 8
consecutive GUIDS. Each record that contains GUIDs requiring retries holds its
next time-to-run based on the retry information of all its GUID entries. The
record having the lowest retry time will run first when that retry time
arrives.

Since the method (SET or DELETE) as sent to the SM applies to all the GUIDs in
the record, we must handle SET requests and DELETE requests in separate SM
messages (one for SETs and the other for DELETEs).

To avoid race conditions where a GUID entry request (set or delete) was
modified after the SM request was sent, we save the method and the requested
indices as part of the callback's context -- thus, only the requested indexes
are evaluated when the response is received.

When an GUID entry is approved we turn off its retry-required bit, this
prevents redundant SM retries from occurring on that record.

The port down event should be sent only when previously it was up. Likewise,
the port up event should be sent only if previously the port was down.

Synchronization was added around the flows that change entries and record state
to prevent race conditions.

Signed-off-by: Yishai Hadas 
Signed-off-by: Jack Morgenstein 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/hw/mlx4/alias_GUID.c |  325 +++
 drivers/infiniband/hw/mlx4/mlx4_ib.h|4 +-
 drivers/infiniband/hw/mlx4/sysfs.c  |   11 +-
 3 files changed, 253 insertions(+), 87 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/alias_GUID.c 
b/drivers/infiniband/hw/mlx4/alias_GUID.c
index a31e031..2ca8984 100644
--- a/drivers/infiniband/hw/mlx4/alias_GUID.c
+++ b/drivers/infiniband/hw/mlx4/alias_GUID.c
@@ -58,14 +58,19 @@ struct mlx4_alias_guid_work_context {
int query_id;
struct list_headlist;
int block_num;
+   ib_sa_comp_mask guid_indexes;
+   u8  method;
 };
 
 struct mlx4_next_alias_guid_work {
u8 port;
u8 block_num;
+   u8 method;
struct mlx4_sriov_alias_guid_info_rec_det rec_det;
 };
 
+static int get_low_record_time_index(struct mlx4_ib_dev *dev, u8 port,
+int *resched_delay_sec);
 
 void mlx4_ib_update_cache_on_guid_change(struct mlx4_ib_dev *dev, int 
block_num,
 u8 port_num, u8 *p_data)
@@ -138,10 +143,15 @@ void mlx4_ib_notify_slaves_on_guid_change(struct 
mlx4_ib_dev *dev,
enum slave_port_state prev_state;
__be64 tmp_cur_ag, form_cache_ag;
enum slave_port_gen_event gen_event;
+   struct mlx4_sriov_alias_guid_info_rec_det *rec;
+   unsigned long flags;
+   __be64 required_value;
 
if (!mlx4_is_master(dev->dev))
return;
 
+   rec = &dev->sriov.alias_guid.ports_guid[port_num - 1].
+   all_rec_per_port[block_num];
guid_indexes = be64_to_cpu((__force __be64) dev->sriov.alias_guid.
   ports_guid[port_num - 1].
   all_rec_per_port[block_num].guid_indexes);
@@ -166,8 +176,27 @@ void mlx4_ib_notify_slaves_on_guid_change(struct 
mlx4_ib_dev *dev,
 */
if (tmp_cur_ag != form_cache_ag)
continue;
-   mlx4_gen_guid_change_eqe(dev->dev, slave_id, port_num);
 
+   spin_lock_irqsave(&dev->sriov.alias_guid.ag_work_lock, flags);
+   required_value = *(__be64 *)&rec->all_recs[i * GUID_REC_SIZE];
+
+   if (required_value == cpu_to_be64(MLX4_GUID_FOR_DELETE_VAL))
+   required_value = 0;
+
+   if (tmp_cur_ag == required_value) {
+   rec->guid_indexes = rec->guid_indexes &
+  ~mlx4_ib_get_aguid_comp_mask_from_ix(i);
+   } else {
+   /* may notify port down if value is 0 */
+   if (tmp_cur_ag != MLX4_NOT_SET_GUID) {
+   spin_unlock_irqrestore(&dev->sriov.
+   alias_guid.ag_work_lock, flags);
+   continue;
+   }
+   }
+   spin_unlock_irqrestore(&dev->sriov.alias_guid.ag_work_lock,
+  flags);
+   mlx4_gen_guid_change_eqe(dev->dev, slave_id, port_num);
/*2 cases: Valid GUID, and Invalid Guid*/
 
if (tmp_cur_ag != MLX4_NOT_SET_GUID) { /*valid GUID*/
@@ -185,13 +214,22 @@ void mlx4_ib_notify_slaves_on_guid_change(struct 
mlx4_ib_dev *dev,
  

[PATCH for-next 2/9] net/mlx4_core: Manage alias GUID per VF

2015-03-29 Thread Or Gerlitz
From: Yishai Hadas 

Manages alias GUIDs per VF per port in the core layer.

This is a pre-step for managing alias GUIDs in a mode that the admin
GUID is returned via ib_query_gid() regardless of whether the SM
has approved it or not.

Signed-off-by: Yishai Hadas 
Signed-off-by: Jack Morgenstein 
Signed-off-by: Or Gerlitz 
---
 drivers/net/ethernet/mellanox/mlx4/main.c |   16 
 drivers/net/ethernet/mellanox/mlx4/mlx4.h |1 +
 include/linux/mlx4/device.h   |3 +++
 3 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c 
b/drivers/net/ethernet/mellanox/mlx4/main.c
index 43aa767..6d1f10e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2229,6 +2229,22 @@ void mlx4_counter_free(struct mlx4_dev *dev, u32 idx)
 }
 EXPORT_SYMBOL_GPL(mlx4_counter_free);
 
+void mlx4_set_admin_guid(struct mlx4_dev *dev, __be64 guid, int entry, int 
port)
+{
+   struct mlx4_priv *priv = mlx4_priv(dev);
+
+   priv->mfunc.master.vf_admin[entry].vport[port].guid = guid;
+}
+EXPORT_SYMBOL_GPL(mlx4_set_admin_guid);
+
+__be64 mlx4_get_admin_guid(struct mlx4_dev *dev, int entry, int port)
+{
+   struct mlx4_priv *priv = mlx4_priv(dev);
+
+   return priv->mfunc.master.vf_admin[entry].vport[port].guid;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_admin_guid);
+
 static int mlx4_setup_hca(struct mlx4_dev *dev)
 {
struct mlx4_priv *priv = mlx4_priv(dev);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index 0b16db0..20165fb 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -512,6 +512,7 @@ struct mlx4_vport_state {
u32 tx_rate;
bool spoofchk;
u32 link_state;
+   __be64 guid;
 };
 
 struct mlx4_vf_admin_state {
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 4550c67..5c67bf0 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -1337,6 +1337,9 @@ int mlx4_wol_write(struct mlx4_dev *dev, u64 config, int 
port);
 int mlx4_counter_alloc(struct mlx4_dev *dev, u32 *idx);
 void mlx4_counter_free(struct mlx4_dev *dev, u32 idx);
 
+void mlx4_set_admin_guid(struct mlx4_dev *dev, __be64 guid, int entry,
+int port);
+__be64 mlx4_get_admin_guid(struct mlx4_dev *dev, int entry, int port);
 int mlx4_flow_attach(struct mlx4_dev *dev,
 struct mlx4_net_trans_rule *rule, u64 *reg_id);
 int mlx4_flow_detach(struct mlx4_dev *dev, u64 reg_id);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] cxgb4: Drop unneeded cast on netdev_priv

2015-03-29 Thread Julia Lawall
From: Julia Lawall 

The result of netdev_priv is already implicitly cast to the type of the
left side of the assignment.

The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)

// 
@@
type T;
T *x;
@@

x = 
- (T *)
  netdev_priv(...)
// 

Signed-off-by: Julia Lawall 

---
 drivers/infiniband/hw/cxgb4/cm.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff -u -p a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c
--- a/drivers/infiniband/hw/cxgb4/cm.c
+++ b/drivers/infiniband/hw/cxgb4/cm.c
@@ -3677,14 +3677,14 @@ static int rx_pkt(struct c4iw_dev *dev,
pdev = ip_dev_find(&init_net, iph->daddr);
e = cxgb4_l2t_get(dev->rdev.lldi.l2t, neigh,
pdev, 0);
-   pi = (struct port_info *)netdev_priv(pdev);
+   pi = netdev_priv(pdev);
tx_chan = cxgb4_port_chan(pdev);
dev_put(pdev);
} else {
pdev = get_real_dev(neigh->dev);
e = cxgb4_l2t_get(dev->rdev.lldi.l2t, neigh,
pdev, 0);
-   pi = (struct port_info *)netdev_priv(pdev);
+   pi = netdev_priv(pdev);
tx_chan = cxgb4_port_chan(pdev);
}
neigh_release(neigh);

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/18] iser-target: Remove redundant assignment to local variable

2015-03-29 Thread Sagi Grimberg
No need to keep a local ib_dev as a device pointer.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index db5460a..0bab74b 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -346,12 +346,11 @@ out_cq:
 static int
 isert_create_device_ib_res(struct isert_device *device)
 {
-   struct ib_device *ib_dev = device->ib_device;
struct ib_device_attr *dev_attr;
-   int ret = 0;
+   int ret;
 
dev_attr = &device->dev_attr;
-   ret = isert_query_device(ib_dev, dev_attr);
+   ret = isert_query_device(device->ib_device, dev_attr);
if (ret)
return ret;
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/18] iser-target: Rename rend/recv completion routines

2015-03-29 Thread Sagi Grimberg
Make receive/send completion handling routines symmetrical.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |   11 ++-
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 0bab74b..f015023 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -1543,8 +1543,9 @@ isert_rx_do_work(struct iser_rx_desc *rx_desc, struct 
isert_conn *isert_conn)
 }
 
 static void
-isert_rx_completion(struct iser_rx_desc *desc, struct isert_conn *isert_conn,
-   u32 xfer_len)
+isert_rcv_completion(struct iser_rx_desc *desc,
+struct isert_conn *isert_conn,
+u32 xfer_len)
 {
struct ib_device *ib_dev = isert_conn->conn_cm_id->device;
struct iscsi_hdr *hdr;
@@ -1969,7 +1970,7 @@ isert_response_completion(struct iser_tx_desc *tx_desc,
 }
 
 static void
-isert_send_completion(struct iser_tx_desc *tx_desc,
+isert_snd_completion(struct iser_tx_desc *tx_desc,
  struct isert_conn *isert_conn)
 {
struct ib_device *ib_dev = isert_conn->conn_cm_id->device;
@@ -2061,10 +2062,10 @@ isert_handle_wc(struct ib_wc *wc)
if (likely(wc->status == IB_WC_SUCCESS)) {
if (wc->opcode == IB_WC_RECV) {
rx_desc = (struct iser_rx_desc *)(uintptr_t)wc->wr_id;
-   isert_rx_completion(rx_desc, isert_conn, wc->byte_len);
+   isert_rcv_completion(rx_desc, isert_conn, wc->byte_len);
} else {
tx_desc = (struct iser_tx_desc *)(uintptr_t)wc->wr_id;
-   isert_send_completion(tx_desc, isert_conn);
+   isert_snd_completion(tx_desc, isert_conn);
}
} else {
if (wc->status != IB_WC_WR_FLUSH_ERR)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/18] iser-target: Rename device find/release routines

2015-03-29 Thread Sagi Grimberg
isert_device_find_by_ib_dev and isert_device_try_release
can have a better, more common name like isert_device_[get|put].

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |   13 ++---
 1 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index f015023..d19271b 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -410,7 +410,7 @@ isert_free_device_ib_res(struct isert_device *device)
 }
 
 static void
-isert_device_try_release(struct isert_device *device)
+isert_device_put(struct isert_device *device)
 {
mutex_lock(&device_list_mutex);
device->refcount--;
@@ -424,7 +424,7 @@ isert_device_try_release(struct isert_device *device)
 }
 
 static struct isert_device *
-isert_device_find_by_ib_dev(struct rdma_cm_id *cma_id)
+isert_device_get(struct rdma_cm_id *cma_id)
 {
struct isert_device *device;
int ret;
@@ -713,11 +713,12 @@ isert_connect_request(struct rdma_cm_id *cma_id, struct 
rdma_cm_event *event)
goto out_req_dma_map;
}
 
-   device = isert_device_find_by_ib_dev(cma_id);
+   device = isert_device_get(cma_id);
if (IS_ERR(device)) {
ret = PTR_ERR(device);
goto out_rsp_dma_map;
}
+   isert_conn->conn_device = device;
 
/* Set max inflight RDMA READ requests */
isert_conn->initiator_depth = min_t(u8,
@@ -725,8 +726,6 @@ isert_connect_request(struct rdma_cm_id *cma_id, struct 
rdma_cm_event *event)
device->dev_attr.max_qp_init_rd_atom);
isert_dbg("Using initiator_depth: %u\n", isert_conn->initiator_depth);
 
-   isert_conn->conn_device = device;
-
ret = isert_conn_setup_qp(isert_conn, cma_id);
if (ret)
goto out_conn_dev;
@@ -748,7 +747,7 @@ isert_connect_request(struct rdma_cm_id *cma_id, struct 
rdma_cm_event *event)
return 0;
 
 out_conn_dev:
-   isert_device_try_release(device);
+   isert_device_put(device);
 out_rsp_dma_map:
ib_dma_unmap_single(ib_dev, isert_conn->login_rsp_dma,
ISER_RX_LOGIN_SIZE, DMA_TO_DEVICE);
@@ -796,7 +795,7 @@ isert_connect_release(struct isert_conn *isert_conn)
kfree(isert_conn);
 
if (device)
-   isert_device_try_release(device);
+   isert_device_put(device);
 }
 
 static void
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 16/18] iser-target: Remove un-needed rdma_listen backlog

2015-03-29 Thread Sagi Grimberg
iser target can handle as many connect request as
the fabric sends to it. This backlog should not set as
a back-pressure mechanism (which is not very useful).

isert does need a back-pressure mechanism, but it should
be added in isert by monitoring the number of pending
established connections (will be added in a later stage).

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |2 +-
 drivers/infiniband/ulp/isert/ib_isert.h |1 -
 2 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 357d481..4a2800a 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -3058,7 +3058,7 @@ isert_setup_id(struct isert_np *isert_np)
goto out_id;
}
 
-   ret = rdma_listen(id, ISERT_RDMA_LISTEN_BACKLOG);
+   ret = rdma_listen(id, 0);
if (ret) {
isert_err("rdma_listen() failed: %d\n", ret);
goto out_id;
diff --git a/drivers/infiniband/ulp/isert/ib_isert.h 
b/drivers/infiniband/ulp/isert/ib_isert.h
index e386092..651c58e 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.h
+++ b/drivers/infiniband/ulp/isert/ib_isert.h
@@ -31,7 +31,6 @@
 #define isert_err(fmt, arg...) \
pr_err(PFX "%s: " fmt, __func__ , ## arg)
 
-#define ISERT_RDMA_LISTEN_BACKLOG  10
 #define ISCSI_ISER_SG_TABLESIZE256
 #define ISER_FASTREG_LI_WRID   0xULL
 #define ISER_BEACON_WRID   0xfffeULL
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/18] iser-target: Get rid of redundant max_accept

2015-03-29 Thread Sagi Grimberg
Not sure what it was used for, but there is
no real need for it now as I see it. Go ahead
and get rid of it.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 4fddc08..97cee96 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -3209,11 +3209,11 @@ isert_accept_np(struct iscsi_np *np, struct iscsi_conn 
*conn)
 {
struct isert_np *isert_np = np->np_context;
struct isert_conn *isert_conn;
-   int max_accept = 0, ret;
+   int ret;
 
 accept_wait:
ret = down_interruptible(&isert_np->np_sem);
-   if (ret || max_accept > 5)
+   if (ret)
return -ENODEV;
 
spin_lock_bh(&np->np_thread_lock);
@@ -3232,7 +3232,6 @@ accept_wait:
mutex_lock(&isert_np->np_accept_mutex);
if (list_empty(&isert_np->np_accept_list)) {
mutex_unlock(&isert_np->np_accept_mutex);
-   max_accept++;
goto accept_wait;
}
isert_conn = list_first_entry(&isert_np->np_accept_list,
@@ -3242,7 +3241,6 @@ accept_wait:
 
conn->context = isert_conn;
isert_conn->conn = conn;
-   max_accept = 0;
 
isert_set_conn_info(np, conn, isert_conn);
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/18] iser-target: Remove redundant casting on void pointers

2015-03-29 Thread Sagi Grimberg
No need to cast void pointers.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |   32 +++---
 1 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index ae09561..1b13337 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -76,7 +76,7 @@ isert_prot_cmd(struct isert_conn *conn, struct se_cmd *cmd)
 static void
 isert_qp_event_callback(struct ib_event *e, void *context)
 {
-   struct isert_conn *isert_conn = (struct isert_conn *)context;
+   struct isert_conn *isert_conn = context;
 
isert_err("conn %p event: %d\n", isert_conn, e->event);
switch (e->event) {
@@ -1200,7 +1200,7 @@ isert_rx_login_req(struct isert_conn *isert_conn)
 static struct iscsi_cmd
 *isert_allocate_cmd(struct iscsi_conn *conn)
 {
-   struct isert_conn *isert_conn = (struct isert_conn *)conn->context;
+   struct isert_conn *isert_conn = conn->context;
struct isert_cmd *isert_cmd;
struct iscsi_cmd *cmd;
 
@@ -2085,7 +2085,7 @@ static int
 isert_put_response(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
 {
struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
-   struct isert_conn *isert_conn = (struct isert_conn *)conn->context;
+   struct isert_conn *isert_conn = conn->context;
struct ib_send_wr *send_wr = &isert_cmd->tx_desc.send_wr;
struct iscsi_scsi_rsp *hdr = (struct iscsi_scsi_rsp *)
&isert_cmd->tx_desc.iscsi_header;
@@ -2134,7 +2134,7 @@ static void
 isert_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
 {
struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
-   struct isert_conn *isert_conn = (struct isert_conn *)conn->context;
+   struct isert_conn *isert_conn = conn->context;
struct isert_device *device = isert_conn->conn_device;
 
spin_lock_bh(&conn->cmd_lock);
@@ -2151,7 +2151,7 @@ isert_aborted_task(struct iscsi_conn *conn, struct 
iscsi_cmd *cmd)
 static enum target_prot_op
 isert_get_sup_prot_ops(struct iscsi_conn *conn)
 {
-   struct isert_conn *isert_conn = (struct isert_conn *)conn->context;
+   struct isert_conn *isert_conn = conn->context;
struct isert_device *device = isert_conn->conn_device;
 
if (conn->tpg->tpg_attrib.t10_pi) {
@@ -2173,7 +2173,7 @@ isert_put_nopin(struct iscsi_cmd *cmd, struct iscsi_conn 
*conn,
bool nopout_response)
 {
struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
-   struct isert_conn *isert_conn = (struct isert_conn *)conn->context;
+   struct isert_conn *isert_conn = conn->context;
struct ib_send_wr *send_wr = &isert_cmd->tx_desc.send_wr;
 
isert_create_send_desc(isert_conn, isert_cmd, &isert_cmd->tx_desc);
@@ -2192,7 +2192,7 @@ static int
 isert_put_logout_rsp(struct iscsi_cmd *cmd, struct iscsi_conn *conn)
 {
struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
-   struct isert_conn *isert_conn = (struct isert_conn *)conn->context;
+   struct isert_conn *isert_conn = conn->context;
struct ib_send_wr *send_wr = &isert_cmd->tx_desc.send_wr;
 
isert_create_send_desc(isert_conn, isert_cmd, &isert_cmd->tx_desc);
@@ -2210,7 +2210,7 @@ static int
 isert_put_tm_rsp(struct iscsi_cmd *cmd, struct iscsi_conn *conn)
 {
struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
-   struct isert_conn *isert_conn = (struct isert_conn *)conn->context;
+   struct isert_conn *isert_conn = conn->context;
struct ib_send_wr *send_wr = &isert_cmd->tx_desc.send_wr;
 
isert_create_send_desc(isert_conn, isert_cmd, &isert_cmd->tx_desc);
@@ -2228,7 +2228,7 @@ static int
 isert_put_reject(struct iscsi_cmd *cmd, struct iscsi_conn *conn)
 {
struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
-   struct isert_conn *isert_conn = (struct isert_conn *)conn->context;
+   struct isert_conn *isert_conn = conn->context;
struct ib_send_wr *send_wr = &isert_cmd->tx_desc.send_wr;
struct isert_device *device = isert_conn->conn_device;
struct ib_device *ib_dev = device->ib_device;
@@ -2261,7 +2261,7 @@ static int
 isert_put_text_rsp(struct iscsi_cmd *cmd, struct iscsi_conn *conn)
 {
struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
-   struct isert_conn *isert_conn = (struct isert_conn *)conn->context;
+   struct isert_conn *isert_conn = conn->context;
struct ib_send_wr *send_wr = &isert_cmd->tx_desc.send_wr;
struct iscsi_text_rsp *hdr =
(struct iscsi_text_rsp *)&isert_cmd->tx_desc.iscsi_header;
@@ -2352,7 +2352,7 @@ isert_map_rdma(struct iscsi_conn *conn, struct iscsi_cmd 
*cmd,
 {
struct se_cmd *se_cmd = &cmd->se_cmd;
struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
-   struct isert_conn *isert_conn = (struct isert_conn *)conn->context;
+   

[PATCH 17/18] iser-target: Remove conn_ prefix from struct isert_conn members

2015-03-29 Thread Sagi Grimberg
These variables are always accessed via struct isert_conn so
no need to have a "conn_" prefix for them.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |  264 +++---
 drivers/infiniband/ulp/isert/ib_isert.h |   32 ++--
 2 files changed, 148 insertions(+), 148 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 4a2800a..8f452f6 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -81,7 +81,7 @@ isert_qp_event_callback(struct ib_event *e, void *context)
isert_err("conn %p event: %d\n", isert_conn, e->event);
switch (e->event) {
case IB_EVENT_COMM_EST:
-   rdma_notify(isert_conn->conn_cm_id, IB_EVENT_COMM_EST);
+   rdma_notify(isert_conn->cm_id, IB_EVENT_COMM_EST);
break;
case IB_EVENT_QP_LAST_WQE_REACHED:
isert_warn("Reached TX IB_EVENT_QP_LAST_WQE_REACHED\n");
@@ -110,7 +110,7 @@ isert_query_device(struct ib_device *ib_dev, struct 
ib_device_attr *devattr)
 static struct isert_comp *
 isert_comp_get(struct isert_conn *isert_conn)
 {
-   struct isert_device *device = isert_conn->conn_device;
+   struct isert_device *device = isert_conn->device;
struct isert_comp *comp;
int i, min = 0;
 
@@ -142,7 +142,7 @@ isert_create_qp(struct isert_conn *isert_conn,
struct isert_comp *comp,
struct rdma_cm_id *cma_id)
 {
-   struct isert_device *device = isert_conn->conn_device;
+   struct isert_device *device = isert_conn->device;
struct ib_qp_init_attr attr;
int ret;
 
@@ -185,9 +185,9 @@ isert_conn_setup_qp(struct isert_conn *isert_conn, struct 
rdma_cm_id *cma_id)
int ret;
 
comp = isert_comp_get(isert_conn);
-   isert_conn->conn_qp = isert_create_qp(isert_conn, comp, cma_id);
-   if (IS_ERR(isert_conn->conn_qp)) {
-   ret = PTR_ERR(isert_conn->conn_qp);
+   isert_conn->qp = isert_create_qp(isert_conn, comp, cma_id);
+   if (IS_ERR(isert_conn->qp)) {
+   ret = PTR_ERR(isert_conn->qp);
goto err;
}
 
@@ -206,19 +206,19 @@ isert_cq_event_callback(struct ib_event *e, void *context)
 static int
 isert_alloc_rx_descriptors(struct isert_conn *isert_conn)
 {
-   struct isert_device *device = isert_conn->conn_device;
+   struct isert_device *device = isert_conn->device;
struct ib_device *ib_dev = device->ib_device;
struct iser_rx_desc *rx_desc;
struct ib_sge *rx_sg;
u64 dma_addr;
int i, j;
 
-   isert_conn->conn_rx_descs = kzalloc(ISERT_QP_MAX_RECV_DTOS *
+   isert_conn->rx_descs = kzalloc(ISERT_QP_MAX_RECV_DTOS *
sizeof(struct iser_rx_desc), GFP_KERNEL);
-   if (!isert_conn->conn_rx_descs)
+   if (!isert_conn->rx_descs)
goto fail;
 
-   rx_desc = isert_conn->conn_rx_descs;
+   rx_desc = isert_conn->rx_descs;
 
for (i = 0; i < ISERT_QP_MAX_RECV_DTOS; i++, rx_desc++)  {
dma_addr = ib_dma_map_single(ib_dev, (void *)rx_desc,
@@ -234,18 +234,18 @@ isert_alloc_rx_descriptors(struct isert_conn *isert_conn)
rx_sg->lkey = device->mr->lkey;
}
 
-   isert_conn->conn_rx_desc_head = 0;
+   isert_conn->rx_desc_head = 0;
 
return 0;
 
 dma_map_fail:
-   rx_desc = isert_conn->conn_rx_descs;
+   rx_desc = isert_conn->rx_descs;
for (j = 0; j < i; j++, rx_desc++) {
ib_dma_unmap_single(ib_dev, rx_desc->dma_addr,
ISER_RX_PAYLOAD_SIZE, DMA_FROM_DEVICE);
}
-   kfree(isert_conn->conn_rx_descs);
-   isert_conn->conn_rx_descs = NULL;
+   kfree(isert_conn->rx_descs);
+   isert_conn->rx_descs = NULL;
 fail:
isert_err("conn %p failed to allocate rx descriptors\n", isert_conn);
 
@@ -255,21 +255,21 @@ fail:
 static void
 isert_free_rx_descriptors(struct isert_conn *isert_conn)
 {
-   struct ib_device *ib_dev = isert_conn->conn_device->ib_device;
+   struct ib_device *ib_dev = isert_conn->device->ib_device;
struct iser_rx_desc *rx_desc;
int i;
 
-   if (!isert_conn->conn_rx_descs)
+   if (!isert_conn->rx_descs)
return;
 
-   rx_desc = isert_conn->conn_rx_descs;
+   rx_desc = isert_conn->rx_descs;
for (i = 0; i < ISERT_QP_MAX_RECV_DTOS; i++, rx_desc++)  {
ib_dma_unmap_single(ib_dev, rx_desc->dma_addr,
ISER_RX_PAYLOAD_SIZE, DMA_FROM_DEVICE);
}
 
-   kfree(isert_conn->conn_rx_descs);
-   isert_conn->conn_rx_descs = NULL;
+   kfree(isert_conn->rx_descs);
+   isert_conn->rx_descs = NULL;
 }
 
 static void isert_cq_work(struct work_struct *);
@@ -471,13 +471,13 @@ isert_conn_free_fastreg_pool(struct isert_conn 
*isert_conn)
struct fas

[PATCH 08/18] iser-target: Split isert_setup_qp

2015-03-29 Thread Sagi Grimberg
Simplify iser QP creation by splitting some unrelated
logic bulks to routines.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |   56 +++
 1 files changed, 42 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 1b13337..2a4a435 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -107,13 +107,12 @@ isert_query_device(struct ib_device *ib_dev, struct 
ib_device_attr *devattr)
return 0;
 }
 
-static int
-isert_conn_setup_qp(struct isert_conn *isert_conn, struct rdma_cm_id *cma_id)
+static struct isert_comp *
+isert_comp_get(struct isert_conn *isert_conn)
 {
struct isert_device *device = isert_conn->conn_device;
-   struct ib_qp_init_attr attr;
struct isert_comp *comp;
-   int ret, i, min = 0;
+   int i, min = 0;
 
mutex_lock(&device_list_mutex);
for (i = 0; i < device->comps_used; i++)
@@ -122,9 +121,30 @@ isert_conn_setup_qp(struct isert_conn *isert_conn, struct 
rdma_cm_id *cma_id)
min = i;
comp = &device->comps[min];
comp->active_qps++;
+   mutex_unlock(&device_list_mutex);
+
isert_info("conn %p, using comp %p min_index: %d\n",
   isert_conn, comp, min);
+
+   return comp;
+}
+
+static void
+isert_comp_put(struct isert_comp *comp)
+{
+   mutex_lock(&device_list_mutex);
+   comp->active_qps--;
mutex_unlock(&device_list_mutex);
+}
+
+static struct ib_qp *
+isert_create_qp(struct isert_conn *isert_conn,
+   struct isert_comp *comp,
+   struct rdma_cm_id *cma_id)
+{
+   struct isert_device *device = isert_conn->conn_device;
+   struct ib_qp_init_attr attr;
+   int ret;
 
memset(&attr, 0, sizeof(struct ib_qp_init_attr));
attr.event_handler = isert_qp_event_callback;
@@ -152,16 +172,28 @@ isert_conn_setup_qp(struct isert_conn *isert_conn, struct 
rdma_cm_id *cma_id)
ret = rdma_create_qp(cma_id, device->pd, &attr);
if (ret) {
isert_err("rdma_create_qp failed for cma_id %d\n", ret);
+   return ERR_PTR(ret);
+   }
+
+   return cma_id->qp;
+}
+
+static int
+isert_conn_setup_qp(struct isert_conn *isert_conn, struct rdma_cm_id *cma_id)
+{
+   struct isert_comp *comp;
+   int ret;
+
+   comp = isert_comp_get(isert_conn);
+   isert_conn->conn_qp = isert_create_qp(isert_conn, comp, cma_id);
+   if (IS_ERR(isert_conn->conn_qp)) {
+   ret = PTR_ERR(isert_conn->conn_qp);
goto err;
}
-   isert_conn->conn_qp = cma_id->qp;
 
return 0;
 err:
-   mutex_lock(&device_list_mutex);
-   comp->active_qps--;
-   mutex_unlock(&device_list_mutex);
-
+   isert_comp_put(comp);
return ret;
 }
 
@@ -736,11 +768,7 @@ isert_connect_release(struct isert_conn *isert_conn)
if (isert_conn->conn_qp) {
struct isert_comp *comp = 
isert_conn->conn_qp->recv_cq->cq_context;
 
-   isert_dbg("dec completion context %p active_qps\n", comp);
-   mutex_lock(&device_list_mutex);
-   comp->active_qps--;
-   mutex_unlock(&device_list_mutex);
-
+   isert_comp_put(comp);
ib_destroy_qp(isert_conn->conn_qp);
}
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/18] iser-target: Split some logic in isert_connect_request to routines

2015-03-29 Thread Sagi Grimberg
Move login buffer alloc/free code to dedicated
routines and introduce isert_conn_init which
initializes the connection lists and locks.

Simplifies and cleans up the code a little bit.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |  118 ++-
 1 files changed, 70 insertions(+), 48 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index d19271b..4fddc08 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -636,32 +636,9 @@ err:
return ret;
 }
 
-static int
-isert_connect_request(struct rdma_cm_id *cma_id, struct rdma_cm_event *event)
+static void
+isert_init_conn(struct isert_conn *isert_conn)
 {
-   struct isert_np *isert_np = cma_id->context;
-   struct iscsi_np *np = isert_np->np;
-   struct isert_conn *isert_conn;
-   struct isert_device *device;
-   struct ib_device *ib_dev = cma_id->device;
-   int ret = 0;
-
-   spin_lock_bh(&np->np_thread_lock);
-   if (!np->enabled) {
-   spin_unlock_bh(&np->np_thread_lock);
-   isert_dbg("iscsi_np is not enabled, reject connect request\n");
-   return rdma_reject(cma_id, NULL, 0);
-   }
-   spin_unlock_bh(&np->np_thread_lock);
-
-   isert_dbg("cma_id: %p, portal: %p\n",
-cma_id, cma_id->context);
-
-   isert_conn = kzalloc(sizeof(struct isert_conn), GFP_KERNEL);
-   if (!isert_conn) {
-   isert_err("Unable to allocate isert_conn\n");
-   return -ENOMEM;
-   }
isert_conn->state = ISER_CONN_INIT;
INIT_LIST_HEAD(&isert_conn->conn_accept_node);
init_completion(&isert_conn->conn_login_comp);
@@ -671,20 +648,38 @@ isert_connect_request(struct rdma_cm_id *cma_id, struct 
rdma_cm_event *event)
mutex_init(&isert_conn->conn_mutex);
spin_lock_init(&isert_conn->conn_lock);
INIT_LIST_HEAD(&isert_conn->conn_fr_pool);
+}
 
-   isert_conn->conn_cm_id = cma_id;
+static void
+isert_free_login_buf(struct isert_conn *isert_conn)
+{
+   struct ib_device *ib_dev = isert_conn->conn_device->ib_device;
+
+   ib_dma_unmap_single(ib_dev, isert_conn->login_rsp_dma,
+   ISER_RX_LOGIN_SIZE, DMA_TO_DEVICE);
+   ib_dma_unmap_single(ib_dev, isert_conn->login_req_dma,
+   ISCSI_DEF_MAX_RECV_SEG_LEN,
+   DMA_FROM_DEVICE);
+   kfree(isert_conn->login_buf);
+}
+
+static int
+isert_alloc_login_buf(struct isert_conn *isert_conn,
+ struct ib_device *ib_dev)
+{
+   int ret;
 
isert_conn->login_buf = kzalloc(ISCSI_DEF_MAX_RECV_SEG_LEN +
ISER_RX_LOGIN_SIZE, GFP_KERNEL);
if (!isert_conn->login_buf) {
isert_err("Unable to allocate isert_conn->login_buf\n");
-   ret = -ENOMEM;
-   goto out;
+   return -ENOMEM;
}
 
isert_conn->login_req_buf = isert_conn->login_buf;
isert_conn->login_rsp_buf = isert_conn->login_buf +
ISCSI_DEF_MAX_RECV_SEG_LEN;
+
isert_dbg("Set login_buf: %p login_req_buf: %p login_rsp_buf: %p\n",
 isert_conn->login_buf, isert_conn->login_req_buf,
 isert_conn->login_rsp_buf);
@@ -695,8 +690,7 @@ isert_connect_request(struct rdma_cm_id *cma_id, struct 
rdma_cm_event *event)
 
ret = ib_dma_mapping_error(ib_dev, isert_conn->login_req_dma);
if (ret) {
-   isert_err("ib_dma_mapping_error failed for login_req_dma: %d\n",
-  ret);
+   isert_err("login_req_dma mapping error: %d\n", ret);
isert_conn->login_req_dma = 0;
goto out_login_buf;
}
@@ -707,12 +701,52 @@ isert_connect_request(struct rdma_cm_id *cma_id, struct 
rdma_cm_event *event)
 
ret = ib_dma_mapping_error(ib_dev, isert_conn->login_rsp_dma);
if (ret) {
-   isert_err("ib_dma_mapping_error failed for login_rsp_dma: %d\n",
-  ret);
+   isert_err("login_rsp_dma mapping error: %d\n", ret);
isert_conn->login_rsp_dma = 0;
goto out_req_dma_map;
}
 
+   return 0;
+
+out_req_dma_map:
+   ib_dma_unmap_single(ib_dev, isert_conn->login_req_dma,
+   ISCSI_DEF_MAX_RECV_SEG_LEN, DMA_FROM_DEVICE);
+out_login_buf:
+   kfree(isert_conn->login_buf);
+   return ret;
+}
+
+static int
+isert_connect_request(struct rdma_cm_id *cma_id, struct rdma_cm_event *event)
+{
+   struct isert_np *isert_np = cma_id->context;
+   struct iscsi_np *np = isert_np->np;
+   struct isert_conn *isert_conn;
+   struct isert_device *device;
+   int ret = 0;
+
+   spin_lock_bh(&np->np_thread_lock);
+   if (!np->enabled) {
+   spin_unlock_bh(&np->n

[PATCH 06/18] iser-target: Remove redundant local variable

2015-03-29 Thread Sagi Grimberg
No need for this assignment.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 5b086b3..ae09561 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -1385,13 +1385,12 @@ isert_rx_opcode(struct isert_conn *isert_conn, struct 
iser_rx_desc *rx_desc,
 {
struct iscsi_hdr *hdr = &rx_desc->iscsi_header;
struct iscsi_conn *conn = isert_conn->conn;
-   struct iscsi_session *sess = conn->sess;
struct iscsi_cmd *cmd;
struct isert_cmd *isert_cmd;
int ret = -EINVAL;
u8 opcode = (hdr->opcode & ISCSI_OPCODE_MASK);
 
-   if (sess->sess_ops->SessionType &&
+   if (conn->sess->sess_ops->SessionType &&
   (!(opcode & ISCSI_OP_TEXT) || !(opcode & ISCSI_OP_LOGOUT))) {
isert_err("Got illegal opcode: 0x%02x in SessionType=Discovery,"
  " ignoring\n", opcode);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/18] iser-target: Introduce isert_[alloc|free]_comps

2015-03-29 Thread Sagi Grimberg
Move the code for completion context handling to dedicated
routines. This simplifies the code and removes code duplication.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |  106 +-
 1 files changed, 60 insertions(+), 46 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 2a4a435..db5460a 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -275,39 +275,31 @@ isert_free_rx_descriptors(struct isert_conn *isert_conn)
 static void isert_cq_work(struct work_struct *);
 static void isert_cq_callback(struct ib_cq *, void *);
 
-static int
-isert_create_device_ib_res(struct isert_device *device)
+static void
+isert_free_comps(struct isert_device *device)
 {
-   struct ib_device *ib_dev = device->ib_device;
-   struct ib_device_attr *dev_attr;
-   int ret = 0, i;
-   int max_cqe;
-
-   dev_attr = &device->dev_attr;
-   ret = isert_query_device(ib_dev, dev_attr);
-   if (ret)
-   return ret;
+   int i;
 
-   max_cqe = min(ISER_MAX_CQ_LEN, dev_attr->max_cqe);
+   for (i = 0; i < device->comps_used; i++) {
+   struct isert_comp *comp = &device->comps[i];
 
-   /* asign function handlers */
-   if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS &&
-   dev_attr->device_cap_flags & IB_DEVICE_SIGNATURE_HANDOVER) {
-   device->use_fastreg = 1;
-   device->reg_rdma_mem = isert_reg_rdma;
-   device->unreg_rdma_mem = isert_unreg_rdma;
-   } else {
-   device->use_fastreg = 0;
-   device->reg_rdma_mem = isert_map_rdma;
-   device->unreg_rdma_mem = isert_unmap_cmd;
+   if (comp->cq) {
+   cancel_work_sync(&comp->work);
+   ib_destroy_cq(comp->cq);
+   }
}
+   kfree(device->comps);
+}
 
-   /* Check signature cap */
-   device->pi_capable = dev_attr->device_cap_flags &
-IB_DEVICE_SIGNATURE_HANDOVER ? true : false;
+static int
+isert_alloc_comps(struct isert_device *device,
+ struct ib_device_attr *attr)
+{
+   int i, max_cqe, ret = 0;
 
device->comps_used = min(ISERT_MAX_CQ, min_t(int, num_online_cpus(),
-   device->ib_device->num_comp_vectors));
+device->ib_device->num_comp_vectors));
+
isert_info("Using %d CQs, %s supports %d vectors support "
   "Fast registration %d pi_capable %d\n",
   device->comps_used, device->ib_device->name,
@@ -321,6 +313,8 @@ isert_create_device_ib_res(struct isert_device *device)
return -ENOMEM;
}
 
+   max_cqe = min(ISER_MAX_CQ_LEN, attr->max_cqe);
+
for (i = 0; i < device->comps_used; i++) {
struct isert_comp *comp = &device->comps[i];
 
@@ -332,6 +326,7 @@ isert_create_device_ib_res(struct isert_device *device)
(void *)comp,
max_cqe, i);
if (IS_ERR(comp->cq)) {
+   isert_err("Unable to allocate cq\n");
ret = PTR_ERR(comp->cq);
comp->cq = NULL;
goto out_cq;
@@ -342,6 +337,40 @@ isert_create_device_ib_res(struct isert_device *device)
goto out_cq;
}
 
+   return 0;
+out_cq:
+   isert_free_comps(device);
+   return ret;
+}
+
+static int
+isert_create_device_ib_res(struct isert_device *device)
+{
+   struct ib_device *ib_dev = device->ib_device;
+   struct ib_device_attr *dev_attr;
+   int ret = 0;
+
+   dev_attr = &device->dev_attr;
+   ret = isert_query_device(ib_dev, dev_attr);
+   if (ret)
+   return ret;
+
+   /* asign function handlers */
+   if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS &&
+   dev_attr->device_cap_flags & IB_DEVICE_SIGNATURE_HANDOVER) {
+   device->use_fastreg = 1;
+   device->reg_rdma_mem = isert_reg_rdma;
+   device->unreg_rdma_mem = isert_unreg_rdma;
+   } else {
+   device->use_fastreg = 0;
+   device->reg_rdma_mem = isert_map_rdma;
+   device->unreg_rdma_mem = isert_unmap_cmd;
+   }
+
+   ret = isert_alloc_comps(device, dev_attr);
+   if (ret)
+   return ret;
+
device->pd = ib_alloc_pd(device->ib_device);
if (IS_ERR(device->pd)) {
ret = PTR_ERR(device->pd);
@@ -358,42 +387,27 @@ isert_create_device_ib_res(struct isert_device *device)
goto out_mr;
}
 
+   /* Check signature cap */
+   device->pi_capable = dev_attr->device_cap_flags &
+IB_DEVICE_SIGNATURE_HANDOVER ? t

[PATCH 15/18] iser-target: Remove redundant check on the device

2015-03-29 Thread Sagi Grimberg
In iser_connect_release there is no chance that
the iser device is set to NULL, if this happens
we have a BUG. So use BUG_ON.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |9 +
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 97cee96..357d481 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -797,7 +797,9 @@ isert_connect_release(struct isert_conn *isert_conn)
 
isert_dbg("conn %p\n", isert_conn);
 
-   if (device && device->use_fastreg)
+   BUG_ON(!device);
+
+   if (device->use_fastreg)
isert_conn_free_fastreg_pool(isert_conn);
 
isert_free_rx_descriptors(isert_conn);
@@ -814,10 +816,9 @@ isert_connect_release(struct isert_conn *isert_conn)
if (isert_conn->login_buf)
isert_free_login_buf(isert_conn);
 
-   kfree(isert_conn);
+   isert_device_put(device);
 
-   if (device)
-   isert_device_put(device);
+   kfree(isert_conn);
 }
 
 static void
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/18] iser-target: Remove dead code

2015-03-29 Thread Sagi Grimberg
unmap_list is unused.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 9b40b37..5b086b3 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -1640,7 +1640,6 @@ static void
 isert_unreg_rdma(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
 {
struct isert_rdma_wr *wr = &isert_cmd->rdma_wr;
-   LIST_HEAD(unmap_list);
 
isert_dbg("Cmd %p\n", isert_cmd);
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/18] iser-target: Use a single DMA MR and PD per device

2015-03-29 Thread Sagi Grimberg
This is to favor the HCA cache hit rate using less MRs
and PDs. This commit partially reverts commit:
"eb6ab13 IB/isert: separate connection protection domains and dma MRs"

At the time I thought this would be needed.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |   99 --
 drivers/infiniband/ulp/isert/ib_isert.h |4 +-
 2 files changed, 55 insertions(+), 48 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 147029a..506c2eb 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -149,7 +149,7 @@ isert_conn_setup_qp(struct isert_conn *isert_conn, struct 
rdma_cm_id *cma_id)
if (device->pi_capable)
attr.create_flags |= IB_QP_CREATE_SIGNATURE_EN;
 
-   ret = rdma_create_qp(cma_id, isert_conn->conn_pd, &attr);
+   ret = rdma_create_qp(cma_id, device->pd, &attr);
if (ret) {
isert_err("rdma_create_qp failed for cma_id %d\n", ret);
goto err;
@@ -174,7 +174,8 @@ isert_cq_event_callback(struct ib_event *e, void *context)
 static int
 isert_alloc_rx_descriptors(struct isert_conn *isert_conn)
 {
-   struct ib_device *ib_dev = isert_conn->conn_cm_id->device;
+   struct isert_device *device = isert_conn->conn_device;
+   struct ib_device *ib_dev = device->ib_device;
struct iser_rx_desc *rx_desc;
struct ib_sge *rx_sg;
u64 dma_addr;
@@ -198,7 +199,7 @@ isert_alloc_rx_descriptors(struct isert_conn *isert_conn)
rx_sg = &rx_desc->rx_sg;
rx_sg->addr = rx_desc->dma_addr;
rx_sg->length = ISER_RX_PAYLOAD_SIZE;
-   rx_sg->lkey = isert_conn->conn_mr->lkey;
+   rx_sg->lkey = device->mr->lkey;
}
 
isert_conn->conn_rx_desc_head = 0;
@@ -309,8 +310,27 @@ isert_create_device_ib_res(struct isert_device *device)
goto out_cq;
}
 
+   device->pd = ib_alloc_pd(device->ib_device);
+   if (IS_ERR(device->pd)) {
+   ret = PTR_ERR(device->pd);
+   isert_err("failed to allocate pd, device %p, ret=%d\n",
+ device, ret);
+   goto out_cq;
+   }
+
+   device->mr = ib_get_dma_mr(device->pd, IB_ACCESS_LOCAL_WRITE);
+   if (IS_ERR(device->mr)) {
+   ret = PTR_ERR(device->mr);
+   isert_err("failed to create dma mr, device %p, ret=%d\n",
+ device, ret);
+   goto out_mr;
+   }
+
+
return 0;
 
+out_mr:
+   ib_dealloc_pd(device->pd);
 out_cq:
for (i = 0; i < device->comps_used; i++) {
struct isert_comp *comp = &device->comps[i];
@@ -332,6 +352,8 @@ isert_free_device_ib_res(struct isert_device *device)
 
isert_info("device %p\n", device);
 
+   ib_dereg_mr(device->mr);
+   ib_dealloc_pd(device->pd);
for (i = 0; i < device->comps_used; i++) {
struct isert_comp *comp = &device->comps[i];
 
@@ -547,7 +569,7 @@ isert_conn_create_fastreg_pool(struct isert_conn 
*isert_conn)
}
 
ret = isert_create_fr_desc(device->ib_device,
-  isert_conn->conn_pd, fr_desc);
+  device->pd, fr_desc);
if (ret) {
isert_err("Failed to create fastreg descriptor 
err=%d\n",
   ret);
@@ -659,22 +681,6 @@ isert_connect_request(struct rdma_cm_id *cma_id, struct 
rdma_cm_event *event)
isert_dbg("Using initiator_depth: %u\n", isert_conn->initiator_depth);
 
isert_conn->conn_device = device;
-   isert_conn->conn_pd = ib_alloc_pd(isert_conn->conn_device->ib_device);
-   if (IS_ERR(isert_conn->conn_pd)) {
-   ret = PTR_ERR(isert_conn->conn_pd);
-   isert_err("ib_alloc_pd failed for conn %p: ret=%d\n",
-  isert_conn, ret);
-   goto out_pd;
-   }
-
-   isert_conn->conn_mr = ib_get_dma_mr(isert_conn->conn_pd,
-  IB_ACCESS_LOCAL_WRITE);
-   if (IS_ERR(isert_conn->conn_mr)) {
-   ret = PTR_ERR(isert_conn->conn_mr);
-   isert_err("ib_get_dma_mr failed for conn %p: ret=%d\n",
-  isert_conn, ret);
-   goto out_mr;
-   }
 
ret = isert_conn_setup_qp(isert_conn, cma_id);
if (ret)
@@ -697,10 +703,6 @@ isert_connect_request(struct rdma_cm_id *cma_id, struct 
rdma_cm_event *event)
return 0;
 
 out_conn_dev:
-   ib_dereg_mr(isert_conn->conn_mr);
-out_mr:
-   ib_dealloc_pd(isert_conn->conn_pd);
-out_pd:
isert_device_try_release(device);
 out_rsp_dma_map:
ib_dma_unmap_single(ib_dev, isert_conn->login_rsp_dma,
@@ -742,9 +744,6 @@ isert_connect_release(struct isert_conn *isert_conn)

[PATCH 18/18] iser-target: Bump version to 1.0

2015-03-29 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 8f452f6..327529e 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -3439,7 +3439,7 @@ static void __exit isert_exit(void)
 }
 
 MODULE_DESCRIPTION("iSER-Target for mainline target infrastructure");
-MODULE_VERSION("0.1");
+MODULE_VERSION("1.0");
 MODULE_AUTHOR("n...@linux-iscsi.org");
 MODULE_LICENSE("GPL");
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/18] iser-target: Remove redundant check on recv completion

2015-03-29 Thread Sagi Grimberg
We have a switch default for this.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |4 
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 506c2eb..9b40b37 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -1946,10 +1946,6 @@ isert_send_completion(struct iser_tx_desc *tx_desc,
isert_dbg("Cmd %p iser_ib_op %d\n", isert_cmd, wr->iser_ib_op);
 
switch (wr->iser_ib_op) {
-   case ISER_IB_RECV:
-   isert_err("Got ISER_IB_RECV\n");
-   dump_stack();
-   break;
case ISER_IB_SEND:
isert_response_completion(tx_desc, isert_cmd,
  isert_conn, ib_dev);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/18] iser-target: Fix session hang in case of an rdma read DIF error

2015-03-29 Thread Sagi Grimberg
This hang was a result of a missing command put when
a DIF error occurred during a rdma read (and we sent
an CHECK_CONDITION error without passing it to the
backend).

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 075b19c..4b8d518 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -1861,11 +1861,13 @@ isert_completion_rdma_read(struct iser_tx_desc *tx_desc,
cmd->i_state = ISTATE_RECEIVED_LAST_DATAOUT;
spin_unlock_bh(&cmd->istate_lock);
 
-   if (ret)
+   if (ret) {
+   target_put_sess_cmd(se_cmd->se_sess, se_cmd);
transport_send_check_condition_and_sense(se_cmd,
 se_cmd->pi_err, 0);
-   else
+   } else {
target_execute_cmd(se_cmd);
+   }
 }
 
 static void
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/18] iser-target: Fix possible deadlock in RDMA_CM connection error

2015-03-29 Thread Sagi Grimberg
Before we reach to connection established we may get an
error event. In this case the core won't teardown this
connection (never established it), so we take care of freeing
it ourselves.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/isert/ib_isert.c |   14 +-
 1 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 4b8d518..147029a 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -222,7 +222,7 @@ fail:
 static void
 isert_free_rx_descriptors(struct isert_conn *isert_conn)
 {
-   struct ib_device *ib_dev = isert_conn->conn_cm_id->device;
+   struct ib_device *ib_dev = isert_conn->conn_device->ib_device;
struct iser_rx_desc *rx_desc;
int i;
 
@@ -719,8 +719,8 @@ out:
 static void
 isert_connect_release(struct isert_conn *isert_conn)
 {
-   struct ib_device *ib_dev = isert_conn->conn_cm_id->device;
struct isert_device *device = isert_conn->conn_device;
+   struct ib_device *ib_dev = device->ib_device;
 
isert_dbg("conn %p\n", isert_conn);
 
@@ -728,7 +728,8 @@ isert_connect_release(struct isert_conn *isert_conn)
isert_conn_free_fastreg_pool(isert_conn);
 
isert_free_rx_descriptors(isert_conn);
-   rdma_destroy_id(isert_conn->conn_cm_id);
+   if (isert_conn->conn_cm_id)
+   rdma_destroy_id(isert_conn->conn_cm_id);
 
if (isert_conn->conn_qp) {
struct isert_comp *comp = 
isert_conn->conn_qp->recv_cq->cq_context;
@@ -878,12 +879,15 @@ isert_disconnected_handler(struct rdma_cm_id *cma_id,
return 0;
 }
 
-static void
+static int
 isert_connect_error(struct rdma_cm_id *cma_id)
 {
struct isert_conn *isert_conn = cma_id->qp->qp_context;
 
+   isert_conn->conn_cm_id = NULL;
isert_put_conn(isert_conn);
+
+   return -1;
 }
 
 static int
@@ -912,7 +916,7 @@ isert_cma_handler(struct rdma_cm_id *cma_id, struct 
rdma_cm_event *event)
case RDMA_CM_EVENT_REJECTED:   /* FALLTHRU */
case RDMA_CM_EVENT_UNREACHABLE:/* FALLTHRU */
case RDMA_CM_EVENT_CONNECT_ERROR:
-   isert_connect_error(cma_id);
+   ret = isert_connect_error(cma_id);
break;
default:
isert_err("Unhandled RDMA CMA event: %d\n", event->event);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


iser target fixes for kernel 4.1

2015-03-29 Thread Sagi Grimberg
Hi Nic,

This set consists of:
- Bug fixes (1-2)
- Performance Optimization (3)
- Code refactoring (3-10,13-16)
- Renaming (11-12,17)
- Version bump (18)

Sagi Grimberg (18):
  iser-target: Fix session hang in case of an rdma read DIF error
  iser-target: Fix possible deadlock in RDMA_CM connection error
  iser-target: Use a single DMA MR and PD per device
  iser-target: Remove redundant check on recv completion
  iser-target: Remove dead code
  iser-target: Remove redundant local variable
  iser-target: Remove redundant casting on void pointers
  iser-target: Split isert_setup_qp
  iser-target: Introduce isert_[alloc|free]_comps
  iser-target: Remove redundant assignment to local variable
  iser-target: Rename rend/recv completion routines
  iser-target: Rename device find/release routines
  iser-target: Split some logic in isert_connect_request to routines
  iser-target: Get rid of redundant max_accept
  iser-target: Remove redundant check on the device
  iser-target: Remove un-needed rdma_listen backlog
  iser-target: Remove conn_ prefix from struct isert_conn members
  iser-target: Bump version to 1.0

 drivers/infiniband/ulp/isert/ib_isert.c |  691 +--
 drivers/infiniband/ulp/isert/ib_isert.h |   37 +-
 2 files changed, 398 insertions(+), 330 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/18] IB/iser: Pass struct iser_mem_reg to iser_fast_reg_mr and iser_reg_sig_mr

2015-03-29 Thread Sagi Grimberg
Instead of passing ib_sge as output variable, we pass the mem_reg
pointer to have the routines fill the rkey as well. This reduces
code duplication and extra assignments. This is a preparation step
to unify some registration logics together.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg 
Signed-off-by: Adir Lev 
---
 drivers/infiniband/ulp/iser/iser_memory.c |   68 
 1 files changed, 29 insertions(+), 39 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index 40d22d5..b4b0098 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -590,8 +590,10 @@ iser_inv_rkey(struct ib_send_wr *inv_wr, struct ib_mr *mr)
 
 static int
 iser_reg_sig_mr(struct iscsi_iser_task *iser_task,
-   struct fast_reg_descriptor *desc, struct ib_sge *data_sge,
-   struct ib_sge *prot_sge, struct ib_sge *sig_sge)
+   struct fast_reg_descriptor *desc,
+   struct iser_mem_reg *data_reg,
+   struct iser_mem_reg *prot_reg,
+   struct iser_mem_reg *sig_reg)
 {
struct ib_conn *ib_conn = &iser_task->iser_conn->ib_conn;
struct iser_pi_context *pi_ctx = desc->pi_ctx;
@@ -615,12 +617,12 @@ iser_reg_sig_mr(struct iscsi_iser_task *iser_task,
memset(&sig_wr, 0, sizeof(sig_wr));
sig_wr.opcode = IB_WR_REG_SIG_MR;
sig_wr.wr_id = ISER_FASTREG_LI_WRID;
-   sig_wr.sg_list = data_sge;
+   sig_wr.sg_list = &data_reg->sge;
sig_wr.num_sge = 1;
sig_wr.wr.sig_handover.sig_attrs = &sig_attrs;
sig_wr.wr.sig_handover.sig_mr = pi_ctx->sig_mr;
if (scsi_prot_sg_count(iser_task->sc))
-   sig_wr.wr.sig_handover.prot = prot_sge;
+   sig_wr.wr.sig_handover.prot = &prot_reg->sge;
sig_wr.wr.sig_handover.access_flags = IB_ACCESS_LOCAL_WRITE |
  IB_ACCESS_REMOTE_READ |
  IB_ACCESS_REMOTE_WRITE;
@@ -637,13 +639,14 @@ iser_reg_sig_mr(struct iscsi_iser_task *iser_task,
}
desc->reg_indicators &= ~ISER_SIG_KEY_VALID;
 
-   sig_sge->lkey = pi_ctx->sig_mr->lkey;
-   sig_sge->addr = 0;
-   sig_sge->length = scsi_transfer_length(iser_task->sc);
+   sig_reg->sge.lkey = pi_ctx->sig_mr->lkey;
+   sig_reg->rkey = pi_ctx->sig_mr->rkey;
+   sig_reg->sge.addr = 0;
+   sig_reg->sge.length = scsi_transfer_length(iser_task->sc);
 
-   iser_dbg("sig_sge: addr: 0x%llx  length: %u lkey: 0x%x\n",
-sig_sge->addr, sig_sge->length,
-sig_sge->lkey);
+   iser_dbg("sig_sge: lkey: 0x%x, rkey: 0x%x, addr: 0x%llx, length: %u\n",
+sig_reg->sge.lkey, sig_reg->rkey, sig_reg->sge.addr,
+sig_reg->sge.length);
 err:
return ret;
 }
@@ -652,7 +655,7 @@ static int iser_fast_reg_mr(struct iscsi_iser_task 
*iser_task,
struct iser_mem_reg *mem_reg,
struct iser_data_buf *mem,
enum iser_reg_indicator ind,
-   struct ib_sge *sge)
+   struct iser_mem_reg *reg)
 {
struct fast_reg_descriptor *desc = mem_reg->mem_h;
struct ib_conn *ib_conn = &iser_task->iser_conn->ib_conn;
@@ -668,12 +671,13 @@ static int iser_fast_reg_mr(struct iscsi_iser_task 
*iser_task,
if (mem->dma_nents == 1) {
struct scatterlist *sg = mem->sg;
 
-   sge->lkey = device->mr->lkey;
-   sge->addr   = ib_sg_dma_address(ibdev, &sg[0]);
-   sge->length  = ib_sg_dma_len(ibdev, &sg[0]);
+   reg->sge.lkey = device->mr->lkey;
+   reg->rkey = device->mr->rkey;
+   reg->sge.addr = ib_sg_dma_address(ibdev, &sg[0]);
+   reg->sge.length = ib_sg_dma_len(ibdev, &sg[0]);
 
iser_dbg("Single DMA entry: lkey=0x%x, addr=0x%llx, 
length=0x%x\n",
-sge->lkey, sge->addr, sge->length);
+reg->sge.lkey, reg->sge.addr, reg->sge.length);
return 0;
}
 
@@ -723,9 +727,10 @@ static int iser_fast_reg_mr(struct iscsi_iser_task 
*iser_task,
}
desc->reg_indicators &= ~ind;
 
-   sge->lkey = mr->lkey;
-   sge->addr = frpl->page_list[0] + offset;
-   sge->length = size;
+   reg->sge.lkey = mr->lkey;
+   reg->rkey = mr->rkey;
+   reg->sge.addr = frpl->page_list[0] + offset;
+   reg->sge.length = size;
 
return ret;
 }
@@ -745,7 +750,6 @@ int iser_reg_rdma_mem_fastreg(struct iscsi_iser_task 
*iser_task,
struct iser_data_buf *mem = &iser_task->data[cmd_dir];
struct iser_mem_reg *mem_reg = &iser_task->rdma_reg[cmd_dir];
struct fast_reg_descriptor *desc = NULL;
-   struct ib_sge data_sge;
int err, aligned_len;
 

[PATCH 10/18] IB/iser: Merge build page-vec into register page-vec

2015-03-29 Thread Sagi Grimberg
No need for these two separate. Keep it in a single routine
like in the fastreg case. This will also make iser_reg_page_vec
closer to iser_fast_reg_mr arguments. This is a preparation
step for registration flow refactor.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg 
Signed-off-by: Adir Lev 
---
 drivers/infiniband/ulp/iser/iser_memory.c |   91 ++--
 1 files changed, 33 insertions(+), 58 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index 0b8656f..6e6b753 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -280,31 +280,6 @@ static void iser_dump_page_vec(struct iser_page_vec 
*page_vec)
iser_err("%d %lx\n",i,(unsigned long)page_vec->pages[i]);
 }
 
-static void iser_page_vec_build(struct iser_data_buf *data,
-   struct iser_page_vec *page_vec,
-   struct ib_device *ibdev)
-{
-   int page_vec_len = 0;
-
-   page_vec->length = 0;
-   page_vec->offset = 0;
-
-   iser_dbg("Translating sg sz: %d\n", data->dma_nents);
-   page_vec_len = iser_sg_to_page_vec(data, ibdev, page_vec->pages,
-  &page_vec->offset,
-  &page_vec->data_size);
-   iser_dbg("sg len %d page_vec_len %d\n", data->dma_nents, page_vec_len);
-
-   page_vec->length = page_vec_len;
-
-   if (page_vec_len * SIZE_4K < page_vec->data_size) {
-   iser_err("page_vec too short to hold this SG\n");
-   iser_data_buf_dump(data, ibdev);
-   iser_dump_page_vec(page_vec);
-   BUG();
-   }
-}
-
 int iser_dma_map_task_data(struct iscsi_iser_task *iser_task,
struct iser_data_buf *data,
enum iser_data_dir iser_dir,
@@ -367,43 +342,44 @@ static int fall_to_bounce_buf(struct iscsi_iser_task 
*iser_task,
  * returns: 0 on success, errno code on failure
  */
 static
-int iser_reg_page_vec(struct ib_conn *ib_conn,
+int iser_reg_page_vec(struct iscsi_iser_task *iser_task,
+ struct iser_data_buf *mem,
  struct iser_page_vec *page_vec,
- struct iser_mem_reg  *mem_reg)
+ struct iser_mem_reg *mem_reg)
 {
-   struct ib_pool_fmr *mem;
-   u64io_addr;
-   u64*page_list;
-   intstatus;
-
-   page_list = page_vec->pages;
-   io_addr   = page_list[0];
+   struct ib_conn *ib_conn = &iser_task->iser_conn->ib_conn;
+   struct iser_device *device = ib_conn->device;
+   struct ib_pool_fmr *fmr;
+   int ret, plen;
+
+   plen = iser_sg_to_page_vec(mem, device->ib_device,
+  page_vec->pages,
+  &page_vec->offset,
+  &page_vec->data_size);
+   page_vec->length = plen;
+   if (plen * SIZE_4K < page_vec->data_size) {
+   iser_err("page vec too short to hold this SG\n");
+   iser_data_buf_dump(mem, device->ib_device);
+   iser_dump_page_vec(page_vec);
+   return -EINVAL;
+   }
 
-   mem  = ib_fmr_pool_map_phys(ib_conn->fmr.pool,
-   page_list,
+   fmr  = ib_fmr_pool_map_phys(ib_conn->fmr.pool,
+   page_vec->pages,
page_vec->length,
-   io_addr);
-
-   if (IS_ERR(mem)) {
-   status = (int)PTR_ERR(mem);
-   iser_err("ib_fmr_pool_map_phys failed: %d\n", status);
-   return status;
+   page_vec->pages[0]);
+   if (IS_ERR(fmr)) {
+   ret = PTR_ERR(fmr);
+   iser_err("ib_fmr_pool_map_phys failed: %d\n", ret);
+   return ret;
}
 
-   mem_reg->lkey  = mem->fmr->lkey;
-   mem_reg->rkey  = mem->fmr->rkey;
-   mem_reg->len   = page_vec->data_size;
-   mem_reg->va= io_addr + page_vec->offset;
-   mem_reg->mem_h = (void *)mem;
-
-   iser_dbg("PHYSICAL Mem.register, [PHYS p_array: 0x%p, sz: %d, "
-"entry[0]: (0x%08lx,%ld)] -> "
-"[lkey: 0x%08X mem_h: 0x%p va: 0x%08lX sz: %ld]\n",
-page_vec, page_vec->length,
-(unsigned long)page_vec->pages[0],
-(unsigned long)page_vec->data_size,
-(unsigned int)mem_reg->lkey, mem_reg->mem_h,
-(unsigned long)mem_reg->va, (unsigned long)mem_reg->len);
+   mem_reg->lkey = fmr->fmr->lkey;
+   mem_reg->rkey = fmr->fmr->rkey;
+   mem_reg->va = page_vec->pages[0] + page_vec->offset;
+   mem_reg->len = page_vec->data_size;
+   mem_reg->mem_h = fmr;
+
return 0;
 }
 
@@ -493,8 +4

[PATCH 18/18] IB/iser: Rewrite bounce buffer code path

2015-03-29 Thread Sagi Grimberg
In some rare cases, IO operations may be not aligned to page
boundaries. This prevents iser from performing fast memory
registration. In order to overcome that iser uses a bounce
buffer to carry the transaction. We basically allocate a buffer
in the size of the transaction and perform a copy.

The buffer allocation using kmalloc is too restrictive since it
requires higher order (atomic) allocations for large transactions
(which may result in memory exhaustion fairly fast for some workloads).
We rewrite the bounce buffer code path to allocate scattered pages
and perform a copy between the transaction sg and the bounce sg.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/iser/iscsi_iser.h |8 +-
 drivers/infiniband/ulp/iser/iser_initiator.c |8 +-
 drivers/infiniband/ulp/iser/iser_memory.c|  211 --
 3 files changed, 138 insertions(+), 89 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
b/drivers/infiniband/ulp/iser/iscsi_iser.h
index 9c15f37..6d0493a 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -222,12 +222,9 @@ enum iser_data_dir {
  * @size: num entries of this sg
  * @data_len: total beffer byte len
  * @dma_nents:returned by dma_map_sg
- * @copy_buf: allocated copy buf for SGs unaligned
- *for rdma which are copied
  * @orig_sg:  pointer to the original sg list (in case
  *we used a copy)
- * @sg_single:SG-ified clone of a non SG SC or
- *unaligned SG
+ * @orig_size:num entris of orig sg list
  */
 struct iser_data_buf {
struct scatterlist *sg;
@@ -235,8 +232,7 @@ struct iser_data_buf {
unsigned long  data_len;
unsigned int   dma_nents;
struct scatterlist *orig_sg;
-   char   *copy_buf;
-   struct scatterlist sg_single;
+   unsigned int   orig_size;
   };
 
 /* fwd declarations */
diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
b/drivers/infiniband/ulp/iser/iser_initiator.c
index b2e3b77..3e2118e 100644
--- a/drivers/infiniband/ulp/iser/iser_initiator.c
+++ b/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -674,28 +674,28 @@ void iser_task_rdma_finalize(struct iscsi_iser_task 
*iser_task)
/* if we were reading, copy back to unaligned sglist,
 * anyway dma_unmap and free the copy
 */
-   if (iser_task->data[ISER_DIR_IN].copy_buf) {
+   if (iser_task->data[ISER_DIR_IN].orig_sg) {
is_rdma_data_aligned = 0;
iser_finalize_rdma_unaligned_sg(iser_task,
&iser_task->data[ISER_DIR_IN],
ISER_DIR_IN);
}
 
-   if (iser_task->data[ISER_DIR_OUT].copy_buf) {
+   if (iser_task->data[ISER_DIR_OUT].orig_sg) {
is_rdma_data_aligned = 0;
iser_finalize_rdma_unaligned_sg(iser_task,
&iser_task->data[ISER_DIR_OUT],
ISER_DIR_OUT);
}
 
-   if (iser_task->prot[ISER_DIR_IN].copy_buf) {
+   if (iser_task->prot[ISER_DIR_IN].orig_sg) {
is_rdma_prot_aligned = 0;
iser_finalize_rdma_unaligned_sg(iser_task,
&iser_task->prot[ISER_DIR_IN],
ISER_DIR_IN);
}
 
-   if (iser_task->prot[ISER_DIR_OUT].copy_buf) {
+   if (iser_task->prot[ISER_DIR_OUT].orig_sg) {
is_rdma_prot_aligned = 0;
iser_finalize_rdma_unaligned_sg(iser_task,
&iser_task->prot[ISER_DIR_OUT],
diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index e78ce53..223ef7a 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -39,7 +39,112 @@
 
 #include "iscsi_iser.h"
 
-#define ISER_KMALLOC_THRESHOLD 0x2 /* 128K - kmalloc limit */
+static void
+iser_free_bounce_sg(struct iser_data_buf *data)
+{
+   struct scatterlist *sg;
+   int count;
+
+   for_each_sg(data->sg, sg, data->size, count)
+   __free_page(sg_page(sg));
+
+   kfree(data->sg);
+
+   data->sg = data->orig_sg;
+   data->size = data->orig_size;
+   data->orig_sg = NULL;
+   data->orig_size = 0;
+}
+
+static int
+iser_alloc_bounce_sg(struct iser_data_buf *data)
+{
+   struct scatterlist *sg;
+   struct page *page;
+   unsigned long length = data->data_len;
+   int i = 0, nents = DIV_ROUND_UP(length, PAGE_SIZE);
+
+   sg = kcalloc(nents, sizeof(*sg), GFP_ATOMIC);
+   if (!sg)
+   goto err;
+
+   sg_init_table(sg, nents);
+   while (length) {
+   u32 page_len = min_t(u32, length, PAGE_SIZE);
+
+   page = 

[PATCH 16/18] IB/iser: Remove code duplication for a single DMA entry

2015-03-29 Thread Sagi Grimberg
In singleton scatterlists, DMA memory registration code
is taken both for Fastreg and FMR code paths. Move it to
a function.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg 
Signed-off-by: Adir Lev 
---
 drivers/infiniband/ulp/iser/iser_memory.c |   48 
 1 files changed, 21 insertions(+), 27 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index b4b0098..e78ce53 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -334,6 +334,24 @@ void iser_dma_unmap_task_data(struct iscsi_iser_task 
*iser_task,
ib_dma_unmap_sg(dev, data->sg, data->size, dir);
 }
 
+static int
+iser_reg_dma(struct iser_device *device, struct iser_data_buf *mem,
+struct iser_mem_reg *reg)
+{
+   struct scatterlist *sg = mem->sg;
+
+   reg->sge.lkey = device->mr->lkey;
+   reg->rkey = device->mr->rkey;
+   reg->sge.addr = ib_sg_dma_address(device->ib_device, &sg[0]);
+   reg->sge.length = ib_sg_dma_len(device->ib_device, &sg[0]);
+
+   iser_dbg("Single DMA entry: lkey=0x%x, rkey=0x%x, addr=0x%llx,"
+" length=0x%x\n", reg->sge.lkey, reg->rkey,
+reg->sge.addr, reg->sge.length);
+
+   return 0;
+}
+
 static int fall_to_bounce_buf(struct iscsi_iser_task *iser_task,
  struct iser_data_buf *mem,
  enum iser_data_dir cmd_dir,
@@ -461,7 +479,6 @@ int iser_reg_rdma_mem_fmr(struct iscsi_iser_task *iser_task,
int aligned_len;
int err;
int i;
-   struct scatterlist *sg;
 
mem_reg = &iser_task->rdma_reg[cmd_dir];
 
@@ -477,19 +494,7 @@ int iser_reg_rdma_mem_fmr(struct iscsi_iser_task 
*iser_task,
 
/* if there a single dma entry, FMR is not needed */
if (mem->dma_nents == 1) {
-   sg = mem->sg;
-
-   mem_reg->sge.lkey = device->mr->lkey;
-   mem_reg->rkey = device->mr->rkey;
-   mem_reg->sge.length = ib_sg_dma_len(ibdev, &sg[0]);
-   mem_reg->sge.addr = ib_sg_dma_address(ibdev, &sg[0]);
-
-   iser_dbg("PHYSICAL Mem.register: lkey: 0x%08X rkey: 0x%08X  "
-"va: 0x%08lX sz: %ld]\n",
-(unsigned int)mem_reg->sge.lkey,
-(unsigned int)mem_reg->rkey,
-(unsigned long)mem_reg->sge.addr,
-(unsigned long)mem_reg->sge.length);
+   return iser_reg_dma(device, mem, mem_reg);
} else { /* use FMR for multiple dma entries */
err = iser_reg_page_vec(iser_task, mem, ib_conn->fmr.page_vec,
mem_reg);
@@ -660,7 +665,6 @@ static int iser_fast_reg_mr(struct iscsi_iser_task 
*iser_task,
struct fast_reg_descriptor *desc = mem_reg->mem_h;
struct ib_conn *ib_conn = &iser_task->iser_conn->ib_conn;
struct iser_device *device = ib_conn->device;
-   struct ib_device *ibdev = device->ib_device;
struct ib_mr *mr;
struct ib_fast_reg_page_list *frpl;
struct ib_send_wr fastreg_wr, inv_wr;
@@ -668,18 +672,8 @@ static int iser_fast_reg_mr(struct iscsi_iser_task 
*iser_task,
int ret, offset, size, plen;
 
/* if there a single dma entry, dma mr suffices */
-   if (mem->dma_nents == 1) {
-   struct scatterlist *sg = mem->sg;
-
-   reg->sge.lkey = device->mr->lkey;
-   reg->rkey = device->mr->rkey;
-   reg->sge.addr = ib_sg_dma_address(ibdev, &sg[0]);
-   reg->sge.length = ib_sg_dma_len(ibdev, &sg[0]);
-
-   iser_dbg("Single DMA entry: lkey=0x%x, addr=0x%llx, 
length=0x%x\n",
-reg->sge.lkey, reg->sge.addr, reg->sge.length);
-   return 0;
-   }
+   if (mem->dma_nents == 1)
+   return iser_reg_dma(device, mem, mem_reg);
 
if (ind == ISER_DATA_KEY_VALID) {
mr = desc->data_mr;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/18] IB/iser: Fix unload during ep_poll wrong dereference

2015-03-29 Thread Sagi Grimberg
In case the user unloaded ib_iser while ep_connect is in
progress, we need to destroy the endpoint although ep_disconnect
wasn't invoked (we detect this by the iser conn state != DOWN).
However, if we got an REJECTED/UNREACHABLE CM event we move the
connection state to DOWN which will prevent us from destroying
the endpoint in the module unload stage. Fix this by setting the
connection state to TERMINATING in iser_conn_error so we can still
destroy the endpoint at unload stage.

Reported-by: Ariel Nahum 
Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/iser/iser_verbs.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c 
b/drivers/infiniband/ulp/iser/iser_verbs.c
index 4065abe..070c5af 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -721,7 +721,7 @@ static void iser_connect_error(struct rdma_cm_id *cma_id)
struct iser_conn *iser_conn;
 
iser_conn = (struct iser_conn *)cma_id->context;
-   iser_conn->state = ISER_CONN_DOWN;
+   iser_conn->state = ISER_CONN_TERMINATING;
 }
 
 /**
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/18] IB/iser: Move memory reg/dereg routines to iser_memory.c

2015-03-29 Thread Sagi Grimberg
As memory registration/de-registration methods, lets
move them to their natural location. While we're at it,
make iser_reg_page_vec routine static.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/iser/iscsi_iser.h  |4 -
 drivers/infiniband/ulp/iser/iser_memory.c |   88 +
 drivers/infiniband/ulp/iser/iser_verbs.c  |   87 
 3 files changed, 88 insertions(+), 91 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
b/drivers/infiniband/ulp/iser/iscsi_iser.h
index 5c7036c..d5e5288 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -632,10 +632,6 @@ int  iser_connect(struct iser_conn *iser_conn,
  struct sockaddr *dst_addr,
  int non_blocking);
 
-int  iser_reg_page_vec(struct ib_conn *ib_conn,
-  struct iser_page_vec *page_vec,
-  struct iser_mem_reg *mem_reg);
-
 void iser_unreg_mem_fmr(struct iscsi_iser_task *iser_task,
enum iser_data_dir cmd_dir);
 void iser_unreg_mem_fastreg(struct iscsi_iser_task *iser_task,
diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index 9c60ff1..4e0cbbb 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -362,6 +362,94 @@ static int fall_to_bounce_buf(struct iscsi_iser_task 
*iser_task,
 }
 
 /**
+ * iser_reg_page_vec - Register physical memory
+ *
+ * returns: 0 on success, errno code on failure
+ */
+static
+int iser_reg_page_vec(struct ib_conn *ib_conn,
+ struct iser_page_vec *page_vec,
+ struct iser_mem_reg  *mem_reg)
+{
+   struct ib_pool_fmr *mem;
+   u64io_addr;
+   u64*page_list;
+   intstatus;
+
+   page_list = page_vec->pages;
+   io_addr   = page_list[0];
+
+   mem  = ib_fmr_pool_map_phys(ib_conn->fmr.pool,
+   page_list,
+   page_vec->length,
+   io_addr);
+
+   if (IS_ERR(mem)) {
+   status = (int)PTR_ERR(mem);
+   iser_err("ib_fmr_pool_map_phys failed: %d\n", status);
+   return status;
+   }
+
+   mem_reg->lkey  = mem->fmr->lkey;
+   mem_reg->rkey  = mem->fmr->rkey;
+   mem_reg->len   = page_vec->length * SIZE_4K;
+   mem_reg->va= io_addr;
+   mem_reg->mem_h = (void *)mem;
+
+   mem_reg->va   += page_vec->offset;
+   mem_reg->len   = page_vec->data_size;
+
+   iser_dbg("PHYSICAL Mem.register, [PHYS p_array: 0x%p, sz: %d, "
+"entry[0]: (0x%08lx,%ld)] -> "
+"[lkey: 0x%08X mem_h: 0x%p va: 0x%08lX sz: %ld]\n",
+page_vec, page_vec->length,
+(unsigned long)page_vec->pages[0],
+(unsigned long)page_vec->data_size,
+(unsigned int)mem_reg->lkey, mem_reg->mem_h,
+(unsigned long)mem_reg->va, (unsigned long)mem_reg->len);
+   return 0;
+}
+
+/**
+ * Unregister (previosuly registered using FMR) memory.
+ * If memory is non-FMR does nothing.
+ */
+void iser_unreg_mem_fmr(struct iscsi_iser_task *iser_task,
+   enum iser_data_dir cmd_dir)
+{
+   struct iser_mem_reg *reg = &iser_task->rdma_regd[cmd_dir].reg;
+   int ret;
+
+   if (!reg->mem_h)
+   return;
+
+   iser_dbg("PHYSICAL Mem.Unregister mem_h %p\n", reg->mem_h);
+
+   ret = ib_fmr_pool_unmap((struct ib_pool_fmr *)reg->mem_h);
+   if (ret)
+   iser_err("ib_fmr_pool_unmap failed %d\n", ret);
+
+   reg->mem_h = NULL;
+}
+
+void iser_unreg_mem_fastreg(struct iscsi_iser_task *iser_task,
+   enum iser_data_dir cmd_dir)
+{
+   struct iser_mem_reg *reg = &iser_task->rdma_regd[cmd_dir].reg;
+   struct iser_conn *iser_conn = iser_task->iser_conn;
+   struct ib_conn *ib_conn = &iser_conn->ib_conn;
+   struct fast_reg_descriptor *desc = reg->mem_h;
+
+   if (!desc)
+   return;
+
+   reg->mem_h = NULL;
+   spin_lock_bh(&ib_conn->lock);
+   list_add_tail(&desc->list, &ib_conn->fastreg.pool);
+   spin_unlock_bh(&ib_conn->lock);
+}
+
+/**
  * iser_reg_rdma_mem_fmr - Registers memory intended for RDMA,
  * using FMR (if possible) obtaining rkey and va
  *
diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c 
b/drivers/infiniband/ulp/iser/iser_verbs.c
index 7ee4926..986b5f4 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -992,93 +992,6 @@ connect_failure:
return err;
 }
 
-/**
- * iser_reg_page_vec - Register physical memory
- *
- * returns: 0 on success, errno code on failure
- */
-int iser_reg_page_vec(struct ib_conn *ib_conn,
- struct 

[PATCH 05/18] IB/iser: Remove a redundant struct iser_data_buf

2015-03-29 Thread Sagi Grimberg
No need to keep two iser_data_buf structures just in case we use
mem copy. We can avoid that just by adding a pointer to the original
sg. So keep only two iser_data_buf per command (data and protection)
and pass the relevant data_buf to bounce buffer routine.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg 
Signed-off-by: Adir Lev 
---
 drivers/infiniband/ulp/iser/iscsi_iser.h |   12 ++---
 drivers/infiniband/ulp/iser/iser_initiator.c |   16 +++
 drivers/infiniband/ulp/iser/iser_memory.c|   58 ++---
 3 files changed, 34 insertions(+), 52 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
b/drivers/infiniband/ulp/iser/iscsi_iser.h
index b47aea1..5c7036c 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -218,20 +218,23 @@ enum iser_data_dir {
 /**
  * struct iser_data_buf - iSER data buffer
  *
- * @buf:  pointer to the sg list
+ * @sg:   pointer to the sg list
  * @size: num entries of this sg
  * @data_len: total beffer byte len
  * @dma_nents:returned by dma_map_sg
  * @copy_buf: allocated copy buf for SGs unaligned
  *for rdma which are copied
+ * @orig_sg:  pointer to the original sg list (in case
+ *we used a copy)
  * @sg_single:SG-ified clone of a non SG SC or
  *unaligned SG
  */
 struct iser_data_buf {
-   void   *buf;
+   struct scatterlist *sg;
unsigned int   size;
unsigned long  data_len;
unsigned int   dma_nents;
+   struct scatterlist *orig_sg;
char   *copy_buf;
struct scatterlist sg_single;
   };
@@ -536,9 +539,7 @@ struct iser_conn {
  * @dir:  iser data direction
  * @rdma_regd:task rdma registration desc
  * @data: iser data buffer desc
- * @data_copy:iser data copy buffer desc (bounce buffer)
  * @prot: iser protection buffer desc
- * @prot_copy:iser protection copy buffer desc (bounce buffer)
  */
 struct iscsi_iser_task {
struct iser_tx_desc  desc;
@@ -549,9 +550,7 @@ struct iscsi_iser_task {
int  dir[ISER_DIRS_NUM];
struct iser_regd_buf rdma_regd[ISER_DIRS_NUM];
struct iser_data_buf data[ISER_DIRS_NUM];
-   struct iser_data_buf data_copy[ISER_DIRS_NUM];
struct iser_data_buf prot[ISER_DIRS_NUM];
-   struct iser_data_buf prot_copy[ISER_DIRS_NUM];
 };
 
 struct iser_page_vec {
@@ -621,7 +620,6 @@ void iser_free_rx_descriptors(struct iser_conn *iser_conn);
 
 void iser_finalize_rdma_unaligned_sg(struct iscsi_iser_task *iser_task,
 struct iser_data_buf *mem,
-struct iser_data_buf *mem_copy,
 enum iser_data_dir cmd_dir);
 
 int  iser_reg_rdma_mem_fmr(struct iscsi_iser_task *task,
diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
b/drivers/infiniband/ulp/iser/iser_initiator.c
index 76eb57b..0e414db 100644
--- a/drivers/infiniband/ulp/iser/iser_initiator.c
+++ b/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -401,13 +401,13 @@ int iser_send_command(struct iscsi_conn *conn,
}
 
if (scsi_sg_count(sc)) { /* using a scatter list */
-   data_buf->buf  = scsi_sglist(sc);
+   data_buf->sg = scsi_sglist(sc);
data_buf->size = scsi_sg_count(sc);
}
data_buf->data_len = scsi_bufflen(sc);
 
if (scsi_prot_sg_count(sc)) {
-   prot_buf->buf  = scsi_prot_sglist(sc);
+   prot_buf->sg  = scsi_prot_sglist(sc);
prot_buf->size = scsi_prot_sg_count(sc);
prot_buf->data_len = (data_buf->data_len >>
 ilog2(sc->device->sector_size)) * 8;
@@ -674,35 +674,31 @@ void iser_task_rdma_finalize(struct iscsi_iser_task 
*iser_task)
/* if we were reading, copy back to unaligned sglist,
 * anyway dma_unmap and free the copy
 */
-   if (iser_task->data_copy[ISER_DIR_IN].copy_buf != NULL) {
+   if (iser_task->data[ISER_DIR_IN].copy_buf) {
is_rdma_data_aligned = 0;
iser_finalize_rdma_unaligned_sg(iser_task,
&iser_task->data[ISER_DIR_IN],
-   
&iser_task->data_copy[ISER_DIR_IN],
ISER_DIR_IN);
}
 
-   if (iser_task->data_copy[ISER_DIR_OUT].copy_buf != NULL) {
+   if (iser_task->data[ISER_DIR_OUT].copy_buf) {
is_rdma_data_aligned = 0;
iser_finalize_rdma_unaligned_sg(iser_task,
&iser_task->data[ISER_DIR_OUT],
-   
&iser_task->data_c

[PATCH 14/18] IB/iser: Modify struct iser_mem_reg members

2015-03-29 Thread Sagi Grimberg
No need to keep lkey, va, len variables, we can keep
them as struct ib_sge. This will help when we change the
memory registration logic.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg 
Signed-off-by: Adir Lev 
---
 drivers/infiniband/ulp/iser/iscsi_iser.h |   14 ---
 drivers/infiniband/ulp/iser/iser_initiator.c |   18 +++---
 drivers/infiniband/ulp/iser/iser_memory.c|   30 +-
 3 files changed, 29 insertions(+), 33 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
b/drivers/infiniband/ulp/iser/iscsi_iser.h
index 43e4912..b2074e0 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -247,18 +247,14 @@ struct iscsi_endpoint;
 /**
  * struct iser_mem_reg - iSER memory registration info
  *
- * @lkey: MR local key
- * @rkey: MR remote key
- * @va:   MR start address (buffer va)
- * @len:  MR length
+ * @sge:  memory region sg element
+ * @rkey: memory region remote key
  * @mem_h:pointer to registration context (FMR/Fastreg)
  */
 struct iser_mem_reg {
-   u32  lkey;
-   u32  rkey;
-   u64  va;
-   u64  len;
-   void *mem_h;
+   struct ib_sgesge;
+   u32  rkey;
+   void*mem_h;
 };
 
 enum iser_desc_type {
diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
b/drivers/infiniband/ulp/iser/iser_initiator.c
index 420a613..b2e3b77 100644
--- a/drivers/infiniband/ulp/iser/iser_initiator.c
+++ b/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -82,11 +82,11 @@ static int iser_prepare_read_cmd(struct iscsi_task *task)
 
hdr->flags|= ISER_RSV;
hdr->read_stag = cpu_to_be32(mem_reg->rkey);
-   hdr->read_va   = cpu_to_be64(mem_reg->va);
+   hdr->read_va   = cpu_to_be64(mem_reg->sge.addr);
 
iser_dbg("Cmd itt:%d READ tags RKEY:%#.4X VA:%#llX\n",
 task->itt, mem_reg->rkey,
-(unsigned long long)mem_reg->va);
+(unsigned long long)mem_reg->sge.addr);
 
return 0;
 }
@@ -139,20 +139,20 @@ iser_prepare_write_cmd(struct iscsi_task *task,
if (unsol_sz < edtl) {
hdr->flags |= ISER_WSV;
hdr->write_stag = cpu_to_be32(mem_reg->rkey);
-   hdr->write_va   = cpu_to_be64(mem_reg->va + unsol_sz);
+   hdr->write_va   = cpu_to_be64(mem_reg->sge.addr + unsol_sz);
 
iser_dbg("Cmd itt:%d, WRITE tags, RKEY:%#.4X "
 "VA:%#llX + unsol:%d\n",
 task->itt, mem_reg->rkey,
-(unsigned long long)mem_reg->va, unsol_sz);
+(unsigned long long)mem_reg->sge.addr, unsol_sz);
}
 
if (imm_sz > 0) {
iser_dbg("Cmd itt:%d, WRITE, adding imm.data sz: %d\n",
 task->itt, imm_sz);
-   tx_dsg->addr   = mem_reg->va;
+   tx_dsg->addr = mem_reg->sge.addr;
tx_dsg->length = imm_sz;
-   tx_dsg->lkey   = mem_reg->lkey;
+   tx_dsg->lkey = mem_reg->sge.lkey;
iser_task->desc.num_sge = 2;
}
 
@@ -479,9 +479,9 @@ int iser_send_data_out(struct iscsi_conn *conn,
 
mem_reg = &iser_task->rdma_reg[ISER_DIR_OUT];
tx_dsg = &tx_desc->tx_sg[1];
-   tx_dsg->addr= mem_reg->va + buf_offset;
-   tx_dsg->length  = data_seg_len;
-   tx_dsg->lkey= mem_reg->lkey;
+   tx_dsg->addr = mem_reg->sge.addr + buf_offset;
+   tx_dsg->length = data_seg_len;
+   tx_dsg->lkey = mem_reg->sge.lkey;
tx_desc->num_sge = 2;
 
if (buf_offset + data_seg_len > iser_task->data[ISER_DIR_OUT].data_len) 
{
diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index 45f5120..40d22d5 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -400,10 +400,10 @@ int iser_reg_page_vec(struct iscsi_iser_task *iser_task,
return ret;
}
 
-   mem_reg->lkey = fmr->fmr->lkey;
+   mem_reg->sge.lkey = fmr->fmr->lkey;
mem_reg->rkey = fmr->fmr->rkey;
-   mem_reg->va = page_vec->pages[0] + page_vec->offset;
-   mem_reg->len = page_vec->data_size;
+   mem_reg->sge.addr = page_vec->pages[0] + page_vec->offset;
+   mem_reg->sge.length = page_vec->data_size;
mem_reg->mem_h = fmr;
 
return 0;
@@ -479,17 +479,17 @@ int iser_reg_rdma_mem_fmr(struct iscsi_iser_task 
*iser_task,
if (mem->dma_nents == 1) {
sg = mem->sg;
 
-   mem_reg->lkey = device->mr->lkey;
+   mem_reg->sge.lkey = device->mr->lkey;
mem_reg->rkey = device->mr->rkey;
-   mem_reg->len  = ib_sg_dma_len(ibdev, &sg[0]);
-   mem_reg->va   = ib_sg_dma_address(ibdev, &sg[0]);
+   mem_reg->sge.leng

[PATCH 13/18] IB/iser: Make fastreg pool cache friendly

2015-03-29 Thread Sagi Grimberg
Memory regions are resources that are saved
in the device caches. Increase the probability for
a cache hit by adding the MRU descriptor to pool
head.

Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/iser/iser_memory.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index 17a5d70..45f5120 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -63,7 +63,7 @@ iser_reg_desc_put(struct ib_conn *ib_conn,
unsigned long flags;
 
spin_lock_irqsave(&ib_conn->lock, flags);
-   list_add_tail(&desc->list, &ib_conn->fastreg.pool);
+   list_add(&desc->list, &ib_conn->fastreg.pool);
spin_unlock_irqrestore(&ib_conn->lock, flags);
 }
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/18] IB/iser: Move PI context alloc/free to routines

2015-03-29 Thread Sagi Grimberg
Make iser_[create|destroy]_fastreg_desc shorter, more
readable and easily extendable.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg 
Signed-off-by: Adir Lev 
---
 drivers/infiniband/ulp/iser/iser_verbs.c |  118 --
 1 files changed, 63 insertions(+), 55 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c 
b/drivers/infiniband/ulp/iser/iser_verbs.c
index 20eec09..cc2dd35 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -274,6 +274,65 @@ void iser_free_fmr_pool(struct ib_conn *ib_conn)
 }
 
 static int
+iser_alloc_pi_ctx(struct ib_device *ib_device, struct ib_pd *pd,
+ struct fast_reg_descriptor *desc)
+{
+   struct iser_pi_context *pi_ctx = NULL;
+   struct ib_mr_init_attr mr_init_attr = {.max_reg_descriptors = 2,
+  .flags = IB_MR_SIGNATURE_EN};
+   int ret = 0;
+
+   desc->pi_ctx = kzalloc(sizeof(*desc->pi_ctx), GFP_KERNEL);
+   if (!desc->pi_ctx)
+   return -ENOMEM;
+
+   pi_ctx = desc->pi_ctx;
+
+   pi_ctx->prot_frpl = ib_alloc_fast_reg_page_list(ib_device,
+   ISCSI_ISER_SG_TABLESIZE);
+   if (IS_ERR(pi_ctx->prot_frpl)) {
+   ret = PTR_ERR(pi_ctx->prot_frpl);
+   goto prot_frpl_failure;
+   }
+
+   pi_ctx->prot_mr = ib_alloc_fast_reg_mr(pd,
+   ISCSI_ISER_SG_TABLESIZE + 1);
+   if (IS_ERR(pi_ctx->prot_mr)) {
+   ret = PTR_ERR(pi_ctx->prot_mr);
+   goto prot_mr_failure;
+   }
+   desc->reg_indicators |= ISER_PROT_KEY_VALID;
+
+   pi_ctx->sig_mr = ib_create_mr(pd, &mr_init_attr);
+   if (IS_ERR(pi_ctx->sig_mr)) {
+   ret = PTR_ERR(pi_ctx->sig_mr);
+   goto sig_mr_failure;
+   }
+   desc->reg_indicators |= ISER_SIG_KEY_VALID;
+   desc->reg_indicators &= ~ISER_FASTREG_PROTECTED;
+
+   return 0;
+
+sig_mr_failure:
+   ib_dereg_mr(desc->pi_ctx->prot_mr);
+prot_mr_failure:
+   ib_free_fast_reg_page_list(desc->pi_ctx->prot_frpl);
+prot_frpl_failure:
+   kfree(desc->pi_ctx);
+
+   return ret;
+}
+
+static void
+iser_free_pi_ctx(struct iser_pi_context *pi_ctx)
+{
+   ib_free_fast_reg_page_list(pi_ctx->prot_frpl);
+   ib_dereg_mr(pi_ctx->prot_mr);
+   ib_destroy_mr(pi_ctx->sig_mr);
+   kfree(pi_ctx);
+}
+
+static int
 iser_create_fastreg_desc(struct ib_device *ib_device, struct ib_pd *pd,
 bool pi_enable, struct fast_reg_descriptor *desc)
 {
@@ -297,59 +356,12 @@ iser_create_fastreg_desc(struct ib_device *ib_device, 
struct ib_pd *pd,
desc->reg_indicators |= ISER_DATA_KEY_VALID;
 
if (pi_enable) {
-   struct ib_mr_init_attr mr_init_attr = {0};
-   struct iser_pi_context *pi_ctx = NULL;
-
-   desc->pi_ctx = kzalloc(sizeof(*desc->pi_ctx), GFP_KERNEL);
-   if (!desc->pi_ctx) {
-   iser_err("Failed to allocate pi context\n");
-   ret = -ENOMEM;
+   ret = iser_alloc_pi_ctx(ib_device, pd, desc);
+   if (ret)
goto pi_ctx_alloc_failure;
-   }
-   pi_ctx = desc->pi_ctx;
-
-   pi_ctx->prot_frpl = ib_alloc_fast_reg_page_list(ib_device,
-   ISCSI_ISER_SG_TABLESIZE);
-   if (IS_ERR(pi_ctx->prot_frpl)) {
-   ret = PTR_ERR(pi_ctx->prot_frpl);
-   iser_err("Failed to allocate prot frpl ret=%d\n",
-ret);
-   goto prot_frpl_failure;
-   }
-
-   pi_ctx->prot_mr = ib_alloc_fast_reg_mr(pd,
-   ISCSI_ISER_SG_TABLESIZE + 1);
-   if (IS_ERR(pi_ctx->prot_mr)) {
-   ret = PTR_ERR(pi_ctx->prot_mr);
-   iser_err("Failed to allocate prot frmr ret=%d\n",
-ret);
-   goto prot_mr_failure;
-   }
-   desc->reg_indicators |= ISER_PROT_KEY_VALID;
-
-   mr_init_attr.max_reg_descriptors = 2;
-   mr_init_attr.flags |= IB_MR_SIGNATURE_EN;
-   pi_ctx->sig_mr = ib_create_mr(pd, &mr_init_attr);
-   if (IS_ERR(pi_ctx->sig_mr)) {
-   ret = PTR_ERR(pi_ctx->sig_mr);
-   iser_err("Failed to allocate signature enabled mr 
err=%d\n",
-ret);
-   goto sig_mr_failure;
-   }
-   desc->reg_indicators |= ISER_SIG_KEY_VALID;
}
-   desc->reg_indicators &= ~ISER_FASTREG_PROTECTED;
-
-   iser_dbg("Create fr_desc %p page_list %p\n",
-desc, desc->data_frpl->page_list);
 
retur

[PATCH 11/18] IB/iser: Move fastreg descriptor pool get/put to helper functions

2015-03-29 Thread Sagi Grimberg
Instead of open-coding connection fastreg pool get/put,
we introduce iser_reg_desc[get|put] helpers.

We aren't setting these static as this will be a per-device
routine later on. Also, cleanup iser_unreg_rdma_mem_fastreg
a bit.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg 
Signed-off-by: Adir Lev 
---
 drivers/infiniband/ulp/iser/iscsi_iser.h  |5 +++
 drivers/infiniband/ulp/iser/iser_memory.c |   50 ++--
 2 files changed, 37 insertions(+), 18 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
b/drivers/infiniband/ulp/iser/iscsi_iser.h
index 198f928..43e4912 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -644,4 +644,9 @@ int iser_create_fastreg_pool(struct ib_conn *ib_conn, 
unsigned cmds_max);
 void iser_free_fastreg_pool(struct ib_conn *ib_conn);
 u8 iser_check_task_pi_status(struct iscsi_iser_task *iser_task,
 enum iser_data_dir cmd_dir, sector_t *sector);
+struct fast_reg_descriptor *
+iser_reg_desc_get(struct ib_conn *ib_conn);
+void
+iser_reg_desc_put(struct ib_conn *ib_conn,
+ struct fast_reg_descriptor *desc);
 #endif
diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index 6e6b753..17a5d70 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -41,6 +41,32 @@
 
 #define ISER_KMALLOC_THRESHOLD 0x2 /* 128K - kmalloc limit */
 
+struct fast_reg_descriptor *
+iser_reg_desc_get(struct ib_conn *ib_conn)
+{
+   struct fast_reg_descriptor *desc;
+   unsigned long flags;
+
+   spin_lock_irqsave(&ib_conn->lock, flags);
+   desc = list_first_entry(&ib_conn->fastreg.pool,
+   struct fast_reg_descriptor, list);
+   list_del(&desc->list);
+   spin_unlock_irqrestore(&ib_conn->lock, flags);
+
+   return desc;
+}
+
+void
+iser_reg_desc_put(struct ib_conn *ib_conn,
+ struct fast_reg_descriptor *desc)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(&ib_conn->lock, flags);
+   list_add_tail(&desc->list, &ib_conn->fastreg.pool);
+   spin_unlock_irqrestore(&ib_conn->lock, flags);
+}
+
 /**
  * iser_start_rdma_unaligned_sg
  */
@@ -409,17 +435,13 @@ void iser_unreg_mem_fastreg(struct iscsi_iser_task 
*iser_task,
enum iser_data_dir cmd_dir)
 {
struct iser_mem_reg *reg = &iser_task->rdma_reg[cmd_dir];
-   struct iser_conn *iser_conn = iser_task->iser_conn;
-   struct ib_conn *ib_conn = &iser_conn->ib_conn;
-   struct fast_reg_descriptor *desc = reg->mem_h;
 
-   if (!desc)
+   if (!reg->mem_h)
return;
 
+   iser_reg_desc_put(&iser_task->iser_conn->ib_conn,
+ reg->mem_h);
reg->mem_h = NULL;
-   spin_lock_bh(&ib_conn->lock);
-   list_add_tail(&desc->list, &ib_conn->fastreg.pool);
-   spin_unlock_bh(&ib_conn->lock);
 }
 
 /**
@@ -725,7 +747,6 @@ int iser_reg_rdma_mem_fastreg(struct iscsi_iser_task 
*iser_task,
struct fast_reg_descriptor *desc = NULL;
struct ib_sge data_sge;
int err, aligned_len;
-   unsigned long flags;
 
aligned_len = iser_data_buf_aligned_len(mem, ibdev);
if (aligned_len != mem->dma_nents) {
@@ -739,11 +760,7 @@ int iser_reg_rdma_mem_fastreg(struct iscsi_iser_task 
*iser_task,
 
if (mem->dma_nents != 1 ||
scsi_get_prot_op(iser_task->sc) != SCSI_PROT_NORMAL) {
-   spin_lock_irqsave(&ib_conn->lock, flags);
-   desc = list_first_entry(&ib_conn->fastreg.pool,
-   struct fast_reg_descriptor, list);
-   list_del(&desc->list);
-   spin_unlock_irqrestore(&ib_conn->lock, flags);
+   desc = iser_reg_desc_get(ib_conn);
mem_reg->mem_h = desc;
}
 
@@ -799,11 +816,8 @@ int iser_reg_rdma_mem_fastreg(struct iscsi_iser_task 
*iser_task,
 
return 0;
 err_reg:
-   if (desc) {
-   spin_lock_irqsave(&ib_conn->lock, flags);
-   list_add_tail(&desc->list, &ib_conn->fastreg.pool);
-   spin_unlock_irqrestore(&ib_conn->lock, flags);
-   }
+   if (desc)
+   iser_reg_desc_put(ib_conn, desc);
 
return err;
 }
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/18] IB/iser: Bump version to 1.6

2015-03-29 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/iser/iscsi_iser.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
b/drivers/infiniband/ulp/iser/iscsi_iser.h
index b2074e0..9c15f37 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -69,7 +69,7 @@
 
 #define DRV_NAME   "iser"
 #define PFXDRV_NAME ": "
-#define DRV_VER"1.5"
+#define DRV_VER"1.6"
 
 #define iser_dbg(fmt, arg...)   \
do { \
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/18] IB/iser: Remove redundant cmd_data_len calculation

2015-03-29 Thread Sagi Grimberg
This code was added before we had protection data length
calculation (in iser_send_command), so we needed to calc
the sg data length from the sg itself. This is not needed
anymore.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg 
Signed-off-by: Adir Lev 
---
 drivers/infiniband/ulp/iser/iser_memory.c |5 +
 1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index 341040b..32ccd5c 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -53,12 +53,9 @@ static int iser_start_rdma_unaligned_sg(struct 
iscsi_iser_task *iser_task,
struct scatterlist *sgl = (struct scatterlist *)data->buf;
struct scatterlist *sg;
char *mem = NULL;
-   unsigned long  cmd_data_len = 0;
+   unsigned long  cmd_data_len = data->data_len;
int dma_nents, i;
 
-   for_each_sg(sgl, sg, data->size, i)
-   cmd_data_len += ib_sg_dma_len(dev, sg);
-
if (cmd_data_len > ISER_KMALLOC_THRESHOLD)
mem = (void *)__get_free_pages(GFP_ATOMIC,
  ilog2(roundup_pow_of_two(cmd_data_len)) - PAGE_SHIFT);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/18] IB/iser: Don't pass ib_device to fall_to_bounce_buff routine

2015-03-29 Thread Sagi Grimberg
No need to pass that, we can take it from the task.
In a later stage, this function will be invoked
according to a device capability.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg 
Signed-off-by: Adir Lev 
---
 drivers/infiniband/ulp/iser/iser_memory.c |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index beeabd0..9c60ff1 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -334,19 +334,19 @@ void iser_dma_unmap_task_data(struct iscsi_iser_task 
*iser_task,
 }
 
 static int fall_to_bounce_buf(struct iscsi_iser_task *iser_task,
- struct ib_device *ibdev,
  struct iser_data_buf *mem,
  enum iser_data_dir cmd_dir,
  int aligned_len)
 {
-   struct iscsi_conn*iscsi_conn = iser_task->iser_conn->iscsi_conn;
+   struct iscsi_conn *iscsi_conn = iser_task->iser_conn->iscsi_conn;
+   struct iser_device *device = iser_task->iser_conn->ib_conn.device;
 
iscsi_conn->fmr_unalign_cnt++;
iser_warn("rdma alignment violation (%d/%d aligned) or FMR not 
supported\n",
  aligned_len, mem->size);
 
if (iser_debug_level > 0)
-   iser_data_buf_dump(mem, ibdev);
+   iser_data_buf_dump(mem, device->ib_device);
 
/* unmap the command data before accessing it */
iser_dma_unmap_task_data(iser_task, mem,
@@ -384,7 +384,7 @@ int iser_reg_rdma_mem_fmr(struct iscsi_iser_task *iser_task,
 
aligned_len = iser_data_buf_aligned_len(mem, ibdev);
if (aligned_len != mem->dma_nents) {
-   err = fall_to_bounce_buf(iser_task, ibdev, mem,
+   err = fall_to_bounce_buf(iser_task, mem,
 cmd_dir, aligned_len);
if (err) {
iser_err("failed to allocate bounce buffer\n");
@@ -669,7 +669,7 @@ int iser_reg_rdma_mem_fastreg(struct iscsi_iser_task 
*iser_task,
 
aligned_len = iser_data_buf_aligned_len(mem, ibdev);
if (aligned_len != mem->dma_nents) {
-   err = fall_to_bounce_buf(iser_task, ibdev, mem,
+   err = fall_to_bounce_buf(iser_task, mem,
 cmd_dir, aligned_len);
if (err) {
iser_err("failed to allocate bounce buffer\n");
@@ -700,7 +700,7 @@ int iser_reg_rdma_mem_fastreg(struct iscsi_iser_task 
*iser_task,
mem = &iser_task->prot[cmd_dir];
aligned_len = iser_data_buf_aligned_len(mem, ibdev);
if (aligned_len != mem->dma_nents) {
-   err = fall_to_bounce_buf(iser_task, ibdev, mem,
+   err = fall_to_bounce_buf(iser_task, mem,
 cmd_dir, aligned_len);
if (err) {
iser_err("failed to allocate bounce 
buffer\n");
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/18] IB/iser: Remove redundant assignments in iser_reg_page_vec

2015-03-29 Thread Sagi Grimberg
Buffer length was assigned twice, and no reason to set va to
io_addr and then add the offset, just set va to io_addr + offset.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg 
Signed-off-by: Adir Lev 
---
 drivers/infiniband/ulp/iser/iser_memory.c |7 ++-
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index 4e0cbbb..cb30865 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -392,12 +392,9 @@ int iser_reg_page_vec(struct ib_conn *ib_conn,
 
mem_reg->lkey  = mem->fmr->lkey;
mem_reg->rkey  = mem->fmr->rkey;
-   mem_reg->len   = page_vec->length * SIZE_4K;
-   mem_reg->va= io_addr;
-   mem_reg->mem_h = (void *)mem;
-
-   mem_reg->va   += page_vec->offset;
mem_reg->len   = page_vec->data_size;
+   mem_reg->va= io_addr + page_vec->offset;
+   mem_reg->mem_h = (void *)mem;
 
iser_dbg("PHYSICAL Mem.register, [PHYS p_array: 0x%p, sz: %d, "
 "entry[0]: (0x%08lx,%ld)] -> "
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/18] IB/iser: Handle fastreg/local_inv completion errors

2015-03-29 Thread Sagi Grimberg
Fast registration and local invalidate work requests can
also fail. We should call error completion handler for them.

Reported-by: Roi Dayan 
Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/iser/iser_verbs.c |   11 ++-
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c 
b/drivers/infiniband/ulp/iser/iser_verbs.c
index 070c5af..7ee4926 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -1210,6 +1210,9 @@ iser_handle_comp_error(struct ib_conn *ib_conn,
iscsi_conn_failure(iser_conn->iscsi_conn,
   ISCSI_ERR_CONN_FAILED);
 
+   if (wc->wr_id == ISER_FASTREG_LI_WRID)
+   return;
+
if (is_iser_tx_desc(iser_conn, wr_id)) {
struct iser_tx_desc *desc = wr_id;
 
@@ -1254,13 +1257,11 @@ static void iser_handle_wc(struct ib_wc *wc)
else
iser_dbg("flush error: wr id %llx\n", wc->wr_id);
 
-   if (wc->wr_id != ISER_FASTREG_LI_WRID &&
-   wc->wr_id != ISER_BEACON_WRID)
-   iser_handle_comp_error(ib_conn, wc);
-
-   /* complete in case all flush errors were consumed */
if (wc->wr_id == ISER_BEACON_WRID)
+   /* all flush errors were consumed */
complete(&ib_conn->flush_comp);
+   else
+   iser_handle_comp_error(ib_conn, wc);
}
 }
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/18] IB/iser: Fix wrong calculation of protection buffer length

2015-03-29 Thread Sagi Grimberg
This length miss-calculation may cause a silent data corruption
in the DIX case and cause the device to reference unmapped area.

Fixes: d77e65350f2d ('libiscsi, iser: Adjust data_length to include protection 
information')
Signed-off-by: Sagi Grimberg 
---
 drivers/infiniband/ulp/iser/iser_initiator.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
b/drivers/infiniband/ulp/iser/iser_initiator.c
index 20e859a..76eb57b 100644
--- a/drivers/infiniband/ulp/iser/iser_initiator.c
+++ b/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -409,8 +409,8 @@ int iser_send_command(struct iscsi_conn *conn,
if (scsi_prot_sg_count(sc)) {
prot_buf->buf  = scsi_prot_sglist(sc);
prot_buf->size = scsi_prot_sg_count(sc);
-   prot_buf->data_len = data_buf->data_len >>
-ilog2(sc->device->sector_size) * 8;
+   prot_buf->data_len = (data_buf->data_len >>
+ilog2(sc->device->sector_size)) * 8;
}
 
if (hdr->flags & ISCSI_FLAG_CMD_READ) {
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/18] IB/iser: Get rid of struct iser_rdma_regd

2015-03-29 Thread Sagi Grimberg
This struct members other than struct iser_mem_reg are unused,
so remove it altogether.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg 
Signed-off-by: Adir Lev 
---
 drivers/infiniband/ulp/iser/iscsi_iser.h |   19 +
 drivers/infiniband/ulp/iser/iser_initiator.c |   44 ++--
 drivers/infiniband/ulp/iser/iser_memory.c|   56 +-
 drivers/infiniband/ulp/iser/iser_verbs.c |2 +-
 4 files changed, 52 insertions(+), 69 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
b/drivers/infiniband/ulp/iser/iscsi_iser.h
index d5e5288..198f928 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -261,23 +261,6 @@ struct iser_mem_reg {
void *mem_h;
 };
 
-/**
- * struct iser_regd_buf - iSER buffer registration desc
- *
- * @reg:  memory registration info
- * @virt_addr:virtual address of buffer
- * @device:   reference to iser device
- * @direction:dma direction (for dma_unmap)
- * @data_size:data buffer size in bytes
- */
-struct iser_regd_buf {
-   struct iser_mem_reg reg;
-   void*virt_addr;
-   struct iser_device  *device;
-   enum dma_data_direction direction;
-   unsigned intdata_size;
-};
-
 enum iser_desc_type {
ISCSI_TX_CONTROL ,
ISCSI_TX_SCSI_COMMAND,
@@ -548,7 +531,7 @@ struct iscsi_iser_task {
struct scsi_cmnd *sc;
int  command_sent;
int  dir[ISER_DIRS_NUM];
-   struct iser_regd_buf rdma_regd[ISER_DIRS_NUM];
+   struct iser_mem_reg  rdma_reg[ISER_DIRS_NUM];
struct iser_data_buf data[ISER_DIRS_NUM];
struct iser_data_buf prot[ISER_DIRS_NUM];
 };
diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
b/drivers/infiniband/ulp/iser/iser_initiator.c
index 0e414db..420a613 100644
--- a/drivers/infiniband/ulp/iser/iser_initiator.c
+++ b/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -50,7 +50,7 @@ static int iser_prepare_read_cmd(struct iscsi_task *task)
 {
struct iscsi_iser_task *iser_task = task->dd_data;
struct iser_device  *device = iser_task->iser_conn->ib_conn.device;
-   struct iser_regd_buf *regd_buf;
+   struct iser_mem_reg *mem_reg;
int err;
struct iser_hdr *hdr = &iser_task->desc.iser_header;
struct iser_data_buf *buf_in = &iser_task->data[ISER_DIR_IN];
@@ -78,15 +78,15 @@ static int iser_prepare_read_cmd(struct iscsi_task *task)
iser_err("Failed to set up Data-IN RDMA\n");
return err;
}
-   regd_buf = &iser_task->rdma_regd[ISER_DIR_IN];
+   mem_reg = &iser_task->rdma_reg[ISER_DIR_IN];
 
hdr->flags|= ISER_RSV;
-   hdr->read_stag = cpu_to_be32(regd_buf->reg.rkey);
-   hdr->read_va   = cpu_to_be64(regd_buf->reg.va);
+   hdr->read_stag = cpu_to_be32(mem_reg->rkey);
+   hdr->read_va   = cpu_to_be64(mem_reg->va);
 
iser_dbg("Cmd itt:%d READ tags RKEY:%#.4X VA:%#llX\n",
-task->itt, regd_buf->reg.rkey,
-(unsigned long long)regd_buf->reg.va);
+task->itt, mem_reg->rkey,
+(unsigned long long)mem_reg->va);
 
return 0;
 }
@@ -104,7 +104,7 @@ iser_prepare_write_cmd(struct iscsi_task *task,
 {
struct iscsi_iser_task *iser_task = task->dd_data;
struct iser_device  *device = iser_task->iser_conn->ib_conn.device;
-   struct iser_regd_buf *regd_buf;
+   struct iser_mem_reg *mem_reg;
int err;
struct iser_hdr *hdr = &iser_task->desc.iser_header;
struct iser_data_buf *buf_out = &iser_task->data[ISER_DIR_OUT];
@@ -134,25 +134,25 @@ iser_prepare_write_cmd(struct iscsi_task *task,
return err;
}
 
-   regd_buf = &iser_task->rdma_regd[ISER_DIR_OUT];
+   mem_reg = &iser_task->rdma_reg[ISER_DIR_OUT];
 
if (unsol_sz < edtl) {
hdr->flags |= ISER_WSV;
-   hdr->write_stag = cpu_to_be32(regd_buf->reg.rkey);
-   hdr->write_va   = cpu_to_be64(regd_buf->reg.va + unsol_sz);
+   hdr->write_stag = cpu_to_be32(mem_reg->rkey);
+   hdr->write_va   = cpu_to_be64(mem_reg->va + unsol_sz);
 
iser_dbg("Cmd itt:%d, WRITE tags, RKEY:%#.4X "
 "VA:%#llX + unsol:%d\n",
-task->itt, regd_buf->reg.rkey,
-(unsigned long long)regd_buf->reg.va, unsol_sz);
+task->itt, mem_reg->rkey,
+(unsigned long long)mem_reg->va, unsol_sz);
}
 
if (imm_sz > 0) {
iser_dbg("Cmd itt:%d, WRITE, adding imm.data sz: %d\n",
 task->itt, imm_sz);
-   tx_dsg->addr   = regd_buf->reg.va;
+   tx_dsg->addr   = mem_reg->va;
 

iser patches for kernel 4.1

2015-03-29 Thread Sagi Grimberg
Hi Roland,

This set contains bug fixes as well as code refactoring patches

- Patches 1-3: Bug fixes (stable material)
- Patches 4-6: Bounce buffer related code cleanups
- Patches 7-12,16: Refactoring
- Patches 13-15:   Minor optimizations
- Patch 17:Version bump
- Patch 18:Bounce buffer rewrite to handle situations where
   large transfers are unaligned causing iser to
   use high order allocations

Sagi Grimberg (18):
  IB/iser: Fix unload during ep_poll wrong dereference
  IB/iser: Handle fastreg/local_inv completion errors
  IB/iser: Fix wrong calculation of protection buffer length
  IB/iser: Remove redundant cmd_data_len calculation
  IB/iser: Remove a redundant struct iser_data_buf
  IB/iser: Don't pass ib_device to fall_to_bounce_buff routine
  IB/iser: Move memory reg/dereg routines to iser_memory.c
  IB/iser: Remove redundant assignments in iser_reg_page_vec
  IB/iser: Get rid of struct iser_rdma_regd
  IB/iser: Merge build page-vec into register page-vec
  IB/iser: Move fastreg descriptor pool get/put to helper functions
  IB/iser: Move PI context alloc/free to routines
  IB/iser: Make fastreg pool cache friendly
  IB/iser: Modify struct iser_mem_reg members
  IB/iser: Pass struct iser_mem_reg to iser_fast_reg_mr and
iser_reg_sig_mr
  IB/iser: Remove code duplication for a single DMA entry
  IB/iser: Bump version to 1.6
  IB/iser: Rewrite bounce buffer code path

 drivers/infiniband/ulp/iser/iscsi_iser.h |   64 +---
 drivers/infiniband/ulp/iser/iser_initiator.c |   66 ++--
 drivers/infiniband/ulp/iser/iser_memory.c|  524 +++---
 drivers/infiniband/ulp/iser/iser_verbs.c |  220 
 4 files changed, 431 insertions(+), 443 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html