[patch] mlx5: use after free in mlx5_cmd_comp_handler()

2013-07-22 Thread Dan Carpenter
We can't dereference ent after passing it to free_cmd().

Signed-off-by: Dan Carpenter dan.carpen...@oracle.com

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c 
b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 205753a..4037406 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1113,7 +1113,13 @@ void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, 
unsigned long vector)
 
for (i = 0; i  (1  cmd-log_sz); i++) {
if (test_bit(i, vector)) {
+   struct semaphore *sem;
+
ent = cmd-ent_arr[i];
+   if (ent-page_queue)
+   sem = cmd-pages_sem;
+   else
+   sem = cmd-sem;
ktime_get_ts(ent-ts2);
memcpy(ent-out-first.data, ent-lay-out, 
sizeof(ent-lay-out));
dump_command(dev, ent, 0);
@@ -1136,10 +1142,7 @@ void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, 
unsigned long vector)
} else {
complete(ent-done);
}
-   if (ent-page_queue)
-   up(cmd-pages_sem);
-   else
-   up(cmd-sem);
+   up(sem);
}
}
 }
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-3.11 7/7] IB/iser: Introduce fast memory registration model (FRWR)

2013-07-22 Thread Bart Van Assche

On 07/18/13 15:25, Or Gerlitz wrote:

+static int iser_fast_reg_mr(struct fast_reg_descriptor *desc,
+   struct iser_conn *ib_conn,
+   struct iser_regd_buf *regd_buf,
+   u32 offset, unsigned int data_size,
+   unsigned int page_list_len)
+{
+   struct ib_send_wr fastreg_wr, inv_wr;
+   struct ib_send_wr *bad_wr, *wr = NULL;
+   u8 key;
+   int ret;
+
+   if (!desc-valid) {
+   memset(inv_wr, 0, sizeof(inv_wr));
+   inv_wr.opcode = IB_WR_LOCAL_INV;
+   inv_wr.send_flags = IB_SEND_SIGNALED;
+   inv_wr.ex.invalidate_rkey = desc-data_mr-rkey;
+   wr = inv_wr;
+   /* Bump the key */
+   key = (u8)(desc-data_mr-rkey  0x00FF);
+   ib_update_fast_reg_key(desc-data_mr, ++key);
+   }
+
+   /* Prepare FASTREG WR */
+   memset(fastreg_wr, 0, sizeof(fastreg_wr));
+   fastreg_wr.opcode = IB_WR_FAST_REG_MR;
+   fastreg_wr.send_flags = IB_SEND_SIGNALED;
+   fastreg_wr.wr.fast_reg.iova_start = desc-data_frpl-page_list[0] + 
offset;
+   fastreg_wr.wr.fast_reg.page_list = desc-data_frpl;
+   fastreg_wr.wr.fast_reg.page_list_len = page_list_len;
+   fastreg_wr.wr.fast_reg.page_shift = SHIFT_4K;
+   fastreg_wr.wr.fast_reg.length = data_size;
+   fastreg_wr.wr.fast_reg.rkey = desc-data_mr-rkey;
+   fastreg_wr.wr.fast_reg.access_flags = (IB_ACCESS_LOCAL_WRITE  |
+  IB_ACCESS_REMOTE_WRITE |
+  IB_ACCESS_REMOTE_READ);


Hello Sagi,

If I interpret the above code correctly the rkey used in the previous 
FRWR is invalidated as soon as a new FRWR is queued. Does this mean that 
the iSER initiator limits queue depth to one ?


Another question: is it on purpose that iscsi_iser_cleanup_task() does 
not invalidate an rkey if a command has been aborted successfully ? A 
conforming iSER target does not send a response for aborted commands. 
Will successful command abortion result in the rkey not being 
invalidated ? What will happen if a new FRWR is submitted with an rkey 
that is still valid ?


Thanks,

Bart.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-3.11 7/7] IB/iser: Introduce fast memory registration model (FRWR)

2013-07-22 Thread Sagi Grimberg

On 7/22/2013 2:46 PM, Bart Van Assche wrote:

On 07/18/13 15:25, Or Gerlitz wrote:

+static int iser_fast_reg_mr(struct fast_reg_descriptor *desc,
+struct iser_conn *ib_conn,
+struct iser_regd_buf *regd_buf,
+u32 offset, unsigned int data_size,
+unsigned int page_list_len)
+{
+struct ib_send_wr fastreg_wr, inv_wr;
+struct ib_send_wr *bad_wr, *wr = NULL;
+u8 key;
+int ret;
+
+if (!desc-valid) {
+memset(inv_wr, 0, sizeof(inv_wr));
+inv_wr.opcode = IB_WR_LOCAL_INV;
+inv_wr.send_flags = IB_SEND_SIGNALED;
+inv_wr.ex.invalidate_rkey = desc-data_mr-rkey;
+wr = inv_wr;
+/* Bump the key */
+key = (u8)(desc-data_mr-rkey  0x00FF);
+ib_update_fast_reg_key(desc-data_mr, ++key);
+}
+
+/* Prepare FASTREG WR */
+memset(fastreg_wr, 0, sizeof(fastreg_wr));
+fastreg_wr.opcode = IB_WR_FAST_REG_MR;
+fastreg_wr.send_flags = IB_SEND_SIGNALED;
+fastreg_wr.wr.fast_reg.iova_start = 
desc-data_frpl-page_list[0] + offset;

+fastreg_wr.wr.fast_reg.page_list = desc-data_frpl;
+fastreg_wr.wr.fast_reg.page_list_len = page_list_len;
+fastreg_wr.wr.fast_reg.page_shift = SHIFT_4K;
+fastreg_wr.wr.fast_reg.length = data_size;
+fastreg_wr.wr.fast_reg.rkey = desc-data_mr-rkey;
+fastreg_wr.wr.fast_reg.access_flags = (IB_ACCESS_LOCAL_WRITE  |
+   IB_ACCESS_REMOTE_WRITE |
+   IB_ACCESS_REMOTE_READ);


Hello Sagi,

If I interpret the above code correctly the rkey used in the previous 
FRWR is invalidated as soon as a new FRWR is queued. Does this mean 
that the iSER initiator limits queue depth to one ?


Another question: is it on purpose that iscsi_iser_cleanup_task() does 
not invalidate an rkey if a command has been aborted successfully ? A 
conforming iSER target does not send a response for aborted commands. 
Will successful command abortion result in the rkey not being 
invalidated ? What will happen if a new FRWR is submitted with an rkey 
that is still valid ?


Thanks,

Bart.



Hey Bart,

You interpret correctly, iSER will local invalidate the rkey just before 
re-using it (conditioned that it is not previously invalidated - 
remotely by the target).
This code is still missing the remote invalidate part, then iSER 
initiator will publish its remote invalidate support and in case the 
target may remote invalidate (seen in the RSP completion) the rkey
and the Initiator will pick it up in the RSP completion and mark the 
associated MR as valid (valid for use again).


Not sure what you meant in your question, but this does not mean that 
iSER initiator limits the queue depth to 1,
initiator manages a pool of fastreg descriptors of size == max queued 
commands (per connection) each containing an ib_mr,
For each concurrent IOP it takes a fastreg descriptor from the pool, and 
uses it for registration (if marked as not valid - will local invalidate 
the rkey and then use it for registration).
When cleanup_task - iser_task_rdma_finalize - iser_unreg_rdma_mem is 
called,
it just returns the fastreg to the pool (without local invalidate - as 
it is done when it will be reused).


The reason I chose to do that is that if I locally invalidate the rkey 
upon task cleanup then
Only after the completion I'm allowed to return it back to the pool 
(only then I know it's ready for reuse) and assuming that I still want 
to evacuate the task and not wait in my fast-path,
I may end-up in certain conditions in a situation that I have no 
resources to handle the next IOP since all MRs are waiting for LOCAL_INV 
completions.
A possible solution here was to heuristically use a larger pool - but I 
wanted to avoid that...


So just to clarify the flow:
. at connection establishment allocate pool of fastreg descriptors
. upon each IOP take a fastreg descriptor from the pool
. if it is not invalidated - invalidate it.
. register using FRWR.
. when cleanup_task is called - just return the fastreg descriptor to 
the pool.

. at connection teardown free all resources.
Still to come:
. upon each IOP response, check if the target used remote invalidate - 
if so mark relevant fastreg as valid.


Hope this helps.

-Sagi
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-3.11 7/7] IB/iser: Introduce fast memory registration model (FRWR)

2013-07-22 Thread Or Gerlitz
On Mon, Jul 22, 2013 at 2:46 PM, Bart Van Assche bvanass...@acm.org wrote:
[...]
 What will happen if a new FRWR is submitted with an rkey that is still valid ?

The HW will fail this WR, we must invalidate the mapping before re-using the MR.

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH, RFC] mlx4: Avoid that mlx4_cmd_wait() contributes to the system load

2013-07-22 Thread Bart Van Assche
Avoid that kernel threads running mlx4_cmd_wait() contribute to the
system load by setting the task state to TASK_INTERRUPTIBLE instead
of TASK_UNINTERRUPTIBLE while waiting. This patch reduces the load
average from about 0.5 to about 0.0 on an idle system with one mlx4
HCA and no IB cables connected.

Note: I'm posting this patch as an RFC since it involves a behavior
change (a signal sent to a worker thread that is waiting for a
command to finish causes the command to fail) and since I'm not sure
this behavior change is acceptable.

Signed-off-by: Bart Van Assche bvanass...@acm.org
---
 drivers/net/ethernet/mellanox/mlx4/cmd.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c 
b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index 299d018..fb7f7fa 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -559,8 +559,8 @@ static int mlx4_cmd_wait(struct mlx4_dev *dev, u64 
in_param, u64 *out_param,
mlx4_cmd_post(dev, in_param, out_param ? *out_param : 0,
  in_modifier, op_modifier, op, context-token, 1);
 
-   if (!wait_for_completion_timeout(context-done,
-msecs_to_jiffies(timeout))) {
+   if (wait_for_completion_interruptible_timeout(context-done,
+msecs_to_jiffies(timeout)) = 0) {
mlx4_warn(dev, command 0x%x timed out (go bit not cleared)\n,
  op);
err = -EBUSY;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] OpenSM: don't remove our own SM port.

2013-07-22 Thread Sasha Khapyorsky

When cleaning up remotes of non reachable ports anymore. Make exception
for our own local SM port. In that case port still be in SM DB and
running single port mode is possible.

Signed-off-by: Sasha Khapyorsky sash...@gmail.com
---
 opensm/osm_drop_mgr.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/opensm/osm_drop_mgr.c b/opensm/osm_drop_mgr.c
index b309273..5113fa2 100644
--- a/opensm/osm_drop_mgr.c
+++ b/opensm/osm_drop_mgr.c
@@ -114,7 +114,9 @@ static void drop_mgr_clean_physp(osm_sm_t * sm, IN 
osm_physp_t * p_physp)
   the remote port, since it is no longer reachable.
   This can be done if we reset the discovery count
   of the remote port. */
-   if (!p_remote_physp-p_node-sw) {
+   if (!p_remote_physp-p_node-sw 
+   p_remote_physp-port_guid !=
+   sm-p_subn-sm_port_guid) {
p_remote_port-discovery_count = 0;
OSM_LOG(sm-p_log, OSM_LOG_DEBUG,
Resetting discovery count of node: 
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm: update internal PortInfo state for any ports

2013-07-22 Thread Sasha Khapyorsky
Hi Hal,

On 09:37 Fri 12 Jul , Hal Rosenstock wrote:
 
 What difference does it make ? Does this patch fix some operational
 issue ? Is there any harm in not keeping the complete PortInfo when port
 is DOWN ?

AFAIK it doesn't harm in the current state. However, I'm prepering the
patches, which will let to OpenSM work on stand alone port. For such
mode this fix is critical.

Will recent with other patches shortly.

Sasha
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] opensm: update internal PortInfo state for any ports

2013-07-22 Thread Sasha Khapyorsky

Should not be matter to keep internal SM's PortInfo data for ports
in any states.

Signed-off-by: Sasha Khapyorsky sash...@gmail.com
---
 opensm/osm_port.c | 33 -
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/opensm/osm_port.c b/opensm/osm_port.c
index 6e73e66..d59d404 100644
--- a/opensm/osm_port.c
+++ b/opensm/osm_port.c
@@ -669,25 +669,16 @@ void osm_physp_set_port_info(IN osm_physp_t * p_physp,
CL_ASSERT(p_pi);
CL_ASSERT(osm_physp_is_valid(p_physp));
 
-   if (ib_port_info_get_port_state(p_pi) == IB_LINK_DOWN) {
-   /* If PortState is down, only copy PortState */
-   /* and PortPhysicalState per C14-24-2.1 */
-   ib_port_info_set_port_state(p_physp-port_info, IB_LINK_DOWN);
-   ib_port_info_set_port_phys_state
-   (ib_port_info_get_port_phys_state(p_pi),
-p_physp-port_info);
-   } else {
-   p_physp-port_info = *p_pi;
-
-   /* The MKey in p_pi can only be considered valid if it's
-* for a HCA/router or switch port 0, and it's either
-* non-zero or the MKeyProtect bits are also zero.
-*/
-   if ((osm_node_get_type(p_physp-p_node) !=
-IB_NODE_TYPE_SWITCH || p_physp-port_num == 0) 
-   (p_pi-m_key != 0 || ib_port_info_get_mpb(p_pi) == 0))
-   osm_db_guid2mkey_set(p_sm-p_subn-p_g2m,
-cl_ntoh64(p_physp-port_guid),
-cl_ntoh64(p_pi-m_key));
-   }
+   p_physp-port_info = *p_pi;
+
+   /* The MKey in p_pi can only be considered valid if it's
+* for a HCA/router or switch port 0, and it's either
+* non-zero or the MKeyProtect bits are also zero.
+*/
+   if ((osm_node_get_type(p_physp-p_node) != IB_NODE_TYPE_SWITCH ||
+p_physp-port_num == 0) 
+   (p_pi-m_key != 0 || ib_port_info_get_mpb(p_pi) == 0))
+   osm_db_guid2mkey_set(p_sm-p_subn-p_g2m,
+cl_ntoh64(p_physp-port_guid),
+cl_ntoh64(p_pi-m_key));
 }
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] OpenSM: single port sweep

2013-07-22 Thread Sasha Khapyorsky

This provides possibility to keep SM/SA operational even in case when
the local SM port was disconnected. It is needed in order to not break
existing loopback connections.
As side effect it let us to startup OpenSM on disconnected port.

Signed-off-by: Sasha Khapyorsky sash...@gmail.com
---
 opensm/osm_state_mgr.c | 95 +-
 1 file changed, 87 insertions(+), 8 deletions(-)

diff --git a/opensm/osm_state_mgr.c b/opensm/osm_state_mgr.c
index 1b73834..c586e64 100644
--- a/opensm/osm_state_mgr.c
+++ b/opensm/osm_state_mgr.c
@@ -1075,6 +1075,90 @@ int wait_for_pending_transactions(osm_stats_t * stats)
return osm_exit_flag;
 }
 
+static void single_node_sweep(osm_sm_t *sm)
+{
+   osm_opensm_report_event(sm-p_subn-p_osm,
+   OSM_EVENT_ID_HEAVY_SWEEP_DONE, NULL);
+
+   OSM_LOG_MSG_BOX(sm-p_log, OSM_LOG_VERBOSE, HEAVY SWEEP COMPLETE);
+
+   osm_drop_mgr_process(sm);
+
+   /*
+* If we are not MASTER already - this means that we are
+* in discovery state. call osm_sm_state_mgr with signal
+* DISCOVERY_COMPLETED
+*/
+   if (sm-p_subn-sm_state == IB_SMINFO_STATE_DISCOVERING)
+   osm_sm_state_mgr_process(sm, OSM_SM_SIGNAL_DISCOVERY_COMPLETED);
+
+   osm_pkey_mgr_process(sm-p_subn-p_osm);
+
+   /* try to restore SA DB (this should be before lid_mgr
+  because we may want to disable clients reregistration
+  when SA DB is restored) */
+   osm_sa_db_file_load(sm-p_subn-p_osm);
+
+   if (wait_for_pending_transactions(sm-p_subn-p_osm-stats))
+   return;
+
+   OSM_LOG_MSG_BOX(sm-p_log, OSM_LOG_VERBOSE,
+   PKEY setup completed - STARTING SM LID CONFIG);
+
+   osm_lid_mgr_process_sm(sm-lid_mgr);
+   if (wait_for_pending_transactions(sm-p_subn-p_osm-stats))
+   return;
+
+   state_mgr_notify_lid_change(sm);
+
+   /* At this point we need to check the consistency of
+* the port_lid_tbl under the subnet. There might be
+* errors in it if PortInfo Set requests didn't reach
+* their destination. */
+   state_mgr_check_tbl_consistency(sm);
+
+   OSM_LOG_MSG_BOX(sm-p_log, OSM_LOG_VERBOSE, LID ASSIGNMENT COMPLETE);
+
+   /* in any case we zero this flag */
+   sm-p_subn-coming_out_of_standby = FALSE;
+
+   /* If there were errors - then the subnet is not really up */
+   if (sm-p_subn-subnet_initialization_error == TRUE) {
+   osm_log_v2(sm-p_log, OSM_LOG_SYS, FILE_ID,
+  Errors during initialization\n);
+   OSM_LOG_MSG_BOX(sm-p_log, OSM_LOG_ERROR,
+   ERRORS DURING INITIALIZATION);
+   } else {
+   sm-p_subn-need_update = 0;
+   osm_dump_all(sm-p_subn-p_osm);
+   state_mgr_up_msg(sm);
+   sm-p_subn-first_time_master_sweep = FALSE;
+   sm-p_subn-set_client_rereg_on_sweep = FALSE;
+
+   if (OSM_LOG_IS_ACTIVE_V2(sm-p_log, OSM_LOG_VERBOSE) ||
+   sm-p_subn-opt.sa_db_dump)
+   osm_sa_db_file_dump(sm-p_subn-p_osm);
+   }
+
+   /*
+* Finally signal the subnet up event
+*/
+   cl_event_signal(sm-subnet_up_event);
+
+   osm_opensm_report_event(sm-p_subn-p_osm, OSM_EVENT_ID_SUBNET_UP,
+   NULL);
+
+   /* if we got a signal to force heavy sweep or errors
+* in the middle of the sweep - try another sweep. */
+   if (sm-p_subn-force_heavy_sweep
+   || sm-p_subn-subnet_initialization_error)
+   osm_sm_signal(sm, OSM_SIGNAL_SWEEP);
+
+   /* Write a new copy of our persistent guid2mkey database */
+   osm_db_store(sm-p_subn-p_g2m);
+   osm_db_store(sm-p_subn-p_neighbor);
+}
+
 static void do_sweep(osm_sm_t * sm)
 {
ib_api_status_t status;
@@ -1234,15 +1318,10 @@ repeat_discovery:
SM PORT DOWN);
}
 
-   /* Run the drop manager - we want to clear all records */
-   osm_drop_mgr_process(sm);
-
-   /* Move to DISCOVERING state */
-   if (sm-p_subn-sm_state != IB_SMINFO_STATE_DISCOVERING)
-   osm_sm_state_mgr_process(sm, OSM_SM_SIGNAL_DISCOVER);
-   osm_opensm_report_event(sm-p_subn-p_osm,
-   OSM_EVENT_ID_STATE_CHANGE, NULL);
+   /* special case - just loopback on disconnected node */
+   single_node_sweep(sm);
return;
+
} else {
if (!sm-p_subn-last_sm_port_state) {
sm-p_subn-last_sm_port_state = 1;
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH opensm] libvendor/osm_pkt_randomizer.c: Fix broken compilation with vendor sim

2013-07-22 Thread Hal Rosenstock

Due to change in number of args to osm_dr_path_init.
   
Signed-off-by: Eitan Zahavi ei...@mellanox.com
Signed-off-by: Alex Netes ale...@mellanox.com
Signed-off-by: Hal Rosenstock h...@mellanox.com
---
diff --git a/libvendor/osm_pkt_randomizer.c b/libvendor/osm_pkt_randomizer.c
index cfc5a8f..5ea1f4a 100644
--- a/libvendor/osm_pkt_randomizer.c
+++ b/libvendor/osm_pkt_randomizer.c
@@ -234,8 +234,7 @@ osm_pkt_randomizer_mad_drop(IN osm_log_t * p_log,
/* This is a lid route mad. Don't drop it */
goto Exit;
 
-   osm_dr_path_init(dr_path, 0,   /* The h_bind is not really important 
for us to save */
-p_smp-hop_count, p_smp-initial_path);
+   osm_dr_path_init(dr_path, p_smp-hop_count, p_smp-initial_path);
 
if (__osm_pkt_randomizer_process_path
(p_log, p_pkt_randomizer, dr_path)) {
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html