[openib-general] History change in srptools.git

2007-01-17 Thread ishai
Hi

I changed the history in srptools.git (Some e-mail addresses errors)
The change is only on the log information. The files was not changed.
The change: version 19c761889b9bd86abc027a13c1c6d0a96607fe79 become version 
2088b76f62cd0e94d2c8415a6a328dc818d200f1

If you are working on it you will need to perform rebase.
Sorry for any inconvenient.

Thanks
Ishai

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [srptools] [PATCH] Added checks to memory allocation failure when using asprintf

2007-01-17 Thread ishai
Applied,

Thanks
Ishai


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [srptools] [PATCH] style fix in asprintf result check

2007-01-17 Thread ishai
Applied,

Thanks
Ishai



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] IB/SRP add need_reset

2007-01-16 Thread ishai

When there is a call to send_tsk_mgmt it posts a send and waits for 5 seconds
to get a response.
When the QP is in an error state it is obvious that there will be no response
so it is quite useless to wait.
This timeout causes SRP to wait a long time to reconnect. (Each abort and
each reset_device call send_tsk_mgmt that waits for the timeout).
The following patch solves this problem by identifying the failure 
and returning an immediate error code.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
---
Hi Roland,

This is an old patch. We thought at first that the timeout is because there
is a bug in low level driver. After rechecking it, I found that the bug is
internal to SRP.


Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-09-25 
13:51:47.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-09-25 
15:40:04.0 +0300
@@ -543,6 +543,7 @@ static int srp_reconnect_target(struct s
target->tx_head  = 0;
target->tx_tail  = 0;
 
+   target->need_reset = 0;
ret = srp_connect_target(target);
if (ret)
goto err;
@@ -858,6 +859,7 @@ static void srp_completion(struct ib_cq 
printk(KERN_ERR PFX "failed %s status %d\n",
   wc.wr_id & SRP_OP_RECV ? "receive" : "send",
   wc.status);
+   target->need_reset = 1;
break;
}
 
@@ -1313,6 +1315,8 @@ static int srp_abort(struct scsi_cmnd *s
 
printk(KERN_ERR "SRP abort called\n");
 
+   if (target->need_reset)
+   return FAILED;
if (srp_find_req(target, scmnd, &req))
return FAILED;
if (srp_send_tsk_mgmt(target, req, SRP_TSK_ABORT_TASK))
@@ -1341,6 +1345,8 @@ static int srp_reset_device(struct scsi_
 
printk(KERN_ERR "SRP reset_device called\n");
 
+   if (target->need_reset)
+   return FAILED;
if (srp_find_req(target, scmnd, &req))
return FAILED;
if (srp_send_tsk_mgmt(target, req, SRP_TSK_LUN_RESET))
@@ -1750,6 +1756,7 @@ static ssize_t srp_create_target(struct 
goto err_free;
}
 
+   target->need_reset = 0;
ret = srp_connect_target(target);
if (ret) {
printk(KERN_ERR PFX "Connection failed\n");
Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.h
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.h2006-09-25 
13:51:47.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.h 2006-09-25 
14:00:36.0 +0300
@@ -158,6 +158,7 @@ struct srp_target_port {
struct completion   done;
int status;
enum srp_target_state   state;
+   int need_reset;
 };
 
 struct srp_iu {

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] IB/SRP: check memory allocation

2007-01-16 Thread ishai
This patch checks if the kmalloc in match_strdup was successful.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]> 
---
Index: gen2_devel_kernel/drivers/infiniband/ulp/srp/ib_srp.c
===
--- gen2_devel_kernel.orig/drivers/infiniband/ulp/srp/ib_srp.c  2007-01-16 
16:12:02.0 +0200
+++ gen2_devel_kernel/drivers/infiniband/ulp/srp/ib_srp.c   2007-01-16 
16:48:24.0 +0200
@@ -1627,18 +1627,30 @@ static int srp_parse_options(const char 
switch (token) {
case SRP_OPT_ID_EXT:
p = match_strdup(args);
+   if (!p) {
+   ret = -ENOMEM;
+   goto out;
+   }
target->id_ext = cpu_to_be64(simple_strtoull(p, NULL, 
16));
kfree(p);
break;
 
case SRP_OPT_IOC_GUID:
p = match_strdup(args);
+   if (!p) {
+   ret = -ENOMEM;
+   goto out;
+   }
target->ioc_guid = cpu_to_be64(simple_strtoull(p, NULL, 
16));
kfree(p);
break;
 
case SRP_OPT_DGID:
p = match_strdup(args);
+   if (!p) {
+   ret = -ENOMEM;
+   goto out;
+   }
if (strlen(p) != 32) {
printk(KERN_WARNING PFX "bad dest GID parameter 
'%s'\n", p);
kfree(p);
@@ -1662,6 +1674,10 @@ static int srp_parse_options(const char 
 
case SRP_OPT_SERVICE_ID:
p = match_strdup(args);
+   if (!p) {
+   ret = -ENOMEM;
+   goto out;
+   }
target->service_id = cpu_to_be64(simple_strtoull(p, 
NULL, 16));
kfree(p);
break;
@@ -1699,6 +1715,10 @@ static int srp_parse_options(const char 
 
case SRP_OPT_INITIATOR_EXT:
p = match_strdup(args);
+   if (!p) {
+   ret = -ENOMEM;
+   goto out;
+   }
target->initiator_ext = cpu_to_be64(simple_strtoull(p, 
NULL, 16));
kfree(p);
break;

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] A question about sa_query

2006-10-22 Thread Ishai Rabinovitz
Hi,

There is something that bothers me in sa_query.
 
According to table 115 in the IB-SPEC when the status in the MAD hdr is 
1,2 or 3 it shouldn't be considered to as an error. (1 means busy, 2 
means redirection, and 3 means both).
 
The function "recv_handler" in core/sa_query.c sets the status of the 
sa_query before calling the callback function.
It sets the status according to the status returned in the mad header. 
(mad_recv_wc->recv_buf.mad->mad_hdr.status)
 
If the status in the mad_hdr is different from 0 it sets the return 
status to -EINVAL.
 
This mean that the higher layers (e.g., SRP) do not know what was the 
exact status and therefore treat status 1 (busy) as an error.
 
Am I missing something?
 
Ishai

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] IB/SRP Userspace: srptools/srp_daemon - Fix connect bug and add support for user specified initiator extension

2006-10-19 Thread Ishai Rabinovitz
Thanks for your patch.

I agree with some of the changes you suggest and disagree with others. It
will be useful to post a different patch for each logical change.

> 1. Fixes bug in srp_daemon for the case where if it is invoked with the
'-e' option, it fails to connect to the SRP targets because of a newline
character in the parameter string.

You are right that the '\n' is redundant, but I have not seen the bug it
creates. The last parameter in the string is considered to be a string by
ib_srp and therefore ib_srp will ignore the newline.


> 2. Changes the name of the constant 'MAX_TRAGET_CONFIG_STR_STRING' to
'MAX_TARGET_CONFIG_STR_STRING'.

Thanks, I will apply this change.


> 3. Changes the behavior of the '-n' option to srp_daemon. The earlier
behavior printed the initiator extension. The new behavior allows the
user to specify an initiator extension as an argument to the '-n'
option.

I do not think we want to change the -n behavior to this one.
First of all your approach induces the same initiator_ext to all the
targets discovered by this srp_daemon.
Secondly If someone uses random values for the initiator_ext it may cause
a waste of resources in the target: the target can never tell when a
connection had failed (or an initiator performed a boot)  and will keep
the connection alive. When there is an attempt to reconnect to this target
with the same initiator_ext, the target knows he can close the old
connection.
This is the reason we decided to have a convention. The convention is to
use the target port guid. The advantage of this convention is that it
allows us to have two connection on the same time from an initiator to
both ports of the target.

I understand that some targets may need a different initiator_ext, but you
should add a new flag for actually setting the initiator_ext and leave -n
untouched.
You are welcome to send such a patch.

Ishai



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1 release schedule

2006-10-17 Thread Ishai Rabinovitz




Hi,
Let 
me first explain why the current OFED release does not support SRP-HA on 
RHEL4.
SRP-HA is using Device Mapper 
multipath.
Multipath prerequisites include udev of higher version 
than 050.
RHEL4 distributions includes udev 039. udev is an 
important part of the distribution and I do not think that users will be ready 
to upgrade it in order to have SRP-HA.
To 
my best knowledge the main reason that multipath needs at least udev 050 is 
because it uses the RUN option (This option executes its given parameter after 
the device exist). Multipath uses the RUN option to execute kpartx 
that handles the partitions of the new device. SRP-HA also uses the RUN 
option to execute the multipath command.
I 
have an idea on how to overcome this problem. I want to implement a 
srp-multipath-daemon. This daemon will get kpartx and multipath requests using a 
shared message queue. The udev will use the PROGRAM option (That executes its 
given parameter immediately - before the device exist) to post request to this 
shared message queue and return immediately. The daemon will wait for the device 
to create and only than it will execute the commands.
In 
any case this technique will not be a part of the coming OFED 
release.
Ishai
 
 
-Original 
Message-From: Sharma, 
Karun [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 17, 2006 5:11 
AMTo: Tziporet Koren; Open 
FabricsCc: openibSubject: RE: [openfabrics-ewg] OFED 1.1 
release schedule
 


The plan is OK with 
Silverstorm.

I have a question though. What 
are the plans to support SRP-HA feature on RHEL4 kernels 
?

 

Thanks

Karun
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH] IB/SRP set initiator_extention from user space

2006-10-04 Thread Ishai Rabinovitz

There is a need for an initiator to connect to the same target
several times, e.g., once from each IB port of the target.

Some targets do not support multichannel. In order to work with them as well:

1) Use port_guid instead of node_guid.
2) Allow the user to set the identifier_extension when providing the
target attributes. 

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

---

Roland, Madhu and MST,

I think this summarizes our discussion.

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-10-03 
15:38:16.0 +0200
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-10-03 
18:10:34.0 +0200
@@ -329,25 +329,29 @@ static int srp_send_req(struct srp_targe
req->priv.req_it_iu_len = cpu_to_be32(srp_max_iu_len);
req->priv.req_buf_fmt   = cpu_to_be16(SRP_BUF_FORMAT_DIRECT |
  SRP_BUF_FORMAT_INDIRECT);
+
/*
 * In the published SRP specification (draft rev. 16a), the 
 * port identifier format is 8 bytes of ID extension followed
-* by 8 bytes of GUID.  Older drafts put the two halves in the
-* opposite order, so that the GUID comes first.
+* by 8 bytes of port GUID.  Older drafts put the two halves in the
+* opposite order, so that the port GUID comes first.
 *
 * Targets conforming to these obsolete drafts can be
 * recognized by the I/O Class they report.
 */
+
if (target->io_class == SRP_REV10_IB_IO_CLASS) {
memcpy(req->priv.initiator_port_id,
-  target->srp_host->initiator_port_id + 8, 8);
+  &target->path.sgid.global.interface_id, 8);
memcpy(req->priv.initiator_port_id + 8,
-  target->srp_host->initiator_port_id, 8);
+  &target->initiator_ext, 8);
memcpy(req->priv.target_port_id, &target->ioc_guid, 8);
memcpy(req->priv.target_port_id + 8, &target->id_ext, 8);
} else {
memcpy(req->priv.initiator_port_id,
-  target->srp_host->initiator_port_id, 16);
+  &target->initiator_ext, 8);
+   memcpy(req->priv.initiator_port_id + 8,
+  &target->path.sgid.global.interface_id, 8);
memcpy(req->priv.target_port_id, &target->id_ext, 8);
memcpy(req->priv.target_port_id + 8, &target->ioc_guid, 8);
}
@@ -1557,6 +1561,7 @@ enum {
SRP_OPT_MAX_SECT= 1 << 5,
SRP_OPT_MAX_CMD_PER_LUN = 1 << 6,
SRP_OPT_IO_CLASS= 1 << 7,
+   SRP_OPT_INITIATOR_EXT   = 1 << 8,
SRP_OPT_ALL = (SRP_OPT_ID_EXT   |
   SRP_OPT_IOC_GUID |
   SRP_OPT_DGID |
@@ -1573,6 +1578,7 @@ static match_table_t srp_opt_tokens = {
{ SRP_OPT_MAX_SECT, "max_sect=%d"   },
{ SRP_OPT_MAX_CMD_PER_LUN,  "max_cmd_per_lun=%d"},
{ SRP_OPT_IO_CLASS, "io_class=%x"   },
+   { SRP_OPT_INITIATOR_EXT,"initiator_ext=%s"  },
{ SRP_OPT_ERR,  NULL}
 };
 
@@ -1672,6 +1678,12 @@ static int srp_parse_options(const char 
target->io_class = token;
break;
 
+   case SRP_OPT_INITIATOR_EXT:
+   p = match_strdup(args);
+   target->initiator_ext = cpu_to_be64(simple_strtoull(p, 
NULL, 16));
+   kfree(p);
+   break;
+
default:
printk(KERN_WARNING PFX "unknown parameter or missing 
value "
   "'%s' in target creation request\n", p);
@@ -1820,9 +1832,6 @@ static struct srp_host *srp_add_port(str
host->dev  = device;
host->port = port;
 
-   host->initiator_port_id[7] = port;
-   memcpy(host->initiator_port_id + 8, &device->dev->node_guid, 8);
-
host->class_dev.class = &srp_class;
host->class_dev.dev   = device->dev->dma_device;
snprintf(host->class_dev.class_id, BUS_ID_SIZE, "srp-%s-%d",
Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.h
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.h2006-10-03 
15:38:16.0 +0200
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.h 2006-10-03 
18:05:50.0 +0200
@@ -91,7 +91,6

[openib-general] [PATCH] IB_CM: Limit the MRA timeout

2006-10-03 Thread Ishai Rabinovitz

There is a bug in SRP Engenio target that send a large value as service
timeout. (It gets 30 which mean timeout of (2^(30-8))=4195 sec.)
Such a long timeout is not reasonable and it may leave the kernel module 
waiting on wait_for_completion and may stuck a lot of processes.

The following patch allows the load of ib_cm module with a limit on the timeout.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

---


Index: last_stable/drivers/infiniband/core/cm.c
===
--- last_stable.orig/drivers/infiniband/core/cm.c   2006-10-03 
15:30:38.0 +0200
+++ last_stable/drivers/infiniband/core/cm.c2006-10-03 15:39:53.0 
+0200
@@ -54,6 +54,13 @@ MODULE_AUTHOR("Sean Hefty");
 MODULE_DESCRIPTION("InfiniBand CM");
 MODULE_LICENSE("Dual BSD/GPL");
 
+static int mra_timeout_limit = 0;
+
+module_param(mra_timeout_limit, int, 0444);
+MODULE_PARM_DESC(mra_timeout_limit,
+ "Limit the MRA timeout according to this value if != 0");
+
+
 static void cm_add_one(struct ib_device *device);
 static void cm_remove_one(struct ib_device *device);
 
@@ -2297,6 +2304,9 @@ static int cm_mra_handler(struct cm_work
timeout = cm_convert_to_ms(cm_mra_get_service_timeout(mra_msg)) +
  cm_convert_to_ms(cm_id_priv->av.packet_life_time);
 
+   if (mra_timeout_limit && timeout > mra_timeout_limit)
+   timeout = mra_timeout_limit;
+
spin_lock_irqsave(&cm_id_priv->lock, flags);
    switch (cm_id_priv->id.state) {
case IB_CM_REQ_SENT:
-- 
Ishai Rabinovitz

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] IB/SRP: Remove redundant memset of the target

2006-10-03 Thread Ishai Rabinovitz

scsi_host_alloc already sets the entire scsi_host (including the privsize to
zero)

This patch removes the redundant memset.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-10-03 
15:41:49.0 +0200
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-10-03 
15:55:19.0 +0200
@@ -1731,7 +1731,6 @@ static ssize_t srp_create_target(struct 
target_host->max_lun = SRP_MAX_LUN;
 
target = host_to_target(target_host);
-   memset(target, 0, sizeof *target);
 
target->io_class   = SRP_REV16A_IB_IO_CLASS;
target->scsi_host  = target_host;
-- 
Ishai Rabinovitz

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] IB/SRP: Enable multichannel

2006-09-27 Thread Ishai Rabinovitz
Roland Dreier wrote:
> Maybe we should just use the port GUID instead of the node GUID to
> form the initiator ID?  That would solve this pretty cleanly I think.


This is also Vu's idea.

There are two issues:

1) My patch allows a sophisticated user to have two logical connections on
the same physical solution. He can have different connection parameters
(e.g., MAX_CMD_PER_LUN) according to the application needs.
 Do you think there is such need?

2) In the current implementation there is a problem when there are two
connections on the same physical connection - when the second connection
sends REQ to the target, the target sends a DREQ to the first connection,
but when someone tries to access the first scsi_host, ib_srp tries to
reconnect the first connection and then the second connection gets a DREQ
- and so the ping pong goes.
And if there is a multipath daemon that checks the status of the
connections this ping pong can be for ever.
We need to find a way to eliminate this behavior.

Ishai


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] IB/SRP: allowing multiple connections from taregt to initiator

2006-09-27 Thread Ishai Rabinovitz

SRP High Availability should enable an initiator to connect to the same target
several times, e.g., once from each IB port of the target.

Some targets do not support multichannel. In order to work with them as well
we will use another identifier_extension to the initiator port for each target
connection.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

---

I think this is the best solution. It allows users to use all four physical
connections from the initiator to target.

It also allows users to have several logical connections on one physical
connection (If they want connection with different attributes - for example
different max_cmd_per_lun).

It is SRP spec compliant.

I also added a module param, so it is possible to turn this option off.

Index: latest/drivers/infiniband/ulp/srp/ib_srp.c
===
--- latest.orig/drivers/infiniband/ulp/srp/ib_srp.c 2006-09-27 
10:36:13.0 +0300
+++ latest/drivers/infiniband/ulp/srp/ib_srp.c  2006-09-27 16:48:12.0 
+0300
@@ -85,6 +85,13 @@ MODULE_PARM_DESC(mellanox_workarounds,
 
 static const u8 mellanox_oui[3] = { 0x00, 0x02, 0xc9 };
 
+static int variable_identifier_extension = 1;
+
+module_param(variable_identifier_extension, int, 0444);
+MODULE_PARM_DESC(variable_identifier_extension,
+"Use another identifier_extension on each connection to target"
+", allows multichannel connection on all targets if != 0");
+
 static void srp_add_one(struct ib_device *device);
 static void srp_remove_one(struct ib_device *device);
 static void srp_completion(struct ib_cq *cq, void *target_ptr);
@@ -329,6 +336,7 @@ static int srp_send_req(struct srp_targe
req->priv.req_it_iu_len = cpu_to_be32(srp_max_iu_len);
req->priv.req_buf_fmt   = cpu_to_be16(SRP_BUF_FORMAT_DIRECT |
  SRP_BUF_FORMAT_INDIRECT);
+
/*
 * In the published SRP specification (draft rev. 16a), the 
 * port identifier format is 8 bytes of ID extension followed
@@ -341,13 +349,23 @@ static int srp_send_req(struct srp_targe
if (target->io_class == SRP_REV10_IB_IO_CLASS) {
memcpy(req->priv.initiator_port_id,
   target->srp_host->initiator_port_id + 8, 8);
-   memcpy(req->priv.initiator_port_id + 8,
-  target->srp_host->initiator_port_id, 8);
+   if (variable_identifier_extension)
+   memcpy(req->priv.initiator_port_id + 8,
+  &target, sizeof target);
+   else
+   memcpy(req->priv.initiator_port_id + 8,
+  target->srp_host->initiator_port_id, 8);
memcpy(req->priv.target_port_id, &target->ioc_guid, 8);
memcpy(req->priv.target_port_id + 8, &target->id_ext, 8);
} else {
-   memcpy(req->priv.initiator_port_id,
-  target->srp_host->initiator_port_id, 16);
+   if (variable_identifier_extension)
+   memcpy(req->priv.initiator_port_id,
+  &target, sizeof target);
+   else
+   memcpy(req->priv.initiator_port_id,
+  target->srp_host->initiator_port_id, 8);
+   memcpy(req->priv.initiator_port_id + 8,
+  target->srp_host->initiator_port_id + 8, 8);
memcpy(req->priv.target_port_id, &target->id_ext, 8);
memcpy(req->priv.target_port_id + 8, &target->ioc_guid, 8);
}
@@ -1823,7 +1841,8 @@ static struct srp_host *srp_add_port(str
host->dev  = device;
host->port = port;
 
-   host->initiator_port_id[7] = port;
+   if (!variable_identifier_extension)
+   host->initiator_port_id[7] = port;
memcpy(host->initiator_port_id + 8, &device->dev->node_guid, 8);
 
host->class_dev.class = &srp_class;
-- 
Ishai Rabinovitz

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] 90-ib.rules incorrect?

2006-09-27 Thread ishai
In early versions of udev the syntax was different. The syntax used (=)
and not (==).
RHEL4 for example is still using such old version of udev.

Apparently the new udev versions (used for example in SLES10) still
supports the old syntax.

So this way we can have one file that suits both udev versions.

Ishai


> Isn't the format of 90-ib.rules in
> https://openfabrics.org/svn/gen2/trunk/ofed/openib/scripts/90-ib.rulesincorrect.
>
> We have
>
> KERNEL="umad*", NAME="infiniband/%k", which should be
> KERNEL=="umad*", NAME="infiniband/%k"
>
> Am I missing something?
>
> Eugene
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] IB/SRP: Enable multichannel

2006-09-26 Thread Ishai Rabinovitz
Hi Roland,

SRP High Availability needs an initiator to connect to the same target 
several times, e.g., once from each IB port of the target (this way we can use
device mapper multipath for failover). Note that both connections are actually
active, e.g. multipath is issuing commands to to get the remote scsi id.

Since multiple channel operation is currently disabled in connection request,
each new connection request will cause the target to disconnect
the existing connection which forces us to bounce a lot between the two 
channels.

This patch enables multiple channel operation in connection requests, to avoid 
getting
disconnects when multiple connections are active. There does not seem to be any 
harm
in doing this even when multipath is not used.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

---

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-09-26 
09:22:13.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-09-26 
14:54:35.0 +0300
@@ -329,6 +329,7 @@ static int srp_send_req(struct srp_targe
req->priv.req_it_iu_len = cpu_to_be32(srp_max_iu_len);
req->priv.req_buf_fmt   = cpu_to_be16(SRP_BUF_FORMAT_DIRECT |
  SRP_BUF_FORMAT_INDIRECT);
+   req->priv.req_flags = SRP_MULTICHAN_MULTI;
/*
 * In the published SRP specification (draft rev. 16a), the 
 * port identifier format is 8 bytes of ID extension followed
-- 
Ishai Rabinovitz

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] IB/SRP identify QP in error state

2006-09-26 Thread Ishai Rabinovitz
There is a bug in mthca low level driver. 
A call to ib_post_send that tries to post to a QP that is in error state does
not return immediately with error. It terminates with errors after a timeout.

This causes SRP to wait a long time to reconnect. (Each abort call and
each reset_device call performs post_send and waits for the timeout).
The following patch solves this problem by identifying the failure 
and returning an immediate error code.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
---
Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-09-25 
13:51:47.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-09-25 
15:40:04.0 +0300
@@ -543,6 +543,7 @@ static int srp_reconnect_target(struct s
target->tx_head  = 0;
target->tx_tail  = 0;
 
+   target->need_reset = 0;
ret = srp_connect_target(target);
if (ret)
goto err;
@@ -858,6 +859,7 @@ static void srp_completion(struct ib_cq 
printk(KERN_ERR PFX "failed %s status %d\n",
   wc.wr_id & SRP_OP_RECV ? "receive" : "send",
   wc.status);
+   target->need_reset = 1;
break;
}
 
@@ -1313,6 +1315,8 @@ static int srp_abort(struct scsi_cmnd *s
 
printk(KERN_ERR "SRP abort called\n");
 
+   if (target->need_reset)
+   return FAILED;
if (srp_find_req(target, scmnd, &req))
return FAILED;
if (srp_send_tsk_mgmt(target, req, SRP_TSK_ABORT_TASK))
@@ -1341,6 +1345,8 @@ static int srp_reset_device(struct scsi_
 
printk(KERN_ERR "SRP reset_device called\n");
 
+   if (target->need_reset)
+   return FAILED;
if (srp_find_req(target, scmnd, &req))
return FAILED;
if (srp_send_tsk_mgmt(target, req, SRP_TSK_LUN_RESET))
@@ -1750,6 +1756,7 @@ static ssize_t srp_create_target(struct 
goto err_free;
}
 
+   target->need_reset = 0;
ret = srp_connect_target(target);
if (ret) {
printk(KERN_ERR PFX "Connection failed\n");
Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.h
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.h2006-09-25 
13:51:47.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.h 2006-09-25 
14:00:36.0 +0300
@@ -158,6 +158,7 @@ struct srp_target_port {
struct completion   done;
int status;
enum srp_target_state   state;
+   int need_reset;
 };
 
 struct srp_iu {
-- 
Ishai Rabinovitz

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] High Availability status in OFED (was Re: [openfabrics-ewg] Mellanox/Voltaire/QLogic/IBM SQA results for OFED 1.1?)

2006-09-25 Thread ishai
Hi Scott,

The IPoIB HA (High Availability)solution in OFED 1.1 is a short term
solution. (There is an on going work on a full solution, that uses
bonding).
This short term solution for IPoIB HA uses the command "ip monitor link"
to find out when a link goes down, and then updates the ip address of the
other port.

Apparently RHEL4 uses an old version of iproute package (iproute-2.6.9-3
with ip utility, iproute2-ss040831 in RHEL4.0 U4) in which there is no
unique indication when a port goes down. (It gives the same indication
when a port goes up or down).

In SLES10 there is a newer version of iproute and our solution works well
with this version.

In order to solve the problem, The next RC will include also an
installation of a version of iproute (iproute2-2.6.16-060323 with ip
utility, iproute2-ss060323). This version will be installed only for OFED
installation that includes the IPoIB HA option and only on RHEL4. The
package will be installed in a private directory inside the OFED directory
(It will not replace the iproute version of the distribution) and will be
accessed by the IPoIB scripts using the exact path.


As for SRP HA:
SRP HA is currently available only for SLES10. The reason is that SRP HA
uses the device-mapper multipath that needs high version of udev (>050).
RHEL4 uses udev 039.


Ishai

>> As for the HA it works on SuSE but not on RH. Ishai will
>> issue a report.
>
> This will be fixed for 1.1, right?
>
> Scott
>
> ___
> openfabrics-ewg mailing list
> [EMAIL PROTECTED]
> http://openib.org/mailman/listinfo/openfabrics-ewg
>



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] FW: OFED 1.1 rc3 srp driver panic

2006-09-07 Thread ishai
I think I found the race that causes this NULL Dereference.

1) There is a connection error.

2) srp_completion gets bad status and schedules a call to srp_reconnect_work.

3) srp_reconnect_work is scheduled to run and calls srp_reconnect_target.

4) srp_reconnect_target starts to run, changes the target state to
SRP_TARGET_CONNECTING but there is a context switch before it gets to
execute srp_reset_req.

5) The scsi error handling calls to srp_reset_host.

6) srp_reset_host calls srp_reconnect_target that returns -EAGAIN
(because the target state is not SRP_TARGET_LIVE).

7) srp_reset_host returns FAILED and therefore the device goes offline.

8) Because the device goes offline the commands are being freed (In the
scsi mid-layer).

9) The first execution of srp_reconnect_target resumes and calls to
srp_reset_req that tries to access the commands that were freed.

10) NULL deref.

Ishai


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] FW: OFED 1.1 rc3 srp driver panic

2006-09-07 Thread Ishai Rabinovitz



Here 
is an Oops in ib_srp


From: Dachepalli, Sudhir 
[mailto:[EMAIL PROTECTED] Sent: Tuesday, September 05, 2006 
11:09 PMTo: Vu PhamCc: Richard, BillSubject: 
OFED 1.1 rc3 srp driver panic

Hello 
Vu,
 
I am trying to 
integrate MPP and OFED 1.1 rc3 srp.  
 
Status on following 
2 issues.

  New Host number 
  allocation for controller offline / online - MPP will handle this with out the 
  need to run hot_add. we need to use srp_daemon. 
  scsi error handler 
  invocation - we need to figure out how to cleanly exit out of error handler 
  after cleaning up all the IO's - THIS IS THE BIGGEST ISSUE 
  NOW.
 
Panic
 
 I noticed the 
following panic when I performed sysreboot on controller A while IO is going on 
:
 
ib_srp: SRP reset_host calledib_srp: QP event 
1ib_srp: QP event 1ib_srp: connection closedUnable to handle kernel 
NULL pointer dereference at  
RIP:[<>]PML4 214f0d067 PGD 214657067 PMD 
0Oops: 0010 [1] SMPCPU 1Modules linked in: parport_pc lp parport 
autofs4 i2c_dev i2c_core sunrpc rdma_ucm(U) rdma_cm(U) ib_addr(U) ib_srp(U) ds 
yenta_socket pcmcia_core dm_mirror dm_mod button battery ac md5 ipv6 
uhci_hcd ehci_hcd ib_mthca(U) ib_umad(U) ib_ucm(U) ib_uverbs(U) ib_cm(U) 
ib_sa(U) ib_mad(U) ib_core(U) e1000 ext3 jbd mppVhba(U) qla2400 qla2322 
qla2xxx scsi_transport_fc mptscsih mptsas mptspi mptfc mptscsi mptbase 
ata_piix libata mppUpper(U) sg sd_mod scsi_modPid: 4991, comm: scsi_eh_7 Not 
tainted 2.6.9-34.ELsmpRIP: 0010:[<>] 
[<>]RSP: 0018:01021202dd70  EFLAGS: 
00010006RAX: 010210234100 RBX: 0102114b0a28 RCX: 
0102114b08a0RDX: 0102114b08b0 RSI: 0102114b0a28 RDI: 
010210234100RBP: 0102114b0c08 R08:  R09: 
000210d6c000R10: 0102114b03c8 R11: 0102114b03c8 R12: 
0102114b03c8R13: 01021202dee8 R14: 0102114b R15: 
01021202ded8FS:  002a9589a760() GS:804d7b80() 
knlGS:CS:  0010 DS:  ES:  CR0: 
8005003bCR2:  CR3: cfe58000 CR4: 
06e0Process scsi_eh_7 (pid: 4991, threadinfo 01021202c000, 
task 010211e9b030)Stack: a02a1792 0102114b08b0 
0102114b03c8    
a02a18bf 0038 01021202de78 
01021202ddb8   010211e9b030 
801333c8Call 
Trace:{:ib_srp:srp_reset_req+37} 
{:ib_srp:srp_reconnect_target+288}   
{default_wake_function+0} 
{kobject_release+0}   
{:ib_srp:srp_reset_host+51} 
{:ib_srp:srp_reset_host+59}   
{:scsi_mod:scsi_try_host_reset+118}   
{:scsi_mod:scsi_error_handler+2347}   
{child_rip+8} 
{:scsi_mod:scsi_error_handler+0}   
{child_rip+0}
 
Code:  Bad RIP value.RIP 
[<>] RSP <01021202dd70>CR2: 
 <0>Kernel panic - not syncing: Oops
 
 
 
Sudhir Dachepalli 
Engenio Storage Group 


LSI Logic Corporation 


12331 Riata Trace Parkway 


Suite B200 


Austin , Texas   
78727 
512 794 3706 phone512 794 3702 
fax[EMAIL PROTECTED] 

www.lsilogic.com/engenio 

 
 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] A new version for srp daemon

2006-08-15 Thread Ishai Rabinovitz
 See Below

-Original Message-
From: Roland Dreier [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, August 15, 2006 1:48 AM
To: Ishai Rabinovitz
Cc: openib-general@openib.org; Tziporet Koren
Subject: Re: A new version for srp daemon

 > I put the code in
 >
https://openib.org/svn/trunk/contrib/mellanox/gen2/src/userspace/srptool
s/srp_daemon

Seems like a bizarre place for it -- a package inside of the srptools
package??

    [ishai] This is another tool for srp. I thought that we want to
put several tools in srptools directory.  I will put it in
https://openib.org/svn/gen2/trunk/src/userspace/srp_daemon



 >7) Uses the umad package.

This seems like it adds a fairly complicated dependency (since umad
depends on something else, etc) for minimal gain.  Was it really worth
it in terms of your code?  I found the umad API more trouble than it was
worth for the original srptools.

    [ishai] You may be right, but this way I can gain from future
improvements in the umad package (I'm optimistic)

 - R.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] A new version for srp daemon

2006-08-14 Thread Ishai Rabinovitz



Adding 
a subject


From: Ishai Rabinovitz Sent: Tuesday, 
August 15, 2006 12:32 AMTo: 'Roland Dreier'Cc: 
'openib-general@openib.org'; Tziporet KorenSubject: 


Hi 
Roland,
 
In order to support 
High-Availability in OFED 1.1, we need more functionality to the srp 
daemon.
Based on your code I 
implemented a new srp daemon (I listed its new features below) . 

I put the code in https://openib.org/svn/trunk/contrib/mellanox/gen2/src/userspace/srptools/srp_daemon
and I'm in an initial testing stage (there are some known 
bugs).
Since I think that 
people may still want to use your original ibsrpdm, I think we should keep your 
version and start a new tool from my code. What do you 
think?
 
Ishai
 

The new tool main 
features:
 
    
1) Register to Traps 64 and 144.
    2) Can ask 
ib_srp to connect to the targets it finds.
    
3) Can check if the target is already connected by ib_srp from the same 
port.
    
4) Can perform rescan of the fabric every X seconds.
    
5) Identify SA changes and other events and act accordingly.
    
6) Can get an hca name and a port number as input (not only a umad 
device).

    
7) Uses the umad package.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] (no subject)

2006-08-14 Thread Ishai Rabinovitz



Hi 
Roland,
 
In order to support 
High-Availability in OFED 1.1, we need more functionality to the srp 
daemon.
Based on your code I 
implemented a new srp daemon (I listed its new features below) . 

I put 
the code in https://openib.org/svn/trunk/contrib/mellanox/gen2/src/userspace/srptools/srp_daemon
and I'm in an initial testing stage (there are some known 
bugs).
Since I think that 
people may still want to use your original ibsrpdm, I think we should keep your 
version and start a new tool from my code. What do you 
think?
 
Ishai
 

The new tool main 
features:
 
    
1) Register to Traps 64 and 144.
    2) Can ask 
ib_srp to connect to the targets it finds.
    
3) Can check if the target is already connected by ib_srp from the same 
port.
    
4) Can perform rescan of the fabric every X seconds.
    
5) Identify SA changes and other events and act accordingly.
    
6) Can get an hca name and a port number as input (not only a umad 
device).

    
7) Uses the umad package.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [SRP] [RFC] Needed changes to support fail-over drivers

2006-07-25 Thread Ishai Rabinovitz
On Tue, Jul 25, 2006 at 04:58:48PM +0300, [EMAIL PROTECTED] wrote:
> [CC'ing linux-scsi as well -- I think we'll get better insight from =
> there]
> 
> OK, but is this a valid assumption?  What happens for iSCSI and/or iSER?

>From Mike's response I understand that it is a reasonable behavior to keep the
host (at least for a period of time) and let the userspace daemon be
responsible to the reconnection or deallocating of that host.

> 
> How does the daemon know when something is gone for good vs. when it
> might come back?
> 

I think we should use a time out in the daemon.

> 
> Why does userspace need to be able to disconnect a connection?
> 

There are two options on who will initiate the disconnection: the userspace
daemon or the ib_srp module.  I considered both options and I was not sure
which one is better.  I choose to do it in userspace because it looks a good
symmetry that both the disconnection and reconnection will be initiate in the
same place.  I will accept your comment and change it to the kernel.


> 
> Why the asymmetry here?  In other words, why does anything work for
> reconnect_target but only the literal "erase" work for erase_target?
> 

Because erase_target is a destructive command that can not be reversed I think
it should use a more safe approach.

-- 
Ishai Rabinovitz

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [SRP] [RFC] Needed changes to support fail-over drivers

2006-07-24 Thread Ishai Rabinovitz
Hi,

The current SRP initiator code cannot work with several fail-over mechanisms. 

The current srp driver's behavior when a target off-line then online:
1) The target is offline.
2) the initiator tries to reconnect and fails
3) The initiator calls srp_remove_work that removes the scsi_host.
4) The target is back online.
5) the user (or the ibsrpdm daemon) is expected to execute a new add_target.
6) This creates a new scsi_host (with new names to the devices and new index in
the scsi_host directory in sysfs) for this target.

Fail-over drivers (e.g., MPP that is used by Engenio and XVM that is used by
SGI) have problems with this behavior (item 3). They need the scsi_host to keep
exist and return errors in the meanwhile until the connection to the target
resumes.

In addition remove/re-alloc scsi host is a "heavy" operation instead of
disconnect/reconnect the connection only.

In order to support these tools I propose the following changes that will allow
the user to move the srp initiator to a disconnected state (when the target
leaves the fabric) and reconnect it later (when the target returns to the
fabric).

After these changes will be in the ib_srp module, the ibsrpdm daemon will be
able to monitor the presence of targets in the fabric and to use this interface
(When targets leave or rejoin the fabric).

Here is the description of the new design: (I already implemented most of the
code)

1) Split the function srp_reconnect_target into two functions:
_srp_disconnect_target and _srp_reconnect_target 

2) Adding two new states: SRP_TARGET_DISCONNECTED (The state after
_srp_disconnect_target was executed and before _srp_reconnect_target is
executed) and SRP_TARGET_DISCONNECTING (The state while in srp_remove_target).

3) Adding new input files in sysfs:
/sys/class/scsi_host/host?/{disconnect_target,connect_target,erase_target}

4) Writing the string "remove" to /sys/class/scsi_host/host?/disconnect_target
calls srp_disconnect_target that moves the corresponding target to a
SRP_TARGET_DISCONNECTED state (After closing the cm, and reset all pending
requests).  Now when the scsi performs queuecommand to this host the result is
DID_NO_CONNECT.  This causes the scsi mid-layer to return to the user with an
IO error without initiating the scsi error auto recovery chain.

5) Writing anything to /sys/class/scsi_host/host?/reconnect_target calls
_srp_reconnect_target that move the target to SRP_TARGET_LIVE state again.

6) Writing "erase" to /sys/class/scsi_host/host?/erase_target calls
srp_remove_work that removes the scsi_host.

7) Adding output files in sysfs to present the HCA and port that the initiator
used to connect to the target. Using these files and the target GUID the
ibsrpdm can know on which scsi_host to perform the reconnect_target.

Please comment.

-- 
Ishai Rabinovitz

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] SRP: Avoid a potential race on target->req_queue

2006-06-13 Thread Ishai Rabinovitz
Hi Roland,

There is a potential race between srp_reconnect_target and srp_reset_device
when they access the target->req_queue.
These functions can execute in the same time because srp_reconnect_target is
called form srp_reconnect_work that is scheduled by srp_completion, while
srp_reset_device is called from the scsi layer.

The race is caused because srp_reconnect_target is not holding host_lock while
accessing target->req_queue. It assumes that since the state is CONNECTING no
other function will access target->req_queue (and this is the case with
srp_reset_host for example).

There are two possible solutions: 
1) Change srp_reset_device: after locking host_lock, it will check the
   state. Only if the state is LIVE it will execute the loop that access
   target->req_queue.
2) Change srp_reconnect_target. Before executing the loop that access
   target->req_queue it will lock host_lock and will release it after
   the loop.

I'm sending a patch for the second solution. If you prefer the first, I have 
another patch for it (It is a bit longer).
Which solution do you like better?

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-06-13 
02:24:22.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-06-13 
02:26:07.0 +0300
@@ -641,8 +641,10 @@ static int srp_reconnect_target(struct s
while (ib_poll_cq(target->cq, 1, &wc) > 0)
; /* nothing */
 
+   spin_lock_irq(target->scsi_host->host_lock);
list_for_each_entry_safe(req, tmp, &target->req_queue, list)
srp_reset_req(target, req);
+   spin_unlock_irq(target->scsi_host->host_lock);
 
target->rx_head  = 0;
target->tx_head  = 0;
-- 
Ishai Rabinovitz

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: SRP [PATCH 0/4] Kernel support for removal and restoration of target

2006-06-07 Thread Ishai Rabinovitz
The idea is that the daemon will notice targets that leave the fabric (for a 
short time), and will activate remove_target. When the target will return to
the fabric, the daemon will activate restore_target.
This will make sure that the scsi_host won't go offline (From where there is no
return)
I'm waiting for suggestion about the mechanism that will be responsible to 
remove the scsi_host when the target does not return to the fabric after a 
while. (See my previous mail for details)


On Tue, Jun 06, 2006 at 03:11:46PM -0700, Roland Dreier wrote:
> I haven't read too deeply yet, but something that would help me
> understand the overall plan here would be an explanation of how one
> would use the restore_target function.  Why would I want to disconnect
> from a target but keep the kernel's SCSI device hanging around?
> 
>  - R.

-- 
Ishai Rabinovitz

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] SRP [PATCH 4/4] show_srp_state

2006-06-05 Thread Ishai Rabinovitz

Add query for srp_state in sysfs.
Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-05-31 
18:52:14.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-06-04 
14:21:52.0 +0300
@@ -1362,6 +1362,26 @@ static int srp_reset_host(struct scsi_cm
return ret;
 }
 
+static ssize_t show_srp_state(struct class_device *cdev, char *buf)
+{
+   struct srp_target_port *target = host_to_target(class_to_shost(cdev));
+   enum srp_target_state target_state = target->state;
+
+   static const char *state_name[] = {
+   [SRP_TARGET_LIVE]   = "LIVE",
+   [SRP_TARGET_CONNECTING] = "CONNECTING",
+   [SRP_TARGET_DISCONNECTING]  = "DISCONNECTING",
+   [SRP_TARGET_DISCONNECTED]   = "DISCONNECTED",
+   [SRP_TARGET_DEAD]   = "DEAD",
+   [SRP_TARGET_REMOVED]= "REMOVED",
+   };
+
+   if (target_state >= 0 && target_state < ARRAY_SIZE(state_name))
+   return sprintf(buf, "%s\n", state_name[target_state]);
+
+   return sprintf(buf, "UNKNOWN\n");
+}
+
 static ssize_t show_id_ext(struct class_device *cdev, char *buf)
 {
struct srp_target_port *target = host_to_target(class_to_shost(cdev));
@@ -1439,6 +1459,7 @@ static ssize_t show_zero_req_lim(struct 
 
 static CLASS_DEVICE_ATTR(remove_target,  S_IWUSR, NULL, srp_remove_target);
 static CLASS_DEVICE_ATTR(restore_target, S_IWUSR, NULL, srp_restore_target);
+static CLASS_DEVICE_ATTR(srp_state, S_IRUGO, show_srp_state,   NULL);
 static CLASS_DEVICE_ATTR(id_ext,S_IRUGO, show_id_ext,  NULL);
 static CLASS_DEVICE_ATTR(ioc_guid,  S_IRUGO, show_ioc_guid,NULL);
 static CLASS_DEVICE_ATTR(service_id,S_IRUGO, show_service_id,  NULL);
@@ -1447,6 +1468,7 @@ static CLASS_DEVICE_ATTR(dgid,S_IRUGO,
 static struct class_device_attribute *srp_host_attrs[] = {
&class_device_attr_remove_target,
&class_device_attr_restore_target,
+   &class_device_attr_srp_state,
    &class_device_attr_id_ext,
&class_device_attr_ioc_guid,
&class_device_attr_service_id,
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] SRP [PATCH 3/4] restore target

2006-06-05 Thread Ishai Rabinovitz

Add support to restore_target from sysfs.
Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-06-04 
10:01:50.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-06-04 
10:02:27.0 +0300
@@ -1551,16 +1551,21 @@ static ssize_t show_zero_req_lim(struct 
 static ssize_t srp_remove_target(struct class_device *cdev,
 const char *buf, size_t count);
 
-static CLASS_DEVICE_ATTR(remove_target, S_IWUSR, NULL, srp_remove_target);
-static CLASS_DEVICE_ATTR(id_ext,   S_IRUGO, show_id_ext,   NULL);
-static CLASS_DEVICE_ATTR(ioc_guid, S_IRUGO, show_ioc_guid, NULL);
-static CLASS_DEVICE_ATTR(service_id,   S_IRUGO, show_service_id,   NULL);
-static CLASS_DEVICE_ATTR(pkey, S_IRUGO, show_pkey, NULL);
-static CLASS_DEVICE_ATTR(dgid, S_IRUGO, show_dgid, NULL);
-static CLASS_DEVICE_ATTR(zero_req_lim, S_IRUGO, show_zero_req_lim, NULL);
+static ssize_t srp_restore_target(struct class_device *cdev,
+ const char *buf, size_t count);
+
+static CLASS_DEVICE_ATTR(remove_target,  S_IWUSR, NULL, srp_remove_target);
+static CLASS_DEVICE_ATTR(restore_target, S_IWUSR, NULL, srp_restore_target);
+static CLASS_DEVICE_ATTR(id_ext,S_IRUGO, show_id_ext,  NULL);
+static CLASS_DEVICE_ATTR(ioc_guid,  S_IRUGO, show_ioc_guid,NULL);
+static CLASS_DEVICE_ATTR(service_id,S_IRUGO, show_service_id,  NULL);
+static CLASS_DEVICE_ATTR(pkey,  S_IRUGO, show_pkey,NULL);
+static CLASS_DEVICE_ATTR(dgid,  S_IRUGO, show_dgid,NULL);
+static CLASS_DEVICE_ATTR(zero_req_lim,  S_IRUGO, show_zero_req_lim,NULL);
 
 static struct class_device_attribute *srp_host_attrs[] = {
&class_device_attr_remove_target,
+   &class_device_attr_restore_target,
&class_device_attr_id_ext,
&class_device_attr_ioc_guid,
&class_device_attr_service_id,
@@ -1861,6 +1866,17 @@ static ssize_t srp_remove_target(struct 
return count;
 }
 
+static ssize_t srp_restore_target(struct class_device *cdev,
+const char *buf, size_t count)
+{
+   int ret = _srp_restore_target(host_to_target(class_to_shost(cdev)));
+
+   if (ret)
+   return ret;
+
+   return count;
+}
+
 static ssize_t show_ibdev(struct class_device *class_dev, char *buf)
 {
struct srp_host *host =
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] SRP [PATCH 2/4] remove target

2006-06-05 Thread Ishai Rabinovitz

Add support to remove_target from sysfs.
Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-06-05 
16:46:55.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-06-05 
17:11:11.0 +0300
@@ -1516,6 +1516,10 @@
return sprintf(buf, "%d\n", target->zero_req_lim);
 }
 
+static ssize_t srp_remove_target(struct class_device *cdev,
+const char *buf, size_t count);
+
+static CLASS_DEVICE_ATTR(remove_target, S_IWUSR, NULL, srp_remove_target);
 static CLASS_DEVICE_ATTR(id_ext,   S_IRUGO, show_id_ext,   NULL);
 static CLASS_DEVICE_ATTR(ioc_guid, S_IRUGO, show_ioc_guid, NULL);
 static CLASS_DEVICE_ATTR(service_id,   S_IRUGO, show_service_id,   NULL);
@@ -1524,6 +1528,7 @@
 static CLASS_DEVICE_ATTR(zero_req_lim, S_IRUGO, show_zero_req_lim, NULL);
 
 static struct class_device_attribute *srp_host_attrs[] = {
+   &class_device_attr_remove_target,
&class_device_attr_id_ext,
&class_device_attr_ioc_guid,
&class_device_attr_service_id,
@@ -1814,6 +1819,23 @@
 
 static CLASS_DEVICE_ATTR(add_target, S_IWUSR, NULL, srp_create_target);
 
+static ssize_t srp_remove_target(struct class_device *cdev,
+const char *buf, size_t count)
+{
+   int ret;
+   const char const remove_str[] = "remove";
+
+   if (strncmp(buf, "remove", sizeof(remove_str)))
+   return -EINVAL;
+
+   ret = _srp_remove_target(host_to_target(class_to_shost(cdev)));
+
+   if (ret)
+   return ret;
+
+   return count;
+}
+
 static ssize_t show_ibdev(struct class_device *class_dev, char *buf)
 {
struct srp_host *host =
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] SRP [PATCH 1/4] split srp_reconnect_target

2006-06-05 Thread Ishai Rabinovitz

Split the srp_reconnect_target to two functions _srp_remove_target and
_srp_restore_target. These functions will be used later in patch series also
to allow removal and restoration of a target from the sysfs.

I made some changes in order to support this:
1) There are two new states:
SRP_TARGET_DISCONNECTED - The state after _srp_remove_target was successfully 
  executed and before _srp_restore_target is executed.
SRP_TARGET_DISCONNECTING - The state while _srp_remove_target is executed.
SRP_TARGET_CONNECTING is now the state while _srp_restore_target is executed.

2) The value of target->cm_id can be NULL. This happens after _srp_remove_target
   destroyed the old cm_id and before _srp_restore_target created the new cm_id.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-06-04 
10:03:25.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-06-04 
10:54:26.0 +0300
@@ -40,6 +40,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -373,7 +374,8 @@ static void srp_remove_work(void *target
spin_unlock(&target->srp_host->target_lock);
 
scsi_remove_host(target->scsi_host);
-   ib_destroy_cm_id(target->cm_id);
+   if (target->cm_id)
+   ib_destroy_cm_id(target->cm_id);
srp_free_target_ib(target);
scsi_host_put(target->scsi_host);
 }
@@ -464,20 +466,57 @@ static void srp_reset_req(struct srp_tar
srp_remove_req(target, req);
 }
 
-static int srp_reconnect_target(struct srp_target_port *target)
+static void srp_remove_target_port(struct srp_target_port *target)
+{
+   /*
+* Kill our target port off.
+* However, we have to defer the real removal because we might
+* be in the context of the SCSI error handler now, which
+* would deadlock if we call scsi_remove_host().
+*/
+   spin_lock_irq(target->scsi_host->host_lock);
+   if (target->state != SRP_TARGET_REMOVED) {
+   target->state = SRP_TARGET_DEAD;
+   INIT_WORK(&target->work, srp_remove_work, target);
+   schedule_work(&target->work);
+   }
+   spin_unlock_irq(target->scsi_host->host_lock);
+}
+
+static int _srp_remove_target(struct srp_target_port *target)
 {
-   struct ib_cm_id *new_cm_id;
struct ib_qp_attr qp_attr;
struct srp_request *req, *tmp;
struct ib_wc wc;
-   int ret;
+   int ret = 0;
 
spin_lock_irq(target->scsi_host->host_lock);
-   if (target->state != SRP_TARGET_LIVE) {
+   switch (target->state) {
+   case SRP_TARGET_REMOVED:
+   case SRP_TARGET_DEAD:
+   ret = -ENOENT;
+   break;
+
+   case SRP_TARGET_DISCONNECTING:
+   case SRP_TARGET_CONNECTING:
+   ret = -EAGAIN; /* So that the caller will try again later -
+ after the connection ends one way or another 
*/
+   break;
+
+   case SRP_TARGET_DISCONNECTED:
+   ret = -ENOTCONN;
+   break;
+
+   case SRP_TARGET_LIVE:
+   break;
+   }
+
+   if (ret) {
spin_unlock_irq(target->scsi_host->host_lock);
-   return -EAGAIN;
+   return ret;
}
-   target->state = SRP_TARGET_CONNECTING;
+
+   target->state = SRP_TARGET_DISCONNECTING;
spin_unlock_irq(target->scsi_host->host_lock);
 
srp_disconnect_target(target);
@@ -485,24 +525,14 @@ static int srp_reconnect_target(struct s
 * Now get a new local CM ID so that we avoid confusing the
 * target in case things are really fouled up.
 */
-   new_cm_id = ib_create_cm_id(target->srp_host->dev->dev,
-   srp_cm_handler, target);
-   if (IS_ERR(new_cm_id)) {
-   ret = PTR_ERR(new_cm_id);
-   goto err;
-   }
ib_destroy_cm_id(target->cm_id);
-   target->cm_id = new_cm_id;
+   target->cm_id = NULL;
 
qp_attr.qp_state = IB_QPS_RESET;
ret = ib_modify_qp(target->qp, &qp_attr, IB_QP_STATE);
if (ret)
goto err;
 
-   ret = srp_init_qp(target, target->qp);
-   if (ret)
-   goto err;
-
while (ib_poll_cq(target->cq, 1, &wc) > 0)
; /* nothing */
 
@@ -513,6 +543,49 @@ static int srp_reconnect_target(struct s
target->tx_head  = 0;
target->tx_tail  = 0;
 
+   spin_lock_irq(target->scsi_host->host_lock);
+   if (target->state == SRP_TARGET_DISCONNECTING) {
+   ret = 0;
+   target->state = SRP_TARGET_DISCONNECTED;
+ 

[openib-general] SRP [PATCH 0/4] Kernel support for removal and restoration of target

2006-06-05 Thread Ishai Rabinovitz
Hi Roland,

I'm sending 4 patches that implement kernel support for removal and restoration
of a target (will be used by ibsrpdm).

Some comments about them

1) The first patch splits reconnect to 2 two functions: _srp_remove_target 
   and _srp_restore_target. _srp_remove_target uses the functions I sent in 
   previous patch (Misc cleanups in ib_srp). If you want I can resend this 
   patch without using the previous patch (But then there will be a problem 
   with the previous patch :(  ).

2) These patches implement the following behavior: When someone writes the
   string "remove" to /sys/class/scsi_host/host?/remove_target the corresponding
   target goes to a DISCONNECTED state (After closing the cm, and reset all 
   pending requests).
   Now when the scsi performs queuecommand to this host a SCSI_MLQUEUE_HOST_BUSY
   is returned. These causes the scsi layer to wait until the target turns
   to LIVE state. This is very nice if the user that initiated the remove_target
   knows what he is doing and will perform a restore_target later. On the other 
   hand it may be problematic if the target remains DISCONNECTED and user
   applications that try to access this target remain stuck in the kernel (in 
   the scsi layer)
   I've several ideas on how to handle it (have a timeout after which 
   queuecommand will return fail, try to perform a restore_target after a
   timeout, make sure the daemon will run a restore_target after a timeout) 
   but I'm not sure they are the correct thing to do. I'm waiting for
   suggestions.
   In any case I believe we should apply these patches and add solution to
   this problem later.

Please comment.

-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] SRP: [PATCH] Misc cleanups in ib_srp

2006-06-04 Thread Ishai Rabinovitz
Hi,

Misc cleanups in ib_srp. Please consider for 2.6.18.
1) I think that it is more efficient to move the req entries from req_list
   to free_list in srp_reconnect_target (rather than rebuild the free_list).
   (In any case this code is shorter).
2) This allows us to reuse code in srp_reset_device and srp_reconnect_target
   and call a new function srp_reset_req.
3) We can use list_move_tail in srp_remove_req.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-05-19 
11:14:35.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-05-21 
17:41:25.0 +0300
@@ -451,14 +451,26 @@ static void srp_unmap_data(struct scsi_c
 scmnd->sc_data_direction);
 }
 
+static void srp_remove_req(struct srp_target_port *target, struct srp_request 
*req)
+{
+   srp_unmap_data(req->scmnd, target, req);
+   list_move_tail(&req->list, &target->free_reqs);
+}
+
+static void srp_reset_req(struct srp_target_port *target, struct srp_request 
*req)
+{
+   req->scmnd->result = DID_RESET << 16;
+   req->scmnd->scsi_done(req->scmnd);
+   srp_remove_req(target, req);
+}
+
 static int srp_reconnect_target(struct srp_target_port *target)
 {
struct ib_cm_id *new_cm_id;
struct ib_qp_attr qp_attr;
-   struct srp_request *req;
+   struct srp_request *req, *tmp;
struct ib_wc wc;
int ret;
-   int i;
 
spin_lock_irq(target->scsi_host->host_lock);
if (target->state != SRP_TARGET_LIVE) {
@@ -494,19 +506,12 @@ static int srp_reconnect_target(struct s
while (ib_poll_cq(target->cq, 1, &wc) > 0)
; /* nothing */
 
-   list_for_each_entry(req, &target->req_queue, list) {
-   req->scmnd->result = DID_RESET << 16;
-   req->scmnd->scsi_done(req->scmnd);
-   srp_unmap_data(req->scmnd, target, req);
-   }
+   list_for_each_entry_safe(req, tmp, &target->req_queue, list)
+   srp_reset_req(target, req);
 
target->rx_head  = 0;
target->tx_head  = 0;
target->tx_tail  = 0;
-   INIT_LIST_HEAD(&target->free_reqs);
-   INIT_LIST_HEAD(&target->req_queue);
-   for (i = 0; i < SRP_SQ_SIZE; ++i)
-   list_add_tail(&target->req_ring[i].list, &target->free_reqs);
 
ret = srp_connect_target(target);
if (ret)
@@ -706,13 +711,6 @@ static int srp_map_data(struct scsi_cmnd
return len;
 }
 
-static void srp_remove_req(struct srp_target_port *target, struct srp_request 
*req)
-{
-   srp_unmap_data(req->scmnd, target, req);
-   list_del(&req->list);
-   list_add_tail(&req->list, &target->free_reqs);
-}
-
 static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp 
*rsp)
 {
struct srp_request *req;
@@ -1349,11 +1347,8 @@ static int srp_reset_device(struct scsi_
spin_lock_irq(target->scsi_host->host_lock);
 
list_for_each_entry_safe(req, tmp, &target->req_queue, list)
-   if (req->scmnd->device == scmnd->device) {
-   req->scmnd->result = DID_RESET << 16;
-   req->scmnd->scsi_done(req->scmnd);
-   srp_remove_req(target, req);
-   }
+   if (req->scmnd->device == scmnd->device)
+   srp_reset_req(target, req);
 
spin_unlock_irq(target->scsi_host->host_lock);
 
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] ibsrpdm - allocate agent

2006-05-28 Thread Ishai Rabinovitz

The array agent is allocated with no entries.
When agent is being accessed in create_agent there is a memory corruption.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/src/userspace/srptools/src/srp-dm.c
===
--- last_stable.orig/src/userspace/srptools/src/srp-dm.c2006-05-28 
14:24:25.0 +0300
+++ last_stable/src/userspace/srptools/src/srp-dm.c 2006-05-28 
14:55:41.0 +0300
@@ -536,7 +536,7 @@ static int get_port_list(int fd, uint32_
 int main(int argc, char *argv[])
 {
int fd;
-   uint32_tagent[0];
+   uint32_tagent[2];
char   *cmd_name = strdup(argv[0]);
 
while (1) {
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Get email notification to svn commit

2006-05-23 Thread Ishai Rabinovitz
Hi,

I understand that there is a way to be informed by mail when there is a commit 
to the svn repository.
How can I register?
Can I register to get notification only on svn commit to specific directories?

Thanks
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] SRP: [PATCH] Releasing the scsi_host when unloading

2006-05-23 Thread Ishai Rabinovitz
On Wed, May 17, 2006 at 02:55:57AM +0300, Roland Dreier wrote:
>  > +  /*
>  > +   * We need 2 scsi_host_put becuase there are two get:
>  > +   *  in scsi_host_alloc and in scsi_add_host
>  > +   */
>  > +  scsi_host_put(target->scsi_host);
>  >scsi_host_put(target->scsi_host);
> 
> Hmm, this doesn't seem right to me.  If I try this, then I get a crash
> because the scsi_host is already gone after the first put.  I verified
> that the reference count is 1 before these puts, and with the
> unmodified module I don't see anything left in /sys/class/scsi_host
> after unloading the module.
> 
> What kernel are you seeing problems with?  I'm testing with an
> up-to-date git kernel, although I doubt it makes a difference (did
> SCSI reference counting change recently??).
> 
> I do think there are some extra scsi_host_put() calls in
> srp_remove_work() -- I think the double scsi_host_put() dates back to
> a version (which I may never even have checked in) where there was a
> scsi_host_get() to avoid the scsi_host going away between the
> schedule_work() and srp_remove_work() actually running.
> 
> So the patch below seems correct to me.
> 
> What do you think?
> 
> --- linux-kernel/infiniband/ulp/srp/ib_srp.c  (revision 7245)
> +++ linux-kernel/infiniband/ulp/srp/ib_srp.c  (working copy)
> @@ -353,7 +356,6 @@ static void srp_remove_work(void *target
>   spin_lock_irq(target->scsi_host->host_lock);
>   if (target->state != SRP_TARGET_DEAD) {
>   spin_unlock_irq(target->scsi_host->host_lock);
> - scsi_host_put(target->scsi_host);
>   return;
>   }
>   target->state = SRP_TARGET_REMOVED;
> @@ -367,8 +369,6 @@ static void srp_remove_work(void *target
>   ib_destroy_cm_id(target->cm_id);
>   srp_free_target_ib(target);
>   scsi_host_put(target->scsi_host);
> - /* And another put to really free the target port... */
> - scsi_host_put(target->scsi_host);
>  }
>  
>  static int srp_connect_target(struct srp_target_port *target)
> 
> 

Roland, 

As I told you before, your patch looks correct.
Are you going to apply it?

-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Re: Fwd: [Bug 91] sizeof(srp_indirect_buf) wrongon 64-bit platforms

2006-05-23 Thread Ishai Rabinovitz
On Tue, May 23, 2006 at 10:10:05AM +0300, Arne Redlich wrote:
> Roland Dreier <[EMAIL PROTECTED]> writes:
> 
> >  > Roland, maybe this means we need scsi/srp.h in svn for now?
> >  > svn is supposed to work on 2.6.16 ...
> >
> > As far as I can tell the bug in the header has no effect on how the IB
> > SRP initiator works.
> 
> Roland,
> 
> I'm afraid it *does* have an effect, unfortunately. There's the following 
> code in ib_srp.c::srp_map_data(), around the lines 540 - 550:
> 
>   struct srp_indirect_buf *buf = (void *) cmd->add_data;
> 
>   /* snip */
> 
>   buf->table_desc.va  = cpu_to_be64(req->cmd->dma +
> sizeof *cmd +
> sizeof *buf);
> 
> So if a target actually RDMA Reads the indirect descriptor table, it will use 
> a wrong address.
> 

It looks to me that there is no effect after all.
This buf->table_desc.va should point to the desc_list array in the 
srp_indirect_buf.
When the code enters the values to this array (buf->desc_list[i]) it uses the 
address that is corresponding to sizeof *buf.

To sum it up, there will be a change in the address the target sees but the 
data will still be in the address the target sees.

> Arne
> -- 
> Arne Redlich
> Xiranet Communications GmbH
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] SRP: [PATCH] Handling DREQ

2006-05-22 Thread Ishai Rabinovitz
On Fri, May 19, 2006 at 02:54:35AM +0300, Ishai Rabinovitz wrote:
Hi

Fixed the patch a little bit.

> Hi,
> 
> I got "Unhandled CM event 6" (IB_CM_DREQ_ERROR) and "Unhandled CM event 7" 
> (IB_CM_DREQ_RECEIVED).
> 
> So here is a patch that handles these CM events.
> 
> This is an initial patch. Maybe it will be more efficient to initiate a 
> reconnect 
> in case we get IB_CM_DREQ_RECEIVED.  What do you think?
> 
> Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
> 


Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-05-21 
18:19:58.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-05-22 
09:32:00.0 +0300
@@ -1217,6 +1217,20 @@ static int srp_cm_handler(struct ib_cm_i
target->status = 0;
break;
 
+   case IB_CM_DREQ_ERROR:
+   printk(KERN_ERR PFX
+  "IB_CM_DREQ_ERROR received - connection closed\n");
+   /* no need to set comp - there will be a TIMEWAIT_EXIT */
+   break;
+
+   case IB_CM_DREQ_RECEIVED:
+   printk(KERN_WARNING PFX
+  "IB_CM_DREQ_RECEIVED received - connection closed\n");
+   if (ib_send_cm_drep(cm_id, NULL, 0))
+   printk(KERN_ERR PFX "ib_send_cm_drep failed\n");
+   /* no need to set comp - there will be a TIMEWAIT_EXIT */
+   break;
+
default:
    printk(KERN_WARNING PFX "Unhandled CM event %d\n", 
event->event);
break;

-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] SRP: [PATCH] Handling DREQ

2006-05-18 Thread Ishai Rabinovitz
Hi,

I got "Unhandled CM event 6" (IB_CM_DREQ_ERROR) and "Unhandled CM event 7" 
(IB_CM_DREQ_RECEIVED).

So here is a patch that handles these CM events.

This is an initial patch. Maybe it will be more efficient to initiate a 
reconnect 
in case we get IB_CM_DREQ_RECEIVED.  What do you think?

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-05-19 
00:05:30.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-05-19 
01:35:37.0 +0300
@@ -1214,13 +1231,29 @@ static int srp_cm_handler(struct ib_cm_i
target->status = 0;
break;
 
+   case IB_CM_DREQ_ERROR:
+   printk(KERN_ERR PFX
+  "IB_CM_DREQ_ERROR received - connection closed\n");
+   /* no need to set comp - there will be a TIMEWAIT_EXIT */
+   break;
+
+   case IB_CM_DREQ_RECEIVED:
+   printk(KERN_ERR PFX
+  "IB_CM_DREQ_RECEIVED received - connection closed\n");
+   if (ib_send_cm_drep(target->cm_id, NULL, 0))
+   printk(KERN_ERR PFX "ib_send_cm_drep failed\n");
+   /* no need to set comp - there will be a TIMEWAIT_EXIT */
+   break;
+
default:
printk(KERN_WARNING PFX "Unhandled CM event %d\n", 
event->event);
break;
}
 
-   if (comp)
+   if (comp) {
+   printk(KERN_ERR PFX "srp_cm_handler: complete to %p\n", target);
    complete(&target->done);
+   }
 
kfree(qp_attr);
 
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] SRP [PATCH] Cleaning in srp_remove_one

2006-05-17 Thread Ishai Rabinovitz
3 changes in the same place:

1) The if statement is redundant.
2) There is no need to save the flags - it is inside a mutex_lock.
3) We hold the mutex for the list and we are not deleting from the list so 
   there is no need for list_for_each_entry_safe.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-05-14 
14:22:12.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-05-14 
14:26:54.0 +0300
@@ -1750,7 +1750,6 @@ static void srp_remove_one(struct ib_dev
struct srp_host *host, *tmp_host;
LIST_HEAD(target_list);
struct srp_target_port *target, *tmp_target;
-   unsigned long flags;
 
dev_list = ib_get_client_data(device, &srp_client);
 
@@ -1767,12 +1766,10 @@ static void srp_remove_one(struct ib_dev
 * commands and don't try to reconnect.
 */
mutex_lock(&host->target_mutex);
-   list_for_each_entry_safe(target, tmp_target,
-&host->target_list, list) {
-   spin_lock_irqsave(target->scsi_host->host_lock, flags);
-   if (target->state != SRP_TARGET_REMOVED)
-   target->state = SRP_TARGET_REMOVED;
-   spin_unlock_irqrestore(target->scsi_host->host_lock, 
flags);
+   list_for_each_entry(target, &host->target_list, list) {
+   spin_lock_irq(target->scsi_host->host_lock);
+   target->state = SRP_TARGET_REMOVED;
+   spin_unlock_irq(target->scsi_host->host_lock);
    }
mutex_unlock(&host->target_mutex);
 
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] SRP [PATCH] Looks like a potantial bug

2006-05-17 Thread Ishai Rabinovitz
On Wed, May 17, 2006 at 06:40:04PM +0300, Ishai Rabinovitz wrote:
> Hi,
> 
> While doing a code review I found a potential bug.
> I did not manage to execute a test to check this code.
> Please take a look:

Sorry, I made a mistake in the patch.
Please look at this one.

In srp_reconnect_target it uses req->scmnd->scsi_done(req->scmnd); (like in the 
patch)

Ishai

> Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
> --
> Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
> ===
> --- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c  2006-05-17 
> 16:24:24.0 +0300
> +++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c   2006-05-17 
> 17:13:47.0 +0300
> @@ -1326,7 +1326,7 @@ static int srp_reset_device(struct scsi_
>   list_for_each_entry_safe(req, tmp, &target->req_queue, list)
>   if (req->scmnd->device == scmnd->device) {
>   req->scmnd->result = DID_RESET << 16;
> - scmnd->scsi_done(scmnd);
> + req->scmnd->scsi_done(req->scmnd);
>   srp_remove_req(target, req);
>   }
>  
> -- 
> Ishai Rabinovitz
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] SRP [PATCH] Looks like a potantial bug

2006-05-17 Thread Ishai Rabinovitz
Hi,

While doing a code review I found a potential bug.
I did not manage to execute a test to check this code.
Please take a look:
Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
--
Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-05-17 
16:24:24.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-05-17 
17:13:47.0 +0300
@@ -1326,7 +1326,7 @@ static int srp_reset_device(struct scsi_
list_for_each_entry_safe(req, tmp, &target->req_queue, list)
if (req->scmnd->device == scmnd->device) {
req->scmnd->result = DID_RESET << 16;
-   scmnd->scsi_done(scmnd);
+   req->scmnd->scsi_done(scmnd);
srp_remove_req(target, req);
}
 
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] SRP: [PATCH] Releasing the scsi_host when unloading

2006-05-17 Thread Ishai Rabinovitz
On Wed, May 17, 2006 at 02:55:57AM +0300, Roland Dreier wrote:
> Hmm, this doesn't seem right to me.  If I try this, then I get a crash
> because the scsi_host is already gone after the first put.  I verified
> that the reference count is 1 before these puts, and with the
> unmodified module I don't see anything left in /sys/class/scsi_host
> after unloading the module.
> 
> What kernel are you seeing problems with?  I'm testing with an
> up-to-date git kernel, although I doubt it makes a difference (did
> SCSI reference counting change recently??).
> 
> I do think there are some extra scsi_host_put() calls in
> srp_remove_work() -- I think the double scsi_host_put() dates back to
> a version (which I may never even have checked in) where there was a
> scsi_host_get() to avoid the scsi_host going away between the
> schedule_work() and srp_remove_work() actually running.
> 
> So the patch below seems correct to me.
> 
> What do you think?

I could not reproduce the problem again, so this patch works for me.

-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] SRP: [PATCH] Releasing the scsi_host when unloading

2006-05-17 Thread Ishai Rabinovitz
On Wed, May 17, 2006 at 02:56:58AM +0300, Roland Dreier wrote:
> BTW, I think the patch below is correct as well.  This avoids problems
> where the SRP driver waits forever for a completion, for example if
> sending the DREQ fails because the connection has already been
> disconnected by the target.
> 
> Does this scenario seem like the deadlock you thought you saw?
> 
> --- linux-kernel/infiniband/ulp/srp/ib_srp.c  (revision 7245)
> +++ linux-kernel/infiniband/ulp/srp/ib_srp.c  (working copy)
> @@ -342,7 +342,10 @@ static void srp_disconnect_target(struct
>   /* XXX should send SRP_I_LOGOUT request */
>  
>   init_completion(&target->done);
> - ib_send_cm_dreq(target->cm_id, NULL, 0);
> + if (ib_send_cm_dreq(target->cm_id, NULL, 0)) {
> + printk(KERN_DEBUG PFX "Sending CM DREQ failed\n");
> + return;
> + }
>   wait_for_completion(&target->done);
>  }
>  

I don't think this caused the deadlock I had.
Still it looks like an important patch.
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] SRP: [PATCH] Releasing the scsi_host when unloading

2006-05-14 Thread Ishai Rabinovitz
On Sun, May 14, 2006 at 02:02:01PM +0300, Ishai Rabinovitz wrote:
> Hi,
> 
> After loading ib_srp module, adding a target and then unloading the ib_srp 
> target the scsi_host directory in /sys/class/scsi_host/ still exists.
> It looks like the srp code does not release the scsi_host it had allocated.
> 
> After examining the code I found out that when executing srp_remove_work 
> (the removal of one target) scsi_host_put is called twice, but when unloading 
> the module in srp_remove_one scsi_host_put is called only once.
> 
> It looks like the correct thing is to execute scsi_host_put twice (once for 
> the call to scsi_add_host and once for the call to scsi_host_alloc).
> 
> So, I suggest the next patch:

Forgot to sign. Here it is again with the Signed-off-by line.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>


Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-05-14 
13:09:23.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-05-14 
13:25:48.0 +0300
@@ -357,7 +357,6 @@ static void srp_remove_work(void *target
spin_lock_irq(target->scsi_host->host_lock);
if (target->state != SRP_TARGET_DEAD) {
spin_unlock_irq(target->scsi_host->host_lock);
-   scsi_host_put(target->scsi_host);
return;
}
target->state = SRP_TARGET_REMOVED;
@@ -1790,6 +1789,11 @@ static void srp_remove_one(struct ib_dev
srp_disconnect_target(target);
ib_destroy_cm_id(target->cm_id);
srp_free_target_ib(target);
+   /*
+* We need 2 scsi_host_put becuase there are two get:
+*  in scsi_host_alloc and in scsi_add_host
+*/
+   scsi_host_put(target->scsi_host);
scsi_host_put(target->scsi_host);
}
 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] SRP: [PATCH] Releasing the scsi_host when unloading

2006-05-14 Thread Ishai Rabinovitz
Hi,

After loading ib_srp module, adding a target and then unloading the ib_srp 
target the scsi_host directory in /sys/class/scsi_host/ still exists.
It looks like the srp code does not release the scsi_host it had allocated.

After examining the code I found out that when executing srp_remove_work 
(the removal of one target) scsi_host_put is called twice, but when unloading 
the module in srp_remove_one scsi_host_put is called only once.

It looks like the correct thing is to execute scsi_host_put twice (once for 
the call to scsi_add_host and once for the call to scsi_host_alloc).

So, I suggest the next patch:

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-05-14 
13:09:23.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-05-14 
13:25:48.0 +0300
@@ -357,7 +357,6 @@ static void srp_remove_work(void *target
spin_lock_irq(target->scsi_host->host_lock);
if (target->state != SRP_TARGET_DEAD) {
spin_unlock_irq(target->scsi_host->host_lock);
-   scsi_host_put(target->scsi_host);
return;
}
target->state = SRP_TARGET_REMOVED;
@@ -1790,6 +1789,11 @@ static void srp_remove_one(struct ib_dev
srp_disconnect_target(target);
ib_destroy_cm_id(target->cm_id);
srp_free_target_ib(target);
+   /*
+* We need 2 scsi_host_put becuase there are two get:
+*  in scsi_host_alloc and in scsi_add_host
+*/
+   scsi_host_put(target->scsi_host);
scsi_host_put(target->scsi_host);
}
 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [openfabrics-ewg] SRP: changes to ibsrpdm

2006-05-05 Thread Ishai Rabinovitz
On Thu, May 04, 2006 at 11:05:51PM +0300, Michael S. Tsirkin wrote:
> Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> > Why can't the daemon keep
> > track of which targets it has added and not add a target twice?  Is
> > there some reason that the kernel has to be involved?
> 
> What if the daemon is killed/restarted?
> 
> -- 
> MST
What if there are two daemons by accident?
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] SRP: changes to ibsrpdm

2006-05-04 Thread Ishai Rabinovitz



Hi,
 
After implementing 
and submitting the ibsrpdm patches I got the following important 
remarks:
 
1) Sometimes there 
is a need to add a target twice.
 
2) It it unnecessary 
complication to the kernel to look for a target in the targets list in 
order to remove it.
 
3) There is a 
conceptual problem in the list_target query - In sysfs each file should report 
only one value.
 
 
Before implementing 
the fixes according to this remarks, I want to see if there are any comments to 
the changes I'm going to do.
 
I'm going to change 
ibsrpdm and SRP driver code in the following manner:
 
1) There is going to 
be an attribute for a target indicating if it was added by the daemon (Named 
daemons). 
Only the daemon should add targets with this attribute set.
 
2) The kernel will 
not allow the daemon to add the same target twice. Regular activation of 
add_target can add multiple instances of the same target.
 
3) list_target query 
will be removed. The information will be in several directories in sysfs (One 
for each target) - Roland, vu, can you send me a pointer that 
explains which target information can be found in the sysfs 
today?
 
4) I'll change the way I've implemented 
the activation of remove target. There will be a remove target file for 
each existing target directory in sysfs, echo 1 to this file will remove the 
corresponding target.
 
5) The daemon will remove only targets 
that it added. (Has the daemons attribute 
set).
 
6) Adding execution modes to 
ibsrpdm:
    a) When activated 
without flags or with -c flag, ibsrpdm will executes once (has before) 
and display the targets in the network.
    b) When activated with 
-l flag,  ibsrpdm will be activated in a loop and display at each 
cycle the targets that join the network and the targets the leaves the 
network.
c) When activated 
with -l and -a flags, ibsrpdm will be activated as a daemon 
that adds targets that join the 
network.

d) When activated 
with -l and -r flags, ibsrpdm will be activated as a daemon that removes 
targets that leave the network.

e) When activated 
with -l, -a and -r flags, ibsrpdm will be activated as a daemon 
that adds targets that join the network, and removes targets that 
leave the network.
 
The reason we need option b, is because 
costumers may want to add the targets in a certain order (for binding 
purposes).
 
Any 
comments?
 
Ishai 
Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Re: [openfabrics-ewg] Current OFED kernel snapshot - problems in back porting SRP to RH4

2006-05-02 Thread Ishai Rabinovitz
On Tue, May 02, 2006 at 10:34:29AM +0300, Aviram Gutman wrote:
>  
> Yes, we have uploaded a pre version of RC4. You can find it in:
> 
> URL: https://openib.org/svn/gen2/branches/1.0/ofed/releases
> 
> OFED-1.0-rc4-pre4.tgz
> OFED-1.0-rc4-pre4.tgz.md5sum
> 
> Please pay attention that we still face issues. The following is the diff 
> between RC3 and the pre:
> 
> 1.   Bug fixes according to problems reported.
> 
> 2.   SRP - with new features: FMR, tunable parameters, SRP daemon - We have 
> an issue with the back port of SRP to RH4 U2 and U3. Ishai will issue a mail 
> with explanation.
> 
> 3.   Open MPI - new package based on 1.1a3 - Please be noted that RPM 
> building process failed. Vlad will 
> 
> 4.   RDS - new version from main trunk
> 
> 5.   Kernel code based on git
> 
> 6.   Standard network configuration
> 
> 
> Known issues:
> 1. ipath installation fails on 2.6.9 - 2.6.11* kernels
> 2. OSU MPI compilation fails on SLES10, PPC64
> 3. SRP is not supported on 2.6.9 - 2.6.13* kernels - Ishai will follow up 
> with details 
> 4. Open MPI RPM build process fails - Jeff, will you be able to send us fixes 
> by Wed?
> 
> 
> Regards,
>Aviram
> 
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Doug Ledford
> Sent: Monday, May 01, 2006 11:53 PM
> To: [EMAIL PROTECTED]
> Subject: [openfabrics-ewg] Current OFED kernel snapshot
> 
> There's rumored to be a significant number of changes between rc3 and rc4 on 
> the kernel module front.  I would like to get started on integrating those 
> changes sooner rather than later.  So, where would I go to get a snapshot of 
> the latest OFED kernel
> tree.  So far I've only found kernel trees under the tags directory and 
> obviously the rc4 tag hasn't been populated yet.
> 
> --
>   Doug Ledford <[EMAIL PROTECTED]>
>  Red Hat, Inc. 
>  1801 Varsity Dr.
>  Raleigh, NC 27606
>   
> ___
> openfabrics-ewg mailing list
> [EMAIL PROTECTED]
> http://openib.org/mailman/listinfo/openfabrics-ewg


Hi

We have a problem when trying to back port SRP to RH4 U2 and U3 (Actually to 
any kernel earlier than 2.6.13).
The problem is when the SCSI driver is calling to eh_abort_handler,
or to eh_device_reset_handler.
In the current kernel (starting from 2.6.13) this call is made without 
host_lock spin-lock locked.
In the SRP code that performs the abort and the reset (srp_send_tsk_mgmt) we 
send a message to the target and we wait for a response from the target.

In early versions of the kernel the SCSI driver performs irq_spinlock_save to 
the host_lock before calling to the abort or reset handlers.
This creates a problem: The SRP driver can not go to sleep until the target will
answer.

Any ideas?


-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] SRP: Avoid a potential deadlock

2006-05-01 Thread Ishai Rabinovitz
On Mon, May 01, 2006 at 11:27:24AM -0700, Roland Dreier wrote:
>     Ishai> Avoid a potential dead-lock.  In srp_disconnect_target
> Ishai> there is a call to ib_send_cm_dreq and a wait for
> Ishai> completion If when getting DREP there is no comp no one
> Ishai> will end this wait
> 
> I thought that after the DREP is received, the CM will go through
> timewait and we will eventually get a TIMEWAIT_EXIT event (with a
> completion).  Am I wrong?  Have you actually seen this deadlock happen
> in practice?
> 
>  - R.

I had a deadlock and I suspected at this. I'm not sure that it was the 
reason for the deadlock.

Vu, What do you think?

-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 00/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz
On Mon, May 01, 2006 at 10:53:36AM -0700, Roland Dreier wrote:
>     Ishai> Hi, I'm going to send 12 patches. 6 patches for the kernel,
> Ishai> and 6 for the userspace ibsrpdm.  The kernel patches avoid
> Ishai> adding the same target twice, allow the removal of a
> Ishai> target, and add a query about the connected targets.
> 
> In the future can you use a different descriptive title for each patch?

OK

> 
> Also (although I haven't reviewed the actual code yet) this mostly
> makes sense, but I'm not sure we want to disallow connecting to the
> same target twice.  Userspace may want to implement a policy of one
> conncetion per target, but having multiple connections to the same
> target for multipathing/failover seems like something the kernel
> should allow.  What was your reason for forbidding this?
> 
>  - R.

As I understand it, the path in multipathing is going to be part of the 
attributes of the connection to the target. So there will be no problem
to add the same target twice, if it has a different path leading to it.
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [PATCH 06/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz
On Mon, May 01, 2006 at 04:50:32PM +0300, Muli Ben-Yehuda wrote:
> On Mon, May 01, 2006 at 02:28:48PM +0300, Ishai Rabinovitz wrote:
> > 
> > Support a display of list of target from user level.
> > 
> > Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
> > Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
> > ===
> > --- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-04-21 
> > 01:13:04.0 +0300
> > +++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-04-21 
> > 03:56:05.0 +0300
> > @@ -1730,6 +1730,63 @@ end:
> >  
> >  static CLASS_DEVICE_ATTR(remove_target, S_IWUSR, NULL, srp_remove_target);
> >  
> > +#define TARGET_INFO_BUF_SIZE 126
> > +
> > +static ssize_t list_targets(struct class_device *class_dev, char *buf)
> > +{
> > +   struct srp_host *host =
> > +   container_of(class_dev, struct srp_host, class_dev);
> > +   struct srp_target_port *target;
> > +   int printed=0, ret;
> > +
> > +   mutex_lock(&host->target_mutex);
> > +   list_for_each_entry(target, &host->target_list, list)
> 
> Can this race with list addition / removal? I saw that you removed the
> lock in an earlier patch?

No, In an erlier patch I did not removed the lock, I enlarged it scope to 
include 
the entire call to srp_find_target.

> 
> > +   if (target->state == SRP_TARGET_LIVE) {
> 
> You'd have an easier time with the indentation if you'd do 
> 
> if (target->state != SRP_TARGET_LIVE)
> continue;
> 
> here
> 
> > +   ret = sprintf(buf+printed,
> > +   "id_ext=%016llx,ioc_guid=%016llx,"
> > +   "dgid=%04x%04x%04x%04x%04x%04x%04x%04x,"
> > +   "pkey=%04x,service_id=%016llx\n",
> > +   (unsigned long long)
> > +   be64_to_cpu(target->id_ext),
> > +   (unsigned long long)
> > +   be64_to_cpu(target->ioc_guid),
> > +   (int) be16_to_cpu(*(__be16 *)
> > +   &target->path.dgid.raw[0]),
> > +   (int) be16_to_cpu(*(__be16 *)
> > +   &target->path.dgid.raw[2]),
> > +   (int) be16_to_cpu(*(__be16 *)
> > +   &target->path.dgid.raw[4]),
> > +   (int) be16_to_cpu(*(__be16 *)
> > +   &target->path.dgid.raw[6]),
> > +   (int) be16_to_cpu(*(__be16 *)
> > +   &target->path.dgid.raw[8]),
> > +   (int) be16_to_cpu(*(__be16 *)
> > +   &target->path.dgid.raw[10]),
> > +   (int) be16_to_cpu(*(__be16 *)
> > +   &target->path.dgid.raw[12]),
> > +   (int) be16_to_cpu(*(__be16 *)
> > +   &target->path.dgid.raw[14]),
> 
> This is pretty horrible - could you use show_dgid() here?

Id will add a redundant copy of the buffer.
> 
> Cheers,
> Muli

-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [PATCH 04/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz
On Mon, May 01, 2006 at 04:43:23PM +0300, Muli Ben-Yehuda wrote:
> On Mon, May 01, 2006 at 02:27:39PM +0300, Ishai Rabinovitz wrote:
> > 
> > Do not add the same target twice.
> > 
> > Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
> > Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
> > ===
> > --- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-04-25 
> > 15:17:34.0 +0300
> > +++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-04-25 
> > 15:19:37.0 +0300
> > @@ -1478,7 +1478,8 @@ static int srp_parse_options(const char 
> > printk(KERN_WARNING PFX "bad max sect parameter 
> > '%s'\n", p);
> > goto out;
> > }
> > -   target->scsi_host->max_sectors = token;
> > +   if (target->scsi_host != NULL)
> > +   target->scsi_host->max_sectors = token;
> > break;
> 
> This chunk does not look related to the rest. Is a NULL
> target->scsi_host legal here? if not, the check should be removed as
> we'd rather take an oops here than hide the problem behind the NULL
> pointer check.
> 
> > +/* srp_find_target - If the target exists return it in target,
> > + otherwise target is set to NULL.
> > + host->target_mutex should be hold */
> 
> Please use the usual kernel
> /*
>  * stuff
>  */
> style for multi line comments.

OK, Thanks.
> 
> > +static int srp_find_target(const char *buf, struct srp_host *host,
> > +  struct srp_target_port **target)
> > +{
> > +   struct srp_target_port *target_to_find, *curr_target;
> > +   int ret, i;
> > +
> > +   target_to_find = kzalloc(sizeof *target_to_find, GFP_KERNEL);
> > +   ret = srp_parse_options(buf, target_to_find);
> > +   if (ret)
> > +   goto free;
> > +
> > +   list_for_each_entry(curr_target, &host->target_list, list)
> > +   if (target_to_find->ioc_guid == curr_target->ioc_guid &&
> > +   target_to_find->id_ext == curr_target->id_ext &&
> > +   target_to_find->path.pkey == curr_target->path.pkey &&
> > +   target_to_find->service_id == curr_target->service_id) {
> > +   for (i = 0; i < 16; ++i)
> > +   if (target_to_find->path.dgid.raw[i] != 
> > curr_target->path.dgid.raw[i])
> > +   break;
> 
> The conditional and check here probably deserves an inline helper
> called same_target() or some such.
> 
> > +   if (i == 16) {
> > +   *target = curr_target;
> > +   goto free;
> > +   }
> > +   }
> > +
> > +   *target = NULL;
> > +
> > +free:
> > +   kfree(target_to_find);
> > +   return 0;
> 
> We always return 0 - either this should return void, or you meant to
> return ret here instead of 0?

You are right as usual, We should return ret.
> 
> > +}
> > +
> >  static ssize_t srp_create_target(struct class_device *class_dev,
> >  const char *buf, size_t count)
> >  {
> > struct srp_host *host =
> > container_of(class_dev, struct srp_host, class_dev);
> > struct Scsi_Host *target_host;
> > -   struct srp_target_port *target;
> > +   struct srp_target_port *target, *existing_target = NULL;
> > int ret;
> > int i;
> >  
> > +   /* first check if the target already exists */
> > +
> > +   mutex_lock(&host->target_mutex);
> > +   ret = srp_find_target(buf, host, &existing_target);
> > +   if (ret)
> > +   goto unlock_mutex;
> > +
> > +   if (existing_target) {
> > +   /* target already exists */
> > +   spin_lock_irq(existing_target->scsi_host->host_lock);
> 
> why _irq and not _irqsave? Are you sure this code can't ever be called
> with interrupts off via some other path?

This function is being called from userspace 
(writing to /sys/class/infiniband_srp/.../add_target) so no need for irqsave.

Do you think we should always use irqsave just to be on the safe side (Maybe
in the future someone else will call us)? 
> 
> Cheers,
> Muli

---  Resending the fixed patch -

Re: [openib-general] [PATCHE 02/12] SRP: changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz
On Mon, May 01, 2006 at 04:33:42PM +0300, Muli Ben-Yehuda wrote:
> On Mon, May 01, 2006 at 02:25:46PM +0300, Ishai Rabinovitz wrote:
> > 
> > Move the destruction of the host and the removal from a list to a function.
> > 
> > Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
> > 
> > Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
> > ===
> > --- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-04-23 
> > 14:08:03.0 +0300
> > +++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-04-24 
> > 10:47:00.0 +0300
> > @@ -344,6 +344,16 @@ static void srp_disconnect_target(struct
> > wait_for_completion(&target->done);
> >  }
> >  
> > +static void destruct_scsi_host_and_target(struct srp_target_port *target, 
> > int disconnect_target)
> > +{
> > +   scsi_remove_host(target->scsi_host);
> > +   if (disconnect_target)
> > +   srp_disconnect_target(target);
> > +   ib_destroy_cm_id(target->cm_id);
> > +   srp_free_target_ib(target);
> > +   scsi_host_put(target->scsi_host);
> > +}
> > +
> >  static void srp_remove_work(void *target_ptr)
> >  {
> > struct srp_target_port *target = target_ptr;
> > @@ -357,10 +374,7 @@ static void srp_remove_work(void *target
> > list_del(&target->list);
> > mutex_unlock(&target->srp_host->target_mutex);
> >  
> > -   scsi_remove_host(target->scsi_host);
> > -   ib_destroy_cm_id(target->cm_id);
> > -   srp_free_target_ib(target);
> > -   scsi_host_put(target->scsi_host);
> > +   destruct_scsi_host_and_target(target, 0);
> 
> Is not disconnecting from the target here actually the right thing to
> do? considering we're then destroying the target's queue pairs and
> freeing it?
> 
> Cheers,
> Muli

Hi Muli,

srp_remove_target is being called only when we were unable to reconnect 
in srp_reconnect_target so the target is already disconnected.

Ishai

-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] SRP: Avoid a potential deadlock

2006-05-01 Thread Ishai Rabinovitz
Hi,

I think there is a potential deadlock when disconnecting from the CM.
Roland, can you look at this patch and check if it is needed.

Thanks
Ishai

--

Avoid a potential dead-lock.
In srp_disconnect_target there is a call to ib_send_cm_dreq 
and a wait for completion 
If when getting DREP there is no comp no one will end this wait

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-04-17 
10:03:08.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-04-17 
10:06:19.0 +0300
@@ -1194,6 +1194,7 @@ static int srp_cm_handler(struct ib_cm_i
break;
 
case IB_CM_DREP_RECEIVED:
+   comp = 1;
break;
 
case IB_CM_TIMEWAIT_EXIT:
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH 12/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz

The query can be improved if working against OpenSM that supports
the option to ask about a certain bit in the capability mask.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/src/userspace/srptools/src/srp-dm.c
===
--- last_stable.orig/src/userspace/srptools/src/srp-dm.c2006-04-21 
06:26:25.0 +0300
+++ last_stable/src/userspace/srptools/src/srp-dm.c 2006-04-21 
06:30:36.0 +0300
@@ -97,10 +97,16 @@ static inline uint64_t htonll(uint64_t x
 
 #define SIZE_OF_QUERY_RESPONSE (1 << 18)
 
+#define SM_SUPPORTS_QUERY_OF_PART_OF_CAP_MASK_BIT_MASK (1 << 13)
+
+#define TEST_ONLY_SET_BIT_BIT_MASK (1 << 31)
+
 #define N_COMP_MASK_NODE_TYPE htonll(1 << 4);
 
 #define N_COMP_MASK_LID htonll(1);
 
+#define N_COMP_MASK_CAPABILITY_MASK htonll(1 << 7);
+
 #define INITIAL_SIZE_OF_TARGET_TABLE 10
 
 #define SLEEP_TIME 60
@@ -180,8 +186,6 @@ static void empty_set(targets_set *set)
 
 static void destroy_set(targets_set *set)
 {
-   int i;
-
empty_set(set);
free(set->array);
 }
@@ -679,6 +683,63 @@ static int get_port_info(int fd, uint32_
return 0;
 }
 
+int get_class_port_info(int fd, uint32_t agent[2], uint16_t dlid,
+   int *is_mask_match_supported)
+{
+   struct ib_user_mad  out_mad, in_mad;
+   struct srp_dm_rmpp_sa_mad  *out_sa_mad, *in_sa_mad;
+   struct srp_dm_mad  *in_dm_mad;
+   struct srp_dm_class_port_info  *class_port_info;
+
+   in_sa_mad  = (void *) in_mad.data;
+   in_dm_mad  = (void *) in_mad.data;
+   out_sa_mad = (void *) out_mad.data;
+
+   init_srp_dm_mad(&out_mad, agent[1], sm_lid, 
SRP_DM_ATTR_CLASS_PORT_INFO, 0);
+
+   out_sa_mad->mgmt_class= SRP_MGMT_CLASS_SA;
+   out_sa_mad->class_version = 2;
+
+   if (send_and_get(fd, &out_mad, &in_mad, 0) < 0)
+   return -1;
+
+/* TODO: to handle forwarding */
+   class_port_info = (void *) in_sa_mad->data;
+   *is_mask_match_supported =
+   !!(ntohs(class_port_info->cap_mask) &
+  SM_SUPPORTS_QUERY_OF_PART_OF_CAP_MASK_BIT_MASK);
+
+   return 0;
+}
+
+int get_node_info(int fd, uint32_t agent[2], uint16_t dlid, uint64_t *n_guid)
+{
+   struct ib_user_mad  out_mad, in_mad;
+   struct srp_dm_rmpp_sa_mad  *out_sa_mad, *in_sa_mad;
+   struct srp_dm_mad  *in_dm_mad;
+   struct srp_sa_node_rec *node_info;
+
+   in_sa_mad  = (void *) in_mad.data;
+   in_dm_mad  = (void *) in_mad.data;
+   out_sa_mad = (void *) out_mad.data;
+
+   init_srp_dm_mad(&out_mad, agent[1], sm_lid, SRP_SA_ATTR_NODE, 0);
+
+   out_sa_mad->mgmt_class= SRP_MGMT_CLASS_SA;
+   out_sa_mad->class_version = 2;
+   out_sa_mad->comp_mask = htonll((uint64_t)1); /* LID */
+   node_info = (void *) out_sa_mad->data;
+   node_info->lid= htons(dlid);
+
+   if (send_and_get(fd, &out_mad, &in_mad, 0) < 0)
+   return -1;
+
+   node_info = (void *) in_sa_mad->data;
+   *n_guid   = node_info->port_guid;
+
+   return 0;
+}
+
 static int get_port_list(int fd, uint32_t agent[2])
 {
uint8_t in_mad_space[SIZE_OF_QUERY_RESPONSE];
@@ -686,19 +747,11 @@ static int get_port_list(int fd, uint32_
struct srp_dm_rmpp_sa_mad  *out_sa_mad, *in_sa_mad;
struct srp_sa_node_rec *node;
ssize_t len;
-   char val[64];
int size;
int i;
uint64_t subnet_prefix;
int isdm;
 
-   if (read_file(port_sysfs_path, "sm_lid", val, sizeof val) < 0) {
-   pr_err("Couldn't read SM LID\n");
-   return -1;
-   }
-
-   sm_lid = strtol(val, NULL, 0);
-
in_sa_mad  = (void *) in_mad->data;
out_sa_mad = (void *) out_mad.data;
 
@@ -762,6 +815,57 @@ static int get_existing_targets()
return 0;
 }
 
+int get_port_list_new(int fd, uint32_t agent[2])
+{
+   uint8_t in_mad_space[SIZE_OF_QUERY_RESPONSE];
+   struct ib_user_mad  out_mad, *in_mad=(void *) in_mad_space;
+   struct srp_dm_rmpp_sa_mad   *out_sa_mad, *in_sa_mad;
+   struct srp_sa_port_info_rec *port_info;
+   ssize_t len;
+   int size;
+   int i;
+   uint64_t subnet_prefix;
+   uint16_t lid;
+   uint64_t guid;
+
+   in_sa_mad  = (void *) in_mad->data;
+   out_sa_mad = (void *) out_mad.data;
+
+   init_srp_dm_mad(&out_mad, agent[1], sm_lid, SRP_SA_ATTR_PORT_INFO,
+   TEST_ONLY_SET_BIT_BIT_MASK);
+
+   out_sa_mad->mgmt_class = SRP_MGMT_CLASS_SA;
+   out_sa_mad->method = SRP_SA_METHOD_GET_TABLE;
+   out_sa_mad->class_version  = 2;
+   out_sa_mad-

[openib-general] [PATCH 11/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz

Add -l option to ibsrpdm.
This option activates a daemon that queries for the targets in a loop and
tells ib_srp about new target that appears and old target that disappears.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/src/userspace/srptools/src/srp-dm.c
===
--- last_stable.orig/src/userspace/srptools/src/srp-dm.c2006-04-21 
04:47:54.0 +0300
+++ last_stable/src/userspace/srptools/src/srp-dm.c 2006-04-21 
05:16:11.0 +0300
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "ib_user_mad.h"
 #include "srp-dm.h"
@@ -49,17 +50,39 @@ static uint32_t tid = 1;
 
 static int cmd = 0;
 static int verbose = 0;
+static int loop= 0;
+static int add_target_fd;
+static int remove_target_fd;
+static char *list_targets_path;
+
+#define pr_log(arg...) \
+   do {\
+   if (verbose) {  \
+   if (loop)   \
+   syslog(LOG_WARNING, arg);   \
+   else if (!cmd)  \
+   printf(arg);\
+   }   \
+   } while (0)
 
 #define pr_human(arg...)   \
do {\
-   if (!cmd)   \
+   if (!cmd && !loop)  \
printf(arg);\
} while (0)
 
 #define pr_cmd(arg...) \
do {\
-   if (cmd)\
-   printf(arg);\
+   if (cmd && !loop)   \
+   printf(arg);\
+   } while (0)
+
+#define pr_err(arg...) \
+   do {\
+   if (loop)   \
+   syslog(LOG_WARNING, arg);   \
+   else\
+   fprintf(stderr, arg);   \
} while (0)
 
 #if __BYTE_ORDER == __LITTLE_ENDIAN
@@ -78,11 +101,106 @@ static inline uint64_t htonll(uint64_t x
 
 #define N_COMP_MASK_LID htonll(1);
 
+#define INITIAL_SIZE_OF_TARGET_TABLE 10
+
+#define SLEEP_TIME 60
+
+static int size_of_target_table = INITIAL_SIZE_OF_TARGET_TABLE;
+
+/* Implementaion of target set in an array.
+*  Assumption: there will be small number of targets
+*  TODO: If this assumption does not hold,
+*change the implemantaion to a hash or a tree
+*/
+
+typedef struct {
+   char **array;
+   unsigned int next_index;
+   unsigned int size;
+} targets_set;
+
+static int create_set(targets_set *set)
+{
+   set->next_index = 0;
+   set->size = size_of_target_table;
+   set->array = calloc(set->size, sizeof(char *));
+   if (set->array == NULL) {
+   perror("calloc:");
+   return -1;
+   }
+
+   return 0;
+}
+
+static int add_to_set(targets_set *set, char *target_info)
+{
+   if (set->next_index == set->size) {
+   if (set->size == size_of_target_table)
+   size_of_target_table *= 2;
+   set->size = size_of_target_table;
+   set->array = realloc(set->array, set->size * sizeof(char *));
+   if (set->array == NULL) {
+   pr_err("realloc: %s\n", strerror(errno));
+   return -1;
+   }
+   }
+   set->array[set->next_index] = strdup(target_info);
+   if (set->array[set->next_index] == NULL) {
+   pr_err("strdup: %s\n", strerror(errno));
+   return -1;
+   }
+   ++set->next_index;
+
+   return 0;
+}
+
+static int remove_from_set(targets_set *set, char *target_info)
+{
+   int i;
+
+   for (i = 0; i < set->next_index; ++i)
+   if (!strcmp(set->array[i], target_info)) {
+   free(set->array[i]);
+   set->array[i] = set->array[set->next_index];
+   --set->next_index;
+   return 0;
+   }
+
+   return -1;
+}
+
+static void empty_set(targets_set *set)
+{
+   int i;
+
+   for (i = 0; i < set->next_index; ++i)
+   free(set->array[i]);
+   set->next_index = 0;
+}
+
+static void destroy_set(targets_set *set)
+{
+   int i;
+
+   empty_set(set);
+   free(set->array);
+}
+
+/* for_each_entry_in_set(char *tar

[openib-general] [PATCH 10/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz

Add a function send_and_get that handles the communication and retries.
Reduce redundancy.
Increment TID on retry.
Bound the number of retries.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/src/userspace/srptools/src/srp-dm.c
===
--- last_stable.orig/src/userspace/srptools/src/srp-dm.c2006-04-21 
01:35:10.0 +0300
+++ last_stable/src/userspace/srptools/src/srp-dm.c 2006-04-21 
01:41:28.0 +0300
@@ -85,6 +85,36 @@ static void usage(const char *argv0)
fprintf(stderr, "Usage: %s [-vc] [-d ]\n", argv0);
 }
 
+#define NUM_OF_RETRIES 3
+int send_and_get(int fd, struct ib_user_mad *out_mad,
+struct ib_user_mad *in_mad, long in_mad_size)
+{
+   int i, len, in_mad_real_size;
+   struct srp_dm_mad *out_dm_mad;
+
+   in_mad_real_size = (in_mad_size ? in_mad_size : sizeof(struct 
ib_user_mad));
+   for (i = 0; i < NUM_OF_RETRIES; ++i)
+   {
+   len = write(fd, out_mad, sizeof(struct ib_user_mad));
+  if (len != sizeof(struct ib_user_mad)) {
+   fprintf(stderr, "write: %s\n", strerror(errno));
+   return -1;
+   }
+
+   len = read(fd, in_mad, in_mad_real_size);
+   if ((in_mad_size == 0 && len == in_mad_real_size) ||
+   (in_mad_size != 0 && len > 0))
+   return len;
+   else if (in_mad->hdr.status != ETIMEDOUT) {
+   fprintf(stderr, "%s/%d: read: %s\n", __func__, 
__LINE__, strerror(errno));
+   return -1;
+   }
+   out_dm_mad = (void *) out_mad->data;
+   ((uint32_t *) &out_dm_mad->tid)[1] = tid++;
+   }
+   return -1;
+}
+
 static int read_file(const char *dir, const char *file, char *buf, size_t size)
 {
char *path;
@@ -234,19 +264,8 @@ static int set_class_port_info(int fd, u
((uint16_t *) cpi->trap_gid)[i] = htons(strtol(val + i * 5, 
NULL, 16));
}
 
-again:
-   if (write(fd, &out_mad, sizeof out_mad) != sizeof out_mad) {
-   perror("write");
+   if (send_and_get(fd, &out_mad, &in_mad, 0) < 0)
return -1;
-   }
-
-   if (read(fd, &in_mad, sizeof in_mad) != sizeof in_mad) {
-   if (in_mad.hdr.status == ETIMEDOUT)
-   goto again;
-   fprintf(stderr, "%s/%d: ", __func__, __LINE__);
-   perror("read");
-   return -1;
-   }
 
in_dm_mad = (void *) in_mad.data;
if (in_dm_mad->status) {
@@ -266,19 +285,8 @@ static int get_iou_info(int fd, uint32_t
 
init_srp_dm_mad(&out_mad, agent[1], dlid, SRP_DM_ATTR_IO_UNIT_INFO, 0);
 
-again:
-   if (write(fd, &out_mad, sizeof out_mad) != sizeof out_mad) {
-   perror("write");
-   return -1;
-   }
-
-   if (read(fd, &in_mad, sizeof in_mad) != sizeof in_mad) {
-   if (in_mad.hdr.status == ETIMEDOUT)
-   goto again;
-   fprintf(stderr, "%s/%d: ", __func__, __LINE__);
-   perror("read");
+   if (send_and_get(fd, &out_mad, &in_mad, 0) < 0)
return -1;
-   }
 
in_dm_mad = (void *) in_mad.data;
if (in_dm_mad->status) {
@@ -300,19 +308,8 @@ static int get_ioc_prof(int fd, uint32_t
 
init_srp_dm_mad(&out_mad, agent[1], dlid, 
SRP_DM_ATTR_IO_CONTROLLER_PROFILE, ioc);
 
-again:
-   if (write(fd, &out_mad, sizeof out_mad) != sizeof out_mad) {
-   perror("write");
+   if (send_and_get(fd, &out_mad, &in_mad, 0) < 0)
return -1;
-   }
-
-   if (read(fd, &in_mad, sizeof in_mad) != sizeof in_mad) {
-   if (in_mad.hdr.status == ETIMEDOUT)
-   goto again;
-   fprintf(stderr, "%s/%d: ", __func__, __LINE__);
-   perror("read");
-   return -1;
-   }
 
if (in_mad.hdr.status != 0) {
fprintf(stderr, "IO Controller Profile query timed out\n");
@@ -340,19 +337,8 @@ static int get_svc_entries(int fd, uint3
init_srp_dm_mad(&out_mad, agent[1], dlid, SRP_DM_ATTR_SERVICE_ENTRIES,
(ioc << 16) | (end << 8) | start);
 
-again:
-   if (write(fd, &out_mad, sizeof out_mad) != sizeof out_mad) {
-   perror("write");
+   if (send_and_get(fd, &out_mad, &in_mad, 0) < 0)
return -1;
-   }
-
-   if (read(fd, &in_mad, sizeof in_mad) != sizeof in_mad) {
-   if (in_mad.hdr.status == ETIMEDOUT)
-   goto again;
- 

[openib-general] [PATCH 09/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz

alloca man page on my system says:
The alloca()  function is machine and compiler dependent. 
On many systems its implementation is buggy. Its use is discouraged.
Lets not use it.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/src/userspace/srptools/src/srp-dm.c
===
--- last_stable.orig/src/userspace/srptools/src/srp-dm.c2006-04-16 
13:09:07.0 +0300
+++ last_stable/src/userspace/srptools/src/srp-dm.c 2006-04-16 
13:11:17.0 +0300
@@ -510,7 +510,8 @@ again:
 
 int get_port_list(int fd, uint32_t agent[2])
 {
-   struct ib_user_mad  out_mad, *in_mad;
+   uint8_t in_mad_space[SIZE_OF_QUERY_RESPONSE];
+   struct ib_user_mad  out_mad, *in_mad=(void *) in_mad_space;
struct srp_dm_rmpp_sa_mad  *out_sa_mad, *in_sa_mad;
struct srp_sa_node_rec *node;
ssize_t len;
@@ -521,8 +522,6 @@ int get_port_list(int fd, uint32_t agent
uint64_t subnet_prefix;
int isdm;
 
-   in_mad= alloca(SIZE_OF_QUERY_RESPONSE);
-
in_sa_mad  = (void *) in_mad->data;
out_sa_mad = (void *) out_mad.data;
 
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH 08/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz

Use constants for bits in masks. Improves readability.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/src/userspace/srptools/src/srp-dm.c
===
--- last_stable.orig/src/userspace/srptools/src/srp-dm.c2006-04-21 
01:18:55.0 +0300
+++ last_stable/src/userspace/srptools/src/srp-dm.c 2006-04-21 
03:41:39.0 +0300
@@ -70,6 +70,14 @@ static inline uint64_t ntohll(uint64_t x
 static inline uint64_t htonll(uint64_t x) { return x; }
 #endif
 
+#define IS_DM_MASK (1 << 19)
+
+#define SIZE_OF_QUERY_RESPONSE (1 << 18)
+
+#define N_COMP_MASK_NODE_TYPE htonll(1 << 4);
+
+#define N_COMP_MASK_LID htonll(1);
+
 static char *sysfs_path = "/sys";
 
 static void usage(const char *argv0)
@@ -474,7 +482,7 @@ static int get_port_info(int fd, uint32_
 
out_sa_mad->mgmt_class= SRP_MGMT_CLASS_SA;
out_sa_mad->class_version = 2;
-   out_sa_mad->comp_mask = htonll(1); /* LID */
+   out_sa_mad->comp_mask = N_COMP_MASK_LID;
port_info = (void *) out_sa_mad->data;
port_info->endport_lid= htons(dlid);
 
@@ -495,7 +503,7 @@ again:
 
port_info = (void *) in_sa_mad->data;
*subnet_prefix = ntohll(port_info->subnet_prefix);
-   *isdm  = !!(ntohl(port_info->capability_mask) & (1 << 19));
+   *isdm  = !!(ntohl(port_info->capability_mask) & IS_DM_MASK);
 
return 0;
 }
@@ -519,7 +527,7 @@ static int get_port_list(int fd, uint32_
 
sm_lid = strtol(val, NULL, 0);
 
-   in_mad= alloca(1 << 18);
+   in_mad= alloca(SIZE_OF_QUERY_RESPONSE);
 
in_sa_mad  = (void *) in_mad->data;
out_sa_mad = (void *) out_mad.data;
@@ -529,7 +537,7 @@ static int get_port_list(int fd, uint32_
out_sa_mad->mgmt_class= SRP_MGMT_CLASS_SA;
out_sa_mad->method= SRP_SA_METHOD_GET_TABLE;
out_sa_mad->class_version = 2;
-   out_sa_mad->comp_mask = htonll(1ul << 4); /* node type */
+   out_sa_mad->comp_mask = N_COMP_MASK_NODE_TYPE;
out_sa_mad->rmpp_version  = 1;
out_sa_mad->rmpp_type = 1;
node  = (void *) out_sa_mad->data;
@@ -541,7 +549,7 @@ again:
return -1;
}
 
-   len = read(fd, in_mad, 1 << 18);
+   len = read(fd, in_mad, SIZE_OF_QUERY_RESPONSE);
    if (len < 0) {
fprintf(stderr, "%s/%d: ", __func__, __LINE__);
perror("read");
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH 07/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz

Remove trailing spaces and arranging tabs.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/src/userspace/srptools/src/srp-dm.c
===
--- last_stable.orig/src/userspace/srptools/src/srp-dm.c2006-04-21 
03:54:05.0 +0300
+++ last_stable/src/userspace/srptools/src/srp-dm.c 2006-04-21 
04:20:43.0 +0300
@@ -102,7 +102,6 @@ static int read_file(const char *dir, co
return len;
 }
 
-
 static int setup_port_sysfs_path(void) {
char *env;
char class_dev_path[256];
@@ -135,7 +134,7 @@ static int setup_port_sysfs_path(void) {
fprintf(stderr, "Couldn't read ibdev attribute\n");
return -1;
}
-   
+
if (read_file(class_dev_path, "port", ibport, sizeof ibport) < 0) {
fprintf(stderr, "Couldn't read port attribute\n");
return -1;
@@ -385,7 +384,7 @@ static int do_port(int fd, uint32_t agen
pr_human("change ID:   %04x\n", ntohs(iou_info.change_id));
pr_human("max controllers: 0x%02x\n", iou_info.max_controllers);
 
-   if (verbose > 0) 
+   if (verbose > 0)
for (i = 0; i < iou_info.max_controllers; ++i) {
pr_human("controller[%3d]: ", i + 1);
switch ((iou_info.controller_list[i / 2] >>
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH 06/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz

Support a display of list of target from user level.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-04-21 
01:13:04.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-04-21 
03:56:05.0 +0300
@@ -1730,6 +1730,63 @@ end:
 
 static CLASS_DEVICE_ATTR(remove_target, S_IWUSR, NULL, srp_remove_target);
 
+#define TARGET_INFO_BUF_SIZE 126
+
+static ssize_t list_targets(struct class_device *class_dev, char *buf)
+{
+   struct srp_host *host =
+   container_of(class_dev, struct srp_host, class_dev);
+   struct srp_target_port *target;
+   int printed=0, ret;
+
+   mutex_lock(&host->target_mutex);
+   list_for_each_entry(target, &host->target_list, list)
+   if (target->state == SRP_TARGET_LIVE) {
+   ret = sprintf(buf+printed,
+   "id_ext=%016llx,ioc_guid=%016llx,"
+   "dgid=%04x%04x%04x%04x%04x%04x%04x%04x,"
+   "pkey=%04x,service_id=%016llx\n",
+   (unsigned long long)
+   be64_to_cpu(target->id_ext),
+   (unsigned long long)
+   be64_to_cpu(target->ioc_guid),
+   (int) be16_to_cpu(*(__be16 *)
+   &target->path.dgid.raw[0]),
+   (int) be16_to_cpu(*(__be16 *)
+   &target->path.dgid.raw[2]),
+   (int) be16_to_cpu(*(__be16 *)
+   &target->path.dgid.raw[4]),
+   (int) be16_to_cpu(*(__be16 *)
+   &target->path.dgid.raw[6]),
+   (int) be16_to_cpu(*(__be16 *)
+   &target->path.dgid.raw[8]),
+   (int) be16_to_cpu(*(__be16 *)
+   &target->path.dgid.raw[10]),
+   (int) be16_to_cpu(*(__be16 *)
+   &target->path.dgid.raw[12]),
+   (int) be16_to_cpu(*(__be16 *)
+   &target->path.dgid.raw[14]),
+   be16_to_cpu(target->path.pkey),
+   (unsigned long long)
+   be64_to_cpu(target->service_id));
+   if (ret <= 0)
+   goto end;
+
+   printed += ret;
+
+   if (printed + TARGET_INFO_BUF_SIZE > PAGE_SIZE - 1)
+   break;
+   }
+
+   ret = printed;
+
+end:
+   mutex_unlock(&host->target_mutex);
+   return ret;
+}
+
+static CLASS_DEVICE_ATTR(list_targets, S_IRUGO, list_targets, NULL);
+
 static ssize_t show_ibdev(struct class_device *class_dev, char *buf)
 {
struct srp_host *host =
@@ -1789,6 +1846,8 @@ static struct srp_host *srp_add_port(str
goto err_class;
if (class_device_create_file(&host->class_dev, 
&class_device_attr_remove_target))
goto err_class;
+   if (class_device_create_file(&host->class_dev, 
&class_device_attr_list_targets))
+   goto err_class;
if (class_device_create_file(&host->class_dev, 
&class_device_attr_ibdev))
goto err_class;
if (class_device_create_file(&host->class_dev, &class_device_attr_port))
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH 05/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz

Support a remove of a target from user level.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-05-01 
12:30:01.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-05-01 
12:36:22.0 +0300
@@ -960,10 +960,12 @@ static int srp_queuecommand(struct scsi_
long req_index;
int len;
 
-   if (target->state == SRP_TARGET_CONNECTING)
+   if (target->state == SRP_TARGET_CONNECTING ||
+   target->state == SRP_TARGET_RECONNECTING)
goto err;
 
if (target->state == SRP_TARGET_DEAD ||
+   target->state == SRP_TARGET_DISCONNECTED ||
target->state == SRP_TARGET_REMOVED) {
scmnd->result = DID_BAD_TARGET << 16;
done(scmnd);
@@ -1254,6 +1256,7 @@ static int srp_send_tsk_mgmt(struct scsi
spin_lock_irq(target->scsi_host->host_lock);
 
if (target->state == SRP_TARGET_DEAD ||
+   target->state == SRP_TARGET_DISCONNECTED ||
target->state == SRP_TARGET_REMOVED) {
scmnd->result = DID_BAD_TARGET << 16;
goto out;
@@ -1359,6 +1362,7 @@ static ssize_t show_ioc_guid(struct clas
struct srp_target_port *target = host_to_target(class_to_shost(cdev));
 
if (target->state == SRP_TARGET_DEAD ||
+   target->state == SRP_TARGET_DISCONNECTED ||
target->state == SRP_TARGET_REMOVED)
return -ENODEV;
 
@@ -1371,6 +1375,7 @@ static ssize_t show_service_id(struct cl
struct srp_target_port *target = host_to_target(class_to_shost(cdev));
 
if (target->state == SRP_TARGET_DEAD ||
+   target->state == SRP_TARGET_DISCONNECTED ||
target->state == SRP_TARGET_REMOVED)
return -ENODEV;
 
@@ -1383,6 +1388,7 @@ static ssize_t show_pkey(struct class_de
struct srp_target_port *target = host_to_target(class_to_shost(cdev));
 
if (target->state == SRP_TARGET_DEAD ||
+   target->state == SRP_TARGET_DISCONNECTED ||
target->state == SRP_TARGET_REMOVED)
return -ENODEV;
 
@@ -1394,6 +1400,8 @@ static ssize_t show_dgid(struct class_de
struct srp_target_port *target = host_to_target(class_to_shost(cdev));
 
if (target->state == SRP_TARGET_DEAD ||
+   target->state == SRP_TARGET_DISCONNECTED ||
+   target->state == SRP_TARGET_DISCONNECTED ||
target->state == SRP_TARGET_REMOVED)
return -ENODEV;
 
@@ -1447,11 +1455,11 @@ static int srp_add_target(struct srp_hos
if (scsi_add_host(target->scsi_host, host->dev->dev->dma_device))
return -ENODEV;
 
-   mutex_lock(&host->target_mutex);
list_add_tail(&target->list, &host->target_list);
-   mutex_unlock(&host->target_mutex);
 
+   spin_lock_irq(target->scsi_host->host_lock);
target->state = SRP_TARGET_LIVE;
+   spin_unlock_irq(target->scsi_host->host_lock);
 
/* XXX: are we supposed to have a definition of SCAN_WILD_CARD ?? */
scsi_scan_target(&target->scsi_host->shost_gendev,
@@ -1642,7 +1650,6 @@ static ssize_t srp_create_target(struct 
 {
struct srp_host *host =
container_of(class_dev, struct srp_host, class_dev);
-   struct Scsi_Host *target_host;
struct srp_target_port *target, *existing_target = NULL;
int ret;
int i;
@@ -1663,6 +1670,7 @@ static ssize_t srp_create_target(struct 
   buf);
ret = -EEXIST;
break;
+   case SRP_TARGET_RECONNECTING:
case SRP_TARGET_CONNECTING:
/* It is in the middle of reconnecting */
ret = -EALREADY;
@@ -1671,6 +1679,10 @@ static ssize_t srp_create_target(struct 
/* It will be removed soon - create a new one */
case SRP_TARGET_REMOVED:
/* target is dead, create a new one */
+   existing_target = NULL;
+   break;
+   case SRP_TARGET_DISCONNECTED:
+   existing_target->state = SRP_TARGET_RECONNECTING;
break;
}
spin_unlock_irq(existing_target->scsi_host->host_lock);
@@ -1678,26 +1690,30 @@ static ssize_t srp_create_target(struct 
goto unlock_mutex;
}
 
-   /* really create the target */
-   target_host = scsi_host_alloc(&srp_template,
- sizeof (struct srp_target_port));
-   if (!target_h

[openib-general] [PATCH 04/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz

Do not add the same target twice.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-04-25 
15:17:34.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-04-25 
15:19:37.0 +0300
@@ -1478,7 +1478,8 @@ static int srp_parse_options(const char 
printk(KERN_WARNING PFX "bad max sect parameter 
'%s'\n", p);
goto out;
}
-   target->scsi_host->max_sectors = token;
+   if (target->scsi_host != NULL)
+   target->scsi_host->max_sectors = token;
break;
 
default:
@@ -1503,20 +1504,89 @@ out:
return ret;
 }
 
+/* srp_find_target - If the target exists return it in target,
+ otherwise target is set to NULL.
+ host->target_mutex should be hold */
+static int srp_find_target(const char *buf, struct srp_host *host,
+  struct srp_target_port **target)
+{
+   struct srp_target_port *target_to_find, *curr_target;
+   int ret, i;
+
+   target_to_find = kzalloc(sizeof *target_to_find, GFP_KERNEL);
+   ret = srp_parse_options(buf, target_to_find);
+   if (ret)
+   goto free;
+
+   list_for_each_entry(curr_target, &host->target_list, list)
+   if (target_to_find->ioc_guid == curr_target->ioc_guid &&
+   target_to_find->id_ext == curr_target->id_ext &&
+   target_to_find->path.pkey == curr_target->path.pkey &&
+   target_to_find->service_id == curr_target->service_id) {
+   for (i = 0; i < 16; ++i)
+   if (target_to_find->path.dgid.raw[i] != 
curr_target->path.dgid.raw[i])
+   break;
+   if (i == 16) {
+   *target = curr_target;
+   goto free;
+   }
+   }
+
+   *target = NULL;
+
+free:
+   kfree(target_to_find);
+   return 0;
+}
+
 static ssize_t srp_create_target(struct class_device *class_dev,
 const char *buf, size_t count)
 {
struct srp_host *host =
container_of(class_dev, struct srp_host, class_dev);
struct Scsi_Host *target_host;
-   struct srp_target_port *target;
+   struct srp_target_port *target, *existing_target = NULL;
int ret;
int i;
 
+   /* first check if the target already exists */
+
+   mutex_lock(&host->target_mutex);
+   ret = srp_find_target(buf, host, &existing_target);
+   if (ret)
+   goto unlock_mutex;
+
+   if (existing_target) {
+   /* target already exists */
+   spin_lock_irq(existing_target->scsi_host->host_lock);
+   switch (existing_target->state) {
+   case SRP_TARGET_LIVE:
+   printk(KERN_WARNING PFX "target %s already exists\n",
+  buf);
+   ret = -EEXIST;
+   break;
+   case SRP_TARGET_CONNECTING:
+   /* It is in the middle of reconnecting */
+   ret = -EALREADY;
+   break;
+   case SRP_TARGET_DEAD:
+   /* It will be removed soon - create a new one */
+   case SRP_TARGET_REMOVED:
+   /* target is dead, create a new one */
+   break;
+   }
+   spin_unlock_irq(existing_target->scsi_host->host_lock);
+   if (ret)
+   goto unlock_mutex;
+   }
+
+   /* really create the target */
target_host = scsi_host_alloc(&srp_template,
  sizeof (struct srp_target_port));
-   if (!target_host)
-   return -ENOMEM;
+   if (!target_host) {
+   ret = -ENOMEM;
+   goto unlock_mutex;
+   }
 
target_host->max_lun = SRP_MAX_LUN;
 
@@ -1533,7 +1603,7 @@ static ssize_t srp_create_target(struct 
 
ret = srp_parse_options(buf, target);
if (ret)
-   goto err;
+   goto err_put_scsi_host;
 
ib_get_cached_gid(host->dev, host->port, 0, &target->path.sgid);
 
@@ -1554,7 +1624,7 @@ static ssize_t srp_create_target(struct 
 
ret = srp_create_target_ib(target);
if (ret)
-   goto err;
+   goto err_put_scsi_host;
 
target->cm_id = 

[openib-general] [PATCH 03/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz

It is nicer to perform the init_work just before the call to schedule_work.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-04-17 
10:57:59.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-04-18 
02:26:29.0 +0300
@@ -828,8 +829,10 @@ static void srp_completion(struct ib_cq 
   wc.wr_id & SRP_OP_RECV ? "receive" : "send",
   wc.status);
spin_lock_irqsave(target->scsi_host->host_lock, flags);
-   if (target->state == SRP_TARGET_LIVE)
+   if (target->state == SRP_TARGET_LIVE) {
+   INIT_WORK(&target->work, srp_reconnect_work, 
target);
schedule_work(&target->work);
+   }
spin_unlock_irqrestore(target->scsi_host->host_lock, 
flags);
break;
}
@@ -1601,8 +1684,6 @@ static ssize_t srp_create_target(struct 
target->scsi_host  = target_host;
target->srp_host   = host;
 
-   INIT_WORK(&target->work, srp_reconnect_work, target);
-
for (i = 0; i < SRP_SQ_SIZE - 1; ++i)
target->req_ring[i].next = i + 1;
target->req_ring[SRP_SQ_SIZE - 1].next = -1;
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCHE 02/12] SRP: changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz

Move the destruction of the host and the removal from a list to a function.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-04-23 
14:08:03.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-04-24 
10:47:00.0 +0300
@@ -344,6 +344,16 @@ static void srp_disconnect_target(struct
wait_for_completion(&target->done);
 }
 
+static void destruct_scsi_host_and_target(struct srp_target_port *target, int 
disconnect_target)
+{
+   scsi_remove_host(target->scsi_host);
+   if (disconnect_target)
+   srp_disconnect_target(target);
+   ib_destroy_cm_id(target->cm_id);
+   srp_free_target_ib(target);
+   scsi_host_put(target->scsi_host);
+}
+
 static void srp_remove_work(void *target_ptr)
 {
struct srp_target_port *target = target_ptr;
@@ -357,10 +374,7 @@ static void srp_remove_work(void *target
list_del(&target->list);
mutex_unlock(&target->srp_host->target_mutex);
 
-   scsi_remove_host(target->scsi_host);
-   ib_destroy_cm_id(target->cm_id);
-   srp_free_target_ib(target);
-   scsi_host_put(target->scsi_host);
+   destruct_scsi_host_and_target(target, 0);
/* And another put to really free the target port... */
scsi_host_put(target->scsi_host);
 }
@@ -1734,13 +1746,8 @@ static void srp_remove_one(struct ib_dev
flush_scheduled_work();
 
list_for_each_entry_safe(target, tmp_target,
-&host->target_list, list) {
-   scsi_remove_host(target->scsi_host);
-   srp_disconnect_target(target);
-   ib_destroy_cm_id(target->cm_id);
-   srp_free_target_ib(target);
-   scsi_host_put(target->scsi_host);
-   }
+&host->target_list, list)
+   destruct_scsi_host_and_target(target, 1);
 
    ib_dereg_mr(host->mr);
ib_dealloc_pd(host->pd);
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCHE 01/12] SRP: changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz

Remove a redundant if

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-04-17 
10:06:19.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-04-17 
10:11:24.0 +0300
@@ -1798,8 +1798,7 @@ static void srp_remove_one(struct ib_dev
list_for_each_entry_safe(target, tmp_target,
 &host->target_list, list) {
spin_lock_irqsave(target->scsi_host->host_lock, flags);
-   if (target->state != SRP_TARGET_REMOVED)
-   target->state = SRP_TARGET_REMOVED;
+   target->state = SRP_TARGET_REMOVED;
spin_unlock_irqrestore(target->scsi_host->host_lock, 
flags);
}
mutex_unlock(&host->target_mutex);
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH 00/12] SRP: Changing ibsrpdm

2006-05-01 Thread Ishai Rabinovitz
Hi,

I'm going to send 12 patches. 6 patches for the kernel, and 6 for the 
userspace ibsrpdm.
The kernel patches avoid adding the same target twice, allow the removal of 
a target, and add a query about the connected targets.

The userspace patches change ibsrpdm to a real daemon - that runs all the time 
and updates the kernel with the visible targets.

Some of the kernel patches should be applied after Vu patches for fmr. 
The functionality of the changes can work without Vu's patches, but they are
changing code in the same functions, so there may be some simple conflicts 
when applied without Vu's patches.
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] SRP: fix crash in srp_process_rsp

2006-04-26 Thread Ishai Rabinovitz
Hi

srp_process_rsp crashes on NULL pointer dereference.
The following fixes the crash.
Is this a correct fix?

---

Avoiding dereference of a null pointer.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-04-26 
15:38:23.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-04-26 
17:45:22.0 +0300
@@ -655,9 +655,11 @@ static void srp_process_rsp(struct srp_t
complete(&req->done);
} else {
scmnd = req->scmnd;
-   if (!scmnd)
+   if (!scmnd) {
printk(KERN_ERR "Null scmnd for RSP w/tag %016llx\n",
   (unsigned long long) rsp->tag);
+   goto unlock;
+   }
scmnd->result = rsp->status;
 
if (rsp->flags & SRP_RSP_FLAG_SNSVALID) {
@@ -683,7 +685,7 @@ static void srp_process_rsp(struct srp_t
} else
req->cmd_done = 1;
}
-
+unlock:
spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
 }
 
-- 
Ishai Rabinovitz
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] FW: [openfabrics-ewg] Changes in SM for the new SRP daemon

2006-04-06 Thread Ishai Rabinovitz



Yes it 
should.
 
Ishai


From: Woodruff, Robert J 
[mailto:[EMAIL PROTECTED] Sent: Thursday, April 06, 2006 
1:31 AMTo: Ishai Rabinovitz; 
[EMAIL PROTECTED]Subject: RE: [openfabrics-ewg] Changes in 
SM for the new SRP daemon

Shouldn't this discussion be on openib-general 
?


From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Ishai 
RabinovitzSent: Wednesday, April 05, 2006 2:14 PMTo: 
[EMAIL PROTECTED]Subject: [openfabrics-ewg] Changes in SM 
for the new SRP daemon

 
Summary:


  In the next release of SRP we want to add a 
  daemon that is executed in each initiator and finds out which targets exist in 
  the fabric.
  The new daemon can use SA capabilities 
  from IBTA 1.2 errata (details at the bottom) to improve performance. If 
  you are using SM other then openIB's openSM please support this feature in 
  your SM.
Details:

  When one wants to find all SRP targets and 
  their information in the fabric, he/she currently run "ibsrpdm""ibsrpdm" 
  uses the following procedure to discover all SRP targets available in the 
  fabric
  
"ibsrpdm" sends a query get_table 
node info with node type is CA. 
It 
gets quite a big table and then for every node it sends a port_info query. 
From the response to this query the 
initiator can check if this port is an SRP target. (dm bit capability is set)
  The problem with this procedure is that it may create too much traffic on the fabric.
  Let's assume there is a cluster of 4096 nodes booting together. Each of this 4096 nodes is 
  sending the first query and gets a list of 4096 nodes. This list is divided 
  into a long number of UD messages and may cause retransmit. After getting this 
  list each node sends 4096 queries for the port of each node.
  This traffic is quite huge.
  The SA has a new capability to answer the query: 
  "please return a list of the ports that has the dm bit set" (meaning return 
  ports of SRP targets). (This capability of the SA is part of Errata Release 
  Version: 1.2 1/26/2006 Chapter/Subsection: 15.2.5.3  - quoted at the bottom). 
  Using this capability we can use the 
  following procedure:
  
The daemon will send "get table port_info 
of ports that has dm bit set" query and gets a table of small number of 
port_info.  
For each port It queries for 
the guid of this ports.
  This will significantly reduce the traffic on the fabric. 
  Actually, in this solution there is so little 
  traffic that the new daemon will run it 
  periodically (every minute) to look for changes (There will be 
  less traffic than registering 
  to Trap 64 and Trap 65.)
   
Quoting the 
errata:

  
  Errata Tracking Number: MGTWG8372 
  Sub-Case Number: 0 
  Reference ID: 4291 
  Title: Enhanced SA PortInfoRecord searches 
  Submitter: Livingston, James ([EMAIL PROTECTED]) 
  
  Volume: 1 
  Revision: 1.2 
  Errata Release Version: 1.2 1/26/2006 
  Chapter/Subsection: 15 
  Page: 885 
  Line: 20 
  AssignedIntensity: 
  Status/Disposition: WG_Approved 
  Problem Description: Add a new row to Table 186 SA-Specific 
  
  ClassPortInfo:CapabilityMask Bits: 
  Problem Resolution: Add a new row to Table 186 SA-Specific 
  
  ClassPortInfo:CapabilityMask Bits 
  Original Text:  
  Corrected Text:  
  IsPortInfoCapMaskMatchSupported 
   
  13 
   
  If this value is 1, SA shall support matching the 
  PortInfo:CapabilityMask component as described in 
  15.2.5.3>. 
  Comment History: Dec 14, 2005 08:11:26 PM Old: New:Pending 
  By:Benner, Alan 
  ([EMAIL PROTECTED]) 
  
  Errata Tracking Number: MGTWG8372 
  Sub-Case Number: 1 
  Reference ID: 4292 
  Title: Enhanced SA PortInfoRecord searches 
  Submitter: Livingston, James ([EMAIL PROTECTED]) 
  
  Volume: 1 
  Revision: 1.2 
  Errata Release Version: 1.2 1/26/2006 
  Chapter/Subsection: 15.2.5.3 
  Page: 891 
  Line: 36 
  AssignedIntensity: 
  Status/Disposition: WG_Approved 
  Problem Description: Add optional compliance statement for new 
  capability. 
  Problem Resolution: Make the proposed change to the spec text. 
  
  Original Text:  
  Corrected Text: o15-0.x.y: If SA's 
  ClassPortInfo:CapabilityMask.IsPortInfoCapMaskMatchSupported 
  is 1, 
  then the AttributeModifier of the SubnAdmGet() and 
  SubnAdmGetTable() 
  methods affects the matching behavior on the 
  PortInfo:CapabilityMask 
  component. If the high-order bit (bit 31) of the 
  AttributeModifier 
  is set to 1, matching on the CapabilityMask component will not 
  be an 
  exact bitwise match as described in . 
  Instead, 
  matching will only be performed on those bits which are set to 
  1 in 
  the PortInfo:CapabilityMask embedded in the query. 
  
  In , bits in the 
  PortInfo:CapabilityMask embedded 
  in the query that are set to 0 are bitwi