[no subject]

2014-04-07 Thread Michael Reed
unsubscribe linux-scsi

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


panic in sg_add when class_device_create() fails

2008-01-14 Thread Michael Reed
We're seeing an occasional panic in sg_add() when class_device_create()
fails.  It's obvious in the code that it uses the pointer to sg_class_member
even though it's invalid.  We do see the "class_device_create failed" message.

class_set_devdata(cl_dev, sdp);
error = cdev_add(cdev, MKDEV(SCSI_GENERIC_MAJOR, sdp->index), 1);
if (error)
goto cdev_add_err;

if (sg_sysfs_valid) {
struct class_device * sg_class_member;

sg_class_member = class_device_create(sg_sysfs_class, NULL,
MKDEV(SCSI_GENERIC_MAJOR, sdp->index),
cl_dev->dev, "%s",
disk->disk_name);
if (IS_ERR(sg_class_member))
printk(KERN_WARNING "sg_add: "
"class_device_create failed\n");
class_set_devdata(sg_class_member, sdp);
  
error = sysfs_create_link(&scsidp->sdev_gendev.kobj,
  &sg_class_member->kobj, "generic");
if (error)
printk(KERN_ERR "sg_add: unable to make symlink "
"'generic' back to sg%d\n", sdp->index);
} else
printk(KERN_WARNING "sg_add: sg_sys Invalid\n");

I'm uncertain of the correct fix.  Perhaps it involves a call to cdev_unmap()? 
I don't have a good way to test a fix as this problem is not easily reproduced.

Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: generating a Linux WWN?

2007-10-08 Thread Michael Reed


David Miller wrote:
> From: James Bottomley <[EMAIL PROTECTED]>
> Date: Sat, 06 Oct 2007 10:04:55 -0500
> 
>> If you remember Rusty's guide to interfaces, this is a level 14 easy to
>> misuse interface: "The obvious use is wrong"; since the obvious use is
>> to put it in module parameters and have the problem go away (for
>> now ...).  Actually, I could be harsher and say it's level 17 "There's
>> no correct use" because statistically every time you use it, you expose
>> yourself to potential duplicate WWNs.
> 
> Every time you use GIT you expose yourself to potential duplicate SHAs
> which will corrupt the tree.
> 
> This argument borders on lunacy, give it up :-)

As long as two systems on the same SAN are quite unlikely to wind up with
the same WWN.  override_wwn=5001 comes to mind.  Two months
later, another admin does the same thing to another system.  As Jeff has
pointed out, there really should be some method in place that allows for
some randomness in the wwn.  (Other than fixing the broken hardware. :)
Handing the admins the rope doesn't mean we want to see it around their necks!

And, while I'm thinking about it, whatever implementation is chosen, doesn't
it need to allow for the targeting of a particular sas host adapter and
perhaps even a particular sas port on the adapter?

Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: generating a Linux WWN?

2007-10-03 Thread Michael Reed


Jeff Garzik wrote:
> Luben Tuikov wrote:
>> Do you mean:
>>   "The admin will have the option to SPECIFY(SET) a WWN, should they
>> sodesire."
>> OR do you mean:
>>   "The admin will have the option to HAVE THE KERNEL auto-generate a WWN,
>>should they so desire."
> 
> Both.  It is up to admin policy to decide if they wish to use the
> available board's WWN (if present/valid), manually apportion WWNs, or
> whether the kernel's generation algo is fine with them.

Could the reassignment of a board's WWN present any security issues?
A board which is allowed to access only some targets is now allowed to
access others.  I would assume that most sites that care about security
would have other measures in place to guard against this, but it's still
worth pondering.

Mike

> 
> Jeff
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: generating a Linux WWN?

2007-09-27 Thread Michael Reed


Michael Reed wrote:
> 
> Matthew Wilcox wrote:
>> On Thu, Sep 27, 2007 at 09:16:13AM -0500, [EMAIL PROTECTED] wrote:
>>>> Unfortunately, it looks like IEEE doesn't have any OID's registered for
>>>> Linux or other reserved areas
>>>> (http://standards.ieee.org/regauth/oui/oui.txt). However, it does look
>>>> like they go in order... so maybe if you used an OID of 0xFF you
>>>> could at least guarantee that you didn't conflict with any company's SAS
>>>> WWNs.
>> It's something that happens frequently enough that we should come up
>> with a proper way of handling this.  I heard a story of someone at HP
>> taking an old computer, reading the MAC address from the motherboard,
>> then snapping the board in half.  I suppose if you're going to use a MAC
>> address from a 10Mbit ethernet card for a SAS WWN, there's no chance of
>> conflict, but still ... 3com might choose to do the same thing, and then
>> we're in trouble.
> 
> Record the WWN of your SAS / FC port so that if/when it goes missing you
> can put it back?  Have spares on site?

Our vendor recommends fixing the board, not applying a band-aid.  A random
WWN will potentially not work as the board firmware does some level of
sanity checking.  YMMV.

Mike

> 
>> I don't have a good solution for WWN assignment.  Even if we get a
>> 24-bit OID assignment for 'software use' or something, how do we control
>> the use withi the SAN to be sure we get no overlapping WWNs?
> 
> Your point here is that overriding a WWN assignment is asking for trouble.
> 
> 
> Mike
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: generating a Linux WWN?

2007-09-27 Thread Michael Reed


Matthew Wilcox wrote:
> On Thu, Sep 27, 2007 at 09:16:13AM -0500, [EMAIL PROTECTED] wrote:
>> > Unfortunately, it looks like IEEE doesn't have any OID's registered for
>> > Linux or other reserved areas
>> > (http://standards.ieee.org/regauth/oui/oui.txt). However, it does look
>> > like they go in order... so maybe if you used an OID of 0xFF you
>> > could at least guarantee that you didn't conflict with any company's SAS
>> > WWNs.
> 
> It's something that happens frequently enough that we should come up
> with a proper way of handling this.  I heard a story of someone at HP
> taking an old computer, reading the MAC address from the motherboard,
> then snapping the board in half.  I suppose if you're going to use a MAC
> address from a 10Mbit ethernet card for a SAS WWN, there's no chance of
> conflict, but still ... 3com might choose to do the same thing, and then
> we're in trouble.

Record the WWN of your SAS / FC port so that if/when it goes missing you
can put it back?  Have spares on site?

> 
> I don't have a good solution for WWN assignment.  Even if we get a
> 24-bit OID assignment for 'software use' or something, how do we control
> the use withi the SAN to be sure we get no overlapping WWNs?

Your point here is that overriding a WWN assignment is asking for trouble.


Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: generating a Linux WWN?

2007-09-27 Thread Michael Reed


Jeff Garzik wrote:
> James Smart wrote:
>> Uh, although this may work very well for small constrained configs, as one
>> who debugs larger environments (and things always grow or get connected in
>> ways you don't expect), depending on a random number for uniqueness makes
>> me very unsettled. Debugging that small-percentage potential, when/if it
>> does occur can be an absolute nightmare with all kinds of weird behavior.
>> Rarely will you solve it from the context of a single server. There are
>> lots of ways to cheat, but I believe they all have to be based on some
>> small part that is guaranteed to be unique (like the IEEE id's).
> 
> That's the ideal world, sure  :) 
> 
> But the reality is, just like with ethernet MAC addresses, you don't always 
> have one when you "supposed to" have one.
> 

I'm curious about the problem you're trying to solve.  Generally a missing WWN,
as having a unique WWN should be a product requirement, is an indicator of
a problem.  Shouldn't the proper solution be to have the card repaired?
Arbitrarily assigning a WWN could mask or introduce other problems, in 
particular
with regard to storage connectivity.

> There are obvious limitations and problems arising from a generated WWN -- 
> particularly for an HBA, where serial numbers and other unique identifiers 
> are scarce.  But that doesn't wish the missing-WWN problem away.  Decades of 
> networking experience have taught us a few things  :) 
> 
> As an aside, we also need a way to override the HBA's WWN, just like we do 
> with ethernet MAC address.

We do?  Perhaps if it's corrupt and you know the correct value.  But, in that 
case
shouldn't the card be replaced?  When would you need to override the default 
WWN?

Linux: it's not just for desktops anymore.  :)

Mike


> 
> Jeff
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: generating a Linux WWN?

2007-09-27 Thread Michael Reed


Jeff Garzik wrote:
> Is there an accepted way to generate a SAS address, when the adapter
> does not supply one (NVRAM invalid or missing, etc.)?
> 
> Unless somebody complains, I was just planning to use

I'm not complaining.  I'm circulating this for discussion with our
SAS hardware provider and internally.  Give us a little time to
mull this over.

Thanks,
 Mike


> get_random_bytes().  But maybe Linux has an IEEE id we can use, to make
> the practice a bit more legitimate and avoid stomping on others?  Or an
> IEEE id specifically designed for generated-address purposes?
> 
> Jeff
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/9] mpt fusion: error recovery improvements,andsynchronizing internal commands

2007-09-25 Thread Michael Reed

Jeff Garzik wrote:
> Moore, Eric wrote:
>> On Tuesday, September 25, 2007 11:32 AM,  James Bottomley  wrote:
>>  
 Youve rejected the error recovery patchs, which is fair enough.  
>>> Just the separation ... the actual patch looks OK.
>>>
>>
>> I'll will continue working to separate the "error recovery
>> improvements:" into smaller feature add, but will take some time.   I
>> want some constructive feedback, and too big of patch is deterring some
>> people from looking at it.
> 
> I think it's fair to break it up, as long as its clearly noted that the
> patch is for review only.

Any chance the changes, properly documented, could be accepted as
one big patch?  Breaking this code into smaller patches, when it's
really intended as a driver update, could prove problematic.

Mike


> 
> Jeff
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] stale residual returned on write following BUSY retry

2007-09-17 Thread Michael Reed
A BUSY status returned on a write request results in a stale residual
being returned when the write ultimately successfully completes.

This can be reproduced as follows:

1) issue immediate mode rewind to scsi tape drive
2) issue write request

The tape drive returns busy.  The low level driver detects underrun and
sets the residual into the scsi command.  The low level driver responds
with (DID_OK << 16) | scsi_status.  scsi_status is 8, hence
status_byte(result) == 4, i.e., BUSY.

scsi_softirq_done() calls scsi_decide_disposition() which returns
ADD_TO_MLQUEUE.  scsi_softirq_done() then calls scsi_queue_insert()
which, on the way to resubmitting the request to the driver, calls
scsi_init_cmd_errh().

The attached patch modifies scsi_init_cmd_errh() to clear the resid
field.  This prevents a "stale" residual from being returned when the
scsi command finally completes without a BUSY status.

This patch applies to 2.6.23-rc6-git7.

Signed-off-by: Michael Reed <[EMAIL PROTECTED]>


--- linux-2.6.23-rc6-git7.orig/drivers/scsi/scsi_lib.c	2007-09-17 14:02:03.0 -0700
+++ linux-2.6.23-rc6-git7/drivers/scsi/scsi_lib.c	2007-09-17 14:05:51.0 -0700
@@ -443,6 +443,7 @@ EXPORT_SYMBOL_GPL(scsi_execute_async);
 static void scsi_init_cmd_errh(struct scsi_cmnd *cmd)
 {
 	cmd->serial_number = 0;
+	cmd->resid = 0;
 	memset(cmd->sense_buffer, 0, sizeof cmd->sense_buffer);
 	if (cmd->cmd_len == 0)
 		cmd->cmd_len = COMMAND_SIZE(cmd->cmnd[0]);


Re: persistent scsi with udev

2007-02-08 Thread Michael Reed
While you didn't explicitly mention which distro you're using, some already
have udev in their initrd.  If this applies in your case, you should be able
to change the root= parameter passed to the kernel to specify your root device
via a persistent name.  For example:

root=/dev/disk/by-path/pci-:01:03.0-scsi-0:0:1:0p1

You can do the same in your /etc/fstab for the root device.

You can ALSO change the order in which modules are loaded via 
/etc/modprobe.conf,
or /etc/sysconfig/kernel, or something similar for your distro, so that the 
driver
for your root device is always loaded first.

I hope this helps.

Mike


James Smart wrote:
> You can see if a white paper we put together, which includes information
> on tapes, can help you. There's also a small section on "boot from SAN"
> which talks about configuring udev in initrd for boot devices. Although
> the content references distros, it should be applicable to the kernel
> generically.
> http://www.emulex.com/white/hba/wp_linux26udev.pdf
> 
> -- james s
> 
> 
> Andreas Moroder wrote:
>> Hello,
>>
>> we have a big problem with scsi device mapping.
>>
>> We have a LSI Megaraid controller and a qlogic HBA with a attached SAN
>>
>> When we boot the machine without the SAN then  the first disk on the
>> MEGARAID becomes \dev\sda
>>
>> When we boot with the SAN attached then the first disk in the san
>> becomes \dev\sda and the system stop.
>>
>> Is there a step by step tot availabe on how to configure the udev
>> files when two SCSI devices are installed in one machine ?
>>
>> Thanks
>> Andreas
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>> the body of a message to [EMAIL PROTECTED]
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Prevent infinite retries due to DID_RESET return status

2007-01-31 Thread Michael Reed
Limit lifetime of scsi commands receiving DID_RESET status.

Signed-off-by: Michael Reed <[EMAIL PROTECTED]>

Limit lifetime of scsi commands receiving DID_RESET status.

Signed-off-by: Michael Reed <[EMAIL PROTECTED]>

--- rg61u/include/scsi/scsi.h	2006-10-31 21:08:47.0 -0600
+++ rg61/include/scsi/scsi.h	2007-01-29 15:58:40.633779567 -0600
@@ -353,6 +353,7 @@ struct scsi_lun {
 #define SCSI_MLQUEUE_HOST_BUSY   0x1055
 #define SCSI_MLQUEUE_DEVICE_BUSY 0x1056
 #define SCSI_MLQUEUE_EH_RETRY0x1057
+#define SCSI_MLQUEUE_DID_RESET   0x1058
 
 /*
  *  Use these to separate status msg and our bytes
--- rg61u/drivers/scsi/scsi_lib.c	2007-01-29 15:11:11.466213588 -0600
+++ rg61/drivers/scsi/scsi_lib.c	2007-01-29 15:58:40.637779387 -0600
@@ -101,10 +101,10 @@ static void scsi_unprep_request(struct r
  *
  * Returns: Nothing.
  *
- * Notes:   We do this for one of two cases.  Either the host is busy
+ * Notes:   We do this for one of three cases.  1) the host is busy
  *  and it cannot accept any more commands for the time being,
- *  or the device returned QUEUE_FULL and can accept no more
- *  commands.
+ *  2) the device returned QUEUE_FULL and can accept no more
+ *  commands, or 3) the LLDD returned DID_RESET.
  * Notes:   This could be called either from an interrupt context or a
  *  normal process context.
  */
@@ -138,9 +138,11 @@ int scsi_queue_insert(struct scsi_cmnd *
 
 	/*
 	 * Decrement the counters, since these commands are no longer
-	 * active on the host/device.
+	 * active on the host/device.  If the reason is SCSI_MLQUEUE_DID_RESET
+	 * then scsi_device_unbusy() was called by scsi_finish_command().
 	 */
-	scsi_device_unbusy(device);
+	if (reason != SCSI_MLQUEUE_DID_RESET)
+		scsi_device_unbusy(device);
 
 	/*
 	 * Requeue this command.  It will go before all other commands
@@ -792,6 +794,33 @@ static void scsi_release_buffers(struct 
 }
 
 /*
+ * Function:scsi_command_expired()
+ *
+ * Purpose: Check scsi a command's age before retrying it.
+ *
+ * Arguments:   cmd	- command that we are checking for timeout.
+ *
+ * Returns: non-zero if command has exceeded its lifetime
+ *  zero otherwise
+ *
+ * Notes:   A command's lifetime is considered to be the number
+ *  of (retries permitted plus one) * command timeout.
+ *
+ */
+static int scsi_command_expired(struct scsi_cmnd *cmd)
+{
+	int ret = 0;
+	unsigned long wait_for = (cmd->allowed + 1) * cmd->timeout_per_command;
+	if (time_before(cmd->jiffies_at_alloc + wait_for, jiffies)) {
+		sdev_printk(KERN_ERR, cmd->device,
+			"timing out command, waited %lus\n",
+			wait_for/HZ);
+		ret = 1;
+	}
+	return ret;
+}
+
+/*
  * Function:scsi_io_completion()
  *
  * Purpose: Completion processing for block device I/O requests.
@@ -962,12 +991,26 @@ void scsi_io_completion(struct scsi_cmnd
 		}
 	}
 	if (host_byte(result) == DID_RESET) {
-		/* Third party bus reset or reset for error recovery
-		 * reasons.  Just retry the request and see what
-		 * happens.
+		/*
+		 * Third party bus reset or reset for error recovery reasons.
+		 * If no data was transferred and the command has not expired,
+		 * just reinsert the command  on the queue.  If the command
+		 * HAS expired, we fall through to call scsi_end_request.
+		 *
+		 * If data was transferred, regenerate the command to transfer
+		 * only untransferred data by calling scsi_requeue_command().
 		 */
-		scsi_requeue_command(q, cmd);
-		return;
+		if (!good_bytes) {
+			if (!(scsi_command_expired(cmd))) {
+scsi_queue_insert(cmd, SCSI_MLQUEUE_DID_RESET);
+return;
+			}
+			/* fall through to call scsi_end_request() */
+		}
+		else {
+			scsi_requeue_command(q, cmd);
+			return;
+		}
 	}
 	if (result) {
 		if (!(req->cmd_flags & REQ_QUIET)) {
@@ -1381,17 +1424,12 @@ static void scsi_kill_request(struct req
 static void scsi_softirq_done(struct request *rq)
 {
 	struct scsi_cmnd *cmd = rq->completion_data;
-	unsigned long wait_for = (cmd->allowed + 1) * cmd->timeout_per_command;
 	int disposition;
 
 	INIT_LIST_HEAD(&cmd->eh_entry);
 
 	disposition = scsi_decide_disposition(cmd);
-	if (disposition != SUCCESS &&
-	time_before(cmd->jiffies_at_alloc + wait_for, jiffies)) {
-		sdev_printk(KERN_ERR, cmd->device,
-			"timing out command, waited %lus\n",
-			wait_for/HZ);
+	if (disposition != SUCCESS && scsi_command_expired(cmd)) {
 		disposition = SUCCESS;
 	}
 			


[PATCH 1/2] Prevent infinite retries due to DID_RESET return status

2007-01-31 Thread Michael Reed
Move scsi_release_buffers() call so that buffers are retained until
after a determination is made about whether the command will be
requeued via scsi_queue_insert().  Doing so allows a command which
receives DID_RESET to be immediately queued to the driver using
the original scsi_cmnd, thus retaining it's original jiffies_at_alloc.

Signed-off-by: Michael Reed <[EMAIL PROTECTED]>
Move scsi_release_buffers() call so that buffers are retained until
after a determination is made about whether the command will be
requeued via scsi_queue_insert().  Doing so allows a command which
receives DID_RESET to be immediately queued to the driver using
the original scsi_cmnd, thus retaining it's original jiffies_at_alloc.

Signed-off-by: Michael Reed <[EMAIL PROTECTED]>

--- rg61u/drivers/scsi/scsi_lib.c	2007-01-02 23:11:18.0 -0600
+++ rg61/drivers/scsi/scsi_lib.c	2007-01-29 15:03:36.602482680 -0600
@@ -65,6 +65,7 @@ static struct scsi_host_sg_pool scsi_sg_
 #undef SP
 
 static void scsi_run_queue(struct request_queue *q);
+static void scsi_release_buffers(struct scsi_cmnd *cmd);
 
 /*
  * Function:	scsi_unprep_request()
@@ -599,6 +600,7 @@ static void scsi_requeue_command(struct 
 	struct request *req = cmd->request;
 	unsigned long flags;
 
+	scsi_release_buffers(cmd);
 	scsi_unprep_request(req);
 	spin_lock_irqsave(q->queue_lock, flags);
 	blk_requeue_request(q, req);
@@ -644,6 +646,7 @@ void scsi_run_host_queues(struct Scsi_Ho
  * Lock status: Assumed that lock is not held upon entry.
  *
  * Returns: cmd if requeue required, NULL otherwise.
+ * 		If cmd is returned then its buffers have not been released.
  *
  * Notes:   This is called for block device requests in order to
  *  mark some number of sectors as complete.
@@ -686,6 +689,7 @@ static struct scsi_cmnd *scsi_end_reques
 		}
 	}
 
+	scsi_release_buffers(cmd);
 	add_disk_randomness(req->rq_disk);
 
 	spin_lock_irqsave(q->queue_lock, flags);
@@ -826,8 +830,6 @@ void scsi_io_completion(struct scsi_cmnd
 	int sense_valid = 0;
 	int sense_deferred = 0;
 
-	scsi_release_buffers(cmd);
-
 	if (result) {
 		sense_valid = scsi_command_normalize_sense(cmd, &sshdr);
 		if (sense_valid)


[PATCH 0/2] Prevent infinite retries due to DID_RESET return status

2007-01-31 Thread Michael Reed
I have implemented Christoph's suggestions and comments in his reply
to my RFC.


Due to a firmware mismatch between a host and target (names withheld to
protect the innocent?), the LLDD was returning DID_RESET for every
i/o command.  This patch modifies the scsi layer to take into account
when the command which received DID_RESET was issued and eventually
give up on it instead of unconditionally reissuing it forever.
With this patch, on my test system, the command receiving the solid
DID_RESET times out after about 360 seconds.

The premise for this patch is that no command should have an infinite
lifetime.  The impetus for this patch was a system which would not
reach a command prompt without disconnecting the storage from the
host.

The significant change in this patch is to call scsi_queue_insert()
instead of scsi_requeue_command() if the command which receives a
DID_RESET did not complete any i/o (good_bytes==0).  scsi_queue_insert()
does not release the command and regenerate it like scsi_requeue_command()
does, hence jiffies_at_alloc reflects when the command was first issued.

Per Christoph's suggestion, I have broken this into two patches.  The
first implements the moving of the scsi_release_buffer() calls so that
it can be more easily reviewed.  The second patch implements the
lifetime timer for commands receiving DID_RESET.

These patches differ from the first posting in that scsi_queue_insert()
is called directly instead of calling the since removed scsi_retry_command();
comments have been cleaned up; and a comment has been added to indicate
that the code is supposed to fall through to call scsi_end_request()
if the command which received DID_RESET has expired.

These patches were tested by modifying a LLDD to return DID_RESET for
every command received.

Patches apply to 2.6.20-rc6-git1.

Signed-off-by: Michael Reed <[EMAIL PROTECTED]>

Christoph Hellwig wrote:
> On Mon, Dec 11, 2006 at 03:42:34PM -0600, Michael Reed wrote:
>> Due to a firmware mismatch between a host and target (names withheld to
>> protect the innocent?), the LLDD was returning DID_RESET for every
>> i/o command.  This patch modifies the scsi layer to take into account
>> when the command which received DID_RESET was issued and eventually
>> give up on it instead of unconditionally reissuing it forever
>> when it receives a DID_RESET.  With this patch, on my test system,
>> the command receiving the constant DID_RESET times out after about
>> 360 seconds.
>>
>> The premise for this patch is that no command should have an infinite
>> lifetime.  The impetus for this patch was a system which would not
>> reach a command prompt without disconnecting the storage from the
>> host.
>>
>> The significant change in this patch is to call scsi_retry_command()
>> instead of scsi_requeue_command() if the command which receives a
>> DID_RESET did not complete any i/o (good_bytes==0).  scsi_retry_command()
>> does not release the command and regenerate it like scsi_requeue_command()
>> does, hence jiffies_at_alloc reflects when the command was first issued.
> 
> Generally this patch looks good to me.  Some comments:
> 
> 
>> -extern int scsi_retry_command(struct scsi_cmnd *cmd);
>> +extern int scsi_retry_command(struct scsi_cmnd *cmd, int reason);
> 
>   I've just sent a patch to kill scsi_retry_command.  Use
>   scsi_queue_insert directly instead.
> 
>> +scsi_release_buffers(cmd);
> 
>   Can you please separate out the moving of the scsi_release_buffer
>   calls into a separate patch so it can be audited better?
> 
>> @@ -961,9 +992,20 @@ void scsi_io_completion(struct scsi_cmnd
>>  /* Third party bus reset or reset for error recovery
>>   * reasons.  Just retry the request and see what
>>   * happens.
>> + * If no data was transferred, just reissue this
>> + * command.  If data was transferred, regenerate
>> + * the command to transfer only untransferred data.
>>   */
> 
> The whole comment should look more like:
> 
>   /*
>* Third party bus reset or reset for error recovery reasons.
>* If no data was transferred, just reissue this command.
>* If data was transferred, regenerate the command to transfer
>* only untransferred data.
>*/
> 
>> -scsi_requeue_command(q, cmd);
>> -return;
>> +if (!good_bytes) {
>> +if (!(scsi_command_expired(cmd))) {
>> +scsi_retry_command(cmd, SCSI_MLQUEUE_DID_RESET);
>

Re: [patch] scsi: use lock per host instead of per device for shared queue tag host

2007-01-24 Thread Michael Reed
How 'bout a comment in scsh_host.h indicating that the pointer will be NULL 
unless
initialized by the driver?

"Protect shared block queue tag" is unique to stex.  Perhaps have no comment on
the variable declaration in scsi_host.h and explain why you use it in stex.

Mike


Ed Lin wrote:
> The block layer uses lock to protect request queue. Every scsi device
> has a unique request queue, and queue lock is the default lock in struct
> request_queue. This is good for normal cases. But for a  host with
> shared queue tag (e.g. stex controllers), a queue lock per device means
> the shared queue tag is not protected when multiple devices are accessed
> at a same time.  This patch is a simple fix for this situation by introducing
> a host queue lock to protect shared queue tag. Without this patch we will
> see various kernel panics (including the BUG() and kernel errors in
> blk_queue_start_tag and blk_queue_end_tag of ll_rw_blk.c) when accessing
> different devices simultaneously (e.g. copying big file from one device to
> another in smp kernels).
> 
> This is against kernel 2.6.20-rc5.
> 
> Signed-off-by: Ed Lin <[EMAIL PROTECTED]>
> ---
>  drivers/scsi/scsi_lib.c  |2 +-
>  drivers/scsi/stex.c  |2 ++
>  include/scsi/scsi_host.h |3 +++
>  3 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff -purN a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> --- a/drivers/scsi/scsi_lib.c 2007-01-23 14:40:28.0 -0800
> +++ b/drivers/scsi/scsi_lib.c 2007-01-23 14:46:43.0 -0800
> @@ -1574,7 +1574,7 @@ struct request_queue *__scsi_alloc_queue
>  {
>   struct request_queue *q;
>  
> - q = blk_init_queue(request_fn, NULL);
> + q = blk_init_queue(request_fn, shost->req_q_lock);
>   if (!q)
>   return NULL;
>  
> diff -purN a/drivers/scsi/stex.c b/drivers/scsi/stex.c
> --- a/drivers/scsi/stex.c 2007-01-23 14:40:28.0 -0800
> +++ b/drivers/scsi/stex.c 2007-01-23 14:48:59.0 -0800
> @@ -1254,6 +1254,8 @@ stex_probe(struct pci_dev *pdev, const s
>   if (err)
>   goto out_free_irq;
>  
> + spin_lock_init(&host->__req_q_lock);
> + host->req_q_lock = &host->__req_q_lock;
>   err = scsi_init_shared_tag_map(host, host->can_queue);
>   if (err) {
>   printk(KERN_ERR DRV_NAME "(%s): init shared queue failed\n",
> diff -purN a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
> --- a/include/scsi/scsi_host.h2007-01-23 14:40:29.0 -0800
> +++ b/include/scsi/scsi_host.h2007-01-23 14:57:04.0 -0800
> @@ -508,6 +508,9 @@ struct Scsi_Host {
>   spinlock_t  default_lock;
>   spinlock_t  *host_lock;
>  
> + spinlock_t  __req_q_lock;
> + spinlock_t  *req_q_lock;/* protect shared block queue tag */
> +
>   struct mutexscan_mutex;/* serialize scanning activity */
>  
>   struct list_headeh_cmd_q;
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: REGRESSION: 2.6.20-rc3-git4: EIO not returned to direct i/o application following disk error

2007-01-09 Thread Michael Reed
More info.

Linux duck 2.6.20-rc3-git4 #10 SMP PREEMPT Fri Jan 5 10:58:11 CST 2007 ia64 
ia64 ia64 GNU/Linux

For this test I used an Emulex host adapter, but the problem is not
unique to it.  It also occurs when the targets are connected to either
LSI or QLogic adapters.

Using O_DIRECT, once the targets are removed the application fails
to terminate until it completes its write phase and then attempts
to read/verify the written data.  It then reports data miscompares.
It should get an EIO and fail instead of continuing to have the
write (and read) operations succeed.  It used to get EIO

As this is easily reproducible, I'm happy to try any changes which might
address the problem.

duck /root# ./disktest -cvwbB -n 10 -u 16 /dev/sde1
Tue Jan  9 15:21:39 2007: ./disktest: options: c v w b B n u
./disktest: setting O_DIRECT
Tue Jan  9 15:21:39 2007: ./disktest: /dev/sde1: maxiou 4096, secsize 64
Tue Jan  9 15:21:39 2007: ./disktest: /dev/sde1: unique id is 0, random number 
seed is 0
Tue Jan  9 15:21:39 2007: ./disktest: /dev/sde1: starting block 0, using direct 
i/o.
Tue Jan  9 15:21:39 2007: ./disktest: running 10 passes.
Tue Jan  9 15:21:39 2007: ./disktest: /dev/sde1: testing 12799 i/o units, 16 
sectors in length, min io size 1, pass = 1
Tue Jan  9 15:21:39 2007: ./disktest: /dev/sde1: 0x6473 - 
sequential writes
Tue Jan  9 15:21:42 2007: ./disktest: /dev/sde1: 0x6473 - 
sequential reads
Tue Jan  9 15:21:44 2007: ./disktest: /dev/sde1: 0x6473 - random 
writes
Tue Jan  9 15:21:44 2007: ./disktest: /dev/sde1: 0x6473 - 
sequential reads
Tue Jan  9 15:21:47 2007: ./disktest: /dev/sde1: 0x6473 - 256 
random reads
Tue Jan  9 15:21:47 2007: ./disktest: /dev/sde1: testing 12799 i/o units, 16 
sectors in length, min io size 1, pass = 2
Tue Jan  9 15:21:47 2007: ./disktest: /dev/sde1: 0x6473 - 
sequential writes
-- portdisable 
Tue Jan  9 15:22:39 2007: ./disktest: /dev/sde1: 0x6473 - 
sequential reads
Tue Jan  9 15:22:39 2007: ./disktest: /dev/sde1: data compare error number 1 at 
0x2c4000: word 0 of block 0 (0)
act = bad0bad0deaddead
exp = 


and from the console

duck /root# lpfc 0001:00:02.0: 0:1305 Link Down Event x2 received Data: x2 x20 
x110
 rport-2:0-2: blocked FC remote port time out: removing target and saving 
binding
 rport-2:0-3: blocked FC remote port time out: removing target and saving 
binding
lpfc 0001:00:02.0: 0:0203 Devloss timeout on WWPN 10:0:0:6:2b:8:b:9c NPort 
x130100 Data: x208 x7 x5
sd 2:0:0:0: SCSI error: return code = 0x0001
end_request: I/O error, dev sde, sector 179986
lpfc 0001:00:02.0: 0:0203 Devloss timeout on WWPN 10:0:0:6:2b:8:b:9d NPort 
x13 Data: x208 x7 x4
scsi 2:0:0:0: rejecting I/O to dead device
scsi 2:0:0:0: rejecting I/O to dead device
scsi 2:0:0:0: rejecting I/O to dead device
scsi 2:0:0:0: rejecting I/O to dead device
scsi 2:0:0:0: rejecting I/O to dead device
(huge numbers of these occur)


I tried the same test using O_FSYNC instead of O_DIRECT and the write error
following the portdisable propagated to the test and it exited with an EIO.

Mike




Michael Reed wrote:
> Testing using 2.6.20-rc3-git4 I observed that my direct i/o test
> application no longer receives an EIO when the fc transport deletes
> a target following a fibre channel switch port disable.
> 
> With 2.6.19 EIO is returned and the application terminates.
> 
> With 2.6.20, the requested read length is returned with incorrect
> data in the buffer.
> 
> (I was playing around with an error recovery patch when I first
> discovered this, and do believe it is not limited in scope to
> targets being removed by the transport.)
> 
> This is a serious regression which puts customer data at risk.
> 
> (Is there a formal mechanism for filing a bug?)
> 
> Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/5] fusion: vmware bug fix prevent inifinite retries

2007-01-09 Thread Michael Reed


Moore, Eric wrote:
> On  Monday, January 08, 2007 3:25 PM, James Bottomley wrote:
> 
>> Right, I sort of suspected something like this.  BUSY/QUEUE_FULL
>> handling was a bit iffy in 2.4; but it was sorted out in the 2003/4
>> timeframe.  Nowadays, I think you want to translate the
>> MPI_SCSI_STATUS_BUSY directly to SAM_STAT_BUSY (i.e. just remove the
>> special casing if).

Christoph put in code to limit a command's lifetime to prevent infinite
loops in the case of QUEUE_FULL and BUSY.  (See scsi_softirq_done()
for implementation.)

DID_OK / COMMAND_COMPLETE / BUSY results in a ADD_TO_MLQUEUE for a retry,
same as QUEUE_FULL.  I don't infinite retries, just a whole lot of them.
See scsi_decide_disposition().

Mike

>>
> 
> I think your'e on the same page with the folks from VMware,
> where the've asked us to go back to our old driver code.
> Meaning we kill the check for "MPI_SCSI_STATUS_BUSY", instead the sam
> status
> is sent back "as is" without changing the DID_OK to DID_BUS_BUSY, etc. 
> 
> My problem with that is whether is breaks the Fibre Channel Folks. 
> Will FC failover solution work properly if we go back to the old code?
> I add Stephen Shirron and Mike Reed. 
> I don't know.   Here is an explanation why that fix was needed back
> about a year ago:
> 
> 
> "When a target device responds with BUSY status, the MPT driver was
> sending DID_OK to the 
> SCSI mid layer, which caused the IO to be retried indefinitely between
> the mid layer and the 
> driver.  By changing the driver return status to DID_BUS_BUSY, the
> target BUSY status can 
> now flow through the mid layer to an upper layer Failover driver, which
> will manage the I/O timeout."
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


REGRESSION: 2.6.20-rc3-git4: EIO not returned to direct i/o application following disk error

2007-01-05 Thread Michael Reed
Testing using 2.6.20-rc3-git4 I observed that my direct i/o test
application no longer receives an EIO when the fc transport deletes
a target following a fibre channel switch port disable.

With 2.6.19 EIO is returned and the application terminates.

With 2.6.20, the requested read length is returned with incorrect
data in the buffer.

(I was playing around with an error recovery patch when I first
discovered this, and do believe it is not limited in scope to
targets being removed by the transport.)

This is a serious regression which puts customer data at risk.

(Is there a formal mechanism for filing a bug?)

Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] fusion: fibre channel: return DID_ERROR for MPI_IOCSTATUS_SCSI_IOC_TERMINATED

2006-12-20 Thread Michael Reed
The fibre channel IOC may kill a request for a variety of
reasons, some of which may be recovered by a retry, some of
which are unlikely to be recovered.  Return DID_ERROR
instead of DID_RESET to permit retry of the command,
just not an infinite number of them.

Signed-off-by: Michael Reed <[EMAIL PROTECTED]>
The fibre channel IOC may kill a request for a variety of
reasons, some of which may be recovered by a retry, some of
which are unlikely to be recovered.  Return DID_ERROR
instead of DID_RESET to permit retry of the command,
just not an infinite number of them.

Signed-off-by: Michael Reed <[EMAIL PROTECTED]>

--- kdbu/drivers/message/fusion/mptscsih.c	2006-08-04 02:10:22.0 -0500
+++ kdb/drivers/message/fusion/mptscsih.c	2006-12-16 12:54:13.952128107 -0600
@@ -702,9 +702,14 @@ mptscsih_io_done(MPT_ADAPTER *ioc, MPT_F
 	}
 }
 			}
+			else if (ioc->bus_type == FC) {
+/* i/o killed by IOC */
+sc->result = DID_ERROR << 16;
+break;
+			}
 
 			/*
-			 * Allow non-SAS & non-NEXUS_LOSS to drop into below code
+			 * Allow non-FC, non-SAS & non-NEXUS_LOSS to drop into below code
 			 */
 
 		case MPI_IOCSTATUS_SCSI_TASK_TERMINATED:	/* 0x0048 */


[RFC] Prevent infinite retries due to DID_RESET return status

2006-12-11 Thread Michael Reed
Due to a firmware mismatch between a host and target (names withheld to
protect the innocent?), the LLDD was returning DID_RESET for every
i/o command.  This patch modifies the scsi layer to take into account
when the command which received DID_RESET was issued and eventually
give up on it instead of unconditionally reissuing it forever
when it receives a DID_RESET.  With this patch, on my test system,
the command receiving the constant DID_RESET times out after about
360 seconds.

The premise for this patch is that no command should have an infinite
lifetime.  The impetus for this patch was a system which would not
reach a command prompt without disconnecting the storage from the
host.

The significant change in this patch is to call scsi_retry_command()
instead of scsi_requeue_command() if the command which receives a
DID_RESET did not complete any i/o (good_bytes==0).  scsi_retry_command()
does not release the command and regenerate it like scsi_requeue_command()
does, hence jiffies_at_alloc reflects when the command was first issued.

This patch is based upon 2.6.19.  Thanks for taking the time to
look at this.

Mike








--- kdbu/drivers/scsi/scsi_priv.h	2006-10-09 01:58:19.0 -0500
+++ kdb/drivers/scsi/scsi_priv.h	2006-12-07 14:15:19.925332776 -0600
@@ -28,7 +28,7 @@ extern int scsi_dispatch_cmd(struct scsi
 extern int scsi_setup_command_freelist(struct Scsi_Host *shost);
 extern void scsi_destroy_command_freelist(struct Scsi_Host *shost);
 extern void __scsi_done(struct scsi_cmnd *cmd);
-extern int scsi_retry_command(struct scsi_cmnd *cmd);
+extern int scsi_retry_command(struct scsi_cmnd *cmd, int reason);
 #ifdef CONFIG_SCSI_LOGGING
 void scsi_log_send(struct scsi_cmnd *cmd);
 void scsi_log_completion(struct scsi_cmnd *cmd, int disposition);
--- kdbu/include/scsi/scsi.h	2006-10-31 21:08:47.0 -0600
+++ kdb/include/scsi/scsi.h	2006-12-07 14:13:09.188052974 -0600
@@ -353,6 +353,7 @@ struct scsi_lun {
 #define SCSI_MLQUEUE_HOST_BUSY   0x1055
 #define SCSI_MLQUEUE_DEVICE_BUSY 0x1056
 #define SCSI_MLQUEUE_EH_RETRY0x1057
+#define SCSI_MLQUEUE_DID_RESET   0x1058
 
 /*
  *  Use these to separate status msg and our bytes
--- kdbu/drivers/scsi/scsi.c	2006-10-09 01:58:19.0 -0500
+++ kdb/drivers/scsi/scsi.c	2006-12-07 14:15:49.835794930 -0600
@@ -673,7 +673,7 @@ void __scsi_done(struct scsi_cmnd *cmd)
  *  level drivers should not become re-entrant as a result of
  *  this.
  */
-int scsi_retry_command(struct scsi_cmnd *cmd)
+int scsi_retry_command(struct scsi_cmnd *cmd, int reason)
 {
 /*
  * Zero the sense information from the last time we tried
@@ -681,7 +681,7 @@ int scsi_retry_command(struct scsi_cmnd 
  */
 	memset(cmd->sense_buffer, 0, sizeof(cmd->sense_buffer));
 
-	return scsi_queue_insert(cmd, SCSI_MLQUEUE_EH_RETRY);
+	return scsi_queue_insert(cmd, reason);
 }
 
 /*
--- kdbu/drivers/scsi/scsi_lib.c	2006-11-29 21:09:07.0 -0600
+++ kdb/drivers/scsi/scsi_lib.c	2006-12-11 14:22:52.756579311 -0600
@@ -65,6 +65,7 @@ static struct scsi_host_sg_pool scsi_sg_
 #undef SP
 
 static void scsi_run_queue(struct request_queue *q);
+static void scsi_release_buffers(struct scsi_cmnd *cmd);
 
 /*
  * Function:	scsi_unprep_request()
@@ -100,10 +101,10 @@ static void scsi_unprep_request(struct r
  *
  * Returns: Nothing.
  *
- * Notes:   We do this for one of two cases.  Either the host is busy
+ * Notes:   We do this for one of three cases.  1) the host is busy
  *  and it cannot accept any more commands for the time being,
- *  or the device returned QUEUE_FULL and can accept no more
- *  commands.
+ *  2) the device returned QUEUE_FULL and can accept no more
+ *  commands, or 3) the LLDD returned DID_RESET.
  * Notes:   This could be called either from an interrupt context or a
  *  normal process context.
  */
@@ -137,9 +138,11 @@ int scsi_queue_insert(struct scsi_cmnd *
 
 	/*
 	 * Decrement the counters, since these commands are no longer
-	 * active on the host/device.
+	 * active on the host/device.  If the reason is SCSI_MLQUEUE_DID_RESET
+	 * then scsi_device_unbusy() was previously called.
 	 */
-	scsi_device_unbusy(device);
+	if (reason != SCSI_MLQUEUE_DID_RESET)
+		scsi_device_unbusy(device);
 
 	/*
 	 * Requeue this command.  It will go before all other commands
@@ -601,6 +604,7 @@ static void scsi_requeue_command(struct 
 	struct request *req = cmd->request;
 	unsigned long flags;
 
+	scsi_release_buffers(cmd);
 	scsi_unprep_request(req);
 	spin_lock_irqsave(q->queue_lock, flags);
 	blk_requeue_request(q, req);
@@ -646,6 +650,7 @@ void scsi_run_host_queues(struct Scsi_Ho
  * Lock status: Assumed that lock is not held upon entry.
  *
  * Returns: cmd if requeue required, NULL otherwise.
+ * 		If cmd is returned then its buffers have not been released.
  *
  * Notes:   This is called for block device requests in order to
  * 

Re: Infinite retries reading the partition table

2006-12-05 Thread Michael Reed


Luben Tuikov wrote:
...snip...
> This statement in scsi_io_completion() causes the infinite retry loop:
>if (scsi_end_request(cmd, 1, good_bytes, !!result) == NULL)
>  return;

The code in 2.6.19 is "result==0", not "!!result", which is logically
the same as "result!=0".  Did you mean to change the logic here?
Am I missing something?

Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html