Re: [PATCH] libsas: Don't issue commands to devices that have been hot-removed.

2007-12-06 Thread Brian King
Darrick J. Wong wrote:
 In general, I agree that sas-ata should adopt the new EH.
 Unfortunately, I believe the old way of sas-ata configuring ATA ports is
 somehow not compatible with the new EH stuff and causes a crash during
 the device probe with my patch to move sas-ata to the new EH.  If I
 apply the patch that migrates sas-ata to use brking's latest ata-sas
 configuration mechanism (the one that creates real ata_hosts), I see
 (a) lots and lots of ATA hosts getting created (one per ATA port;
 possibly undesirable if you've a SAS topology with a lot of SATA disks)

The new libata EH ends up spending more time in the error handling thread
than the old code did. One of the reasons having multiple ATA/SCSI hosts
is a good thing is that is the granularity of error handling, so it
prevents stalling all the other devices under that SAS HBA while we are
hitting errors on an ATAPI SATA device, for example.

Arguably, SATA users of libata already have one SCSI host per ATA port,
so my SAS patches really just bring SAS in line with that design...

-Brian

-- 
Brian King
Linux on Power Virtualization
IBM Linux Technology Center
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] libsas: Don't issue commands to devices that have been hot-removed.

2007-12-04 Thread Darrick J. Wong
Hrm... does this patch help?  You'll get a bunch of ATA/SAS disk errors
printed to the screen if you yank the disk, but at least libsas won't
get stuck waiting for the cache-flush commands to time out.
---
sd will get hung up issuing commands to flush write cache if a SAS device
is unplugged without warning.  Change libsas to reject commands to domain
devices that have already gone away.

Signed-off-by: Darrick J. Wong [EMAIL PROTECTED]
---

 drivers/scsi/libsas/sas_ata.c   |4 
 drivers/scsi/libsas/sas_expander.c  |3 +++
 drivers/scsi/libsas/sas_port.c  |2 ++
 drivers/scsi/libsas/sas_scsi_host.c |7 +++
 include/scsi/libsas.h   |1 +
 5 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 0829b55..f5e5213 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -161,6 +161,10 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd 
*qc)
unsigned int num = 0;
unsigned int xfer = 0;
 
+   /* If the device fell off, no sense in issuing commands */
+   if (dev-gone)
+   return AC_ERR_SYSTEM;
+
task = sas_alloc_task(GFP_ATOMIC);
if (!task)
return AC_ERR_SYSTEM;
diff --git a/drivers/scsi/libsas/sas_expander.c 
b/drivers/scsi/libsas/sas_expander.c
index 27674fe..4ba4d2a 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -1680,6 +1680,7 @@ static void sas_unregister_ex_tree(struct domain_device 
*dev)
struct domain_device *child, *n;
 
list_for_each_entry_safe(child, n, ex-children, siblings) {
+   child-gone = 1;
if (child-dev_type == EDGE_DEV ||
child-dev_type == FANOUT_DEV)
sas_unregister_ex_tree(child);
@@ -1699,6 +1700,7 @@ static void sas_unregister_devs_sas_addr(struct 
domain_device *parent,
list_for_each_entry_safe(child, n, ex_dev-children, siblings) {
if (SAS_ADDR(child-sas_addr) ==
SAS_ADDR(phy-attached_sas_addr)) {
+   child-gone = 1;
if (child-dev_type == EDGE_DEV ||
child-dev_type == FANOUT_DEV)
sas_unregister_ex_tree(child);
@@ -1707,6 +1709,7 @@ static void sas_unregister_devs_sas_addr(struct 
domain_device *parent,
break;
}
}
+   parent-gone = 1;
sas_disable_routing(parent, phy-attached_sas_addr);
memset(phy-attached_sas_addr, 0, SAS_ADDR_SIZE);
sas_port_delete_phy(phy-port, phy-phy);
diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c
index b6f0243..2e82097 100644
--- a/drivers/scsi/libsas/sas_port.c
+++ b/drivers/scsi/libsas/sas_port.c
@@ -144,6 +144,8 @@ void sas_deform_port(struct asd_sas_phy *phy)
port-port_dev-pathways--;
 
if (port-num_phys == 1) {
+   if (port-port_dev)
+   port-port_dev-gone = 1;
sas_unregister_domain_devices(port);
sas_port_delete(port-port);
port-port = NULL;
diff --git a/drivers/scsi/libsas/sas_scsi_host.c 
b/drivers/scsi/libsas/sas_scsi_host.c
index c29ba47..61d2679 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -228,6 +228,13 @@ int sas_queuecommand(struct scsi_cmnd *cmd,
goto out;
}
 
+   /* If the device fell off, no sense in issuing commands */
+   if (dev-gone) {
+   cmd-result = DID_BAD_TARGET  16;
+   scsi_done(cmd);
+   goto out;
+   }
+
res = -ENOMEM;
task = sas_create_task(cmd, dev, GFP_ATOMIC);
if (!task)
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index 8ad7465..73c5b15 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -207,6 +207,7 @@ struct domain_device {
 };
 
 void *lldd_dev;
+   int gone;
 };
 
 struct sas_discovery_event {
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] libsas: Don't issue commands to devices that have been hot-removed.

2007-12-04 Thread Jeff Garzik

Darrick J. Wong wrote:

Hrm... does this patch help?  You'll get a bunch of ATA/SAS disk errors
printed to the screen if you yank the disk, but at least libsas won't
get stuck waiting for the cache-flush commands to time out.
---
sd will get hung up issuing commands to flush write cache if a SAS device
is unplugged without warning.  Change libsas to reject commands to domain
devices that have already gone away.

Signed-off-by: Darrick J. Wong [EMAIL PROTECTED]
---

 drivers/scsi/libsas/sas_ata.c   |4 
 drivers/scsi/libsas/sas_expander.c  |3 +++
 drivers/scsi/libsas/sas_port.c  |2 ++
 drivers/scsi/libsas/sas_scsi_host.c |7 +++
 include/scsi/libsas.h   |1 +
 5 files changed, 17 insertions(+), 0 deletions(-)


Seems sane...



diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 0829b55..f5e5213 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -161,6 +161,10 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd 
*qc)
unsigned int num = 0;
unsigned int xfer = 0;
 
+	/* If the device fell off, no sense in issuing commands */

+   if (dev-gone)
+   return AC_ERR_SYSTEM;
+
task = sas_alloc_task(GFP_ATOMIC);
if (!task)
return AC_ERR_SYSTEM;


As an aside, issues like this really really imply a need to move libsas 
away from the old libata EH stuff (like brking did with ipr, in patches).


Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] libsas: Don't issue commands to devices that have been hot-removed.

2007-12-04 Thread Darrick J. Wong
On Tue, Dec 04, 2007 at 05:48:33PM -0500, Jeff Garzik wrote:

 As an aside, issues like this really really imply a need to move libsas 
 away from the old libata EH stuff (like brking did with ipr, in patches).

Hm... does the new libata EH handle the case of device was
unplugged, don't bother trying to send any more commands?

In general, I agree that sas-ata should adopt the new EH.
Unfortunately, I believe the old way of sas-ata configuring ATA ports is
somehow not compatible with the new EH stuff and causes a crash during
the device probe with my patch to move sas-ata to the new EH.  If I
apply the patch that migrates sas-ata to use brking's latest ata-sas
configuration mechanism (the one that creates real ata_hosts), I see
(a) lots and lots of ATA hosts getting created (one per ATA port;
possibly undesirable if you've a SAS topology with a lot of SATA disks)
and (b) NCQ disks don't seem to work if you unplug the disk and plug
it back in (unless NCQ is disabled entirely).  Jeff, by any chance have
you tried plugging SATA devices into your SAS controllers?

James Bottomley wondered if it would be easier to have sas-ata call only
into the parts of libata that convert SCSI commands to ATA taskfiles,
though I'm unsure how many wormy cans that would open.

--D
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] libsas: Don't issue commands to devices that have been hot-removed.

2007-12-04 Thread Jeff Garzik

Darrick J. Wong wrote:

On Tue, Dec 04, 2007 at 05:48:33PM -0500, Jeff Garzik wrote:

As an aside, issues like this really really imply a need to move libsas 
away from the old libata EH stuff (like brking did with ipr, in patches).


Hm... does the new libata EH handle the case of device was
unplugged, don't bother trying to send any more commands?

In general, I agree that sas-ata should adopt the new EH.
Unfortunately, I believe the old way of sas-ata configuring ATA ports is
somehow not compatible with the new EH stuff and causes a crash during
the device probe with my patch to move sas-ata to the new EH.  If I
apply the patch that migrates sas-ata to use brking's latest ata-sas
configuration mechanism (the one that creates real ata_hosts), I see
(a) lots and lots of ATA hosts getting created (one per ATA port;
possibly undesirable if you've a SAS topology with a lot of SATA disks)
and (b) NCQ disks don't seem to work if you unplug the disk and plug
it back in (unless NCQ is disabled entirely).  Jeff, by any chance have
you tried plugging SATA devices into your SAS controllers?


aic94xx yes, bcm and mv no.

Will take a look though...



James Bottomley wondered if it would be easier to have sas-ata call only
into the parts of libata that convert SCSI commands to ATA taskfiles,
though I'm unsure how many wormy cans that would open.


You want more than that.

You want to make sure libata is the place for knowledge about weird ATA 
devices, SATA quirks, ATA device error handling (to be distinguished 
from ATA /link/ error handling), and other areas.


That stuff shouldn't be duplicated, and you /really/ do not want to 
re-learn all those lessons all over again ;-)


Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html