Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-30 Thread Douglas Gilbert
Ric Wheeler wrote:
> 
> 
> Mark Lord wrote:
> 
>> Eric D. Mudama wrote:
>>
>>>
>>> Actually, it's possibly worse, since each failure in libata will
>>> generate 3-4 retries.  With existing ATA error recovery in the
>>> drives, that's about 3 seconds per retry on average, or 12 seconds
>>> per failure.  Multiply that by the number of blocks past the error to
>>> complete the request..
>>
>>
>> It really beats the alternative of a forced reboot
>> due to, say, superblock I/O failing because it happened
>> to get merged with an unrelated I/O which then failed..
>> Etc..
>>
>> Definitely an improvement.
>>
>> The number of retries is an entirely separate issue.
>> If we really care about it, then we should fix SD_MAX_RETRIES.
>>
>> The current value of 5 is *way* too high.  It should be zero or one.
>>
>> Cheers
>>
> I think that drives retry enough, we should leave retry at zero for
> normal (non-removable) drives. Should this  be a policy we can set like
> we do with NCQ queue depth via /sys ?

The transport might also want a say. I see ABORTED COMMAND
errors often enough with SAS (e.g. due to expander congestion)
to warrant at least one retry (which works in my testing).
SATA disks behind SAS infrastructure would also be
susceptible to the same "random" failures.

Transport Layer Retries (TLR) in SAS should remove this class
of transport errors but only SAS tape drives support TLR as
far as I know.

Doug Gilbert

> We need to be able to layer things like MD on top of normal drive errors
> in a way that will produce a system that provides reasonable response
> time despite any possible IO error on a single component.  Another case
> that we end up doing on a regular basis is drive recovery. Errors need
> to be limited in scope to just the impacted area and dispatched up to
> the application layer as quickly as we can so that you don't spend days
> watching a copy of  huge drive (think 750GB or more) ;-)
> 
> ric


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-30 Thread James Bottomley
On Tue, 2007-01-30 at 22:20 -0500, Ric Wheeler wrote:
> Mark Lord wrote:
> > The number of retries is an entirely separate issue.
> > If we really care about it, then we should fix SD_MAX_RETRIES.
> >
> > The current value of 5 is *way* too high.  It should be zero or one.
> >
> > Cheers
> >
> I think that drives retry enough, we should leave retry at zero for 
> normal (non-removable) drives. Should this  be a policy we can set like 
> we do with NCQ queue depth via /sys ?

I don't disagree that it should be settable.  However, retries occur for
other reasons than failures inside the device.  The most standard ones
are unit attentions generated because of other activity (target reset
etc).  The key to the problem is retrying only operations that are
genuinely retryable, which the mid-layer doesn't do such a good job on.

> We need to be able to layer things like MD on top of normal drive errors 
> in a way that will produce a system that provides reasonable response 
> time despite any possible IO error on a single component.  Another case 
> that we end up doing on a regular basis is drive recovery. Errors need 
> to be limited in scope to just the impacted area and dispatched up to 
> the application layer as quickly as we can so that you don't spend days 
> watching a copy of  huge drive (think 750GB or more) ;-)

For the MD case, this is what REQ_FAILFAST is for.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-30 Thread Ric Wheeler



Mark Lord wrote:


Eric D. Mudama wrote:



Actually, it's possibly worse, since each failure in libata will 
generate 3-4 retries.  With existing ATA error recovery in the 
drives, that's about 3 seconds per retry on average, or 12 seconds 
per failure.  Multiply that by the number of blocks past the error to 
complete the request..



It really beats the alternative of a forced reboot
due to, say, superblock I/O failing because it happened
to get merged with an unrelated I/O which then failed..
Etc..

Definitely an improvement.

The number of retries is an entirely separate issue.
If we really care about it, then we should fix SD_MAX_RETRIES.

The current value of 5 is *way* too high.  It should be zero or one.

Cheers

I think that drives retry enough, we should leave retry at zero for 
normal (non-removable) drives. Should this  be a policy we can set like 
we do with NCQ queue depth via /sys ?


We need to be able to layer things like MD on top of normal drive errors 
in a way that will produce a system that provides reasonable response 
time despite any possible IO error on a single component.  Another case 
that we end up doing on a regular basis is drive recovery. Errors need 
to be limited in scope to just the impacted area and dispatched up to 
the application layer as quickly as we can so that you don't spend days 
watching a copy of  huge drive (think 750GB or more) ;-)


ric

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-30 Thread Mark Lord

James Bottomley wrote:

First off, please send SCSI patches to the SCSI list:



Fixed already, thanks!


This patch fixes the behaviour to be similar to what we had originally.

When a bad sector is encounted, SCSI will now work around it again,
failing *only* the bad sector itself.


Erm, but the corollary is that if we get a large read failure because of
a bad track, you're going to try and chunk up it a sector at a time


That's better than the huge data-loss scenario that we currently
have for single-sector errors.  MUCH better.


forcing an individual error for each sector is going to annoy some
people ... particularly removable medium ones which return this error if
the medium isn't present ... Are you sure this is really what we want to
do?


No, for removed-medium everything just fails right away.
This patch is *only* for media errors, not any other failures.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-30 Thread James Bottomley
First off, please send SCSI patches to the SCSI list:


On Tue, 2007-01-30 at 19:47 -0500, Mark Lord wrote:
> In ancient kernels, the SCSI disk code used to continue after
> encountering a MEDIUM_ERROR.  It would "complete" the good
> sectors before the error, fail the bad sector/block, and then
> continue with the rest of the request.
> 
> Kernels since about 2.6.16 or so have been broken in this regard.
> They "complete" the good sectors before the error,
> and then fail the entire remaining portions of the request.

What was the commit that introduced the change? ... I have a vague
memory of it being deliberate.

> This is very risky behaviour, as a request is often a merge
> of several bios, and just because one application hits a bad sector
> is no reason to pretend that (for example) an adjacent directly lookup also 
> failed.
> 
> This patch fixes the behaviour to be similar to what we had originally.
> 
> When a bad sector is encounted, SCSI will now work around it again,
> failing *only* the bad sector itself.

Erm, but the corollary is that if we get a large read failure because of
a bad track, you're going to try and chunk up it a sector at a time
forcing an individual error for each sector is going to annoy some
people ... particularly removable medium ones which return this error if
the medium isn't present ... Are you sure this is really what we want to
do?

> Signed-off-by:  Mark Lord <[EMAIL PROTECTED]>
> ---
> diff -u --recursive --new-file 
> --exclude-from=linux_17//Documentation/dontdiff old/drivers/scsi/scsi_lib.c 
> linux/drivers/scsi/scsi_lib.c
> --- old/drivers/scsi/scsi_lib.c   2007-01-30 13:58:05.0 -0500
> +++ linux/drivers/scsi/scsi_lib.c 2007-01-30 18:30:01.0 -0500
> @@ -865,6 +865,12 @@
>*/
>   if (sense_valid && !sense_deferred) {
>   switch (sshdr.sense_key) {
> + case MEDIUM_ERROR:
> + // Bad sector.  Fail it, and then continue the rest of 
> the request:
> + if (scsi_end_request(cmd, 0, cmd->device->sector_size, 
> 1) == NULL) {

The sense key may have come with additional information  I think we want
to parse that (if it exists) rather than just blindly failing the first
sector of the request.

> + cmd->retries = 0;   // go around again..
> + return;
> + }

This would drop through to the UNIT_ATTENTION case if scsi_end_request()
fails ... I don't think that's correct.

>   case UNIT_ATTENTION:
>   if (cmd->device->removable) {
>   /* Detected disc change.  Set a bit

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-30 Thread Mark Lord

Eric D. Mudama wrote:


Actually, it's possibly worse, since each failure in libata will 
generate 3-4 retries.  With existing ATA error recovery in the drives, 
that's about 3 seconds per retry on average, or 12 seconds per failure.  
Multiply that by the number of blocks past the error to complete the 
request..


It really beats the alternative of a forced reboot
due to, say, superblock I/O failing because it happened
to get merged with an unrelated I/O which then failed..
Etc..

Definitely an improvement.

The number of retries is an entirely separate issue.
If we really care about it, then we should fix SD_MAX_RETRIES.

The current value of 5 is *way* too high.  It should be zero or one.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AIC7xxx on 2.6.18

2007-01-30 Thread Wakko Warner
NOTE: I am not on the linux-scsi list, keep me in CC.

Andrew Morton wrote:
> On Tue, 30 Jan 2007 07:18:20 -0500
> Wakko Warner <[EMAIL PROTECTED]> wrote:
> > Andrew Morton wrote:
> > > Yes, getting the oops traces will help, thanks.  And confirmation on a 
> > > more
> > > recent kernel would be good.
> > 
> > I tested with a 2.6.20-rc6 kernel and the MAC 39160 card.  There was no oops
> > and I was able to access the 2 disks.  This was on a different PC though. 
> > I'll try it again on the original PC.
> 
> Thanks.

The PC was a completely different PC when I tried it that time.  This time,
I tried it on a similar PC (same motherboard model, but not the exact same
machine).  I had no problems with 2.6.18.  I looked a little close and I
noticed that the original machine was actually overclocked.  I did the same
to the machine that works and it is now not working.  So the problem with
the mac card seems to be the overclocking.  I completely forgotten about it
since it was a test machine anyway.

So this just leaves the problem I've experienced on the machine with the PC
u160 and the u/uw dual card.

> > Should I try 2.6.19 as well?
> 
> There's not a lot of point in doing so.  If/when we come up with a
> 2.6.20-rc6 fix we'll know whether it is applicable to 2.6.19.x.

I'll try 2.6.19 on the machine with the 2 scsi cards with the option roms
disabled.  I'd rather not run a -rc kernel on this machine.

-- 
 Lab tests show that use of micro$oft causes cancer in lab animals
 Got Gas???
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] RESEND scsi_lib.c: continue after MEDIUM_ERROR

2007-01-30 Thread Mark Lord
Fixed for 80-columns, and copying linux-scsi this time.

In ancient kernels, the SCSI disk code used to continue after
encountering a MEDIUM_ERROR.  It would "complete" the good
sectors before the error, fail the bad sector/block, and then
continue with the rest of the request.

Kernels since about 2.6.16 or so have been broken in this regard.
They "complete" the good sectors before the error,
and then fail the entire remaining portions of the request.

This is very risky behaviour, as a request is often a merge
of several bios, and just because one application hits a bad sector
is no reason to pretend that (for example) an adjacent directly lookup also 
failed.

This patch fixes the behaviour to be similar to what we had originally.

When a bad sector is encounted, SCSI will now work around it again,
failing *only* the bad sector itself.

Signed-off-by:  Mark Lord <[EMAIL PROTECTED]>
---
--- old/drivers/scsi/scsi_lib.c 2007-01-30 20:06:15.0 -0500
+++ linux/drivers/scsi/scsi_lib.c   2007-01-30 20:06:59.0 -0500
@@ -865,6 +865,13 @@
 */
if (sense_valid && !sense_deferred) {
switch (sshdr.sense_key) {
+   case MEDIUM_ERROR:
+   /* Bad sector. Fail it, and continue on with the rest */
+   if (scsi_end_request(cmd, 0,
+   cmd->device->sector_size, 1) == NULL) {
+   cmd->retries = 0;   /* go around again.. */
+   return;
+   }
case UNIT_ATTENTION:
if (cmd->device->removable) {
/* Detected disc change.  Set a bit
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] scsi: Fix lost EH commands

2007-01-30 Thread Brian King

If an EH command times out today, the LLDD's abort handler
will be called to abort the command. It is assumed that this
completes successfully, which can result in the command getting
completed later resulting in an oops. Improve the current
implementation by escalating all the way to host reset if
necessary in order to clean up the EH command.

Signed-off-by: Brian King <[EMAIL PROTECTED]>
---

 linux-2.6-bjking1/drivers/scsi/scsi_error.c |  239 ++--
 1 files changed, 123 insertions(+), 116 deletions(-)

diff -puN drivers/scsi/scsi_error.c~scsi_fix_eh_lost_cmds 
drivers/scsi/scsi_error.c
--- linux-2.6/drivers/scsi/scsi_error.c~scsi_fix_eh_lost_cmds   2007-01-12 
15:42:11.0 -0600
+++ linux-2.6-bjking1/drivers/scsi/scsi_error.c 2007-01-12 15:42:11.0 
-0600
@@ -453,6 +453,128 @@ static void scsi_eh_done(struct scsi_cmn
 }
 
 /**
+ * scsi_try_host_reset - ask host adapter to reset itself
+ * @scmd:  SCSI cmd to send hsot reset.
+ **/
+static int scsi_try_host_reset(struct scsi_cmnd *scmd)
+{
+   unsigned long flags;
+   int rtn;
+
+   SCSI_LOG_ERROR_RECOVERY(3, printk("%s: Snd Host RST\n",
+ __FUNCTION__));
+
+   if (!scmd->device->host->hostt->eh_host_reset_handler)
+   return FAILED;
+
+   rtn = scmd->device->host->hostt->eh_host_reset_handler(scmd);
+
+   if (rtn == SUCCESS) {
+   if (!scmd->device->host->hostt->skip_settle_delay)
+   ssleep(HOST_RESET_SETTLE_TIME);
+   spin_lock_irqsave(scmd->device->host->host_lock, flags);
+   scsi_report_bus_reset(scmd->device->host,
+ scmd_channel(scmd));
+   spin_unlock_irqrestore(scmd->device->host->host_lock, flags);
+   }
+
+   return rtn;
+}
+
+/**
+ * scsi_try_bus_reset - ask host to perform a bus reset
+ * @scmd:  SCSI cmd to send bus reset.
+ **/
+static int scsi_try_bus_reset(struct scsi_cmnd *scmd)
+{
+   unsigned long flags;
+   int rtn;
+
+   SCSI_LOG_ERROR_RECOVERY(3, printk("%s: Snd Bus RST\n",
+ __FUNCTION__));
+
+   if (!scmd->device->host->hostt->eh_bus_reset_handler)
+   return FAILED;
+
+   rtn = scmd->device->host->hostt->eh_bus_reset_handler(scmd);
+
+   if (rtn == SUCCESS) {
+   if (!scmd->device->host->hostt->skip_settle_delay)
+   ssleep(BUS_RESET_SETTLE_TIME);
+   spin_lock_irqsave(scmd->device->host->host_lock, flags);
+   scsi_report_bus_reset(scmd->device->host,
+ scmd_channel(scmd));
+   spin_unlock_irqrestore(scmd->device->host->host_lock, flags);
+   }
+
+   return rtn;
+}
+
+/**
+ * scsi_try_bus_device_reset - Ask host to perform a BDR on a dev
+ * @scmd:  SCSI cmd used to send BDR
+ *
+ * Notes:
+ *There is no timeout for this operation.  if this operation is
+ *unreliable for a given host, then the host itself needs to put a
+ *timer on it, and set the host back to a consistent state prior to
+ *returning.
+ **/
+static int scsi_try_bus_device_reset(struct scsi_cmnd *scmd)
+{
+   int rtn;
+
+   if (!scmd->device->host->hostt->eh_device_reset_handler)
+   return FAILED;
+
+   rtn = scmd->device->host->hostt->eh_device_reset_handler(scmd);
+   if (rtn == SUCCESS) {
+   scmd->device->was_reset = 1;
+   scmd->device->expecting_cc_ua = 1;
+   }
+
+   return rtn;
+}
+
+static int __scsi_try_to_abort_cmd(struct scsi_cmnd *scmd)
+{
+   if (!scmd->device->host->hostt->eh_abort_handler)
+   return FAILED;
+
+   return scmd->device->host->hostt->eh_abort_handler(scmd);
+}
+
+/**
+ * scsi_try_to_abort_cmd - Ask host to abort a running command.
+ * @scmd:  SCSI cmd to abort from Lower Level.
+ *
+ * Notes:
+ *This function will not return until the user's completion function
+ *has been called.  there is no timeout on this operation.  if the
+ *author of the low-level driver wishes this operation to be timed,
+ *they can provide this facility themselves.  helper functions in
+ *scsi_error.c can be supplied to make this easier to do.
+ **/
+static int scsi_try_to_abort_cmd(struct scsi_cmnd *scmd)
+{
+   /*
+* scsi_done was called just after the command timed out and before
+* we had a chance to process it. (db)
+*/
+   if (scmd->serial_number == 0)
+   return SUCCESS;
+   return __scsi_try_to_abort_cmd(scmd);
+}
+
+static void scsi_abort_eh_cmnd(struct scsi_cmnd *scmd)
+{
+   if (__scsi_try_to_abort_cmd(scmd) != SUCCESS)
+   if (scsi_try_bus_device_reset(scmd) != SUCCESS)
+   if (scsi_try_bus_reset(scmd) != SUCCESS)
+   scsi_try_host_reset(scmd);
+}
+
+/**
  * scsi_send_eh_cmnd  - submit a scsi comma

[PATCH] scsi: Update Aic94xx SAS/SATA Linux open source device driver for new sequence firmware.

2007-01-30 Thread Wu, Gilbert
Subject:  [PATCH] scsi: Update Aic94xx SAS/SATA Linux open source device
driver for new sequence firmware.

 

Contribution:

   Ed Chim <[EMAIL PROTECTED]>
   Gilbert Wu <[EMAIL PROTECTED]>

Change Log:
 
1.Use dword instead of qword to display the value of Connection
State register for debug purpose.

2.There are some registers location of AIC94xx chip has been changed
according to the new V28 firmware. The patch has redefined the register
location and provided initialization.

3.The new sequencer firmware v28 for Aic94xx SAS/SATA Linux open
source device driver can be downloaded from
http://www.adaptec.com/NR/exeres/35B611BC-9789-4B5B-82C6-85A2CCA8A46A.ht
m

 

Patch: apply to scsi-misc-2.6.git development tree

Signed-off-by: Gilbert Wu <[EMAIL PROTECTED]>



diff -urN a/drivers/scsi/aic94xx/aic94xx_dump.c
b/drivers/scsi/aic94xx/aic94xx_dump.c
--- a/drivers/scsi/aic94xx/aic94xx_dump.c   2007-01-29
10:20:44.0 -0800
+++ b/drivers/scsi/aic94xx/aic94xx_dump.c   2007-01-29
10:31:44.0 -0800
@@ -556,7 +556,7 @@
PRINT_LMIP_word(asd_ha, lseq, Q_TGTXFR_TAIL);
PRINT_LMIP_byte(asd_ha, lseq, LINK_NUMBER);
PRINT_LMIP_byte(asd_ha, lseq, SCRATCH_FLAGS);
-   PRINT_LMIP_qword(asd_ha, lseq, CONNECTION_STATE);
+   PRINT_LMIP_dword(asd_ha, lseq, CONNECTION_STATE);
PRINT_LMIP_word(asd_ha, lseq, CONCTL);
PRINT_LMIP_byte(asd_ha, lseq, CONSTAT);
PRINT_LMIP_byte(asd_ha, lseq, CONNECTION_MODES);
diff -urN a/drivers/scsi/aic94xx/aic94xx_reg_def.h
b/drivers/scsi/aic94xx/aic94xx_reg_def.h
--- a/drivers/scsi/aic94xx/aic94xx_reg_def.h2007-01-29
10:21:14.0 -0800
+++ b/drivers/scsi/aic94xx/aic94xx_reg_def.h2007-01-29
10:35:54.0 -0800
@@ -2226,9 +2226,10 @@
 #define LmSEQ_SAS_RESET_MODE(LinkNum)  (LmSCRATCH(LinkNum) +
0x0074)
 #define LmSEQ_LINK_RESET_RETRY_COUNT(LinkNum)  (LmSCRATCH(LinkNum) +
0x0075)
 #define LmSEQ_NUM_LINK_RESET_RETRIES(LinkNum)  (LmSCRATCH(LinkNum) +
0x0076)
-#define LmSEQ_OOB_INT_ENABLES(LinkNum) (LmSCRATCH(LinkNum) +
0x007A)
+#define LmSEQ_OOB_INT_ENABLES(LinkNum) (LmSCRATCH(LinkNum) +
0x0078)
+#define LmSEQ_NOTIFY_TIMER_DOWN_COUNT(LinkNum) (LmSCRATCH(LinkNum) +
0x007A)
 #define LmSEQ_NOTIFY_TIMER_TIMEOUT(LinkNum)(LmSCRATCH(LinkNum) +
0x007C)
-#define LmSEQ_NOTIFY_TIMER_DOWN_COUNT(LinkNum) (LmSCRATCH(LinkNum) +
0x007E)
+#define LmSEQ_NOTIFY_TIMER_INITIAL_COUNT(LinkNum) (LmSCRATCH(LinkNum) +
0x007E)
 
 /* Mode dependent scratch page 1, mode 0 and mode 1 */
 #define LmSEQ_SG_LIST_PTR_ADDR0(LinkNum)(LmSCRATCH(LinkNum) +
0x0020)
diff -urN a/drivers/scsi/aic94xx/aic94xx_seq.c
b/drivers/scsi/aic94xx/aic94xx_seq.c
--- a/drivers/scsi/aic94xx/aic94xx_seq.c2007-01-29
10:21:28.0 -0800
+++ b/drivers/scsi/aic94xx/aic94xx_seq.c2007-01-29
10:42:55.0 -0800
@@ -810,6 +810,8 @@
/* No delay for the first NOTIFY to be sent to the attached
target. */
asd_write_reg_word(asd_ha, LmSEQ_NOTIFY_TIMER_DOWN_COUNT(lseq),
   ASD_NOTIFY_DOWN_COUNT);
+   asd_write_reg_word(asd_ha,
LmSEQ_NOTIFY_TIMER_INITIAL_COUNT(lseq),
+  ASD_NOTIFY_DOWN_COUNT);
 
/* LSEQ Mode dependent, mode 0 and 1, page 1 setup. */
for (i = 0; i < 2; i++) {
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


SAS illegal toplogies [was Re: [PATCH 1/4 v2] libsas: Don't BUG when connecting two expanders via wide port]

2007-01-30 Thread Douglas Gilbert
Darrick J. Wong wrote:
> libsas: Don't BUG when connecting two expanders via wide port
> 
> When a device is connected to an expander, the discovery process goes through
> sas_ex_discover_dev to figure out what's attached to the phy.  If it is the
> case that the phy being discovered happens to be the second phy of a wide link
> to an expander, that discover_dev function will incorrectly call
> sas_ex_discover_expander, which creates another sas_port and tries to attach 
> the
> other sas_phys to the new port, thus triggering a BUG.  The correct thing to 
> do is
> to check the other ex_phys of the expander to see if there's a sas_port for 
> this
> sas_phy, and attach the sas_phy to the existing sas_port.
> 
> This is easily triggered if one enables the phys of a wide port between
> expanders one by one.
> 
> This second version of the patch fixes a small regression in the case where
> all the phys show up at once and we accidentally try to attach to a port
> that hasn't been created yet.

Darrick,
Okay.

Now I'm wondering what the discovery algorithm in libsas
does if it finds truly illegal connections between expanders.
The spec defines what is illegal but says it is vendor specific
what will be done.

One approach is to use the SMP PHY CONTROL function to disable
the phy (or the phys at both ends of the illegal link). The
next trick is how to tell the user who just connected a cable
between expanders that "you can't do that!". Tools like my
smp_discover could alert a user to a disabled phy but
without turning it back on (and causing the libsas discovery
algorithm another headache) my SMP utilities don't know what
it is connected to.

Another question is which link to disable. Imagine three
expanders interconnected with 3 links which is illegal.
Breaking any one link makes it legal, but which one
to break? Last seen, or perhaps the link which has
the largest SAS address sum ...

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] aacraid: Add kernel command line parameter parsing

2007-01-30 Thread Salyzyn, Mark
One shortcoming of the driver relationship with the kernel is that there
is no standard means of having the insmod parameters associated with a
driver to also be parsed and set by the kernel parameter line. The
enclosed patch is a proposal for the aacraid driver to pick up the
kernel parameter line, parse it, and then adjust the insmod parameters.

The format of the kernel parameter line is
aacraid=:[,:]...

There may be a better way of providing this service via the kernel
without any modifications from the driver, since all the characteristics
of the insmod parameters are exported by the MODULE_PARM_* hints. Would
such mods be in insmod/modprobe and not in the kernel or driver?

Signed-off-by Mark Salyzyn <[EMAIL PROTECTED]>
---

Sincerely -- Mark Salyzyn
Illegitimi Non Carborundum


aacraid_command_line.patch
Description: aacraid_command_line.patch


Re: 2.6.20-rc6-mm1

2007-01-30 Thread Anton Altaparmakov
Hi Andrew,

Looks good for NTFS thanks!  The only thing is that I think we 
already have a variable "unsigned long flags" in the function 
ntfs_end_buffer_async_read() so that could be used instead of 
redefining it more locally in the if statements.

Could you send the patch to Linus?

Feel free to add my Acked-by or Signed-off-by "Anton Altaparmakov 
<[EMAIL PROTECTED]> line if you wish (I am not bothered either way)...

Thanks a lot for fixing it!

Best regards,

Anton

On Tue, 30 Jan 2007, Andrew Morton wrote:

> On Mon, 29 Jan 2007 23:27:27 -0800
> Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > On Sun, 28 Jan 2007 11:25:42 +0100
> > Jiri Slaby <[EMAIL PROTECTED]> wrote:
> > 
> > > Andrew Morton napsal(a):
> > > > Temporarily at
> > > > 
> > > > http://userweb.kernel.org/~akpm/2.6.20-rc6-mm1/
> > > 
> > > I'm still seeing this during bootup:
> > > BUG: at /home/l/latest/xxx/arch/i386/mm/highmem.c:52 kmap_atomic()
> > >   [] show_trace_log_lvl+0x1a/0x30
> > >   [] show_trace+0x12/0x14
> > >   [] dump_stack+0x16/0x18
> > >   [] kmap_atomic+0x16c/0x20e
> > >   [] ntfs_end_buffer_async_read+0x18e/0x2ed
> > >   [] end_bio_bh_io_sync+0x26/0x3f
> > >   [] bio_endio+0x37/0x62
> > >   [] __end_that_request_first+0x224/0x444
> > >   [] end_that_request_chunk+0x8/0xa
> > >   [] scsi_end_request+0x1f/0xc7
> > >   [] scsi_io_completion+0x7b/0x33a
> > >   [] sd_rw_intr+0x23/0x1ab
> > >   [] scsi_finish_command+0x42/0x47
> > >   [] scsi_softirq_done+0x64/0xcf
> > >   [] blk_done_softirq+0x54/0x62
> > >   [] __do_softirq+0x75/0xde
> > >   [] do_softirq+0x3b/0x3d
> > >   [] irq_exit+0x3b/0x3d
> > >   [] do_IRQ+0x51/0x8d
> > >   [] common_interrupt+0x23/0x28
> > >   [] cpu_idle+0x80/0xc3
> > >   [] rest_init+0x23/0x36
> > >   [] start_kernel+0x3a5/0x43c
> > >   [<>] 0x0
> > >   ===
> > > 
> > > I.e. KM_BIO_SRC_IRQ through softirq path.
> > > 
> > 
> > argh.
> > 
> > ntfs_end_buffer_async_read() doesn't know whether it will be called from
> > hardirq or from softirq context: it depends upon the underlying driver.
> > 
> > In this case, if the CPU running ntfs_end_buffer_async_read() is
> > interrupted by IO completion against a different disk controller and that
> > completion handler uses KM_BIO_SRC_IRQ (as it is allowed to do), it will
> > trash ntfs_end_buffer_async_read()'s atomic kmap and unpleasing things will
> > ensue.
> > 
> > I guess a suitable fix here is to protect that kmap with
> > local_irq_save/restore.
> > 
> > I wonder where else we have that bug?
> 
> Actually, this isn't related to softirq-vs-hardirq.  Most interrupt
> handlers are interruptible, so the rule is simply that KM_BIO_SRC_IRQ must
> always be taken under local_irq_disable().
> 
> A quick scan indicates that the following files might be buggy in this
> regard:
> 
> drivers/mmc/wbsd.c
> drivers/mmc/at91_mci.c
> drivers/mmc/sdhci.c
> drivers/scsi/scsi_lib.c when called from stex.c
> fs/ntfs/aops.c
> 
> Happily, KM_BIO_DST_IRQ has no users and can presumably be removed.
> 
> 
> Fixes for stex and ntfs follow.
> 
> 
> From: Andrew Morton <[EMAIL PROTECTED]>
> 
> The KM_BIO_SRC_IRQ kmap slot requires local irq protection.
> 
> Cc: James Bottomley <[EMAIL PROTECTED]>
> Cc: Ed Lin <[EMAIL PROTECTED]>
> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
> ---
> 
>  drivers/scsi/stex.c |8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff -puN drivers/scsi/stex.c~stex-kmap_atomic-atomicity-fix 
> drivers/scsi/stex.c
> --- a/drivers/scsi/stex.c~stex-kmap_atomic-atomicity-fix
> +++ a/drivers/scsi/stex.c
> @@ -459,15 +459,19 @@ static void stex_internal_copy(struct sc
>   *count = cmd->request_bufflen;
>   lcount = *count;
>   while (lcount) {
> + unsigned long flags = flags;/* Suppress uninit warning */
> +
>   len = lcount;
>   s = (void *)src;
>   if (cmd->use_sg) {
>   size_t offset = *count - lcount;
>   s += offset;
> + local_irq_save(flags);
>   base = scsi_kmap_atomic_sg(cmd->request_buffer,
>   sg_count, &offset, &len);
>   if (base == NULL) {
>   *count -= lcount;
> + local_irq_restore(flags);
>   return;
>   }
>   d = base + offset;
> @@ -480,8 +484,10 @@ static void stex_internal_copy(struct sc
>   memcpy(s, d, len);
>  
>   lcount -= len;
> - if (cmd->use_sg)
> + if (cmd->use_sg) {
>   scsi_kunmap_atomic_sg(base);
> + local_irq_restore(flags);
> + }
>   }
>  }
>  
> _
> 
> 
> 
> From: Andrew Morton <[EMAIL PROTECTED]>
> 
> The KM_BIO_SRC_IRQ kmap slot requires local irq protection.
> 
> Cc: Anton Altaparmakov <[EMAIL PROTECTED]>
> Signed-off-by: An

[RFC: 2.6.16 patch] add the areca driver

2007-01-30 Thread Adrian Bunk
I'd like to add the areca driver to 2.6.16 - it seems straightforward 
and doesn't touch other code.

Below are the commits I picked from Linus' tree, and the complete patch 
is attachd.

Is there any reason I miss why this driver might not work in 2.6.16?

TIA
Adrian


Commit: f6013cc7f40d9b191a6b879a1941871b54552a81 
Author: James Bottomley <[EMAIL PROTECTED]> Sun, 28 Jan 2007 00:54:39 +0100 

[SCSI] arcmsr: fix up sysfs values

The sysfs files in arcmsr are non-standard in that they aren't simple
filename value pairs, the values actually contain preceeding text which
would have to be parsed.  The idea of sysfs files is that the file name
is the description and the contents is a simple value.

Fix up arcmsr to conform to this standard.

Signed-off-by: James Bottomley <[EMAIL PROTECTED]>
Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

Commit: e43c51964140ae3b11b320fae451f47ecb7763d4 
Author: Andrew Morton <[EMAIL PROTECTED]> Sun, 28 Jan 2007 00:53:31 +0100 

[SCSI] areca sysfs fix

Remove sysfs_remove_bin_file() return-value checking from the areca driver.

There's nothing a driver can do if sysfs file removal fails, so we'll soon 
be
changing sysfs_remove_bin_file() to internally print a diagnostic and to
return void.

Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

Commit: 144d09c6b0f3638ba03f9994a01aa0136b86918c 
Author: Erich Chen <[EMAIL PROTECTED]> Sun, 28 Jan 2007 00:52:30 +0100 

[SCSI] arcmsr: initial driver, version 1.20.00.13

arcmsr is a driver for the Areca Raid controller, a host based RAID
subsystem that speaks SCSI at the firmware level.

This patch is quite a clean up over the initial submission with
contributions from:

Randy Dunlap <[EMAIL PROTECTED]>
Christoph Hellwig <[EMAIL PROTECTED]>
Matthew Wilcox <[EMAIL PROTECTED]>
Adrian Bunk <[EMAIL PROTECTED]>

Signed-off-by: Erich Chen <[EMAIL PROTECTED]>
Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>




patch-areca.gz
Description: Binary data


Re: 2.6.20-rc6-mm1

2007-01-30 Thread Jiri Slaby

Andrew Morton napsal(a):

On Mon, 29 Jan 2007 23:27:27 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:


On Sun, 28 Jan 2007 11:25:42 +0100
Jiri Slaby <[EMAIL PROTECTED]> wrote:


Andrew Morton napsal(a):

Temporarily at

http://userweb.kernel.org/~akpm/2.6.20-rc6-mm1/

I'm still seeing this during bootup:
BUG: at /home/l/latest/xxx/arch/i386/mm/highmem.c:52 kmap_atomic()
  [] show_trace_log_lvl+0x1a/0x30
  [] show_trace+0x12/0x14
  [] dump_stack+0x16/0x18
  [] kmap_atomic+0x16c/0x20e
  [] ntfs_end_buffer_async_read+0x18e/0x2ed
  [] end_bio_bh_io_sync+0x26/0x3f
  [] bio_endio+0x37/0x62
  [] __end_that_request_first+0x224/0x444
  [] end_that_request_chunk+0x8/0xa
  [] scsi_end_request+0x1f/0xc7
  [] scsi_io_completion+0x7b/0x33a
  [] sd_rw_intr+0x23/0x1ab
  [] scsi_finish_command+0x42/0x47
  [] scsi_softirq_done+0x64/0xcf
  [] blk_done_softirq+0x54/0x62
  [] __do_softirq+0x75/0xde
  [] do_softirq+0x3b/0x3d
  [] irq_exit+0x3b/0x3d
  [] do_IRQ+0x51/0x8d
  [] common_interrupt+0x23/0x28
  [] cpu_idle+0x80/0xc3
  [] rest_init+0x23/0x36
  [] start_kernel+0x3a5/0x43c
  [<>] 0x0
  ===

I.e. KM_BIO_SRC_IRQ through softirq path.

[...]

Actually, this isn't related to softirq-vs-hardirq.  Most interrupt


I meant that hardirq path was fixed (by adding KM_BIO_SRC_IRQ to kmap_atomic
"type !=" test in arch/i386/mm/highmem.c) and softirq was not yet.


handlers are interruptible, so the rule is simply that KM_BIO_SRC_IRQ must
always be taken under local_irq_disable().

A quick scan indicates that the following files might be buggy in this
regard:

drivers/mmc/wbsd.c
drivers/mmc/at91_mci.c
drivers/mmc/sdhci.c
drivers/scsi/scsi_lib.c when called from stex.c
fs/ntfs/aops.c

Happily, KM_BIO_DST_IRQ has no users and can presumably be removed.


Fixes for stex and ntfs follow.


Clean boot now.

thanks,
--
http://www.fi.muni.cz/~xslaby/Jiri Slaby
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8  22A0 32CC 55C3 39D4 7A7E

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.20-rc6-mm1

2007-01-30 Thread Pierre Ossman
Andrew Morton wrote:
> 
> A quick scan indicates that the following files might be buggy in this
> regard:
> 
> drivers/mmc/wbsd.c
> drivers/mmc/sdhci.c

This are probably even buggier than so. They really should be using
page_address(), it seems that kmap_atomic() gives the same result when
not using highmem (which they are carful to avoid).

I'll put on the paper bag and whip up a patch.

Rgds
Pierre



signature.asc
Description: OpenPGP digital signature


[PATCH 1/4 v2] libsas: Don't BUG when connecting two expanders via wide port

2007-01-30 Thread Darrick J. Wong
libsas: Don't BUG when connecting two expanders via wide port

When a device is connected to an expander, the discovery process goes through
sas_ex_discover_dev to figure out what's attached to the phy.  If it is the
case that the phy being discovered happens to be the second phy of a wide link
to an expander, that discover_dev function will incorrectly call
sas_ex_discover_expander, which creates another sas_port and tries to attach the
other sas_phys to the new port, thus triggering a BUG.  The correct thing to do 
is
to check the other ex_phys of the expander to see if there's a sas_port for this
sas_phy, and attach the sas_phy to the existing sas_port.

This is easily triggered if one enables the phys of a wide port between
expanders one by one.

This second version of the patch fixes a small regression in the case where
all the phys show up at once and we accidentally try to attach to a port
that hasn't been created yet.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/libsas/sas_expander.c |   30 ++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/libsas/sas_expander.c 
b/drivers/scsi/libsas/sas_expander.c
index 114e26c..2f3b8e1 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -736,6 +736,29 @@ static struct domain_device *sas_ex_disc
return NULL;
 }
 
+/* See if this phy is part of a wide port */
+static int sas_ex_join_wide_port(struct domain_device *parent, int phy_id)
+{
+   struct ex_phy *phy = &parent->ex_dev.ex_phy[phy_id];
+   int i;
+
+   for (i = 0; i < parent->ex_dev.num_phys; i++) {
+   struct ex_phy *ephy = &parent->ex_dev.ex_phy[i];
+
+   if (ephy == phy)
+   continue;
+
+   if (!memcmp(phy->attached_sas_addr, ephy->attached_sas_addr,
+   SAS_ADDR_SIZE) && ephy->port) {
+   sas_port_add_phy(ephy->port, phy->phy);
+   phy->phy_state = PHY_DEVICE_DISCOVERED;
+   return 0;
+   }
+   }
+
+   return -ENODEV;
+}
+
 static struct domain_device *sas_ex_discover_expander(
struct domain_device *parent, int phy_id)
 {
@@ -868,6 +891,13 @@ static int sas_ex_discover_dev(struct do
return res;
}
 
+   res = sas_ex_join_wide_port(dev, phy_id);
+   if (!res) {
+   SAS_DPRINTK("Attaching ex phy%d to wide port %016llx\n",
+   phy_id, SAS_ADDR(ex_phy->attached_sas_addr));
+   return res;
+   }
+
switch (ex_phy->attached_dev_type) {
case SAS_END_DEV:
child = sas_ex_discover_end_dev(dev, phy_id);

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.20-rc6-mm1

2007-01-30 Thread Andrew Morton
On Mon, 29 Jan 2007 23:27:27 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Sun, 28 Jan 2007 11:25:42 +0100
> Jiri Slaby <[EMAIL PROTECTED]> wrote:
> 
> > Andrew Morton napsal(a):
> > > Temporarily at
> > > 
> > >   http://userweb.kernel.org/~akpm/2.6.20-rc6-mm1/
> > 
> > I'm still seeing this during bootup:
> > BUG: at /home/l/latest/xxx/arch/i386/mm/highmem.c:52 kmap_atomic()
> >   [] show_trace_log_lvl+0x1a/0x30
> >   [] show_trace+0x12/0x14
> >   [] dump_stack+0x16/0x18
> >   [] kmap_atomic+0x16c/0x20e
> >   [] ntfs_end_buffer_async_read+0x18e/0x2ed
> >   [] end_bio_bh_io_sync+0x26/0x3f
> >   [] bio_endio+0x37/0x62
> >   [] __end_that_request_first+0x224/0x444
> >   [] end_that_request_chunk+0x8/0xa
> >   [] scsi_end_request+0x1f/0xc7
> >   [] scsi_io_completion+0x7b/0x33a
> >   [] sd_rw_intr+0x23/0x1ab
> >   [] scsi_finish_command+0x42/0x47
> >   [] scsi_softirq_done+0x64/0xcf
> >   [] blk_done_softirq+0x54/0x62
> >   [] __do_softirq+0x75/0xde
> >   [] do_softirq+0x3b/0x3d
> >   [] irq_exit+0x3b/0x3d
> >   [] do_IRQ+0x51/0x8d
> >   [] common_interrupt+0x23/0x28
> >   [] cpu_idle+0x80/0xc3
> >   [] rest_init+0x23/0x36
> >   [] start_kernel+0x3a5/0x43c
> >   [<>] 0x0
> >   ===
> > 
> > I.e. KM_BIO_SRC_IRQ through softirq path.
> > 
> 
> argh.
> 
> ntfs_end_buffer_async_read() doesn't know whether it will be called from
> hardirq or from softirq context: it depends upon the underlying driver.
> 
> In this case, if the CPU running ntfs_end_buffer_async_read() is
> interrupted by IO completion against a different disk controller and that
> completion handler uses KM_BIO_SRC_IRQ (as it is allowed to do), it will
> trash ntfs_end_buffer_async_read()'s atomic kmap and unpleasing things will
> ensue.
> 
> I guess a suitable fix here is to protect that kmap with
> local_irq_save/restore.
> 
> I wonder where else we have that bug?

Actually, this isn't related to softirq-vs-hardirq.  Most interrupt
handlers are interruptible, so the rule is simply that KM_BIO_SRC_IRQ must
always be taken under local_irq_disable().

A quick scan indicates that the following files might be buggy in this
regard:

drivers/mmc/wbsd.c
drivers/mmc/at91_mci.c
drivers/mmc/sdhci.c
drivers/scsi/scsi_lib.c when called from stex.c
fs/ntfs/aops.c

Happily, KM_BIO_DST_IRQ has no users and can presumably be removed.


Fixes for stex and ntfs follow.


From: Andrew Morton <[EMAIL PROTECTED]>

The KM_BIO_SRC_IRQ kmap slot requires local irq protection.

Cc: James Bottomley <[EMAIL PROTECTED]>
Cc: Ed Lin <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 drivers/scsi/stex.c |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff -puN drivers/scsi/stex.c~stex-kmap_atomic-atomicity-fix drivers/scsi/stex.c
--- a/drivers/scsi/stex.c~stex-kmap_atomic-atomicity-fix
+++ a/drivers/scsi/stex.c
@@ -459,15 +459,19 @@ static void stex_internal_copy(struct sc
*count = cmd->request_bufflen;
lcount = *count;
while (lcount) {
+   unsigned long flags = flags;/* Suppress uninit warning */
+
len = lcount;
s = (void *)src;
if (cmd->use_sg) {
size_t offset = *count - lcount;
s += offset;
+   local_irq_save(flags);
base = scsi_kmap_atomic_sg(cmd->request_buffer,
sg_count, &offset, &len);
if (base == NULL) {
*count -= lcount;
+   local_irq_restore(flags);
return;
}
d = base + offset;
@@ -480,8 +484,10 @@ static void stex_internal_copy(struct sc
memcpy(s, d, len);
 
lcount -= len;
-   if (cmd->use_sg)
+   if (cmd->use_sg) {
scsi_kunmap_atomic_sg(base);
+   local_irq_restore(flags);
+   }
}
 }
 
_



From: Andrew Morton <[EMAIL PROTECTED]>

The KM_BIO_SRC_IRQ kmap slot requires local irq protection.

Cc: Anton Altaparmakov <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 fs/ntfs/aops.c |6 ++
 1 file changed, 6 insertions(+)

diff -puN fs/ntfs/aops.c~ntfs-kmap_atomic-atomicity-fix fs/ntfs/aops.c
--- a/fs/ntfs/aops.c~ntfs-kmap_atomic-atomicity-fix
+++ a/fs/ntfs/aops.c
@@ -88,14 +88,17 @@ static void ntfs_end_buffer_async_read(s
if (unlikely(file_ofs + bh->b_size > init_size)) {
u8 *kaddr;
int ofs;
+   unsigned long flags;
 
ofs = 0;
if (file_ofs < init_size)
ofs = init_size - file_ofs;
+   local_irq_save(flags);
kaddr = kmap_atomic(page, KM_BIO_SRC_IRQ);
 

Re: AIC7xxx on 2.6.18

2007-01-30 Thread Andrew Morton
On Tue, 30 Jan 2007 07:18:20 -0500
Wakko Warner <[EMAIL PROTECTED]> wrote:

> NOTE: I am not on the linux-scsi list, keep me in CC.
> 
> Andrew Morton wrote:
> > On Sun, 28 Jan 2007 14:46:20 -0500
> > Wakko Warner <[EMAIL PROTECTED]> wrote:
> > 
> > > I have 2 machine that oops with these cards.
> > > 
> > > 1) The bios has the option to enable/disable option roms on individual PCI
> > > slots.  I have an AHA-39160 and an AHA-2940U/UW (dual channel).  If I
> > > disable option roms, the driver oopses when accessing the 2nd card.
> > > 
> > > I can get the oops if really needed as I don't like rebooting this 
> > > machine.
> > > 
> > > 2) I have an AHA-39160 with Apple/Mac firmware.  When attempting to use it
> > > on a PC, the driver oopses presumably because the card wasn't initialized 
> > > or
> > > something.  I realize this is probably not a supported configuration, but 
> > > I
> > > don't believe that it should be oopsing.
> > > 
> > > I can get the oops for this one if it'll help.
> > 
> > Yes, getting the oops traces will help, thanks.  And confirmation on a more
> > recent kernel would be good.
> 
> I tested with a 2.6.20-rc6 kernel and the MAC 39160 card.  There was no oops
> and I was able to access the 2 disks.  This was on a different PC though. 
> I'll try it again on the original PC.

Thanks.

> Should I try 2.6.19 as well?

There's not a lot of point in doing so.  If/when we come up with a
2.6.20-rc6 fix we'll know whether it is applicable to 2.6.19.x.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fw: SAS1068 PCI-X Fusion-MPT SAS 1000:0055

2007-01-30 Thread Frederic TEMPORELLI
Hi,

Also seen on a NEC server, a 1068 chip with a jumper used to switch chip
PCI ID and its BIOS:
- PCI ID = 0054 => 'MPT Fusion' BIOS
- PCI ID = 0055 => 'MegaRAID' BIOS

I'm feeling that I submit this unusual chip ID to pciid DB some month ago...

More important: there's a driver for this chip when it is used in
'MegaRAID' mode (standard 'mptsas' driver may be used for MPT Fusion
mode) . This driver is named 'megasr' and is available (binaries) from
several server vendors (Intel/Supermicro/Hitachi...) for standard distro
(RH,Suse).
Seems that this driver is provided by LSI (modinfo)...

regards
--
Fred


Moore, Eric a écrit :
> On Friday, January 26, 2007 12:53 PM, Jun'ichi Nomura wrote: 
>> Hi,
>>
>>> I have new NEC server with SAS1068 PCI-X Fusion-MPT SAS
>>> pciid: 1000:0055
>>> mptsas form 2.6.20-rc5 don't recognize it ;(
>>>
>>> I see that driver support only 1000:0054 and 1000:0058 devices.
>> It might be that the device has software RAID feature and changes
>> device ID based on setup. (1000:0055 when software RAID is enabled
>> and 1000:0054 or something for normal SAS)
>>
>> If so, there is a chance you can disable the software RAID
>> via BIOS setup utility.
>>
>> Thanks,
>> -- 
>> Jun'ichi Nomura, NEC Corporation of America
>>
> 
> You probably want to talk to the megaraid folks and see
> if the have a driver for that.
> 
> I didn't submit a device id of 0055 to sourceforge.
> 
> The only 1068 ids that are clamied by mptsas is 0054 and 0058
> which are the pcix and pcie solutions.   I notice that 0055 is
> listed in repository, but it was not me that submitted that.
> http://pci-ids.ucw.cz/iii/?i=1000 
> 
> Eric Moore
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] sym53c500_cs: remove bogus call fo free_dma()

2007-01-30 Thread Al Viro

What DMA for 16bit pcmcia card, anyway?  We never do request_dma()
there and ->dma_channel never changes since initialization to -1.
IOW, that call is dead code.

Signed-off-by: Al Viro <[EMAIL PROTECTED]>
---
 drivers/scsi/pcmcia/sym53c500_cs.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/pcmcia/sym53c500_cs.c 
b/drivers/scsi/pcmcia/sym53c500_cs.c
index 9fb0ea5..5b458d2 100644
--- a/drivers/scsi/pcmcia/sym53c500_cs.c
+++ b/drivers/scsi/pcmcia/sym53c500_cs.c
@@ -545,8 +545,6 @@ SYM53C500_release(struct pcmcia_device *link)
*/
if (shost->irq)
free_irq(shost->irq, shost);
-   if (shost->dma_channel != 0xff)
-   free_dma(shost->dma_channel);
if (shost->io_port && shost->n_io_port)
release_region(shost->io_port, shost->n_io_port);
 
-- 
1.5.0-rc2.GIT


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AIC7xxx on 2.6.18

2007-01-30 Thread Wakko Warner
NOTE: I am not on the linux-scsi list, keep me in CC.

Andrew Morton wrote:
> On Sun, 28 Jan 2007 14:46:20 -0500
> Wakko Warner <[EMAIL PROTECTED]> wrote:
> 
> > I have 2 machine that oops with these cards.
> > 
> > 1) The bios has the option to enable/disable option roms on individual PCI
> > slots.  I have an AHA-39160 and an AHA-2940U/UW (dual channel).  If I
> > disable option roms, the driver oopses when accessing the 2nd card.
> > 
> > I can get the oops if really needed as I don't like rebooting this machine.
> > 
> > 2) I have an AHA-39160 with Apple/Mac firmware.  When attempting to use it
> > on a PC, the driver oopses presumably because the card wasn't initialized or
> > something.  I realize this is probably not a supported configuration, but I
> > don't believe that it should be oopsing.
> > 
> > I can get the oops for this one if it'll help.
> 
> Yes, getting the oops traces will help, thanks.  And confirmation on a more
> recent kernel would be good.

I tested with a 2.6.20-rc6 kernel and the MAC 39160 card.  There was no oops
and I was able to access the 2 disks.  This was on a different PC though. 
I'll try it again on the original PC.

Should I try 2.6.19 as well?

-- 
 Lab tests show that use of micro$oft causes cancer in lab animals
 Got Gas???
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/12] sas_ata: Make this a module separate from libsas

2007-01-30 Thread Darrick J. Wong

Break out sas_ata as a free-standing module that provides a SATA
Translation Layer (SATL) for libsas.  This patch requires the libsas
SATL registration patch; the changes to sas_ata itself are rather
minor.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/libsas/Makefile  |5 +++--
 drivers/scsi/libsas/sas_ata.c |   37 ++---
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/libsas/Makefile b/drivers/scsi/libsas/Makefile
index 6383eb5..5e95902 100644
--- a/drivers/scsi/libsas/Makefile
+++ b/drivers/scsi/libsas/Makefile
@@ -33,5 +33,6 @@ libsas-y +=  sas_init.o \
sas_dump.o \
sas_discover.o \
sas_expander.o \
-   sas_scsi_host.o \
-   sas_ata.o
+   sas_scsi_host.o
+
+obj-$(CONFIG_SCSI_SAS_SATL) += sas_ata.o
diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 1b7221c..f75fa59 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -404,8 +404,8 @@ static struct ata_port_info sata_port_in
.port_ops = &sas_sata_ops
 };
 
-int sas_ata_init_host_and_port(struct domain_device *found_dev,
-  struct scsi_target *starget)
+static int sas_ata_init_host_and_port(struct domain_device *found_dev,
+ struct scsi_target *starget)
 {
struct Scsi_Host *shost = dev_to_shost(&starget->dev);
struct sas_ha_struct *ha = SHOST_TO_SAS_HA(shost);
@@ -431,7 +431,7 @@ int sas_ata_init_host_and_port(struct do
return 0;
 }
 
-void sas_ata_task_abort(struct sas_task *task)
+static void sas_ata_task_abort(struct sas_task *task)
 {
struct ata_queued_cmd *qc = task->uldd_task;
struct completion *waiting;
@@ -450,3 +450,34 @@ void sas_ata_task_abort(struct sas_task 
waiting = qc->private_data;
complete(waiting);
 }
+
+/* Module initialization */
+static struct satl_operations sas_ata_ops = {
+   .owner  = THIS_MODULE,
+   .init_target= sas_ata_init_host_and_port,
+   .queuecommand   = ata_sas_queuecmd,
+   .ioctl  = ata_scsi_ioctl,
+   .configure_port = ata_sas_slave_configure,
+   .deactivate_port= ata_port_disable,
+   .destroy_port   = ata_sas_port_destroy,
+   .init_port  = ata_sas_port_init,
+   .task_abort = sas_ata_task_abort
+};
+
+static int __init sas_ata_init(void)
+{
+   return sas_register_satl(&sas_ata_ops);
+}
+
+static void __exit sas_ata_exit(void)
+{
+   sas_unregister_satl(&sas_ata_ops);
+}
+
+module_init(sas_ata_init);
+module_exit(sas_ata_exit);
+
+MODULE_AUTHOR("Darrick Wong <[EMAIL PROTECTED]>");
+MODULE_DESCRIPTION("libata SATL for SAS");
+MODULE_LICENSE("GPL v2");
+MODULE_VERSION("1.0");
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/12] libsas: Provide a generic SATL registration function

2007-01-30 Thread Darrick J. Wong

Decouple libsas and sas_ata so that the latter can be provided as a
plug-in module for the former.  Any module wishing to provide SATL
services registers itself with libsas; when SATA devices are
discovered, libsas will module_get/put as necessary to ensure that
the module cannot go away accidentally.  At this time, we cannot
start a SAS HBA without a SATL, load a SATL later, and then
rerun device discovery; that may be addressed in a later patch.
A phy reset will do the job quite nicely.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/libsas/Kconfig |   11 +++
 drivers/scsi/libsas/sas_discover.c  |6 --
 drivers/scsi/libsas/sas_scsi_host.c |  137 ---
 include/scsi/libsas.h   |   30 +---
 include/scsi/sas_ata.h  |   38 +-
 5 files changed, 176 insertions(+), 46 deletions(-)

diff --git a/drivers/scsi/libsas/Kconfig b/drivers/scsi/libsas/Kconfig
index b64e391..9c06eec 100644
--- a/drivers/scsi/libsas/Kconfig
+++ b/drivers/scsi/libsas/Kconfig
@@ -24,12 +24,21 @@ #
 
 config SCSI_SAS_LIBSAS
tristate "SAS Domain Transport Attributes"
-   depends on SCSI && ATA
+   depends on SCSI
select SCSI_SAS_ATTRS
help
  This provides transport specific helpers for SAS drivers which
  use the domain device construct (like the aic94xxx).
 
+config SCSI_SAS_SATL
+   tristate "Serial ATA Translation Layer (SATL) on SAS controllers"
+   depends on SCSI_SAS_LIBSAS && ATA
+   default y
+   help
+ This provides an ATA translation layer between libsas and
+ libata to load SATA devices that are connected to SAS
+ controllers.
+
 config SCSI_SAS_LIBSAS_DEBUG
bool "Compile the SAS Domain Transport Attributes in debug mode"
default y
diff --git a/drivers/scsi/libsas/sas_discover.c 
b/drivers/scsi/libsas/sas_discover.c
index a18c0f6..56cc8da 100644
--- a/drivers/scsi/libsas/sas_discover.c
+++ b/drivers/scsi/libsas/sas_discover.c
@@ -476,12 +476,6 @@ cont1:
if (!dev->parent)
sas_sata_propagate_sas_addr(dev);
 
-   /* XXX Hint: register this SATA device with SATL.
-  When this returns, dev->sata_dev->lu is alive and
-  present.
-   sas_satl_register_dev(dev);
-   */
-
sas_fill_in_rphy(dev, dev->rphy);
 
return 0;
diff --git a/drivers/scsi/libsas/sas_scsi_host.c 
b/drivers/scsi/libsas/sas_scsi_host.c
index a30c0b7..073b6a7 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -44,6 +44,10 @@ #include 
 
 /* -- SCSI Host glue -- */
 
+
+static DEFINE_SPINLOCK(satl_ops_lock);
+static struct satl_operations *satl_ops;
+
 static void sas_scsi_task_done(struct sas_task *task)
 {
struct task_status_struct *ts = &task->task_status;
@@ -213,8 +217,8 @@ int sas_queuecommand(struct scsi_cmnd *c
unsigned long flags;
 
spin_lock_irqsave(dev->sata_dev.ap->lock, flags);
-   res = ata_sas_queuecmd(cmd, scsi_done,
-  dev->sata_dev.ap);
+   res = satl_ops->queuecommand(cmd, scsi_done,
+dev->sata_dev.ap);
spin_unlock_irqrestore(dev->sata_dev.ap->lock, flags);
goto out;
}
@@ -663,8 +667,9 @@ int sas_ioctl(struct scsi_device *sdev, 
 {
struct domain_device *dev = sdev_to_domain_dev(sdev);
 
-   if (dev_is_sata(dev))
-   return ata_scsi_ioctl(sdev, cmd, arg);
+   if (dev_is_sata(dev)) {
+   return satl_ops->ioctl(sdev, cmd, arg);
+   }
 
return -EINVAL;
 }
@@ -705,6 +710,29 @@ static inline struct domain_device *sas_
return sas_find_dev_by_rphy(rphy);
 }
 
+static int sas_target_alloc_sata(struct domain_device *dev,
+struct scsi_target *starget)
+{
+   int res = -ENODEV;
+
+   /* Do we have a SATL available? */
+   if (!get_satl())
+   goto satl_found;
+
+   request_module("sas_ata");
+   if (!get_satl())
+   goto satl_found;
+
+   SAS_DPRINTK("sas_ata not loaded, ignoring SATA devices\n");
+   goto no_satl;
+
+satl_found:
+   res = satl_ops->init_target(dev, starget);
+
+no_satl:
+   return res;
+}
+
 int sas_target_alloc(struct scsi_target *starget)
 {
struct domain_device *found_dev = sas_find_target(starget);
@@ -714,7 +742,7 @@ int sas_target_alloc(struct scsi_target 
return -ENODEV;
 
if (dev_is_sata(found_dev)) {
-   res = sas_ata_init_host_and_port(found_dev, starget);
+   res = sas_target_alloc_sata(found_dev, starget);
if (res)
return res;
}
@@ -734,7 +762,7 @@ int sas_slave_configure(struct scsi_devi
BUG_ON(dev->r

[PATCH 10/12] sas_ata: Implement sas_task_abort for ATA devices

2007-01-30 Thread Darrick J. Wong

ATA devices need special handling for sas_task_abort.  If the ATA command
came from SCSI, then we merely need to tell SCSI to abort the scsi_cmnd.
However, internal commands require a bit more work--we need to fill the qc
with the appropriate error status and complete the command, and eventually
post_internal will issue the actual ABORT TASK.
---

 drivers/scsi/libsas/sas_ata.c   |   47 +--
 drivers/scsi/libsas/sas_internal.h  |3 ++
 drivers/scsi/libsas/sas_scsi_host.c |8 --
 include/scsi/sas_ata.h  |2 +
 4 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 8111222..1b7221c 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -30,6 +30,8 @@ #include 
 #include 
 #include 
 #include "../scsi_sas_internal.h"
+#include "../scsi_transport_api.h"
+#include 
 
 static enum ata_completion_errors sas_to_ata_err(struct task_status_struct *ts)
 {
@@ -91,6 +93,7 @@ static void sas_ata_task_done(struct sas
struct domain_device *dev;
struct task_status_struct *stat = &task->task_status;
struct ata_task_resp *resp = (struct ata_task_resp *)stat->buf;
+   struct sas_ha_struct *sas_ha;
enum ata_completion_errors ac;
unsigned long flags;
 
@@ -98,6 +101,7 @@ static void sas_ata_task_done(struct sas
goto qc_already_gone;
 
dev = qc->ap->private_data;
+   sas_ha = dev->port->ha;
 
spin_lock_irqsave(dev->sata_dev.ap->lock, flags);
if (stat->stat == SAS_PROTO_RESPONSE || stat->stat == SAM_GOOD) {
@@ -124,6 +128,20 @@ static void sas_ata_task_done(struct sas
ata_qc_complete(qc);
spin_unlock_irqrestore(dev->sata_dev.ap->lock, flags);
 
+   /*
+* If the sas_task has an ata qc, a scsi_cmnd and the aborted
+* flag is set, then we must have come in via the libsas EH
+* functions.  When we exit this function, we need to put the
+* scsi_cmnd on the list of finished errors.  The ata_qc_complete
+* call cleans up the libata side of things but we're protected
+* from the scsi_cmnd going away because the scsi_cmnd is owned
+* by the EH, making libata's call to scsi_done a NOP.
+*/
+   spin_lock_irqsave(&task->task_state_lock, flags);
+   if (qc->scsicmd && task->task_state_flags & SAS_TASK_STATE_ABORTED)
+   scsi_eh_finish_cmd(qc->scsicmd, &sas_ha->eh_done_q);
+   spin_unlock_irqrestore(&task->task_state_lock, flags);
+
 qc_already_gone:
list_del_init(&task->list);
sas_free_task(task);
@@ -259,15 +277,18 @@ static void sas_ata_post_internal(struct
 * ought to abort the task.
 */
struct sas_task *task = qc->lldd_task;
-   struct domain_device *dev = qc->ap->private_data;
+   unsigned long flags;
 
qc->lldd_task = NULL;
if (task) {
+   /* Should this be a AT(API) device reset? */
+   spin_lock_irqsave(&task->task_state_lock, flags);
+   task->task_state_flags |= SAS_TASK_NEED_DEV_RESET;
+   spin_unlock_irqrestore(&task->task_state_lock, flags);
+
task->uldd_task = NULL;
__sas_task_abort(task);
}
-
-   sas_phy_reset(dev->port->phy, 1);
}
 }
 
@@ -409,3 +430,23 @@ int sas_ata_init_host_and_port(struct do
 
return 0;
 }
+
+void sas_ata_task_abort(struct sas_task *task)
+{
+   struct ata_queued_cmd *qc = task->uldd_task;
+   struct completion *waiting;
+
+   /* Bounce SCSI-initiated commands to the SCSI EH */
+   if (qc->scsicmd) {
+   scsi_req_abort_cmd(qc->scsicmd);
+   scsi_schedule_eh(qc->scsicmd->device->host);
+   return;
+   }
+
+   /* Internal command, fake a timeout and complete. */
+   qc->flags &= ~ATA_QCFLAG_ACTIVE;
+   qc->flags |= ATA_QCFLAG_FAILED;
+   qc->err_mask |= AC_ERR_TIMEOUT;
+   waiting = qc->private_data;
+   complete(waiting);
+}
diff --git a/drivers/scsi/libsas/sas_internal.h 
b/drivers/scsi/libsas/sas_internal.h
index a78638d..2b8213b 100644
--- a/drivers/scsi/libsas/sas_internal.h
+++ b/drivers/scsi/libsas/sas_internal.h
@@ -39,6 +39,9 @@ #else
 #define SAS_DPRINTK(fmt, ...)
 #endif
 
+#define TO_SAS_TASK(_scsi_cmd)  ((void *)(_scsi_cmd)->host_scribble)
+#define ASSIGN_SAS_TASK(_sc, _t) do { (_sc)->host_scribble = (void *) _t; } 
while (0)
+
 void sas_scsi_recover_host(struct Scsi_Host *shost);
 
 int sas_show_class(enum sas_class class, char *buf);
diff --git a/drivers/scsi/libsas/sas_scsi_host.c 
b/drivers/scsi/libsas/sas_scsi_host.c
index 5b0c471..a30c0b7 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -44,9 +44,6 @@ #include 
 
 /* 

[PATCH 08/12] libsas: Unknown STP devices should be reported to libata as unknown.

2007-01-30 Thread Darrick J. Wong

When libsas encounters a STP device whose protocol isn't recognized (i.e.
not ATA or ATAPI), we should set the ata_device's class to ATA_DEV_UNKNOWN
instead of ATA_DEV_ATA.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/libsas/sas_ata.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 20f3a5e..7ebda69 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -232,7 +232,7 @@ static void sas_ata_phy_reset(struct ata
SAS_DPRINTK("%s: Unknown SATA command set: %d.\n",
__FUNCTION__,
dev->sata_dev.command_set);
-   ap->device[0].class = ATA_DEV_ATA;
+   ap->device[0].class = ATA_DEV_UNKNOWN;
break;
}
 
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/12] sas_ata: Assign sas_task to scsi_cmnd to enable EH for ATA devices

2007-01-30 Thread Darrick J. Wong

The SATL should connect the scsi_cmnd to the sas_task (despite the presence
of libata) so that requests to abort scsi_cmnds headed to the ATA device
can be processed by the EH and aborted correctly.  The abort status should
still be propagated from sas -> ata -> scsi.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/libsas/sas_ata.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 7ebda69..8111222 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -119,6 +119,8 @@ static void sas_ata_task_done(struct sas
}
 
qc->lldd_task = NULL;
+   if (qc->scsicmd)
+   ASSIGN_SAS_TASK(qc->scsicmd, NULL);
ata_qc_complete(qc);
spin_unlock_irqrestore(dev->sata_dev.ap->lock, flags);
 
@@ -184,6 +186,9 @@ static unsigned int sas_ata_qc_issue(str
break;
}
 
+   if (qc->scsicmd)
+   ASSIGN_SAS_TASK(qc->scsicmd, task);
+
if (sas_ha->lldd_max_execute_num < 2)
res = i->dft->lldd_execute_task(task, 1, GFP_ATOMIC);
else
@@ -193,6 +198,8 @@ static unsigned int sas_ata_qc_issue(str
if (res) {
SAS_DPRINTK("lldd_execute_task returned: %d\n", res);
 
+   if (qc->scsicmd)
+   ASSIGN_SAS_TASK(qc->scsicmd, NULL);
sas_free_task(task);
return AC_ERR_SYSTEM;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/12] libsas: Accept SAM_GOOD for ATAPI devices in sas_ata_task_done

2007-01-30 Thread Darrick J. Wong

A sas_task sent to an ATAPI devices returns SAM_GOOD if successful.
Therefore, we should treat this the same way we treat ATA commands
that succeed.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/libsas/sas_ata.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 2bb619e..20f3a5e 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -100,7 +100,7 @@ static void sas_ata_task_done(struct sas
dev = qc->ap->private_data;
 
spin_lock_irqsave(dev->sata_dev.ap->lock, flags);
-   if (stat->stat == SAS_PROTO_RESPONSE) {
+   if (stat->stat == SAS_PROTO_RESPONSE || stat->stat == SAM_GOOD) {
ata_tf_from_fis(resp->ending_fis, &dev->sata_dev.tf);
qc->err_mask |= ac_err_mask(dev->sata_dev.tf.command);
dev->sata_dev.sstatus = resp->sstatus;
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/12] sas_ata: Don't copy aic94xx's sactive to ata_port

2007-01-30 Thread Darrick J. Wong

Since the aic94xx sequencer assigns its own NCQ tags to ATA commands, it
no longer makes any sense to copy the sactive field in the STP response
to ata_port->sactive, as that will confuse libata.  Also, libata seems
to be capable of managing sactive on its own.

The attached patch gets rid of one of the causes of the BUG messages in
ata_qc_new, and seems to work without problems on an IBM x206m.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/libsas/sas_ata.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index c8af884..16c3e5a 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -106,7 +106,6 @@ static void sas_ata_task_done(struct sas
dev->sata_dev.sstatus = resp->sstatus;
dev->sata_dev.serror = resp->serror;
dev->sata_dev.scontrol = resp->scontrol;
-   dev->sata_dev.ap->sactive = resp->sactive;
} else if (stat->stat != SAM_STAT_GOOD) {
ac = sas_to_ata_err(stat);
if (ac) {
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/12] sas_ata: Implement SATA PHY control

2007-01-30 Thread Darrick J. Wong

This patch requires "libsas: Add a sysfs knob to enable/disable a phy"
to be applied.  It hooks the SControl write function to provide basic
SATA phy control for phy enable/disable and speed limits.  Power
management is still broken, though it is unclear that libata actually
uses those SControl bits anyway.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/libsas/sas_ata.c   |   42 ++-
 drivers/scsi/libsas/sas_scsi_host.c |1 +
 2 files changed, 42 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 16c3e5a..2bb619e 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -270,6 +270,46 @@ static void sas_ata_tf_read(struct ata_p
memcpy(tf, &dev->sata_dev.tf, sizeof (*tf));
 }
 
+static void sas_ata_scontrol_write(struct domain_device *dev, u32 val)
+{
+   u32 tmp = dev->sata_dev.scontrol;
+   struct sas_phy *phy = dev->port->phy;
+
+   val &= 0x0FF; /* only set max spd and dev ctrl */
+   val |= 0x300; /* disallow host pm */
+   val |= tmp & 0xF000; /* preserve upper bits */
+
+   /* disable phy */
+   if ((val & 0x4) && !(tmp & 0x4))
+   sas_phy_enable(phy, 0);
+
+   /* enable phy */
+   if (!(val & 0x4) && (tmp & 0x4))
+   sas_phy_enable(phy, 1);
+
+   /* reset phy */
+   if ((val & 0x1) && !(tmp & 0x1))
+   sas_phy_reset(phy, 0);
+
+   /* speed limit */
+   if ((val & 0xF0) != (tmp & 0xF0)) {
+   struct sas_phy_linkrates rates = {0};
+
+   switch ((val & 0xF0) >> 4) {
+   case 0:
+   case 2:
+   rates.maximum_linkrate = SAS_LINK_RATE_3_0_GBPS;
+   break;
+   case 1:
+   rates.maximum_linkrate = SAS_LINK_RATE_1_5_GBPS;
+   break;
+   }
+   sas_set_phy_speed(phy, &rates);
+   }
+
+   dev->sata_dev.scontrol = val;
+}
+
 static void sas_ata_scr_write(struct ata_port *ap, unsigned int sc_reg_in,
  u32 val)
 {
@@ -281,7 +321,7 @@ static void sas_ata_scr_write(struct ata
dev->sata_dev.sstatus = val;
break;
case SCR_CONTROL:
-   dev->sata_dev.scontrol = val;
+   sas_ata_scontrol_write(dev, val);
break;
case SCR_ERROR:
dev->sata_dev.serror = val;
diff --git a/drivers/scsi/libsas/sas_scsi_host.c 
b/drivers/scsi/libsas/sas_scsi_host.c
index fee9c10..5b0c471 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -1040,3 +1040,4 @@ EXPORT_SYMBOL_GPL(sas_eh_device_reset_ha
 EXPORT_SYMBOL_GPL(sas_slave_alloc);
 EXPORT_SYMBOL_GPL(sas_target_destroy);
 EXPORT_SYMBOL_GPL(sas_ioctl);
+EXPORT_SYMBOL_GPL(sas_set_phy_speed);
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/12] sas_ata: ata_post_internal should abort the sas_task

2007-01-30 Thread Darrick J. Wong

This patch adds a new field, lldd_task, to ata_queued_cmd so that libata
users such as libsas can associate some data with a qc.  The particular
ambition with this patch is to associate a sas_task with a qc; that way,
if libata decides to timeout a command, we can come back (in
sas_ata_post_internal) and abort the sas task.

One question remains: Is it necessary to reset the phy on error, or will
the libata error handler take care of it?  (Assuming that one is written,
of course.)  This patch, as it is today, works well enough to clean
things up when an ATA device probe attempt fails halfway through the probe,
though I'm not sure this is always the right thing to do.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/libsas/sas_ata.c |   30 +++---
 include/linux/libata.h|1 +
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 46e1dbe..c8af884 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -88,12 +88,17 @@ static enum ata_completion_errors sas_to
 static void sas_ata_task_done(struct sas_task *task)
 {
struct ata_queued_cmd *qc = task->uldd_task;
-   struct domain_device *dev = qc->ap->private_data;
+   struct domain_device *dev;
struct task_status_struct *stat = &task->task_status;
struct ata_task_resp *resp = (struct ata_task_resp *)stat->buf;
enum ata_completion_errors ac;
unsigned long flags;
 
+   if (!qc)
+   goto qc_already_gone;
+
+   dev = qc->ap->private_data;
+
spin_lock_irqsave(dev->sata_dev.ap->lock, flags);
if (stat->stat == SAS_PROTO_RESPONSE) {
ata_tf_from_fis(resp->ending_fis, &dev->sata_dev.tf);
@@ -114,9 +119,11 @@ static void sas_ata_task_done(struct sas
}
}
 
+   qc->lldd_task = NULL;
ata_qc_complete(qc);
spin_unlock_irqrestore(dev->sata_dev.ap->lock, flags);
 
+qc_already_gone:
list_del_init(&task->list);
sas_free_task(task);
 }
@@ -166,6 +173,7 @@ static unsigned int sas_ata_qc_issue(str
task->scatter = qc->__sg;
task->ata_task.retry_count = 1;
task->task_state_flags = SAS_TASK_STATE_PENDING;
+   qc->lldd_task = task;
 
switch (qc->tf.protocol) {
case ATA_PROT_NCQ:
@@ -237,8 +245,24 @@ static void sas_ata_post_internal(struct
if (qc->flags & ATA_QCFLAG_FAILED)
qc->err_mask |= AC_ERR_OTHER;
 
-   if (qc->err_mask)
-   SAS_DPRINTK("%s: Failure; reset phy!\n", __FUNCTION__);
+   if (qc->err_mask) {
+   /*
+* Find the sas_task and kill it.  By this point,
+* libata has decided to kill the qc, so we needn't
+* bother with sas_ata_task_done.  But we still
+* ought to abort the task.
+*/
+   struct sas_task *task = qc->lldd_task;
+   struct domain_device *dev = qc->ap->private_data;
+
+   qc->lldd_task = NULL;
+   if (task) {
+   task->uldd_task = NULL;
+   __sas_task_abort(task);
+   }
+
+   sas_phy_reset(dev->port->phy, 1);
+   }
 }
 
 static void sas_ata_tf_read(struct ata_port *ap, struct ata_taskfile *tf)
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 22aa69e..fe98957 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -452,6 +452,7 @@ struct ata_queued_cmd {
ata_qc_cb_t complete_fn;
 
void*private_data;
+   void*lldd_task;
 };
 
 struct ata_port_stats {
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/12] sas_ata: sas_ata_qc_issue should return AC_ERR_*

2007-01-30 Thread Darrick J. Wong

The sas_ata_qc_issue function was incorrectly written to return error
codes such as -ENOMEM.  Since libata OR's qc->err_mask with the
return value, It is necessary to make my code return one of the
AC_ERR_ codes instead.  For now, use AC_ERR_SYSTEM because an error
here means that the OS couldn't send the command to the controller.

If anybody has a suggestion for a better AC_ERR_ code to use, please
suggest it.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/libsas/sas_ata.c |   10 --
 1 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 0bb1a14..46e1dbe 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -123,7 +123,7 @@ static void sas_ata_task_done(struct sas
 
 static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
 {
-   int res = -ENOMEM;
+   int res;
struct sas_task *task;
struct domain_device *dev = qc->ap->private_data;
struct sas_ha_struct *sas_ha = dev->port->ha;
@@ -135,7 +135,7 @@ static unsigned int sas_ata_qc_issue(str
 
task = sas_alloc_task(GFP_ATOMIC);
if (!task)
-   goto out;
+   return AC_ERR_SYSTEM;
task->dev = dev;
task->task_proto = SAS_PROTOCOL_STP;
task->task_done = sas_ata_task_done;
@@ -187,12 +187,10 @@ static unsigned int sas_ata_qc_issue(str
SAS_DPRINTK("lldd_execute_task returned: %d\n", res);
 
sas_free_task(task);
-   if (res == -SAS_QUEUE_FULL)
-   return -ENOMEM;
+   return AC_ERR_SYSTEM;
}
 
-out:
-   return res;
+   return 0;
 }
 
 static u8 sas_ata_check_status(struct ata_port *ap)
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/12] sas_ata: Require CONFIG_ATA in Kconfig

2007-01-30 Thread Darrick J. Wong

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/libsas/Kconfig |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/libsas/Kconfig b/drivers/scsi/libsas/Kconfig
index aafdc92..b64e391 100644
--- a/drivers/scsi/libsas/Kconfig
+++ b/drivers/scsi/libsas/Kconfig
@@ -24,7 +24,7 @@ #
 
 config SCSI_SAS_LIBSAS
tristate "SAS Domain Transport Attributes"
-   depends on SCSI
+   depends on SCSI && ATA
select SCSI_SAS_ATTRS
help
  This provides transport specific helpers for SAS drivers which
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/12] sas_ata: Satisfy libata qc function locking requirements

2007-01-30 Thread Darrick J. Wong

ata_qc_complete and ata_sas_queuecmd require that the port lock be held
when they are called.  sas_ata doesn't do this, leading to BUG messages
about qc tags newly allocated qc tags already being in use.  This patch
fixes the locking, which should clean up the rest of those messages.

So far I've tested this against an IBM x206m with two SATA disks with no
BUG messages and no other signs of things going wrong, and the machine
finally passed the pounder stress test.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/libsas/sas_ata.c   |4 
 drivers/scsi/libsas/sas_scsi_host.c |4 
 2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index de42b5b..0bb1a14 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -92,7 +92,9 @@ static void sas_ata_task_done(struct sas
struct task_status_struct *stat = &task->task_status;
struct ata_task_resp *resp = (struct ata_task_resp *)stat->buf;
enum ata_completion_errors ac;
+   unsigned long flags;
 
+   spin_lock_irqsave(dev->sata_dev.ap->lock, flags);
if (stat->stat == SAS_PROTO_RESPONSE) {
ata_tf_from_fis(resp->ending_fis, &dev->sata_dev.tf);
qc->err_mask |= ac_err_mask(dev->sata_dev.tf.command);
@@ -113,6 +115,8 @@ static void sas_ata_task_done(struct sas
}
 
ata_qc_complete(qc);
+   spin_unlock_irqrestore(dev->sata_dev.ap->lock, flags);
+
list_del_init(&task->list);
sas_free_task(task);
 }
diff --git a/drivers/scsi/libsas/sas_scsi_host.c 
b/drivers/scsi/libsas/sas_scsi_host.c
index 2cd478a..fee9c10 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -213,8 +213,12 @@ int sas_queuecommand(struct scsi_cmnd *c
struct sas_task *task;
 
if (dev_is_sata(dev)) {
+   unsigned long flags;
+
+   spin_lock_irqsave(dev->sata_dev.ap->lock, flags);
res = ata_sas_queuecmd(cmd, scsi_done,
   dev->sata_dev.ap);
+   spin_unlock_irqrestore(dev->sata_dev.ap->lock, flags);
goto out;
}
 
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/12] Roll-up of sas_ata patches

2007-01-30 Thread Darrick J. Wong
Hi all,

This is a roll-up of all of my ATA related uncommitted patches against
libsas and aic94xx to date.  Per James Bottomley's request, I'm pushing
these patches out for further review in aic94xx-sas.  The big changes in
this patch set are a lot of bug and locking fixes, the conversion of the
EH routines to interact with the SAS EH strategy routines, and of course
the separation of the SATL code into a separate module.

These patches should apply in number order cleanly against 2.6.20-rc6 +
scsi_misc + scsi-rc-fixes + aic94xx-sas.  They've been fairly well tested
on a bunch of SATA disks in a x206m, though the ATAPI support is not so
well tested.  However, I have run these patches in other loads for a while.
Hopefully these patches are ready for more widespread testing in
scsi-misc, and thank you for any comments or feedback that you provide.

(Apologies for any stgit mail misconfiguration on my part.)

--D
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html