Re: AIC7xxx on 2.6.18

2007-02-03 Thread James Bottomley
On Fri, 2007-02-02 at 19:42 -0500, Wakko Warner wrote:
> [   40.154122] ACPI: PCI Interrupt :05:01.1[B] -> GSI 17 (level,
> low) -> IRQ 22
> [   40.158190] scsi4: PCI error Interrupt at seqaddr = 0x1bb
> [   40.158261] scsi4: Signaled a Target Abort

Well, this is the source of the problem.  It means the driver detected
an error in the PCI system.  I'm afraid I don't know what a PCI target
error is, but I think it means something is wrong with the PCI bus in
your system.  There's also a screaming interrupt, because the first 500
interrupts will be ignored before it looks at the bus error register.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AIC7xxx on 2.6.18

2007-02-03 Thread Wakko Warner
Please keep me in CC, I'm not on the list.

Mark Rustad wrote:
> On Feb 2, 2007, at 6:42 PM, Wakko Warner wrote:
> >The PC is a suprtmicro x5da8 with an onboard dual channel AHA-39320  
> >u320
> >controller.  I have a dual channel AHA-39160 u160 and a dual channel
> >AHA-2940U/UW (ch0/internal is wide/narrow, ch1/externel is narrow).
> 
> I have used an x6-class Supermicro motherboard with the Adaptec u320  
> controller and I had problems hot-swapping drives with 2.6.18. It  

This wasn't about hot swapping in my case.  On the hot swapping thing, I've
never successfully done this with a 2.6 kernel.

> seemed that the bus reset that the backplane processor generated  
> caused trouble for the driver, killing the SCSI bus. 2.6.16 and  
> 2.6.17 locked up the kernel in the case of hot-swapping drives.

I have a scsi box that supports hot plug.  I had a drive failing and I
decided to just replace it, took down the entire bus and the raid array with
it, fortunately, nothing lost.  This was with the dual u160 card.

> I switched to 2.6.19.2 and things are better. I did find that a card  
> dump is produced when hot-inserting a drive, so it is way noisier  
> than I think it should be, but it continues to operate and life goes  
> on, which is much better behavior than 2.6.16, 2.6.17 or 2.6.18.

I've never bothered with the 2.6.x.x kernels.  Yet =)

> >I thought it was because I had option roms turned off, but when I  
> >turned
> >them on, it still has problems.  What's odd is the fact that if I  
> >boot with init=/bin/sh, modprobe aic7xxx, it works fine and I can  
> >exec init and it works fine.
> 
> I don't know what is up with that, but based on what I have seen I  
> would recommend using 2.6.19.x instead of 2.6.18 for systems using  
> the aic79xx driver.

As stated in the email, it was the aic7xxx driver.  I've never had problems
with the u320 driver that I recall since I've had this machine.  It appears
that when I load aic7xxx, it finds both channels of the u160 card and all
it's devices, then when it hits the 2940u/uw card, it loads the first
channel, all devices and crashes before it hits the 2nd channel.  It looked
like it had problems with the plextor cdrw thats on ID2.

But the odd thing is, if I boot to /bin/sh, insmod aic7xxx, and exec init,
everything's fine.

-- 
 Lab tests show that use of micro$oft causes cancer in lab animals
 Got Gas???
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/12] Roll-up of sas_ata patches

2007-02-03 Thread James Bottomley
On Tue, 2007-01-30 at 04:15 -0500, Darrick J. Wong wrote:
> This is a roll-up of all of my ATA related uncommitted patches against
> libsas and aic94xx to date.  Per James Bottomley's request, I'm pushing
> these patches out for further review in aic94xx-sas.  The big changes in
> this patch set are a lot of bug and locking fixes, the conversion of the
> EH routines to interact with the SAS EH strategy routines, and of course
> the separation of the SATL code into a separate module.
> 
> These patches should apply in number order cleanly against 2.6.20-rc6 +
> scsi_misc + scsi-rc-fixes + aic94xx-sas.  They've been fairly well tested
> on a bunch of SATA disks in a x206m, though the ATAPI support is not so
> well tested.  However, I have run these patches in other loads for a while.
> Hopefully these patches are ready for more widespread testing in
> scsi-misc, and thank you for any comments or feedback that you provide.
> 
> (Apologies for any stgit mail misconfiguration on my part.)

There's a problem somewhere with your error handler changes (which I
picked up thanks to the problems with the V28 firmware).  What I see
without your changes is that for a directly attached SATA device, when
the firmware begins its death spiral, the commands all return and
eventually send I/O errors to the filesystem,  With your patch series
applied, it just loops forever giving messages like:

Feb  3 12:07:06 localhost kernel: aic94xx: escb_tasklet_complete: phy5: 
LINK_RESET_ERROR
Feb  3 12:07:06 localhost kernel: aic94xx: phy5: Receive FIS timeout
Feb  3 12:07:06 localhost kernel: aic94xx: phy5: retries:0 performing link 
reset seq
Feb  3 12:07:06 localhost kernel: sas: --- Exit sas_scsi_recover_host
Feb  3 12:07:06 localhost kernel: aic94xx: control_phy_tasklet_complete: phy5, 
lrate:0x8, proto:0xe
Feb  3 12:07:06 localhost kernel: sas: Enter sas_scsi_recover_host
Feb  3 12:07:06 localhost kernel: sas: --- Exit sas_scsi_recover_host
Feb  3 12:07:06 localhost kernel: sas: Enter sas_scsi_recover_host
Feb  3 12:07:06 localhost kernel: sas: --- Exit sas_scsi_recover_host
Feb  3 12:07:06 localhost kernel: sas: Enter sas_scsi_recover_host
Feb  3 12:07:06 localhost kernel: sas: --- Exit sas_scsi_recover_host


James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: Update Aic94xx SAS/SATA Linux open source device driver for new sequence firmware.

2007-02-03 Thread James Bottomley
On Tue, 2007-01-30 at 15:31 -0800, Wu, Gilbert wrote:
> Subject:  [PATCH] scsi: Update Aic94xx SAS/SATA Linux open source device
> driver for new sequence firmware.

I put the patch, which seems fine, into scsi-misc-2.6; however, it looks
like the firmware isn't fine.  With the V28 firmware I get instability
on my directly attached SATA disk.  It begins about three minutes into a
small exercise of the filesystem on the disk as:

Feb  3 16:13:50 localhost kernel: sas: command 0xf7aca994, task
0x, gone: EH_RESET_TIMER

and then times out commands and eventually flips ext3 into read only
mode because of the errors.

If I run the same tests with the V17/10c6 firmware, everything runs fine
and to completion.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 7919] New: Tape dies if wrong block size used

2007-02-03 Thread Kai Makisara
On Sat, 3 Feb 2007, James Bottomley wrote:

> On Sat, 2007-02-03 at 13:21 +0200, Kai Makisara wrote:
> > This patch may also fix the bug 7900.
> > 
> > The patch compiles and is lightly tested.
> 
> We can give it a spin in scsi-misc ... do you want me to hold off from
> sending it upstream with the scsi-misc tree when 2.6.20 is declared?
> 
You can send it upstream after 2.6.20 is out. I am actually very happy 
with the patch. Conceptually it is very simple and based on mechanisms 
existing in the driver. In addition to fixing the bug in this report, it 
removes the last difference in user space sematics between direct i/o and 
using the driver buffer. (No documentation change needed because 
Documentation/scsi/st.txt has not mentioned this difference ;-)

-- 
Kai
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] sas_ata: Make this a module separate from libsas

2007-02-03 Thread James Bottomley
On Tue, 2007-01-30 at 01:19 -0800, Darrick J. Wong wrote:
> Break out sas_ata as a free-standing module that provides a SATA
> Translation Layer (SATL) for libsas.  This patch requires the libsas
> SATL registration patch; the changes to sas_ata itself are rather
> minor.

Right at the moment, this doesn't work.  The dependency of ATA_AVAILABLE
on SCSI_SAS_SATL forces libsas to require sas_ata if you select it as a
module (i.e. they're not truly independent).

How about this solution to untangle them?

James

diff --git a/drivers/scsi/libsas/sas_scsi_host.c 
b/drivers/scsi/libsas/sas_scsi_host.c
index 009fd2b..73a737f 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -46,7 +46,16 @@
 
 
 static DEFINE_SPINLOCK(satl_ops_lock);
-static struct satl_operations *satl_ops;
+static struct satl_operations *satl_ops = NULL;
+
+
+static inline int dev_is_sata(struct domain_device *dev)
+{
+   return (satl_ops
+   && (dev->rphy->identify.target_port_protocols
+   & SAS_PROTOCOL_SATA));
+}
+
 
 static void sas_scsi_task_done(struct sas_task *task)
 {
diff --git a/include/scsi/sas_ata.h b/include/scsi/sas_ata.h
index f1d90f7..06a9be5 100644
--- a/include/scsi/sas_ata.h
+++ b/include/scsi/sas_ata.h
@@ -49,25 +49,4 @@ struct satl_operations {
 int sas_register_satl(struct satl_operations *satl_ops);
 int sas_unregister_satl(struct satl_operations *satl_ops);
 
-#ifdef CONFIG_SCSI_SAS_SATL_MODULE
-# define SAS_ATA_AVAILABLE
-#endif
-
-#ifdef CONFIG_SCSI_SAS_SATL
-# define SAS_ATA_AVAILABLE
-#endif
-
-#ifdef SAS_ATA_AVAILABLE
-
-static inline int dev_is_sata(struct domain_device *dev)
-{
-   return (dev->rphy->identify.target_port_protocols & SAS_PROTOCOL_SATA);
-}
-
-#else
-
-#define dev_is_sata(x) (0)
-
-#endif /* SAS_ATA_AVAILABLE */
-
 #endif /* _SAS_ATA_H_ */


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PATCH] SCSI final bug fixes for 2.6.20-rc7

2007-02-03 Thread James Bottomley
There are four small bug fixes in this:  A set of qla4xxx driver bugs, a
fix for uninitialised variables in sd during initial hotplug, a tape fix
for BUG 7864 and an async scan fix for the "none" scanning type with
RAID devices.

The fix is available here:

master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6.git

The short changelog is:

David C Somayajulu (1):
   qla4xxx: bug fixes

Kai Makisara (1):
   st: A MTIOCTOP/MTWEOF within the early warning will cause the file 
number to be incorrect

Matthew Wilcox (1):
   Fix scsi_add_device() for async scanning

Nagendra Singh Tomar (1):
   sd: udev accessing an uninitialized scsi_disk field results in a crash

and the diffstat:

 qla4xxx/ql4_def.h |1 
 qla4xxx/ql4_glbl.h|1 
 qla4xxx/ql4_init.c|   18 +++---
 qla4xxx/ql4_isr.c |4 +--
 qla4xxx/ql4_mbx.c |   35 ---
 qla4xxx/ql4_os.c  |   64 ++
 qla4xxx/ql4_version.h |2 -
 scsi_scan.c   |6 
 sd.c  |   20 +++
 st.c  |   19 --
 10 files changed, 100 insertions(+), 70 deletions(-)

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 7919] New: Tape dies if wrong block size used

2007-02-03 Thread James Bottomley
On Sat, 2007-02-03 at 13:21 +0200, Kai Makisara wrote:
> This patch may also fix the bug 7900.
> 
> The patch compiles and is lightly tested.

We can give it a spin in scsi-misc ... do you want me to hold off from
sending it upstream with the scsi-misc tree when 2.6.20 is declared?

James

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 7900] New: Kernel OOPS when using tape drive (compiler bug?)

2007-02-03 Thread Kai Makisara
On Mon, 29 Jan 2007, Andrew Morton wrote:

> On Mon, 29 Jan 2007 17:39:25 -0800
> [EMAIL PROTECTED] wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=7900
> > 
> >Summary: Kernel OOPS when using tape drive (compiler bug?)
> > Kernel Version: 2.6.20-rc5
> > Status: NEW
> >   Severity: high
> >  Owner: [EMAIL PROTECTED]
> >  Submitter: [EMAIL PROTECTED]
> > 
> > 
> > Most recent kernel where this bug did *NOT* occur:
> > Distribution: OpenSUSE 10.2 x86
> > Hardware Environment: 
> > SCSI controller "LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual
> > Ultra320 SCSI (rev 07)", tape streamer Datastor LTO1 (don't know the exact 
> > model)
> > 
> > Software Environment: OpenSUSE 10.2 x86, mondoarchive, mondorescue
> > 
> > Problem Description:
> > Attempt to restore data from a tape drive results in the following Oops and
> > subsequent functinality loss of the strreamer. Writing to the tape, however,
> > worked fine. Or at least mondoarchive haven't complained.
> > 
I don't know how mondoarchive uses the tape. The bug 7900 looked somewhat 
similar. It might be useful to try the patch provided for that bug (in 
Bugzilla and linux-scsi).

-- 
Kai
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 7919] New: Tape dies if wrong block size used

2007-02-03 Thread Kai Makisara
On Thu, 1 Feb 2007, Andrew Morton wrote:

> On Thu, 1 Feb 2007 15:34:29 -0800
> [EMAIL PROTECTED] wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=7919
> > 
> >Summary: Tape dies if wrong block size used
> > Kernel Version: 2.6.20-rc5
> > Status: NEW
> >   Severity: normal
> >  Owner: [EMAIL PROTECTED]
> >  Submitter: [EMAIL PROTECTED]
> > 
> > 
> > Most recent kernel where this bug did *NOT* occur: 2.6.17.14
> > 
> > Other Kernels Tested and Results:
> > 
> > OK 2.6.15.7
> > OK 2.6.16.37 
> > OK 2.6.17.14 
> > BAD 2.6.18.6
> > BAD 2.6.18-1.2869.fc6
> > BAD 2.6.19.2 +
> > BAD 2.6.20-rc5
> > 
> > NOTE: 2.6.18-1.2869.fc6 is a Fedora modified kernel, all others are from 
> > kernel.org
> > 
...
> > Steps to reproduce: 
> > Get a Adaptec AHA-2940U/UW/D / AIC-7881U card and a tape drive,
> > install a recent kernel
> > set the tape block size - mt setblk 4096
> > read from or write to tape using wrong block size - tar -b 7 -cvf /dev/tape 
> > foo
> >
Write does not trigger this bug because the driver refuses in fixed block 
mode writes that are not a multiple of the block size. Read does trigger 
it in my system.

The bug is not associated with any specific HBA. st tries to do direct i/o 
in fixed block mode with reads that are not a multiple of tape block size. 

The patch in this message fixes the st problem by switching to using the 
driver buffer up to the next close of the device file in fixed block mode 
if the user asks for a read like this.

I don't know why the bug has surfaced only after 2.6.17 although the st 
problem is old. There may be another bug in the block subsystem and this 
patch works around it. However, the patch fixes a problem in st and in 
this way it is a valid fix.

This patch may also fix the bug 7900.

The patch compiles and is lightly tested.

Signed-off-by: Kai Makisara <[EMAIL PROTECTED]>

--- linux-2.6/drivers/scsi/st.c 2006-12-09 13:29:31.0 +0200
+++ linux-2.6.20-rc7-km/drivers/scsi/st.c   2007-02-03 12:52:05.0 
+0200
@@ -9,7 +9,7 @@
Steve Hirsch, Andreas Koppenh"ofer, Michael Leodolter, Eyal Lebedinsky,
Michael Schaefer, J"org Weule, and Eric Youngdale.
 
-   Copyright 1992 - 2006 Kai Makisara
+   Copyright 1992 - 2007 Kai Makisara
email [EMAIL PROTECTED]
 
Some small formal changes - aeb, 950809
@@ -17,7 +17,7 @@
Last modified: 18-JAN-1998 Richard Gooch <[EMAIL PROTECTED]> Devfs support
  */
 
-static const char *verstr = "20061107";
+static const char *verstr = "20070203";
 
 #include 
 
@@ -1168,6 +1168,7 @@ static int st_open(struct inode *inode, 
STps = &(STp->ps[i]);
STps->rw = ST_IDLE;
}
+   STp->try_dio_now = STp->try_dio;
STp->recover_count = 0;
DEB( STp->nbr_waits = STp->nbr_finished = 0;
 STp->nbr_requests = STp->nbr_dio = STp->nbr_pages = 
STp->nbr_combinable = 0; )
@@ -1400,9 +1401,9 @@ static int setup_buffering(struct scsi_t
struct st_buffer *STbp = STp->buffer;
 
if (is_read)
-   i = STp->try_dio && try_rdio;
+   i = STp->try_dio_now && try_rdio;
else
-   i = STp->try_dio && try_wdio;
+   i = STp->try_dio_now && try_wdio;
 
if (i && ((unsigned long)buf & queue_dma_alignment(
STp->device->request_queue)) == 0) {
@@ -1599,7 +1600,7 @@ st_write(struct file *filp, const char _
STm->do_async_writes && STps->eof < ST_EOM_OK;
 
if (STp->block_size != 0 && STm->do_buffer_writes &&
-   !(STp->try_dio && try_wdio) && STps->eof < ST_EOM_OK &&
+   !(STp->try_dio_now && try_wdio) && STps->eof < ST_EOM_OK &&
STbp->buffer_bytes < STbp->buffer_size) {
STp->dirty = 1;
/* Don't write a buffer that is not full enough. */
@@ -1769,7 +1770,7 @@ static long read_tape(struct scsi_tape *
if (STp->block_size == 0)
blks = bytes = count;
else {
-   if (!(STp->try_dio && try_rdio) && STm->do_read_ahead) {
+   if (!(STp->try_dio_now && try_rdio) && STm->do_read_ahead) {
blks = (STp->buffer)->buffer_blocks;
bytes = blks * STp->block_size;
} else {
@@ -1948,10 +1949,12 @@ st_read(str