from:"Andrew Vasquez"

Re: [PATCH -mmotm] scsi: fix the wrong position of the comment

2013-03-10 Thread Andrew Vasquez

On Sun, 10 Mar 2013, James Bottomley wrote:

> On Sun, 2013-03-10 at 00:57 -0800, Andrew Morton wrote:
> > On Sun, 10 Mar 2013 08:22:47 + James Bottomley 
> >  wrote:
> >
> > > [missing SCSI cc added]
> > > On Sun, 2013-03-10 at 17:09 +0900, Akinobu Mita wrote:
> > > > This fixes the wrong position of the comment introduced by
> > > > scsi-rename-random32-to-prandom_u32.patch in the -mm tree.
> > > >
> > > > Signed-off-by: Akinobu Mita 
> > > > Cc: "James E.J. Bottomley" 
> > > > Cc: Andrew Vasquez 
> > > > ---
> > > >  drivers/scsi/qla2xxx/qla_attr.c | 6 +++---
> > > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/scsi/qla2xxx/qla_attr.c 
> > > > b/drivers/scsi/qla2xxx/qla_attr.c
> > > > index 04bf7b8..e44d47e 100644
> > > > --- a/drivers/scsi/qla2xxx/qla_attr.c
> > > > +++ b/drivers/scsi/qla2xxx/qla_attr.c
> > > > @@ -1939,13 +1939,13 @@ qla24xx_vport_delete(struct fc_vport *fc_vport)
> > > > }
> > > >
> > > > /* No pending activities shall be there on the vha now */
> > > > -   if (ql2xextended_error_logging & ql_dbg_user)
> > > > -   msleep(prandom_u32() % 10);
> > > > +   if (ql2xextended_error_logging & ql_dbg_user) {
> > > > /*
> > > >  * Just to see if something falls on the net we have 
> > > > placed
> > > >  * below
> > > >  */
> > > > -
> > > > +   msleep(prandom_u32() % 10);
> > > > +   }
> > >
> > > I don't git a toss if it's random or prandom: Andrew: get rid of it; we
> > > do not sleep in kernel for random intervals whatever the provocation ...
> > > if this is supposed to be a warning or error condition then print
> > > something.
> >
> > That msleep was added by
> >
> > commit feafb7b1714cf599a6d0fed45801ab3f66046cbd
> > Author: Arun Easi 
> > AuthorDate: Fri Sep 3 14:57:00 2010 -0700
> > Commit: James Bottomley 
> > CommitDate: Sun Sep 5 15:13:12 2010 -0300
> >
> > [SCSI] qla2xxx: Fix vport delete issues
>
> Sorry, I didn't notice multiple Andrews on the cc list.  I meant Andrew
> Vasquez (or other member of the qla team) remove this, please (and
> preferably do something correct).
>

James,

We'll take a look at this, yes.  Adding Giri and Co. to the CC.

Thanks, AV



This message and any attached documents contain information from QLogic 
Corporation or its wholly-owned subsidiaries that may be confidential. If you 
are not the intended recipient, you may not read, copy, distribute, or use this 
information. If you have received this transmission in error, please notify the 
sender immediately by reply e-mail and then delete this message.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mmotm] scsi: fix the wrong position of the comment

2013-03-10 Thread Andrew Vasquez

On Sun, 10 Mar 2013, James Bottomley wrote:

 On Sun, 2013-03-10 at 00:57 -0800, Andrew Morton wrote:
  On Sun, 10 Mar 2013 08:22:47 + James Bottomley 
  jbottom...@parallels.com wrote:
 
   [missing SCSI cc added]
   On Sun, 2013-03-10 at 17:09 +0900, Akinobu Mita wrote:
This fixes the wrong position of the comment introduced by
scsi-rename-random32-to-prandom_u32.patch in the -mm tree.
   
Signed-off-by: Akinobu Mita akinobu.m...@gmail.com
Cc: James E.J. Bottomley jbottom...@parallels.com
Cc: Andrew Vasquez andrew.vasq...@qlogic.com
---
 drivers/scsi/qla2xxx/qla_attr.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
   
diff --git a/drivers/scsi/qla2xxx/qla_attr.c 
b/drivers/scsi/qla2xxx/qla_attr.c
index 04bf7b8..e44d47e 100644
--- a/drivers/scsi/qla2xxx/qla_attr.c
+++ b/drivers/scsi/qla2xxx/qla_attr.c
@@ -1939,13 +1939,13 @@ qla24xx_vport_delete(struct fc_vport *fc_vport)
}
   
/* No pending activities shall be there on the vha now */
-   if (ql2xextended_error_logging  ql_dbg_user)
-   msleep(prandom_u32() % 10);
+   if (ql2xextended_error_logging  ql_dbg_user) {
/*
 * Just to see if something falls on the net we have 
placed
 * below
 */
-
+   msleep(prandom_u32() % 10);
+   }
  
   I don't git a toss if it's random or prandom: Andrew: get rid of it; we
   do not sleep in kernel for random intervals whatever the provocation ...
   if this is supposed to be a warning or error condition then print
   something.
 
  That msleep was added by
 
  commit feafb7b1714cf599a6d0fed45801ab3f66046cbd
  Author: Arun Easi arun.e...@qlogic.com
  AuthorDate: Fri Sep 3 14:57:00 2010 -0700
  Commit: James Bottomley james.bottom...@suse.de
  CommitDate: Sun Sep 5 15:13:12 2010 -0300
 
  [SCSI] qla2xxx: Fix vport delete issues

 Sorry, I didn't notice multiple Andrews on the cc list.  I meant Andrew
 Vasquez (or other member of the qla team) remove this, please (and
 preferably do something correct).


James,

We'll take a look at this, yes.  Adding Giri and Co. to the CC.

Thanks, AV



This message and any attached documents contain information from QLogic 
Corporation or its wholly-owned subsidiaries that may be confidential. If you 
are not the intended recipient, you may not read, copy, distribute, or use this 
information. If you have received this transmission in error, please notify the 
sender immediately by reply e-mail and then delete this message.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] scsi/qla4xxx/ql4_isr.c: remove dead code

2008-02-19 Thread Andrew Vasquez

On Tue, 19 Feb 2008, James Bottomley wrote:

> On Tue, 2008-02-19 at 18:35 -0800, Andrew Vasquez wrote:
> > On Tue, 19 Feb 2008, James Bottomley wrote:
> > 
> > > On Tue, 2008-02-19 at 21:29 +0200, Adrian Bunk wrote:
> > > > This patch removes dead code spotted by the Coverity checker.
> > > > 
> > > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>
> > > > 
> > > > ---
> > > > 
> > > >  drivers/scsi/qla4xxx/ql4_isr.c |   18 +-
> > > >  1 file changed, 1 insertion(+), 17 deletions(-)
> > > > 
> > > > --- linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c.old2008-02-19 
> > > > 20:29:16.0 +0200
> > > > +++ linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c2008-02-19 
> > > > 20:30:37.0 +0200
> > > > @@ -91,38 +91,22 @@ static void qla4xxx_status_entry(struct 
> > > > if (scsi_status == 0) {
> > > > cmd->result = DID_OK << 16;
> > > > break;
> > > > }
> > > >  
> > > > if (sts_entry->iscsiFlags & ISCSI_FLAG_RESIDUAL_OVER) {
> > > > cmd->result = DID_ERROR << 16;
> > > > break;
> > > > }
> > > >  
> > > > -   if (sts_entry->iscsiFlags _FLAG_RESIDUAL_UNDER) {
> > > > +   if (sts_entry->iscsiFlags _FLAG_RESIDUAL_UNDER)
> > > > scsi_set_resid(cmd, residual);
> > > > -   if (!scsi_status && ((scsi_bufflen(cmd) - 
> > > > residual) <
> > > > -   cmd->underflow)) {
> > > > -
> > > > -   cmd->result = DID_ERROR << 16;
> > > > -
> > > > -   DEBUG2(printk("scsi%ld:%d:%d:%d: %s: "
> > > > -   "Mid-layer Data underrun0, "
> > > > -   "xferlen = 0x%x, "
> > > > -   "residual = 0x%x\n", 
> > > > ha->host_no,
> > > > -   cmd->device->channel,
> > > > -   cmd->device->id,
> > > > -   cmd->device->lun, __func__,
> > > > -   scsi_bufflen(cmd), residual));
> > > > -   break;
> > > > -   }
> > > > -   }
> > > 
> > > This code doesn't look dead to me, it looks to be enforcing
> > > cmd->underrun if set ... what makes the coverity checker think it can
> > > never be executed?
> > 
> > Hmm, guess it's the earlier 'if (scsi_status == 0)' check a few lines
> > up...  Dave S., can you take a look at this...  Thanks, av
> 
> Ah, so the !scsi_status is wrong it was supposed to be scsi_status !=
> 0 ... and even then it can just be dropped.

My guess is that the check should have been written as:

...
if (sts_entry->iscsiFlags _FLAG_RESIDUAL_UNDER)
scsi_set_resid(cmd, residual);
if ((scsi_bufflen(cmd) - residual) < cmd->underflow) {
...

It looks to be a logic-error while porting from qla2xxx, where
scsi_status during CS_COMPLETE is the full 16-bit status (high-byte is
transport, low-byte SCSI status) from from the FCP_RSP frame (not so
in iSCSI, where it's just the SCSI-status) and the residual check
in qla_isr.c::qla2x00_status_entry() looks like:

if (!lscsi_status &&
((unsigned)(scsi_bufflen(cp) - resid) <
 cp->underflow)) {
...

I'll defer to Dave S. for verification.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] scsi/qla4xxx/ql4_isr.c: remove dead code

2008-02-19 Thread Andrew Vasquez

On Tue, 19 Feb 2008, James Bottomley wrote:

> On Tue, 2008-02-19 at 21:29 +0200, Adrian Bunk wrote:
> > This patch removes dead code spotted by the Coverity checker.
> > 
> > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>
> > 
> > ---
> > 
> >  drivers/scsi/qla4xxx/ql4_isr.c |   18 +-
> >  1 file changed, 1 insertion(+), 17 deletions(-)
> > 
> > --- linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c.old2008-02-19 
> > 20:29:16.0 +0200
> > +++ linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c2008-02-19 
> > 20:30:37.0 +0200
> > @@ -91,38 +91,22 @@ static void qla4xxx_status_entry(struct 
> > if (scsi_status == 0) {
> > cmd->result = DID_OK << 16;
> > break;
> > }
> >  
> > if (sts_entry->iscsiFlags & ISCSI_FLAG_RESIDUAL_OVER) {
> > cmd->result = DID_ERROR << 16;
> > break;
> > }
> >  
> > -   if (sts_entry->iscsiFlags _FLAG_RESIDUAL_UNDER) {
> > +   if (sts_entry->iscsiFlags _FLAG_RESIDUAL_UNDER)
> > scsi_set_resid(cmd, residual);
> > -   if (!scsi_status && ((scsi_bufflen(cmd) - residual) <
> > -   cmd->underflow)) {
> > -
> > -   cmd->result = DID_ERROR << 16;
> > -
> > -   DEBUG2(printk("scsi%ld:%d:%d:%d: %s: "
> > -   "Mid-layer Data underrun0, "
> > -   "xferlen = 0x%x, "
> > -   "residual = 0x%x\n", ha->host_no,
> > -   cmd->device->channel,
> > -   cmd->device->id,
> > -   cmd->device->lun, __func__,
> > -   scsi_bufflen(cmd), residual));
> > -   break;
> > -   }
> > -   }
> 
> This code doesn't look dead to me, it looks to be enforcing
> cmd->underrun if set ... what makes the coverity checker think it can
> never be executed?

Hmm, guess it's the earlier 'if (scsi_status == 0)' check a few lines
up...  Dave S., can you take a look at this...  Thanks, av
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] scsi/qla4xxx/ql4_isr.c: remove dead code

2008-02-19 Thread Andrew Vasquez

On Tue, 19 Feb 2008, James Bottomley wrote:

 On Tue, 2008-02-19 at 21:29 +0200, Adrian Bunk wrote:
  This patch removes dead code spotted by the Coverity checker.
  
  Signed-off-by: Adrian Bunk [EMAIL PROTECTED]
  
  ---
  
   drivers/scsi/qla4xxx/ql4_isr.c |   18 +-
   1 file changed, 1 insertion(+), 17 deletions(-)
  
  --- linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c.old2008-02-19 
  20:29:16.0 +0200
  +++ linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c2008-02-19 
  20:30:37.0 +0200
  @@ -91,38 +91,22 @@ static void qla4xxx_status_entry(struct 
  if (scsi_status == 0) {
  cmd-result = DID_OK  16;
  break;
  }
   
  if (sts_entry-iscsiFlags  ISCSI_FLAG_RESIDUAL_OVER) {
  cmd-result = DID_ERROR  16;
  break;
  }
   
  -   if (sts_entry-iscsiFlags ISCSI_FLAG_RESIDUAL_UNDER) {
  +   if (sts_entry-iscsiFlags ISCSI_FLAG_RESIDUAL_UNDER)
  scsi_set_resid(cmd, residual);
  -   if (!scsi_status  ((scsi_bufflen(cmd) - residual) 
  -   cmd-underflow)) {
  -
  -   cmd-result = DID_ERROR  16;
  -
  -   DEBUG2(printk(scsi%ld:%d:%d:%d: %s: 
  -   Mid-layer Data underrun0, 
  -   xferlen = 0x%x, 
  -   residual = 0x%x\n, ha-host_no,
  -   cmd-device-channel,
  -   cmd-device-id,
  -   cmd-device-lun, __func__,
  -   scsi_bufflen(cmd), residual));
  -   break;
  -   }
  -   }
 
 This code doesn't look dead to me, it looks to be enforcing
 cmd-underrun if set ... what makes the coverity checker think it can
 never be executed?

Hmm, guess it's the earlier 'if (scsi_status == 0)' check a few lines
up...  Dave S., can you take a look at this...  Thanks, av
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] scsi/qla4xxx/ql4_isr.c: remove dead code

2008-02-19 Thread Andrew Vasquez

On Tue, 19 Feb 2008, James Bottomley wrote:

 On Tue, 2008-02-19 at 18:35 -0800, Andrew Vasquez wrote:
  On Tue, 19 Feb 2008, James Bottomley wrote:
  
   On Tue, 2008-02-19 at 21:29 +0200, Adrian Bunk wrote:
This patch removes dead code spotted by the Coverity checker.

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

---

 drivers/scsi/qla4xxx/ql4_isr.c |   18 +-
 1 file changed, 1 insertion(+), 17 deletions(-)

--- linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c.old2008-02-19 
20:29:16.0 +0200
+++ linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c2008-02-19 
20:30:37.0 +0200
@@ -91,38 +91,22 @@ static void qla4xxx_status_entry(struct 
if (scsi_status == 0) {
cmd-result = DID_OK  16;
break;
}
 
if (sts_entry-iscsiFlags  ISCSI_FLAG_RESIDUAL_OVER) {
cmd-result = DID_ERROR  16;
break;
}
 
-   if (sts_entry-iscsiFlags ISCSI_FLAG_RESIDUAL_UNDER) {
+   if (sts_entry-iscsiFlags ISCSI_FLAG_RESIDUAL_UNDER)
scsi_set_resid(cmd, residual);
-   if (!scsi_status  ((scsi_bufflen(cmd) - 
residual) 
-   cmd-underflow)) {
-
-   cmd-result = DID_ERROR  16;
-
-   DEBUG2(printk(scsi%ld:%d:%d:%d: %s: 
-   Mid-layer Data underrun0, 
-   xferlen = 0x%x, 
-   residual = 0x%x\n, 
ha-host_no,
-   cmd-device-channel,
-   cmd-device-id,
-   cmd-device-lun, __func__,
-   scsi_bufflen(cmd), residual));
-   break;
-   }
-   }
   
   This code doesn't look dead to me, it looks to be enforcing
   cmd-underrun if set ... what makes the coverity checker think it can
   never be executed?
  
  Hmm, guess it's the earlier 'if (scsi_status == 0)' check a few lines
  up...  Dave S., can you take a look at this...  Thanks, av
 
 Ah, so the !scsi_status is wrong it was supposed to be scsi_status !=
 0 ... and even then it can just be dropped.

My guess is that the check should have been written as:

...
if (sts_entry-iscsiFlags ISCSI_FLAG_RESIDUAL_UNDER)
scsi_set_resid(cmd, residual);
if ((scsi_bufflen(cmd) - residual)  cmd-underflow) {
...

It looks to be a logic-error while porting from qla2xxx, where
scsi_status during CS_COMPLETE is the full 16-bit status (high-byte is
transport, low-byte SCSI status) from from the FCP_RSP frame (not so
in iSCSI, where it's just the SCSI-status) and the residual check
in qla_isr.c::qla2x00_status_entry() looks like:

if (!lscsi_status 
((unsigned)(scsi_bufflen(cp) - resid) 
 cp-underflow)) {
...

I'll defer to Dave S. for verification.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Andrew Vasquez

On Tue, 05 Feb 2008, Alan D. Brunelle wrote:

> > and send the resultant kernel logs?
> 
> Here's the output to the console (if there are other logs you need,
> let me know). I'll try the patch next, and sorry, hadn't realized
> merges were still coming in under 2.6.24 in Linus' tree... 
> 
> QLogic Fibre Channel HBA Driver
> ACPI: PCI Interrupt :40:01.0[A] -> GSI 38 (level, low) -> IRQ 58
> qla2xxx :40:01.0: Found an ISP2312, irq 58, iobase 0xc000a0041000
> qla2xxx :40:01.0: Configuring PCI space...
> qla2x00_get_flash_version(): Unrecognized code type ff at pcids da1c.
> qla2x00_get_flash_version(): Unrecognized code type ff at pcids 1f61c.
> qla2xxx :40:01.0: Configure NVRAM parameters...
> qla2xxx :40:01.0: Verifying loaded RISC code...
> scsi(14):  Load RISC code 
> scsi(14): Verifying Checksum of loaded RISC code.
> scsi(14): Checksum OK, start firmware.
> qla2xxx :40:01.0: Allocated (412 KB) for firmware dump...
> scsi(14): Issue init firmware.
> qla2x00_mailbox_command(14):  FAILED. mbx0=4001, mbx1=0, mbx2=ba8a, 
> cmd=48 

Ok, this is what I would have expected with the linus' tree prior to
the fix.  I just double-checked, the fix in question has yet to make
it's way to Linus' tree.  It's currently in scsi-misc-2.6:

http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=a571fdf7caa010e17f6a70c0c52e0992e87af7db

which should filter up to linux-2.6.git during Linus' next pull.

thanks, av
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Andrew Vasquez

On Tue, 05 Feb 2008, Andrew Vasquez wrote:

> > Could you load the (default 2.6.24) driver with
> > ql2xextended_error_logging modules parameter set:
> > 
> > # insmod qla2xxx ql2xextended_error_logging=1
> > 
> > and send the resultant kernel logs?
> 
> Could you tray the patch referenced here:
> 
> qla2xxx: Correct issue where incorrect init-fw mailbox command was used on 
> non-NPIV capable ISPs.
> http://article.gmane.org/gmane.linux.scsi/38240


BTW:  the regression in question is not present in vanilla 2.6.24.
Instead it was introduced early on in the 2.6.25 merge-window.  Linus'
tree currently has the patch referenced above as well.

--
av
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Andrew Vasquez

On Tue, 05 Feb 2008, Andrew Vasquez wrote:

> On Tue, 05 Feb 2008, Alan D. Brunelle wrote:
> 
> > commit 9b73e76f3cf63379dcf45fcd4f112f5812418d0a
> > Merge: 50d9a12... 23c3e29...
> > Author: Linus Torvalds <[EMAIL PROTECTED]>
> > Date:   Fri Jan 25 17:19:08 2008 -0800
> > 
> > Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
> > 
> > * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: 
> > (200 commits)
> > 
> > I believe a regression was introduced. I'm running on a 4-way IA64,
> > with straight 2.6.24 and 2 dual-port cards:
> > 
> > 40:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> > 40:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> > c0:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> > c0:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> > 
> > the adapters failed initialization. In particular, I narrowed it down
> > to failing the qla2x00_mbox_command call within qla2x00_init_firmware
> > function. I went and removed the qla2x00-related parts of this (large-ish)
> > merge, and the 4 ports initialized just fine.
> 
> Could you load the (default 2.6.24) driver with
> ql2xextended_error_logging modules parameter set:
> 
>   # insmod qla2xxx ql2xextended_error_logging=1
> 
> and send the resultant kernel logs?

Could you tray the patch referenced here:

qla2xxx: Correct issue where incorrect init-fw mailbox command was used on 
non-NPIV capable ISPs.
http://article.gmane.org/gmane.linux.scsi/38240

Thanks, av

---

qla2xxx: Correct issue where incorrect init-fw mailbox command was used on 
non-NPIV capable ISPs.

BIT_2 of the firmware attributes is only valid on FW-interface-2
type HBAs.  Code in commit
c48339decceec8e011498b0fc4c7c7d8b2ea06c1 would cause the
incorrect initialize-firmware mailbox command to be issued for
non-NPIV capable ISPs.  Correct this by reverting to previously
used (and correct) pre-condition 'if' check.

Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]>
---
 drivers/scsi/qla2xxx/qla_mbx.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c
index 0c10c0b..99d29ff 100644
--- a/drivers/scsi/qla2xxx/qla_mbx.c
+++ b/drivers/scsi/qla2xxx/qla_mbx.c
@@ -980,7 +980,7 @@ qla2x00_init_firmware(scsi_qla_host_t *ha, uint16_t size)
DEBUG11(printk("qla2x00_init_firmware(%ld): entered.\n",
ha->host_no));
 
-   if (ha->fw_attributes & BIT_2)
+   if (ha->flags.npiv_supported)
mcp->mb[0] = MBC_MID_INITIALIZE_FIRMWARE;
else
mcp->mb[0] = MBC_INITIALIZE_FIRMWARE;
-- 
1.5.4.rc5.5.gab98

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Andrew Vasquez

On Tue, 05 Feb 2008, Alan D. Brunelle wrote:

> commit 9b73e76f3cf63379dcf45fcd4f112f5812418d0a
> Merge: 50d9a12... 23c3e29...
> Author: Linus Torvalds <[EMAIL PROTECTED]>
> Date:   Fri Jan 25 17:19:08 2008 -0800
> 
> Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
> 
> * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 
> commits)
> 
> I believe a regression was introduced. I'm running on a 4-way IA64,
> with straight 2.6.24 and 2 dual-port cards:
> 
> 40:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> 40:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> c0:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> c0:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> 
> the adapters failed initialization. In particular, I narrowed it down
> to failing the qla2x00_mbox_command call within qla2x00_init_firmware
> function. I went and removed the qla2x00-related parts of this (large-ish)
> merge, and the 4 ports initialized just fine.

Could you load the (default 2.6.24) driver with
ql2xextended_error_logging modules parameter set:

# insmod qla2xxx ql2xextended_error_logging=1

and send the resultant kernel logs?

> Specifically, reverting the "patch" below enabled the devices to initialize 
> properly.
> 
> If need be, I'm certainly willing to help narrow down to the specific part in
> this patch...

That's a rather large patch... :(   Any chance you could git-bisect?
Also, could you send your .config file you are using?

Thanks,
Andrew Vasquez
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Andrew Vasquez

On Tue, 05 Feb 2008, Alan D. Brunelle wrote:

 commit 9b73e76f3cf63379dcf45fcd4f112f5812418d0a
 Merge: 50d9a12... 23c3e29...
 Author: Linus Torvalds [EMAIL PROTECTED]
 Date:   Fri Jan 25 17:19:08 2008 -0800
 
 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
 
 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 
 commits)
 
 I believe a regression was introduced. I'm running on a 4-way IA64,
 with straight 2.6.24 and 2 dual-port cards:
 
 40:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
 40:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
 c0:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
 c0:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
 
 the adapters failed initialization. In particular, I narrowed it down
 to failing the qla2x00_mbox_command call within qla2x00_init_firmware
 function. I went and removed the qla2x00-related parts of this (large-ish)
 merge, and the 4 ports initialized just fine.

Could you load the (default 2.6.24) driver with
ql2xextended_error_logging modules parameter set:

# insmod qla2xxx ql2xextended_error_logging=1

and send the resultant kernel logs?

 Specifically, reverting the patch below enabled the devices to initialize 
 properly.
 
 If need be, I'm certainly willing to help narrow down to the specific part in
 this patch...

That's a rather large patch... :(   Any chance you could git-bisect?
Also, could you send your .config file you are using?

Thanks,
Andrew Vasquez
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Andrew Vasquez

On Tue, 05 Feb 2008, Andrew Vasquez wrote:

 On Tue, 05 Feb 2008, Alan D. Brunelle wrote:
 
  commit 9b73e76f3cf63379dcf45fcd4f112f5812418d0a
  Merge: 50d9a12... 23c3e29...
  Author: Linus Torvalds [EMAIL PROTECTED]
  Date:   Fri Jan 25 17:19:08 2008 -0800
  
  Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
  
  * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: 
  (200 commits)
  
  I believe a regression was introduced. I'm running on a 4-way IA64,
  with straight 2.6.24 and 2 dual-port cards:
  
  40:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
  40:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
  c0:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
  c0:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
  
  the adapters failed initialization. In particular, I narrowed it down
  to failing the qla2x00_mbox_command call within qla2x00_init_firmware
  function. I went and removed the qla2x00-related parts of this (large-ish)
  merge, and the 4 ports initialized just fine.
 
 Could you load the (default 2.6.24) driver with
 ql2xextended_error_logging modules parameter set:
 
   # insmod qla2xxx ql2xextended_error_logging=1
 
 and send the resultant kernel logs?

Could you tray the patch referenced here:

qla2xxx: Correct issue where incorrect init-fw mailbox command was used on 
non-NPIV capable ISPs.
http://article.gmane.org/gmane.linux.scsi/38240

Thanks, av

---

qla2xxx: Correct issue where incorrect init-fw mailbox command was used on 
non-NPIV capable ISPs.

BIT_2 of the firmware attributes is only valid on FW-interface-2
type HBAs.  Code in commit
c48339decceec8e011498b0fc4c7c7d8b2ea06c1 would cause the
incorrect initialize-firmware mailbox command to be issued for
non-NPIV capable ISPs.  Correct this by reverting to previously
used (and correct) pre-condition 'if' check.

Signed-off-by: Andrew Vasquez [EMAIL PROTECTED]
---
 drivers/scsi/qla2xxx/qla_mbx.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c
index 0c10c0b..99d29ff 100644
--- a/drivers/scsi/qla2xxx/qla_mbx.c
+++ b/drivers/scsi/qla2xxx/qla_mbx.c
@@ -980,7 +980,7 @@ qla2x00_init_firmware(scsi_qla_host_t *ha, uint16_t size)
DEBUG11(printk(qla2x00_init_firmware(%ld): entered.\n,
ha-host_no));
 
-   if (ha-fw_attributes  BIT_2)
+   if (ha-flags.npiv_supported)
mcp-mb[0] = MBC_MID_INITIALIZE_FIRMWARE;
else
mcp-mb[0] = MBC_INITIALIZE_FIRMWARE;
-- 
1.5.4.rc5.5.gab98

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Andrew Vasquez

On Tue, 05 Feb 2008, Andrew Vasquez wrote:

  Could you load the (default 2.6.24) driver with
  ql2xextended_error_logging modules parameter set:
  
  # insmod qla2xxx ql2xextended_error_logging=1
  
  and send the resultant kernel logs?
 
 Could you tray the patch referenced here:
 
 qla2xxx: Correct issue where incorrect init-fw mailbox command was used on 
 non-NPIV capable ISPs.
 http://article.gmane.org/gmane.linux.scsi/38240


BTW:  the regression in question is not present in vanilla 2.6.24.
Instead it was introduced early on in the 2.6.25 merge-window.  Linus'
tree currently has the patch referenced above as well.

--
av
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Andrew Vasquez

On Tue, 05 Feb 2008, Alan D. Brunelle wrote:

  and send the resultant kernel logs?
 
 Here's the output to the console (if there are other logs you need,
 let me know). I'll try the patch next, and sorry, hadn't realized
 merges were still coming in under 2.6.24 in Linus' tree... 
 
 QLogic Fibre Channel HBA Driver
 ACPI: PCI Interrupt :40:01.0[A] - GSI 38 (level, low) - IRQ 58
 qla2xxx :40:01.0: Found an ISP2312, irq 58, iobase 0xc000a0041000
 qla2xxx :40:01.0: Configuring PCI space...
 qla2x00_get_flash_version(): Unrecognized code type ff at pcids da1c.
 qla2x00_get_flash_version(): Unrecognized code type ff at pcids 1f61c.
 qla2xxx :40:01.0: Configure NVRAM parameters...
 qla2xxx :40:01.0: Verifying loaded RISC code...
 scsi(14):  Load RISC code 
 scsi(14): Verifying Checksum of loaded RISC code.
 scsi(14): Checksum OK, start firmware.
 qla2xxx :40:01.0: Allocated (412 KB) for firmware dump...
 scsi(14): Issue init firmware.
 qla2x00_mailbox_command(14):  FAILED. mbx0=4001, mbx1=0, mbx2=ba8a, 
 cmd=48 

Ok, this is what I would have expected with the linus' tree prior to
the fix.  I just double-checked, the fix in question has yet to make
it's way to Linus' tree.  It's currently in scsi-misc-2.6:

http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=a571fdf7caa010e17f6a70c0c52e0992e87af7db

which should filter up to linux-2.6.git during Linus' next pull.

thanks, av
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] pci: Remove users of pci_enable_device_bars()

2008-01-31 Thread Andrew Vasquez

On Tue, 08 Jan 2008, Benjamin Herrenschmidt wrote:

> On Mon, 2008-01-07 at 11:42 -0800, Andrew Vasquez wrote:
> > That's fine.  I take it these patches will be funneled via
> > gregkh/pci-2.6.git.  There's some qla2xxx updates which are queued for
> > post-2.6.24 consumption in jejb/scsi-misc-2.6.git which don't appear
> > to have any conflicts.
> > 
> > I do though have a series of patches which I'll hold off on submitting
> > to linux-scsi until these PCI changes are merged, as there's some
> > minor conflicts merging the three branches.  James B, will that be
> > fine?
> > 
> > The patches themselves appear to be working fine within several of our
> > test rings.  I hold off on some of the cleanup post-merge time...
> 
> Thanks for testing ! They should be queued with Greg indeed.

Just curious, this patch-set hasn't made it upstream to linux-2.6.git,
will the work be deferred to 2.6.26??

--
av
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] pci: Remove users of pci_enable_device_bars()

2008-01-31 Thread Andrew Vasquez

On Tue, 08 Jan 2008, Benjamin Herrenschmidt wrote:

 On Mon, 2008-01-07 at 11:42 -0800, Andrew Vasquez wrote:
  That's fine.  I take it these patches will be funneled via
  gregkh/pci-2.6.git.  There's some qla2xxx updates which are queued for
  post-2.6.24 consumption in jejb/scsi-misc-2.6.git which don't appear
  to have any conflicts.
  
  I do though have a series of patches which I'll hold off on submitting
  to linux-scsi until these PCI changes are merged, as there's some
  minor conflicts merging the three branches.  James B, will that be
  fine?
  
  The patches themselves appear to be working fine within several of our
  test rings.  I hold off on some of the cleanup post-merge time...
 
 Thanks for testing ! They should be queued with Greg indeed.

Just curious, this patch-set hasn't made it upstream to linux-2.6.git,
will the work be deferred to 2.6.26??

--
av
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Andrew Vasquez

On Tue, 29 Jan 2008, Luck, Tony wrote:

> > > Will try this next.
> >
> > Should work even better since it avoids a lock and copy, but please do
> > test if you have the time.
> 
> That one works too (survived two full builds at "make -j32" on the 16-way
> system).
> 
> Thanks for the quick turnaround.

Jens, this patch appears to work well on my 16-way box...  Compile
tests appear to pass (quickly).  I'll be sure to let you know if
anything else 'block' related pops-up...

thanks, av
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Andrew Vasquez

On Tue, 29 Jan 2008, Jens Axboe wrote:

> Great, thanks for confirming. It does look like a clear bug in cciss, it
> just got exposed now that it uses proper end request handling. We never
> need to clear ->data_len, since for blk_fs_request() it will be cleared
> on init. So just setting a residual count there for blk_fs_request()
> like cciss does is fine.
> 
> Anyway, it's in my pending queue for Linus.
> 


Hmm, probably not related to the block changes in your tree, but I'm
seeing yet another problem after working (compile jobs) the machine:

[   61.423922] BUG: spinlock recursion on CPU#2, kjournald/2317
[   61.427843]  lock: 81042c5a4988, .magic: dead4ead, .owner: 
kjournald/2317, .owner_cpu: 2
[   61.427843] Pid: 2317, comm: kjournald Not tainted 2.6.24 #45
[   61.427843] 
[   61.427843] Call Trace:
[   61.427843]  [] _raw_spin_lock+0xe9/0x12a
[   61.427843]  [] as_merged_requests+0xfe/0x115
[   61.427843]  [] elv_merge_requests+0x1f/0x45
[   61.427843]  [] attempt_merge+0x281/0x347
[   61.427843]  [] __make_request+0x1e6/0x598
[   61.427843]  [] generic_make_request+0x1c8/0x276
[   61.427843]  [] submit_bio+0x61/0xdb
[   61.427843]  [] submit_bh+0xe2/0x118
[   61.427843]  [] journal_do_submit_data+0x28/0x39
[   61.427843]  [] 
journal_commit_transaction+0xdbe/0x1394
[   61.427843]  [] lock_timer_base+0x26/0x4e
[   61.427843]  [] kjournald+0x104/0x373
[   61.427843]  [] autoremove_wake_function+0x0/0x2e
[   61.427843]  [] kjournald+0x0/0x373
[   61.427843]  [] kthread+0x3d/0x61
[   61.427843]  [] child_rip+0xa/0x12
[   61.427843]  [] kthread+0x0/0x61
[   61.427843]  [] child_rip+0x0/0x12
[   61.427843] 
[  124.555789] BUG: soft lockup - CPU#6 stuck for 61s! [as:7191]
[  124.555789] CPU 6:
[  124.555789] Modules linked in:
[  124.555789] Pid: 7191, comm: as Not tainted 2.6.24 #45
[  124.555789] RIP: 0010:[]  [] 
_raw_spin_lock+0xa5/0x12a
[  124.555789] RSP: 0018:81042b50be18  EFLAGS: 0246
[  124.555789] RAX:  RBX: 0ef44415 RCX: 
2b543897
[  124.555789] RDX: 009c RSI: 81042c87b868 RDI: 
0001
[  124.555789] RBP: 2b0273879000 R08: 7fff38107000 R09: 
805739c4
[  124.555789] R10: 81042f33ed78 R11: 81042f33ed78 R12: 
81042b50f3c0
[  124.555789] R13: 0010 R14: 8000 R15: 

[  124.555789] FS:  2b02736f3ef0() GS:81042f98a560() 
knlGS:
[  124.555789] CS:  0010 DS:  ES:  CR0: 8005003b
[  124.555789] CR2: 0085d018 CR3: 00042cb0e000 CR4: 
06e0
[  124.555789] DR0:  DR1:  DR2: 

[  124.555789] DR3:  DR6: 0ff0 DR7: 
0400
[  124.555789] 
[  124.555789] Call Trace:
[  124.555789]  [] _raw_spin_lock+0xb3/0x12a
[  124.555789]  [] flush_tlb_others+0x4b/0xa8
[  124.555789]  [] flush_tlb_mm+0x4a/0x99
[  124.555789]  [] unmap_region+0x10a/0x141
[  124.555789]  [] do_munmap+0x1fd/0x2b9
[  124.555789]  [] __down_write_nested+0xa0/0xb0
[  124.555789]  [] sys_munmap+0x3b/0x57
[  124.555789]  [] system_call+0x7e/0x83
[  124.555789] 
[  124.555789] BUG: soft lockup - CPU#14 stuck for 61s! [cc1:7190]
[  124.555789] CPU 14:
[  124.555789] Modules linked in:
[  124.555789] Pid: 7190, comm: cc1 Not tainted 2.6.24 #45
[  124.555789] RIP: 0010:[]  [] 
flush_tlb_others+0x75/0xa8
[  124.555789] RSP: 0018:81042b965e48  EFLAGS: 0202
[  124.555789] RAX: 0010 RBX: 0006 RCX: 
0003
[  124.555789] RDX: 0010 RSI: 81042b965df8 RDI: 
0002
[  124.555789] RBP: 810011cc2658 R08: 2aea0f72d000 R09: 
80573dc4
[  124.555789] R10: 81042e65b7b8 R11: 81042e65b7b8 R12: 
80630640
[  124.555789] R13:  R14: 80264d07 R15: 
81042e0dd960
[  124.555789] FS:  2aea0fafb6f0() GS:81042fb01cc0() 
knlGS:
[  124.555789] CS:  0010 DS:  ES:  CR0: 80050033
[  124.555789] CR2: 2aea117fe000 CR3: 00042b537000 CR4: 
06e0
[  124.555789] DR0:  DR1:  DR2: 

[  124.555789] DR3:  DR6: 0ff0 DR7: 
0400
[  124.555789] 
[  124.555789] Call Trace:
[  124.555789]  [] flush_tlb_others+0x69/0xa8
[  124.555789]  [] flush_tlb_mm+0x4a/0x99
[

Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Andrew Vasquez

On Tue, 29 Jan 2008, Miller, Mike (OS Dev) wrote:

> Jens wrote:
> 
> > -Original Message-
> > From: Jens Axboe [mailto:[EMAIL PROTECTED]
> > Sent: Tuesday, January 29, 2008 12:54 PM
> > To: Andrew Vasquez
> > Cc: Linux Kernel Mailing List; Miller, Mike (OS Dev);
> > [EMAIL PROTECTED]; [EMAIL PROTECTED]
> > Subject: Re: kernel BUG at drivers/block/cciss.c:1260! (with
> > recent linux-2.6 tree)
> >
> > On Tue, Jan 29 2008, Andrew Vasquez wrote:
> > > On Tue, 29 Jan 2008, Jens Axboe wrote:
> > >
> > > > On Tue, Jan 29 2008, Andrew Vasquez wrote:
> > > > > On Tue, 29 Jan 2008, Jens Axboe wrote:
> > > > >
> > > > > > > Here the final snippet that was logged:
> > > > > > >
> > > > > > > [   12.724997] input: USB HID v1.01 Mouse [HP
> > Virtual Keyboard] on usb-:01:04.4-1
> > > > > > > [   12.728971] usbcore: registered new interface
> > driver usbhid
> > > > > > > [   12.732866] drivers/hid/usbhid/hid-core.c:
> > v2.6:USB HID core driver
> > > > > > > [   12.741172] TCP cubic registered
> > > > > > > [   12.744506] NET: Registered protocol family 1
> > > > > > > [   12.744884] NET: Registered protocol family 17
> > > > > > > [   12.749217] Freeing unused kernel memory: 228k freed
> > > > > > > [   12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8
> > > > > > > [   12.888929]
> > > > > > > [   12.888930] sector 651061426900570, nr/cnr 0/0
> > > > > > > [   12.892895] bio 81042f130730, biotail
> > 81042f130730, buffer , data
> > , len 0
> > > > > > > [   12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00
> > 00 00 00 00 00 00
> > > > > >
> > > > > > Ah ok, I see the problem... cciss is overriding the
> > data_len for
> > > > > > BLOCK_PC requests, hence it does not complete them properly.
> > > > > > Hmm. Does this work?
> > > > > >
> > > > > > diff --git a/drivers/block/cciss.c
> > b/drivers/block/cciss.c index
> > > > > > ef50068..b6fa52e 100644
> > > > > > --- a/drivers/block/cciss.c
> > > > > > +++ b/drivers/block/cciss.c
> > > > > > @@ -2524,7 +2524,6 @@ after_error_processing:
> > > > > > resend_cciss_cmd(h, cmd);
> > > > > > return;
> > > > > > }
> > > > > > -   cmd->rq->data_len = 0;
> > > > > > cmd->rq->completion_data = cmd;
> > > > > > blk_complete_request(cmd->rq);  }
> > > > >
> > > > >
> > > > > Things look good so far -- with the patch above I can
> > finally boot
> > > > > the machine.
> > > >
> > > > Cool, sorry about that. Will get that applied asap. So after this
> > > > patch was applied, you didn't see any debug messages from
> > > > blk_dump_rq_flags() anymore, right?
> > >
> > > That's correct.  I've yet to see any additional debug-messages from
> > > blk_dump_rq_flags().
> >
> > Great, thanks for confirming. It does look like a clear bug
> > in cciss, it just got exposed now that it uses proper end
> > request handling. We never need to clear ->data_len, since
> > for blk_fs_request() it will be cleared on init. So just
> > setting a residual count there for blk_fs_request() like
> > cciss does is fine.
> 
> Just so I'm clear: just removing the one line is enough to resolve the 
> problem?

That's correct.  The only other change to cciss.c in my tree is where
the BUG() call was replaced with a call to blk_dump_rq_flags():

@@ -1257,7 +1257,8 @@ static void cciss_softirq_done(struct request *rq)
 #endif /* CCISS_DEBUG */

if (blk_end_request(rq, (rq->errors == 0) ? 0 : -EIO, 
blk_rq_bytes(rq)))
-   BUG();
+   blk_dump_rq_flags(rq, "cciss rq");
+// BUG();

spin_lock_irqsave(>lock, flags);
cmd_free(h, cmd, 1);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Andrew Vasquez

On Tue, 29 Jan 2008, Jens Axboe wrote:

> On Tue, Jan 29 2008, Andrew Vasquez wrote:
> > On Tue, 29 Jan 2008, Jens Axboe wrote:
> > 
> > > > Here the final snippet that was logged:
> > > > 
> > > > [   12.724997] input: USB HID v1.01 Mouse [HP Virtual Keyboard] on 
> > > > usb-:01:04.4-1
> > > > [   12.728971] usbcore: registered new interface driver usbhid
> > > > [   12.732866] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
> > > > [   12.741172] TCP cubic registered
> > > > [   12.744506] NET: Registered protocol family 1
> > > > [   12.744884] NET: Registered protocol family 17
> > > > [   12.749217] Freeing unused kernel memory: 228k freed
> > > > [   12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8
> > > > [   12.888929] 
> > > > [   12.888930] sector 651061426900570, nr/cnr 0/0
> > > > [   12.892895] bio 81042f130730, biotail 81042f130730, buffer 
> > > > , data , len 0
> > > > [   12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 
> > > 
> > > Ah ok, I see the problem... cciss is overriding the data_len for
> > > BLOCK_PC requests, hence it does not complete them properly. Hmm. Does
> > > this work?
> > > 
> > > diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> > > index ef50068..b6fa52e 100644
> > > --- a/drivers/block/cciss.c
> > > +++ b/drivers/block/cciss.c
> > > @@ -2524,7 +2524,6 @@ after_error_processing:
> > >   resend_cciss_cmd(h, cmd);
> > >   return;
> > >   }
> > > - cmd->rq->data_len = 0;
> > >   cmd->rq->completion_data = cmd;
> > >   blk_complete_request(cmd->rq);
> > >  }
> > 
> > 
> > Things look good so far -- with the patch above I can finally boot the
> > machine.
> 
> Cool, sorry about that. Will get that applied asap. So after this patch
> was applied, you didn't see any debug messages from blk_dump_rq_flags()
> anymore, right?

That's correct.  I've yet to see any additional debug-messages from
blk_dump_rq_flags().

--
av
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Andrew Vasquez

On Tue, 29 Jan 2008, Jens Axboe wrote:

> > Here the final snippet that was logged:
> > 
> > [   12.724997] input: USB HID v1.01 Mouse [HP Virtual Keyboard] on 
> > usb-:01:04.4-1
> > [   12.728971] usbcore: registered new interface driver usbhid
> > [   12.732866] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
> > [   12.741172] TCP cubic registered
> > [   12.744506] NET: Registered protocol family 1
> > [   12.744884] NET: Registered protocol family 17
> > [   12.749217] Freeing unused kernel memory: 228k freed
> > [   12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8
> > [   12.888929] 
> > [   12.888930] sector 651061426900570, nr/cnr 0/0
> > [   12.892895] bio 81042f130730, biotail 81042f130730, buffer 
> > , data , len 0
> > [   12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 
> 
> Ah ok, I see the problem... cciss is overriding the data_len for
> BLOCK_PC requests, hence it does not complete them properly. Hmm. Does
> this work?
> 
> diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> index ef50068..b6fa52e 100644
> --- a/drivers/block/cciss.c
> +++ b/drivers/block/cciss.c
> @@ -2524,7 +2524,6 @@ after_error_processing:
>   resend_cciss_cmd(h, cmd);
>   return;
>   }
> - cmd->rq->data_len = 0;
>   cmd->rq->completion_data = cmd;
>   blk_complete_request(cmd->rq);
>  }


Things look good so far -- with the patch above I can finally boot the
machine.

Thanks, av
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Andrew Vasquez

On Tue, 29 Jan 2008, Jens Axboe wrote:

> Andrew, can you try with this applied?
> 
> diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> index ef50068..bd7b352 100644
> --- a/drivers/block/cciss.c
> +++ b/drivers/block/cciss.c
> @@ -1257,7 +1257,7 @@ static void cciss_softirq_done(struct request *rq)
>  #endif   /* CCISS_DEBUG */
>  
>   if (blk_end_request(rq, (rq->errors == 0) ? 0 : -EIO, blk_rq_bytes(rq)))
> - BUG();
> + blk_dump_rq_flags(rq, "cciss rq");
>  
>   spin_lock_irqsave(>lock, flags);
>   cmd_free(h, cmd, 1);

Here the final snippet that was logged:

[   12.724997] input: USB HID v1.01 Mouse [HP Virtual Keyboard] on 
usb-:01:04.4-1
[   12.728971] usbcore: registered new interface driver usbhid
[   12.732866] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
[   12.741172] TCP cubic registered
[   12.744506] NET: Registered protocol family 1
[   12.744884] NET: Registered protocol family 17
[   12.749217] Freeing unused kernel memory: 228k freed
[   12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8
[   12.888929] 
[   12.888930] sector 651061426900570, nr/cnr 0/0
[   12.892895] bio 81042f130730, biotail 81042f130730, buffer 
, data , len 0
[   12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Andrew Vasquez

Hitting a consistent BUG() with recent Linus' linux-2.6.git:

[   12.941428] [ cut here ]
[   12.944874] kernel BUG at drivers/block/cciss.c:1260!
[   12.944874] invalid opcode:  [1] SMP 
[   12.944874] CPU 0 
[   12.944874] Modules linked in:
[   12.944874] Pid: 0, comm: swapper Not tainted 2.6.24 #43
[   12.944874] RIP: 0010:[]  [] 
cciss_softirq_done+0xbc/0x1bf
[   12.944874] RSP: 0018:8063aed0  EFLAGS: 00010202
[   12.944874] RAX: 0001 RBX: 8100cf800010 RCX: 
81042f1253b0
[   12.944874] RDX: 81042de398f0 RSI: 81042de398f0 RDI: 
0001
[   12.944874] RBP: 81042daa R08: 81042f1253b0 R09: 
0001
[   12.944874] R10: 00fe R11:  R12: 
0002
[   12.944874] R13: 0001 R14: 8100cf80 R15: 
81042de398f0
[   12.944874] FS:  () GS:805bb000() 
knlGS:
[   12.944874] CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
[   12.944874] CR2: 2afed7eea340 CR3: 00042dbba000 CR4: 
06e0
[   12.944874] DR0:  DR1:  DR2: 

[   12.944874] DR3:  DR6: 0ff0 DR7: 
0400
[   12.944874] Process swapper (pid: 0, threadinfo 805f4000, 
task 805624a0)
[   12.944874] Stack:   8063af10 
0001 80632d60
[   12.944874]   000a 805bb900 
8032038f
[   12.944874]  8063af10 8063af10 805bb940 
802346b4
[   12.944874] Call Trace:
[   12.944874][] blk_done_softirq+0x69/0x78
[   12.944874]  [] __do_softirq+0x6f/0xd8
[   12.944874]  [] call_softirq+0x1c/0x30
[   12.944874]  [] do_softirq+0x30/0x80
[   12.944874]  [] do_IRQ+0x72/0xd9
[   12.944874]  [] mwait_idle+0x0/0x46
[   12.944874]  [] default_idle+0x0/0x3d
[   12.944874]  [] ret_from_intr+0x0/0xa
[   12.944874][] mwait_idle+0x42/0x46
[   12.944874]  [] cpu_idle+0x6a/0xae
[   12.944874] 
[   12.944874] 
[   12.944874] Code: 0f 0b eb fe 48 8d 85 d8 c0 00 00 48 89 04 24 48 89 
c7 e8 e5 
[   12.944874] RIP  [] cciss_softirq_done+0xbc/0x1bf
[   12.944874]  RSP 
[   12.944903] ---[ end trace e9c631603f90d22f ]---

code in question is in drivers/block/cciss.c:cciss_softirq_done():

...
if (blk_end_request(rq, (rq->errors == 0) ? 0 : -EIO, blk_rq_bytes(rq)))
BUG();

And appears to be a result of a recent merge:

commit f0f0052069989b80d2a3e50c9cd2f2a650bc1aea
Refs: v2.6.24-1949-gf0f0052
Merge: 68fbda7... a65b586...
Author: Linus Torvalds <[EMAIL PROTECTED]>
Date:   Tue Jan 29 08:51:32 2008 +1100

Merge branch 'blk-end-request' of 
git://git.kernel.dk/linux-2.6-block

Here's the commit which added the blk_end_request() BUG() on:

commit 3daeea29f9348263e0dda89a565074390475bdf8
Refs: v2.6.24-1743-g3daeea2
Author: Kiyoshi Ueda <[EMAIL PROTECTED]>
Date:   Tue Dec 11 17:50:03 2007 -0500

blk_end_request: changing cciss (take 4)

This patch converts cciss to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.

cciss is a little bit different from "normal" drivers.
cciss directly calls bio_endio() and disk_stat_add()
when completing request.  But those can be replaced with
__end_that_request_first().
After the replacement, request completion procedures of
those drivers become like the following:
o end_that_request_first()
o add_disk_randomness()
o end_that_request_last()
This can be converted to blk_end_request() by following
the rule (a) mentioned in the patch subject
"[PATCH 01/30] blk_end_request: add new request completion 
interface".

Cc: Mike Miller <[EMAIL PROTECTED]>
Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]>
Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>
Signed-off-by: Jens Axboe <[EMAIL PROTECTED]>

--
av
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Andrew Vasquez

Hitting a consistent BUG() with recent Linus' linux-2.6.git:

[   12.941428] [ cut here ]
[   12.944874] kernel BUG at drivers/block/cciss.c:1260!
[   12.944874] invalid opcode:  [1] SMP 
[   12.944874] CPU 0 
[   12.944874] Modules linked in:
[   12.944874] Pid: 0, comm: swapper Not tainted 2.6.24 #43
[   12.944874] RIP: 0010:[8039e43d]  [8039e43d] 
cciss_softirq_done+0xbc/0x1bf
[   12.944874] RSP: 0018:8063aed0  EFLAGS: 00010202
[   12.944874] RAX: 0001 RBX: 8100cf800010 RCX: 
81042f1253b0
[   12.944874] RDX: 81042de398f0 RSI: 81042de398f0 RDI: 
0001
[   12.944874] RBP: 81042daa R08: 81042f1253b0 R09: 
0001
[   12.944874] R10: 00fe R11:  R12: 
0002
[   12.944874] R13: 0001 R14: 8100cf80 R15: 
81042de398f0
[   12.944874] FS:  () GS:805bb000() 
knlGS:
[   12.944874] CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
[   12.944874] CR2: 2afed7eea340 CR3: 00042dbba000 CR4: 
06e0
[   12.944874] DR0:  DR1:  DR2: 

[   12.944874] DR3:  DR6: 0ff0 DR7: 
0400
[   12.944874] Process swapper (pid: 0, threadinfo 805f4000, 
task 805624a0)
[   12.944874] Stack:   8063af10 
0001 80632d60
[   12.944874]   000a 805bb900 
8032038f
[   12.944874]  8063af10 8063af10 805bb940 
802346b4
[   12.944874] Call Trace:
[   12.944874]  IRQ  [8032038f] blk_done_softirq+0x69/0x78
[   12.944874]  [802346b4] __do_softirq+0x6f/0xd8
[   12.944874]  [8020c45c] call_softirq+0x1c/0x30
[   12.944874]  [8020e347] do_softirq+0x30/0x80
[   12.944874]  [8020e409] do_IRQ+0x72/0xd9
[   12.944874]  [8020a50a] mwait_idle+0x0/0x46
[   12.944874]  [8020a3da] default_idle+0x0/0x3d
[   12.944874]  [8020b7e1] ret_from_intr+0x0/0xa
[   12.944874]  EOI  [8020a54c] mwait_idle+0x42/0x46
[   12.944874]  [8020a481] cpu_idle+0x6a/0xae
[   12.944874] 
[   12.944874] 
[   12.944874] Code: 0f 0b eb fe 48 8d 85 d8 c0 00 00 48 89 04 24 48 89 
c7 e8 e5 
[   12.944874] RIP  [8039e43d] cciss_softirq_done+0xbc/0x1bf
[   12.944874]  RSP 8063aed0
[   12.944903] ---[ end trace e9c631603f90d22f ]---

code in question is in drivers/block/cciss.c:cciss_softirq_done():

...
if (blk_end_request(rq, (rq-errors == 0) ? 0 : -EIO, blk_rq_bytes(rq)))
BUG();

And appears to be a result of a recent merge:

commit f0f0052069989b80d2a3e50c9cd2f2a650bc1aea
Refs: v2.6.24-1949-gf0f0052
Merge: 68fbda7... a65b586...
Author: Linus Torvalds [EMAIL PROTECTED]
Date:   Tue Jan 29 08:51:32 2008 +1100

Merge branch 'blk-end-request' of 
git://git.kernel.dk/linux-2.6-block

Here's the commit which added the blk_end_request() BUG() on:

commit 3daeea29f9348263e0dda89a565074390475bdf8
Refs: v2.6.24-1743-g3daeea2
Author: Kiyoshi Ueda [EMAIL PROTECTED]
Date:   Tue Dec 11 17:50:03 2007 -0500

blk_end_request: changing cciss (take 4)

This patch converts cciss to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.

cciss is a little bit different from normal drivers.
cciss directly calls bio_endio() and disk_stat_add()
when completing request.  But those can be replaced with
__end_that_request_first().
After the replacement, request completion procedures of
those drivers become like the following:
o end_that_request_first()
o add_disk_randomness()
o end_that_request_last()
This can be converted to blk_end_request() by following
the rule (a) mentioned in the patch subject
[PATCH 01/30] blk_end_request: add new request completion 
interface.

Cc: Mike Miller [EMAIL PROTECTED]
Signed-off-by: Kiyoshi Ueda [EMAIL PROTECTED]
Signed-off-by: Jun'ichi Nomura [EMAIL PROTECTED]
Signed-off-by: Jens Axboe [EMAIL PROTECTED]

--
av
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Andrew Vasquez

On Tue, 29 Jan 2008, Luck, Tony wrote:

   Will try this next.
 
  Should work even better since it avoids a lock and copy, but please do
  test if you have the time.
 
 That one works too (survived two full builds at make -j32 on the 16-way
 system).
 
 Thanks for the quick turnaround.

Jens, this patch appears to work well on my 16-way box...  Compile
tests appear to pass (quickly).  I'll be sure to let you know if
anything else 'block' related pops-up...

thanks, av
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Andrew Vasquez

On Tue, 29 Jan 2008, Jens Axboe wrote:

 On Tue, Jan 29 2008, Andrew Vasquez wrote:
  On Tue, 29 Jan 2008, Jens Axboe wrote:
  
Here the final snippet that was logged:

[   12.724997] input: USB HID v1.01 Mouse [HP Virtual Keyboard] on 
usb-:01:04.4-1
[   12.728971] usbcore: registered new interface driver usbhid
[   12.732866] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
[   12.741172] TCP cubic registered
[   12.744506] NET: Registered protocol family 1
[   12.744884] NET: Registered protocol family 17
[   12.749217] Freeing unused kernel memory: 228k freed
[   12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8
[   12.888929] 
[   12.888930] sector 651061426900570, nr/cnr 0/0
[   12.892895] bio 81042f130730, biotail 81042f130730, buffer 
, data , len 0
[   12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 
   
   Ah ok, I see the problem... cciss is overriding the data_len for
   BLOCK_PC requests, hence it does not complete them properly. Hmm. Does
   this work?
   
   diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
   index ef50068..b6fa52e 100644
   --- a/drivers/block/cciss.c
   +++ b/drivers/block/cciss.c
   @@ -2524,7 +2524,6 @@ after_error_processing:
 resend_cciss_cmd(h, cmd);
 return;
 }
   - cmd-rq-data_len = 0;
 cmd-rq-completion_data = cmd;
 blk_complete_request(cmd-rq);
}
  
  
  Things look good so far -- with the patch above I can finally boot the
  machine.
 
 Cool, sorry about that. Will get that applied asap. So after this patch
 was applied, you didn't see any debug messages from blk_dump_rq_flags()
 anymore, right?

That's correct.  I've yet to see any additional debug-messages from
blk_dump_rq_flags().

--
av
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Andrew Vasquez

On Tue, 29 Jan 2008, Jens Axboe wrote:

 Andrew, can you try with this applied?
 
 diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
 index ef50068..bd7b352 100644
 --- a/drivers/block/cciss.c
 +++ b/drivers/block/cciss.c
 @@ -1257,7 +1257,7 @@ static void cciss_softirq_done(struct request *rq)
  #endif   /* CCISS_DEBUG */
  
   if (blk_end_request(rq, (rq-errors == 0) ? 0 : -EIO, blk_rq_bytes(rq)))
 - BUG();
 + blk_dump_rq_flags(rq, cciss rq);
  
   spin_lock_irqsave(h-lock, flags);
   cmd_free(h, cmd, 1);

Here the final snippet that was logged:

[   12.724997] input: USB HID v1.01 Mouse [HP Virtual Keyboard] on 
usb-:01:04.4-1
[   12.728971] usbcore: registered new interface driver usbhid
[   12.732866] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
[   12.741172] TCP cubic registered
[   12.744506] NET: Registered protocol family 1
[   12.744884] NET: Registered protocol family 17
[   12.749217] Freeing unused kernel memory: 228k freed
[   12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8
[   12.888929] 
[   12.888930] sector 651061426900570, nr/cnr 0/0
[   12.892895] bio 81042f130730, biotail 81042f130730, buffer 
, data , len 0
[   12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Andrew Vasquez

On Tue, 29 Jan 2008, Jens Axboe wrote:

  Here the final snippet that was logged:
  
  [   12.724997] input: USB HID v1.01 Mouse [HP Virtual Keyboard] on 
  usb-:01:04.4-1
  [   12.728971] usbcore: registered new interface driver usbhid
  [   12.732866] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
  [   12.741172] TCP cubic registered
  [   12.744506] NET: Registered protocol family 1
  [   12.744884] NET: Registered protocol family 17
  [   12.749217] Freeing unused kernel memory: 228k freed
  [   12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8
  [   12.888929] 
  [   12.888930] sector 651061426900570, nr/cnr 0/0
  [   12.892895] bio 81042f130730, biotail 81042f130730, buffer 
  , data , len 0
  [   12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 
 
 Ah ok, I see the problem... cciss is overriding the data_len for
 BLOCK_PC requests, hence it does not complete them properly. Hmm. Does
 this work?
 
 diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
 index ef50068..b6fa52e 100644
 --- a/drivers/block/cciss.c
 +++ b/drivers/block/cciss.c
 @@ -2524,7 +2524,6 @@ after_error_processing:
   resend_cciss_cmd(h, cmd);
   return;
   }
 - cmd-rq-data_len = 0;
   cmd-rq-completion_data = cmd;
   blk_complete_request(cmd-rq);
  }


Things look good so far -- with the patch above I can finally boot the
machine.

Thanks, av
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Andrew Vasquez

On Tue, 29 Jan 2008, Miller, Mike (OS Dev) wrote:

 Jens wrote:

  -Original Message-
  From: Jens Axboe [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, January 29, 2008 12:54 PM
  To: Andrew Vasquez
  Cc: Linux Kernel Mailing List; Miller, Mike (OS Dev);
  [EMAIL PROTECTED]; [EMAIL PROTECTED]
  Subject: Re: kernel BUG at drivers/block/cciss.c:1260! (with
  recent linux-2.6 tree)

  On Tue, Jan 29 2008, Andrew Vasquez wrote:
   On Tue, 29 Jan 2008, Jens Axboe wrote:

On Tue, Jan 29 2008, Andrew Vasquez wrote:
 On Tue, 29 Jan 2008, Jens Axboe wrote:

   Here the final snippet that was logged:

   [   12.724997] input: USB HID v1.01 Mouse [HP
  Virtual Keyboard] on usb-:01:04.4-1
   [   12.728971] usbcore: registered new interface
  driver usbhid
   [   12.732866] drivers/hid/usbhid/hid-core.c:
  v2.6:USB HID core driver
   [   12.741172] TCP cubic registered
   [   12.744506] NET: Registered protocol family 1
   [   12.744884] NET: Registered protocol family 17
   [   12.749217] Freeing unused kernel memory: 228k freed
   [   12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8
   [   12.888929]
   [   12.888930] sector 651061426900570, nr/cnr 0/0
   [   12.892895] bio 81042f130730, biotail
  81042f130730, buffer , data
  , len 0
   [   12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00
  00 00 00 00 00 00

  Ah ok, I see the problem... cciss is overriding the
  data_len for
  BLOCK_PC requests, hence it does not complete them properly.
  Hmm. Does this work?

  diff --git a/drivers/block/cciss.c
  b/drivers/block/cciss.c index
  ef50068..b6fa52e 100644
  --- a/drivers/block/cciss.c
  +++ b/drivers/block/cciss.c
  @@ -2524,7 +2524,6 @@ after_error_processing:
  resend_cciss_cmd(h, cmd);
  return;
  }
  -   cmd-rq-data_len = 0;
  cmd-rq-completion_data = cmd;
  blk_complete_request(cmd-rq);  }

 Things look good so far -- with the patch above I can
  finally boot
 the machine.

Cool, sorry about that. Will get that applied asap. So after this
patch was applied, you didn't see any debug messages from
blk_dump_rq_flags() anymore, right?

   That's correct.  I've yet to see any additional debug-messages from
   blk_dump_rq_flags().

  Great, thanks for confirming. It does look like a clear bug
  in cciss, it just got exposed now that it uses proper end
  request handling. We never need to clear -data_len, since
  for blk_fs_request() it will be cleared on init. So just
  setting a residual count there for blk_fs_request() like
  cciss does is fine.

 Just so I'm clear: just removing the one line is enough to resolve the 
 problem?

That's correct.  The only other change to cciss.c in my tree is where
the BUG() call was replaced with a call to blk_dump_rq_flags():

@@ -1257,7 +1257,8 @@ static void cciss_softirq_done(struct request *rq)
 #endif /* CCISS_DEBUG */

if (blk_end_request(rq, (rq-errors == 0) ? 0 : -EIO, 
blk_rq_bytes(rq)))
-   BUG();
+   blk_dump_rq_flags(rq, cciss rq);
+// BUG();

spin_lock_irqsave(h-lock, flags);
cmd_free(h, cmd, 1);
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] pci: Remove users of pci_enable_device_bars()

2008-01-07 Thread Andrew Vasquez

On Thu, 20 Dec 2007, Benjamin Herrenschmidt wrote:

> This patch converts users of pci_enable_device_bars() to the new
> pci_enable_device_{io,mem} interface.
> 
> The new API fits nicely, except maybe for the QLA case where a bit of
> code re-organization might be a good idea but I prefer sticking to the
> simple patch as I don't have hardware to test on.

That's fine.  I take it these patches will be funneled via
gregkh/pci-2.6.git.  There's some qla2xxx updates which are queued for
post-2.6.24 consumption in jejb/scsi-misc-2.6.git which don't appear
to have any conflicts.

I do though have a series of patches which I'll hold off on submitting
to linux-scsi until these PCI changes are merged, as there's some
minor conflicts merging the three branches.  James B, will that be
fine?

The patches themselves appear to be working fine within several of our
test rings.  I hold off on some of the cleanup post-merge time...

Thanks,
Andrew Vasquez

> Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
> ---
> 
>  drivers/ata/pata_cs5520.c   |2 +-
>  drivers/i2c/busses/scx200_acb.c |2 +-
>  drivers/ide/pci/cs5520.c|   10 --
>  drivers/ide/setup-pci.c |6 --
>  drivers/scsi/lpfc/lpfc_init.c   |3 +--
>  drivers/scsi/qla2xxx/qla_os.c   |   12 +---

Acked-by: Andrew Vasquez <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] pci: Remove users of pci_enable_device_bars()

2008-01-07 Thread Andrew Vasquez

On Thu, 20 Dec 2007, Benjamin Herrenschmidt wrote:

 This patch converts users of pci_enable_device_bars() to the new
 pci_enable_device_{io,mem} interface.
 
 The new API fits nicely, except maybe for the QLA case where a bit of
 code re-organization might be a good idea but I prefer sticking to the
 simple patch as I don't have hardware to test on.

That's fine.  I take it these patches will be funneled via
gregkh/pci-2.6.git.  There's some qla2xxx updates which are queued for
post-2.6.24 consumption in jejb/scsi-misc-2.6.git which don't appear
to have any conflicts.

I do though have a series of patches which I'll hold off on submitting
to linux-scsi until these PCI changes are merged, as there's some
minor conflicts merging the three branches.  James B, will that be
fine?

The patches themselves appear to be working fine within several of our
test rings.  I hold off on some of the cleanup post-merge time...

Thanks,
Andrew Vasquez

 Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED]
 ---
 
  drivers/ata/pata_cs5520.c   |2 +-
  drivers/i2c/busses/scx200_acb.c |2 +-
  drivers/ide/pci/cs5520.c|   10 --
  drivers/ide/setup-pci.c |6 --
  drivers/scsi/lpfc/lpfc_init.c   |3 +--
  drivers/scsi/qla2xxx/qla_os.c   |   12 +---

Acked-by: Andrew Vasquez [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] drivers/scsi/: Spelling fixes

2007-12-17 Thread Andrew Vasquez

On Mon, 17 Dec 2007, Joe Perches wrote:

> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>
> ---
>  drivers/scsi/NCR53C9x.h   |2 +-
>  drivers/scsi/aic7xxx/aic79xx_inline.h |2 +-
>  drivers/scsi/aic7xxx/aic79xx_osm.c|2 +-
>  drivers/scsi/aic7xxx/aic79xx_pci.c|4 ++--
>  drivers/scsi/aic7xxx/aic7xxx_inline.h |2 +-
>  drivers/scsi/aic7xxx/aic7xxx_osm.c|2 +-
>  drivers/scsi/ipr.c|2 +-
>  drivers/scsi/ips.c|2 +-
>  drivers/scsi/iscsi_tcp.c  |4 ++--
>  drivers/scsi/lpfc/lpfc.h  |2 +-
>  drivers/scsi/lpfc/lpfc_mbox.c |2 +-
>  drivers/scsi/megaraid/megaraid_mbox.c |   10 +-
>  drivers/scsi/psi240i.c|2 +-
>  drivers/scsi/qla2xxx/qla_gs.c |2 +-

qla2xxx bits:

Acked-by: Andrew Vasquez <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] drivers/scsi/: Spelling fixes

2007-12-17 Thread Andrew Vasquez

On Mon, 17 Dec 2007, Joe Perches wrote:

 Signed-off-by: Joe Perches [EMAIL PROTECTED]
 ---
  drivers/scsi/NCR53C9x.h   |2 +-
  drivers/scsi/aic7xxx/aic79xx_inline.h |2 +-
  drivers/scsi/aic7xxx/aic79xx_osm.c|2 +-
  drivers/scsi/aic7xxx/aic79xx_pci.c|4 ++--
  drivers/scsi/aic7xxx/aic7xxx_inline.h |2 +-
  drivers/scsi/aic7xxx/aic7xxx_osm.c|2 +-
  drivers/scsi/ipr.c|2 +-
  drivers/scsi/ips.c|2 +-
  drivers/scsi/iscsi_tcp.c  |4 ++--
  drivers/scsi/lpfc/lpfc.h  |2 +-
  drivers/scsi/lpfc/lpfc_mbox.c |2 +-
  drivers/scsi/megaraid/megaraid_mbox.c |   10 +-
  drivers/scsi/psi240i.c|2 +-
  drivers/scsi/qla2xxx/qla_gs.c |2 +-

qla2xxx bits:

Acked-by: Andrew Vasquez [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one

2007-10-09 Thread Andrew Vasquez

On Tue, 09 Oct 2007, James Smart wrote:

>  Why do you prefer request_firmware() vs something over sysfs ?
> 
>  Does environments like the kdump kernel also have access to data needed
>  by request_firmware() ?

There's already much in the way of automation and infrastructure
present in supporting the request_firwmare() interfaces (perhaps not
the best of names) which can provide for a level of flexibility beyond
a basic 'soft_port_name' interface.

Though I don't see why both can't coexist cleanly -- I take it the use
case you are considering is: software recognizes no valid WWPN
available, query via request_firmware() fails, software halts
initialization (rather than fail), and awaits the admin to poke
'0x123456.. > /sys/.../fc_host/soft_port_name', causing a ping to the
driver and continuation of initialization with requested portname?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one

2007-10-09 Thread Andrew Vasquez

On Tue, 09 Oct 2007, James Smart wrote:

  Why do you prefer request_firmware() vs something over sysfs ?
 
  Does environments like the kdump kernel also have access to data needed
  by request_firmware() ?

There's already much in the way of automation and infrastructure
present in supporting the request_firwmare() interfaces (perhaps not
the best of names) which can provide for a level of flexibility beyond
a basic 'soft_port_name' interface.

Though I don't see why both can't coexist cleanly -- I take it the use
case you are considering is: software recognizes no valid WWPN
available, query via request_firmware() fails, software halts
initialization (rather than fail), and awaits the admin to poke
'0x123456..  /sys/.../fc_host/soft_port_name', causing a ping to the
driver and continuation of initialization with requested portname?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one

2007-10-08 Thread Andrew Vasquez

On Mon, 08 Oct 2007, Darrick J. Wong wrote:

> On Mon, Oct 08, 2007 at 03:48:32PM -0700, Andrew Vasquez wrote:
> 
> > So how about factoring that out to a transport-level interface.  How
> > about something along the lines of the following patch, whereby the
> > software driver upon detecting no valid WWPN, makes an upcall to each
> > interface's 'request_wwn()'.  The data passed in from shost_gendev
> > should be enough for some helper script to cull relevent device bits
> > and perhaps offer some level of persistence...  Off base?
> 
> Hrm... jejb made a remark that it might be better to pass the
> scsi_host's device into request_firmware() as your example does, so I'll
> pitch in a patch to do likewise with libsas--the scsi_host knows the
> actual device it's coming from, and userland can sort that all out later
> anyway via DEVPATH.
> 
> I suppose one could also have multiple scsi_hosts per PCI device, which
> means that my first patch would stumble horribly in more than a few
> cases.

This is done already in the FC case -- NPIV.  Though with that
interface, the administrator is already responsible for assigning
proper WWNN/WWPN during creation.

> > Darrick, forgive the FC example, I don't do SAS...
> 
> That's ok, I don't do FC. :)  Looks mostly good to me...

--
av
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one

2007-10-08 Thread Andrew Vasquez

On Mon, 08 Oct 2007, Darrick J. Wong wrote:

> If the aic94xx chip doesn't have a SAS address in the chip's flash memory,
> use the request_firmware() interface to get one from userspace.  This
> way, there's no debate as to who or how an address gets generated--it's
> totally up to the administrator to provide it if the card doesn't have one.

So how about factoring that out to a transport-level interface.  How
about something along the lines of the following patch, whereby the
software driver upon detecting no valid WWPN, makes an upcall to each
interface's 'request_wwn()'.  The data passed in from shost_gendev
should be enough for some helper script to cull relevent device bits
and perhaps offer some level of persistence...  Off base?

Darrick, forgive the FC example, I don't do SAS...

--
av

--

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 7a7cfe5..5e0d953 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "scsi_priv.h"
 #include "scsi_transport_fc_internal.h"
 
@@ -3251,6 +3252,30 @@ fc_vport_sched_delete(struct work_struct *work)
vport->channel, stat);
 }
 
+int
+fc_request_wwn(struct Scsi_Host *shost, u64 *wwn)
+{
+   const struct firmware *fw;
+   int stat;
+
+   stat = request_firmware(, "fc_addr", >shost_gendev);
+   if (stat)
+   return stat;
+
+   if (fw->size < 16) {
+   stat = -EINVAL;
+   goto out;
+   }
+
+   stat = fc_parse_wwn(fw->data, wwn);
+   if (stat)
+   return stat;
+
+out:
+   release_firmware(fw);
+   return stat;
+}
+EXPORT_SYMBOL(fc_request_wwn);
 
 /* Original Author:  Martin Hicks */
 MODULE_AUTHOR("James Smart");
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index e466d88..e80c36c 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -734,4 +734,6 @@ void fc_host_post_vendor_event(struct Scsi_Host *shost, u32 
event_number,
 */
 int fc_vport_terminate(struct fc_vport *vport);
 
+int fc_request_wwn(struct Scsi_Host *, u64 *);
+
 #endif /* SCSI_TRANSPORT_FC_H */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one

2007-10-08 Thread Andrew Vasquez

On Mon, 08 Oct 2007, Darrick J. Wong wrote:

 If the aic94xx chip doesn't have a SAS address in the chip's flash memory,
 use the request_firmware() interface to get one from userspace.  This
 way, there's no debate as to who or how an address gets generated--it's
 totally up to the administrator to provide it if the card doesn't have one.

So how about factoring that out to a transport-level interface.  How
about something along the lines of the following patch, whereby the
software driver upon detecting no valid WWPN, makes an upcall to each
interface's 'request_wwn()'.  The data passed in from shost_gendev
should be enough for some helper script to cull relevent device bits
and perhaps offer some level of persistence...  Off base?

Darrick, forgive the FC example, I don't do SAS...

--
av

--

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 7a7cfe5..5e0d953 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -35,6 +35,7 @@
 #include linux/netlink.h
 #include net/netlink.h
 #include scsi/scsi_netlink_fc.h
+#include linux/firmware.h
 #include scsi_priv.h
 #include scsi_transport_fc_internal.h
 
@@ -3251,6 +3252,30 @@ fc_vport_sched_delete(struct work_struct *work)
vport-channel, stat);
 }
 
+int
+fc_request_wwn(struct Scsi_Host *shost, u64 *wwn)
+{
+   const struct firmware *fw;
+   int stat;
+
+   stat = request_firmware(fw, fc_addr, shost-shost_gendev);
+   if (stat)
+   return stat;
+
+   if (fw-size  16) {
+   stat = -EINVAL;
+   goto out;
+   }
+
+   stat = fc_parse_wwn(fw-data, wwn);
+   if (stat)
+   return stat;
+
+out:
+   release_firmware(fw);
+   return stat;
+}
+EXPORT_SYMBOL(fc_request_wwn);
 
 /* Original Author:  Martin Hicks */
 MODULE_AUTHOR(James Smart);
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index e466d88..e80c36c 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -734,4 +734,6 @@ void fc_host_post_vendor_event(struct Scsi_Host *shost, u32 
event_number,
 */
 int fc_vport_terminate(struct fc_vport *vport);
 
+int fc_request_wwn(struct Scsi_Host *, u64 *);
+
 #endif /* SCSI_TRANSPORT_FC_H */
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one

2007-10-08 Thread Andrew Vasquez

On Mon, 08 Oct 2007, Darrick J. Wong wrote:

 On Mon, Oct 08, 2007 at 03:48:32PM -0700, Andrew Vasquez wrote:
 
  So how about factoring that out to a transport-level interface.  How
  about something along the lines of the following patch, whereby the
  software driver upon detecting no valid WWPN, makes an upcall to each
  interface's 'request_wwn()'.  The data passed in from shost_gendev
  should be enough for some helper script to cull relevent device bits
  and perhaps offer some level of persistence...  Off base?
 
 Hrm... jejb made a remark that it might be better to pass the
 scsi_host's device into request_firmware() as your example does, so I'll
 pitch in a patch to do likewise with libsas--the scsi_host knows the
 actual device it's coming from, and userland can sort that all out later
 anyway via DEVPATH.
 
 I suppose one could also have multiple scsi_hosts per PCI device, which
 means that my first patch would stumble horribly in more than a few
 cases.

This is done already in the FC case -- NPIV.  Though with that
interface, the administrator is already responsible for assigning
proper WWNN/WWPN during creation.

  Darrick, forgive the FC example, I don't do SAS...
 
 That's ok, I don't do FC. :)  Looks mostly good to me...

--
av
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3/3] 2.6.23-rc2: known regressions v2

2007-08-08 Thread Andrew Vasquez

On Wed, 08 Aug 2007, Michal Piotrowski wrote:

> Here is a list of some known regressions in 2.6.23-rc2.
> 
> Feel free to add new regressions/remove fixed etc.
> http://kernelnewbies.org/known_regressions
...
> SCSI
> 
> Subject : unable to handle kernel NULL pointer dereference in 
> qla2x00_read_nvram_data
> References  : http://lkml.org/lkml/2007/8/6/506
> Last known good : ?
> Submitter   : Zhang, Yanmin <[EMAIL PROTECTED]>
> Caused-By   : ?
> Handled-By  : ?
> Status  : unknown

Already addressed in Linus' latest GIT tree:

http://article.gmane.org/gmane.linux.scsi/33472/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3/3] 2.6.23-rc2: known regressions v2

2007-08-08 Thread Andrew Vasquez

On Wed, 08 Aug 2007, Michal Piotrowski wrote:

 Here is a list of some known regressions in 2.6.23-rc2.
 
 Feel free to add new regressions/remove fixed etc.
 http://kernelnewbies.org/known_regressions
...
 SCSI
 
 Subject : unable to handle kernel NULL pointer dereference in 
 qla2x00_read_nvram_data
 References  : http://lkml.org/lkml/2007/8/6/506
 Last known good : ?
 Submitter   : Zhang, Yanmin [EMAIL PROTECTED]
 Caused-By   : ?
 Handled-By  : ?
 Status  : unknown

Already addressed in Linus' latest GIT tree:

http://article.gmane.org/gmane.linux.scsi/33472/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] qla2xxx: allocate enough space for the full PCI descriptor.

2007-08-01 Thread Andrew Vasquez

Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]>
---

On Thu, 26 Jul 2007, Andrew Vasquez wrote:

> On Thu, 26 Jul 2007, Andrew Patterson wrote:
> 
> > On Thu, 2007-07-26 at 15:36 +0200, Ulrich Windl wrote:
> > > Hi,
> > > 
> > > <6>QLogic Fibre Channel HBA Driver
> > > <6>GSI 49 (level, low) -> CPU 3 (0x0300) vector 51
> > > <6>ACPI: PCI Interrupt :0f:01.0[A] -> GSI 49 (level, low) -> 
IRQ 51
> > > <6>qla2xxx :0f:01.0: Found an ISP2422, irq 51, iobase 
0xc000b004
> > > [...]
> > > <6>qla2xxx :0f:01.0: LOOP UP detected (4 Gbps).
> > > <6>qla2xxx :0f:01.0: Topology - (F_Port), Host Loop address 
0x0
> > > <6>scsi0 : qla2xxx
> > > <6>qla2xxx :0f:01.0:
> > > <4> QLogic Fibre Channel HBA Driver: 8.01.07-k3
> > > <4>  QLogic HP AB378-60001 -
> > > <4>  ISP2422: PCI-X Mode 2 (133 MH4.00.26 [IP]  @ :0f:01.0 
hdma+, host#=0, 
> > > fw=4.00.26 [IP]
> 
> The 33/66/100/133 values refer to the bus-clock speed at which the
> card is operating.  As is seen here (although a bit truncated --
> separate issue, I'll try to see if I can reproduce this on one of my
> HPQ rigs),

Ok, so what's happening here is the buffer passed in (pci_info)
does not have bytes allocated (off by 3).

James, please apply...

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 93c0c7e..acca898 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1564,7 +1564,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct 
pci_device_id *id)
struct Scsi_Host *host;
scsi_qla_host_t *ha;
unsigned long   flags = 0;
-   char pci_info[20];
+   char pci_info[30];
char fw_str[30];
struct scsi_host_template *sht;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] qla2xxx: allocate enough space for the full PCI descriptor.

2007-08-01 Thread Andrew Vasquez

Signed-off-by: Andrew Vasquez [EMAIL PROTECTED]
---

On Thu, 26 Jul 2007, Andrew Vasquez wrote:

 On Thu, 26 Jul 2007, Andrew Patterson wrote:
 
  On Thu, 2007-07-26 at 15:36 +0200, Ulrich Windl wrote:
   Hi,
   
   6QLogic Fibre Channel HBA Driver
   6GSI 49 (level, low) - CPU 3 (0x0300) vector 51
   6ACPI: PCI Interrupt :0f:01.0[A] - GSI 49 (level, low) - 
IRQ 51
   6qla2xxx :0f:01.0: Found an ISP2422, irq 51, iobase 
0xc000b004
   [...]
   6qla2xxx :0f:01.0: LOOP UP detected (4 Gbps).
   6qla2xxx :0f:01.0: Topology - (F_Port), Host Loop address 
0x0
   6scsi0 : qla2xxx
   6qla2xxx :0f:01.0:
   4 QLogic Fibre Channel HBA Driver: 8.01.07-k3
   4  QLogic HP AB378-60001 -
   4  ISP2422: PCI-X Mode 2 (133 MH4.00.26 [IP]  @ :0f:01.0 
hdma+, host#=0, 
   fw=4.00.26 [IP]
 
 The 33/66/100/133 values refer to the bus-clock speed at which the
 card is operating.  As is seen here (although a bit truncated --
 separate issue, I'll try to see if I can reproduce this on one of my
 HPQ rigs),

Ok, so what's happening here is the buffer passed in (pci_info)
does not have bytes allocated (off by 3).

James, please apply...

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 93c0c7e..acca898 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1564,7 +1564,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct 
pci_device_id *id)
struct Scsi_Host *host;
scsi_qla_host_t *ha;
unsigned long   flags = 0;
-   char pci_info[20];
+   char pci_info[30];
char fw_str[30];
struct scsi_host_template *sht;
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 64] drivers/scsi/qla2xxx/qla_init.c: mostly kmalloc + memset conversion to k[cz]alloc

2007-07-31 Thread Andrew Vasquez

On Tue, 31 Jul 2007, Mariusz Kozlowski wrote:

> Signed-off-by: Mariusz Kozlowski <[EMAIL PROTECTED]>
> 
>  drivers/scsi/qla2xxx/qla_init.c | 107445 -> 107327 (-118 bytes)
>  drivers/scsi/qla2xxx/qla_init.o | 237540 -> 237424 (-116 bytes)
> 
>  drivers/scsi/qla2xxx/qla_init.c |   14 ++
>  1 file changed, 6 insertions(+), 8 deletions(-)

Acked-by: Andrew Vasquez <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Q: PCI-X @ 266MHz on HP rx6600 (Qlogic 4Gb FC HBA)

2007-07-31 Thread Andrew Vasquez

> On Fri, 27 Jul 2007, Andrew Patterson wrote:
> 
> > On Thu, 2007-07-26 at 23:23 -0700, Andrew Vasquez wrote:
> >
> > > The 33/66/100/133 values refer to the bus-clock speed at which the
> > > card is operating.  As is seen here (although a bit truncated --
> > > separate issue, I'll try to see if I can reproduce this on one of my
> > > HPQ rigs), the card is inserted into a PCI-X Mode-2 capable 133MHz
> > > (bus clock) slot.  When operating under this mode, each data-phase
> > > between two devices is divided into 2 sub-phases, effectively doubling
> > > the transfer-data-rate to 266Mhz.
> > 
> > I guess the proper terminology would be 266 MT/s (Mega
> > Transfers/second). Looking through the PSI_SIG PCI-X 2.0 marketing
> > blurbs, they use MHz a lot when referring to MT/S. So I would still
> > consider this to be a minor bug.  The user wants to know the transfer
> > rate, not the actual frequency of the bus.  Maybe just print out the
> > mode used instead, e.g., "PCI-X 266"?

Given PCI-X Mode-2 can run at different bus-clock speeds, how about
this as an alternative?

PCI-X 266 (133Mhz)

it's a bit more descriptive than

PCI-X Mode 2 (133Mhz)

then again, I don't want to beat this thing to death...

---

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index c488996..26f7e54 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -283,9 +283,9 @@ qla24xx_pci_info_str(struct scsi_qla_host *ha, char *str)
} else {
strcat(str, "-X ");
if (pci_bus & BIT_2)
-   strcat(str, "Mode 2");
+   strcat(str, "266");
else
-   strcat(str, "Mode 1");
+   strcat(str, "133");
strcat(str, " (");
strcat(str, pci_bus_modes[pci_bus & ~BIT_2]);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Q: PCI-X @ 266MHz on HP rx6600 (Qlogic 4Gb FC HBA)

2007-07-31 Thread Andrew Vasquez

 On Fri, 27 Jul 2007, Andrew Patterson wrote:
 
  On Thu, 2007-07-26 at 23:23 -0700, Andrew Vasquez wrote:
 
   The 33/66/100/133 values refer to the bus-clock speed at which the
   card is operating.  As is seen here (although a bit truncated --
   separate issue, I'll try to see if I can reproduce this on one of my
   HPQ rigs), the card is inserted into a PCI-X Mode-2 capable 133MHz
   (bus clock) slot.  When operating under this mode, each data-phase
   between two devices is divided into 2 sub-phases, effectively doubling
   the transfer-data-rate to 266Mhz.
  
  I guess the proper terminology would be 266 MT/s (Mega
  Transfers/second). Looking through the PSI_SIG PCI-X 2.0 marketing
  blurbs, they use MHz a lot when referring to MT/S. So I would still
  consider this to be a minor bug.  The user wants to know the transfer
  rate, not the actual frequency of the bus.  Maybe just print out the
  mode used instead, e.g., PCI-X 266?

Given PCI-X Mode-2 can run at different bus-clock speeds, how about
this as an alternative?

PCI-X 266 (133Mhz)

it's a bit more descriptive than

PCI-X Mode 2 (133Mhz)

then again, I don't want to beat this thing to death...

---

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index c488996..26f7e54 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -283,9 +283,9 @@ qla24xx_pci_info_str(struct scsi_qla_host *ha, char *str)
} else {
strcat(str, -X );
if (pci_bus  BIT_2)
-   strcat(str, Mode 2);
+   strcat(str, 266);
else
-   strcat(str, Mode 1);
+   strcat(str, 133);
strcat(str,  ();
strcat(str, pci_bus_modes[pci_bus  ~BIT_2]);
}
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 64] drivers/scsi/qla2xxx/qla_init.c: mostly kmalloc + memset conversion to k[cz]alloc

2007-07-31 Thread Andrew Vasquez

On Tue, 31 Jul 2007, Mariusz Kozlowski wrote:

 Signed-off-by: Mariusz Kozlowski [EMAIL PROTECTED]
 
  drivers/scsi/qla2xxx/qla_init.c | 107445 - 107327 (-118 bytes)
  drivers/scsi/qla2xxx/qla_init.o | 237540 - 237424 (-116 bytes)
 
  drivers/scsi/qla2xxx/qla_init.c |   14 ++
  1 file changed, 6 insertions(+), 8 deletions(-)

Acked-by: Andrew Vasquez [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Q: PCI-X @ 266MHz on HP rx6600 (Qlogic 4Gb FC HBA)

2007-07-27 Thread Andrew Vasquez

On Fri, 27 Jul 2007, Andrew Patterson wrote:

> On Thu, 2007-07-26 at 23:23 -0700, Andrew Vasquez wrote:
>
> > The 33/66/100/133 values refer to the bus-clock speed at which the
> > card is operating.  As is seen here (although a bit truncated --
> > separate issue, I'll try to see if I can reproduce this on one of my
> > HPQ rigs), the card is inserted into a PCI-X Mode-2 capable 133MHz
> > (bus clock) slot.  When operating under this mode, each data-phase
> > between two devices is divided into 2 sub-phases, effectively doubling
> > the transfer-data-rate to 266Mhz.
> 
> I guess the proper terminology would be 266 MT/s (Mega
> Transfers/second). Looking through the PSI_SIG PCI-X 2.0 marketing
> blurbs, they use MHz a lot when referring to MT/S. So I would still
> consider this to be a minor bug.  The user wants to know the transfer
> rate, not the actual frequency of the bus.  Maybe just print out the
> mode used instead, e.g., "PCI-X 266"?

That sounds reasonable.  I'll spin some patches today after I verify
all the bus-bits with the PUI group.

Thanks,
Andrew Vasquez
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Q: PCI-X @ 266MHz on HP rx6600 (Qlogic 4Gb FC HBA)

2007-07-27 Thread Andrew Vasquez

On Thu, 26 Jul 2007, Andrew Patterson wrote:

> On Thu, 2007-07-26 at 15:36 +0200, Ulrich Windl wrote:
> > Hi,
> > 
> > I have a question: The Qlogic ISP2422 chip is said to handle PCI-X 266MHz. 
> > So does 
> > the HP Itanium2 server rx6600. Basically that was the reason to select that 
> > server. The FC-HBA is in a 266 MHz capable slot. However when booting 
> > SLES10 SP1 
> > for IA64, the logs say:

There's a mixup here in terminology...  The QLA2460 card which you
have does in fact support 'PCI-X 266'...

> > <6>QLogic Fibre Channel HBA Driver
> > <6>GSI 49 (level, low) -> CPU 3 (0x0300) vector 51
> > <6>ACPI: PCI Interrupt :0f:01.0[A] -> GSI 49 (level, low) -> IRQ 51
> > <6>qla2xxx :0f:01.0: Found an ISP2422, irq 51, iobase 0xc000b004
> > [...]
> > <6>qla2xxx :0f:01.0: LOOP UP detected (4 Gbps).
> > <6>qla2xxx :0f:01.0: Topology - (F_Port), Host Loop address 0x0
> > <6>scsi0 : qla2xxx
> > <6>qla2xxx :0f:01.0:
> > <4> QLogic Fibre Channel HBA Driver: 8.01.07-k3
> > <4>  QLogic HP AB378-60001 -
> > <4>  ISP2422: PCI-X Mode 2 (133 MH4.00.26 [IP]  @ :0f:01.0 hdma+, 
> > host#=0, 
> > fw=4.00.26 [IP]

The 33/66/100/133 values refer to the bus-clock speed at which the
card is operating.  As is seen here (although a bit truncated --
separate issue, I'll try to see if I can reproduce this on one of my
HPQ rigs), the card is inserted into a PCI-X Mode-2 capable 133MHz
(bus clock) slot.  When operating under this mode, each data-phase
between two devices is divided into 2 sub-phases, effectively doubling
the transfer-data-rate to 266Mhz.

> > <5>  Vendor: HPModel: HSV200Rev: 6100
> > <5>  Type:   RAID   ANSI SCSI revision: 02
> > <5> 0:0:0:0: Attached scsi generic sg0 type 12
> > 
> > Now does Linux support the speed of 266 MHz, and is it just displayed 
> > incorrectly, 
> > or doesn't Linux support the speed of 266MHz yet?
> 
> This is a bug in the driver.  The lookup table only goes to 133 MHz.
> 
> static char *pci_bus_modes[] = {
> "33", "66", "100", "133",
> 
> The same problem exists in the scsi_misc tree.

Regards,
Andrew Vasquez
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Q: PCI-X @ 266MHz on HP rx6600 (Qlogic 4Gb FC HBA)

2007-07-27 Thread Andrew Vasquez

On Thu, 26 Jul 2007, Andrew Patterson wrote:

 On Thu, 2007-07-26 at 15:36 +0200, Ulrich Windl wrote:
  Hi,
  
  I have a question: The Qlogic ISP2422 chip is said to handle PCI-X 266MHz. 
  So does 
  the HP Itanium2 server rx6600. Basically that was the reason to select that 
  server. The FC-HBA is in a 266 MHz capable slot. However when booting 
  SLES10 SP1 
  for IA64, the logs say:

There's a mixup here in terminology...  The QLA2460 card which you
have does in fact support 'PCI-X 266'...

  6QLogic Fibre Channel HBA Driver
  6GSI 49 (level, low) - CPU 3 (0x0300) vector 51
  6ACPI: PCI Interrupt :0f:01.0[A] - GSI 49 (level, low) - IRQ 51
  6qla2xxx :0f:01.0: Found an ISP2422, irq 51, iobase 0xc000b004
  [...]
  6qla2xxx :0f:01.0: LOOP UP detected (4 Gbps).
  6qla2xxx :0f:01.0: Topology - (F_Port), Host Loop address 0x0
  6scsi0 : qla2xxx
  6qla2xxx :0f:01.0:
  4 QLogic Fibre Channel HBA Driver: 8.01.07-k3
  4  QLogic HP AB378-60001 -
  4  ISP2422: PCI-X Mode 2 (133 MH4.00.26 [IP]  @ :0f:01.0 hdma+, 
  host#=0, 
  fw=4.00.26 [IP]

The 33/66/100/133 values refer to the bus-clock speed at which the
card is operating.  As is seen here (although a bit truncated --
separate issue, I'll try to see if I can reproduce this on one of my
HPQ rigs), the card is inserted into a PCI-X Mode-2 capable 133MHz
(bus clock) slot.  When operating under this mode, each data-phase
between two devices is divided into 2 sub-phases, effectively doubling
the transfer-data-rate to 266Mhz.

  5  Vendor: HPModel: HSV200Rev: 6100
  5  Type:   RAID   ANSI SCSI revision: 02
  5 0:0:0:0: Attached scsi generic sg0 type 12
  
  Now does Linux support the speed of 266 MHz, and is it just displayed 
  incorrectly, 
  or doesn't Linux support the speed of 266MHz yet?
 
 This is a bug in the driver.  The lookup table only goes to 133 MHz.
 
 static char *pci_bus_modes[] = {
 33, 66, 100, 133,
 
 The same problem exists in the scsi_misc tree.

Regards,
Andrew Vasquez
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Q: PCI-X @ 266MHz on HP rx6600 (Qlogic 4Gb FC HBA)

2007-07-27 Thread Andrew Vasquez

On Fri, 27 Jul 2007, Andrew Patterson wrote:

 On Thu, 2007-07-26 at 23:23 -0700, Andrew Vasquez wrote:

  The 33/66/100/133 values refer to the bus-clock speed at which the
  card is operating.  As is seen here (although a bit truncated --
  separate issue, I'll try to see if I can reproduce this on one of my
  HPQ rigs), the card is inserted into a PCI-X Mode-2 capable 133MHz
  (bus clock) slot.  When operating under this mode, each data-phase
  between two devices is divided into 2 sub-phases, effectively doubling
  the transfer-data-rate to 266Mhz.
 
 I guess the proper terminology would be 266 MT/s (Mega
 Transfers/second). Looking through the PSI_SIG PCI-X 2.0 marketing
 blurbs, they use MHz a lot when referring to MT/S. So I would still
 consider this to be a minor bug.  The user wants to know the transfer
 rate, not the actual frequency of the bus.  Maybe just print out the
 mode used instead, e.g., PCI-X 266?

That sounds reasonable.  I'll spin some patches today after I verify
all the bus-bits with the PUI group.

Thanks,
Andrew Vasquez
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] quiet down swiotlb warnings

2007-06-01 Thread Andrew Vasquez

On Fri, 01 Jun 2007, Andi Kleen wrote:

> On Fri, Jun 01, 2007 at 03:38:57PM -0400, Rik van Riel wrote:
> > Andi Kleen wrote:
> > 
> > >An pci_map_sg failing typically leads to an IO error and we've
> > >always printk'ed those. Otherwise people will wonder why they
> > >get EIO.
> > 
> > In some situations.  In this case the qla2xxx driver uses
> > the pci_map_sg() failure as a throttling mechanism and
> 
> First WTF does it need swiotlb anyways? QA hardware should
> be definitely DAC capable, shouldn't it?

yes, the card can support 64bit DMA transfers. but in this case the
'required' DMA mask returned from dma_get_required_mask() states that a
32bit mask would suffice.

Here's a snippet from the bugzilla report
(https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=219216):

QLogic Fibre Channel HBA Driver
PCI: Enabling device :1f:00.0 (0140 -> 0143)
ACPI: PCI Interrupt :1f:00.0[A] -> GSI 16 (level, low) -> IRQ 16
qla2xxx :1f:00.0: Found an ISP2432, irq 16, iobase 
0xc202
*** qla2x00_config_dma_addressing: required_mask set to 
7fff.
*** qla2x00_config_dma_addressing: required_mask has no high-dword bits 
set.
*** qla2x00_config_dma_addressing: set consistent 64bit mask returned 0.
*** qla2x00_config_dma_addressing: defaulting to 32bit 
mask/consistent-mask.
qla2xxx :1f:00.0: Configuring PCI space...

Which tells me that a 32bit DMA mask is being set for dma_set_mask()
and pci_set_consistent_dma_mask() since dma_get_required_mask() is
returning back 7fff -- no upper-dword bits set...
...

> > printing out all the warnings will actually slow down the
> > system.
> 
> Another reason is that there is a lot of code that
> still doesn't check the return values and when that
> happens you might get data corruption too.
> 
> > 
> > Andi, what do you propose as a solution?
> 
> A different interface; like I wrote in my earlier mail.
> 
> Another probabibility would be to have a blocking interface
> to swiotlb that won't fail. That would be the better solution
> long term, but i was told it is hard to fit into some current
> driver interfaces.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] quiet down swiotlb warnings

2007-06-01 Thread Andrew Vasquez

On Fri, 01 Jun 2007, Andi Kleen wrote:

> Rik van Riel <[EMAIL PROTECTED]> writes:
> 
> > It turns out that the qla2xxx driver sometimes fills up the iotlb
> > on purpose and throttles itself when pci_map_sg() fails.  In the
> > case of a driver that expects and handles pci_map_sg() failures,
> > we should not spam the user's console with swiotlb full messages.
> 
> Why does it do that? Could we supply a better interface
> for whatever it is trying to do here?

The driver only calls pci_map_sg() once it's insured that all local
driver resources are available to submit an I/O to the hardware.

> > -   printk(KERN_ERR "DMA: Out of SW-IOMMU space for %zu bytes at "
> > -  "device %s\n", size, dev ? dev->bus_id : "?");
> > +   if (++warnings < 5)
> > +   printk(KERN_ERR "DMA: Out of SW-IOMMU space for %zu bytes at "
> > +  "device %s\n", size, dev ? dev->bus_id : "?");
> 
> Bad idea imho. swiotlb mappings should always lead to printk by default
> because it is pretty dangerous.

Why?  It's just another resource which is consumed -- the qla2xxx
driver is the final consumer before I/O is submitted out on the wire.
The mappings are held for the shorted time required -- as such, are
released as soon as the I/O completes.

> One possible solution for this I could think of would be to define a
> new pci_map_sg_couldfail() or similar that doesn't warn and use a weak
> fallback just calling pci_map_sg on other IOMMU implementations. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] quiet down swiotlb warnings

2007-06-01 Thread Andrew Vasquez

On Fri, 01 Jun 2007, Andi Kleen wrote:

 Rik van Riel [EMAIL PROTECTED] writes:
 
  It turns out that the qla2xxx driver sometimes fills up the iotlb
  on purpose and throttles itself when pci_map_sg() fails.  In the
  case of a driver that expects and handles pci_map_sg() failures,
  we should not spam the user's console with swiotlb full messages.
 
 Why does it do that? Could we supply a better interface
 for whatever it is trying to do here?

The driver only calls pci_map_sg() once it's insured that all local
driver resources are available to submit an I/O to the hardware.

  -   printk(KERN_ERR DMA: Out of SW-IOMMU space for %zu bytes at 
  -  device %s\n, size, dev ? dev-bus_id : ?);
  +   if (++warnings  5)
  +   printk(KERN_ERR DMA: Out of SW-IOMMU space for %zu bytes at 
  +  device %s\n, size, dev ? dev-bus_id : ?);
 
 Bad idea imho. swiotlb mappings should always lead to printk by default
 because it is pretty dangerous.

Why?  It's just another resource which is consumed -- the qla2xxx
driver is the final consumer before I/O is submitted out on the wire.
The mappings are held for the shorted time required -- as such, are
released as soon as the I/O completes.

 One possible solution for this I could think of would be to define a
 new pci_map_sg_couldfail() or similar that doesn't warn and use a weak
 fallback just calling pci_map_sg on other IOMMU implementations. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] quiet down swiotlb warnings

2007-06-01 Thread Andrew Vasquez

On Fri, 01 Jun 2007, Andi Kleen wrote:

 On Fri, Jun 01, 2007 at 03:38:57PM -0400, Rik van Riel wrote:
  Andi Kleen wrote:
  
  An pci_map_sg failing typically leads to an IO error and we've
  always printk'ed those. Otherwise people will wonder why they
  get EIO.
  
  In some situations.  In this case the qla2xxx driver uses
  the pci_map_sg() failure as a throttling mechanism and
 
 First WTF does it need swiotlb anyways? QA hardware should
 be definitely DAC capable, shouldn't it?

yes, the card can support 64bit DMA transfers. but in this case the
'required' DMA mask returned from dma_get_required_mask() states that a
32bit mask would suffice.

Here's a snippet from the bugzilla report
(https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=219216):

QLogic Fibre Channel HBA Driver
PCI: Enabling device :1f:00.0 (0140 - 0143)
ACPI: PCI Interrupt :1f:00.0[A] - GSI 16 (level, low) - IRQ 16
qla2xxx :1f:00.0: Found an ISP2432, irq 16, iobase 
0xc202
*** qla2x00_config_dma_addressing: required_mask set to 
7fff.
*** qla2x00_config_dma_addressing: required_mask has no high-dword bits 
set.
*** qla2x00_config_dma_addressing: set consistent 64bit mask returned 0.
*** qla2x00_config_dma_addressing: defaulting to 32bit 
mask/consistent-mask.
qla2xxx :1f:00.0: Configuring PCI space...

Which tells me that a 32bit DMA mask is being set for dma_set_mask()
and pci_set_consistent_dma_mask() since dma_get_required_mask() is
returning back 7fff -- no upper-dword bits set...
...

  printing out all the warnings will actually slow down the
  system.
 
 Another reason is that there is a lot of code that
 still doesn't check the return values and when that
 happens you might get data corruption too.
 
  
  Andi, what do you propose as a solution?
 
 A different interface; like I wrote in my earlier mail.
 
 Another probabibility would be to have a blocking interface
 to swiotlb that won't fail. That would be the better solution
 long term, but i was told it is hard to fit into some current
 driver interfaces.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/2] PCI-X/PCI-Express read control interfaces

2007-05-15 Thread Andrew Vasquez

On Tue, 15 May 2007, Andrew Morton wrote:

> On Tue, 15 May 2007 13:50:27 +0200
> "Peter Oruba" <[EMAIL PROTECTED]> wrote:
> 
> > This patch set introduces a PCI-X / PCI-Express read byte count control 
> > interface. Instead of letting every driver to directly read/write to PCI 
> > config space for that, an interface is provided. The interface functions 
> > then 
> > can be used for quirks since some PCI bridges require that read byte count 
> > values are set by the BIOS and left unchanged by device drivers.
> 
> Some of the patches were wordwrapped, which I fixed.
> 
> The way we would merge a feature like this is
> 
> - get maintainers to review-and-ack the change

This is definetly good cleanup, and I ACK the QLogic changes.

I do though have some questions on call prerequisites given the
driver-changes, most in the form of:

> diff -uprN -X linux-2.6.22-rc1.orig/Documentation/dontdiff 
> linux-2.6.22-rc1.orig/drivers/infiniband/hw/mthca/mthca_main.c 
> linux-2.6.22-rc1/drivers/infiniband/hw/mthca/mthca_main.c
> --- linux-2.6.22-rc1.orig/drivers/infiniband/hw/mthca/mthca_main.c
> 2007-05-14 
> 11:29:29.358547000 +0200
> +++ linux-2.6.22-rc1/drivers/infiniband/hw/mthca/mthca_main.c 2007-05-15 
> 10:55:24.954074000 +0200
> @@ -137,45 +137,27 @@ static const char mthca_version[] __devi
>  
>  static int mthca_tune_pci(struct mthca_dev *mdev)
>  {
> - int cap;
> - u16 val;
> -
>   if (!tune_pci)
>   return 0;
>  
>   /* First try to max out Read Byte Count */
> - cap = pci_find_capability(mdev->pdev, PCI_CAP_ID_PCIX);
> - if (cap) {
> - if (pci_read_config_word(mdev->pdev, cap + PCI_X_CMD, )) {
> - mthca_err(mdev, "Couldn't read PCI-X command register, "
> -   "aborting.\n");
> - return -ENODEV;
> - }
> - val = (val & ~PCI_X_CMD_MAX_READ) | (3 << 2);
> - if (pci_write_config_word(mdev->pdev, cap + PCI_X_CMD, val)) {
> - mthca_err(mdev, "Couldn't write PCI-X command register, 
> "
> -   "aborting.\n");
> + if (pci_find_capability(mdev->pdev, PCI_CAP_ID_PCIX)) {
> + if (pcix_set_mmrbc(mdev->pdev, pcix_get_max_mmrbc(mdev->pdev))) 
> {
> + mthca_err(mdev, "Couldn't set PCI-X max read count, "
> + "aborting.\n");
...
> - cap = pci_find_capability(mdev->pdev, PCI_CAP_ID_EXP);
> - if (cap) {
> - if (pci_read_config_word(mdev->pdev, cap + PCI_EXP_DEVCTL, 
> )) {
> - mthca_err(mdev, "Couldn't read PCI Express device 
> control "
> -   "register, aborting.\n");
> - return -ENODEV;
> - }
> - val = (val & ~PCI_EXP_DEVCTL_READRQ) | (5 << 12);
> - if (pci_write_config_word(mdev->pdev, cap + PCI_EXP_DEVCTL, 
> val)) {
> - mthca_err(mdev, "Couldn't write PCI Express device 
> control "
> -   "register, aborting.\n");
> + if (pci_find_capability(mdev->pdev, PCI_CAP_ID_EXP)) {
> + if (pcie_set_readrq(mdev->pdev, 4096)) {
> + mthca_err(mdev, "Couldn't write PCI Express read 
> request, "
> + "aborting.\n");


In general, if PCI-[Xe] capability structure exists do set-
mmrbc()/readrq(), yet each of those pre-condition checks are already
present in the pcix_set_mmrbc() and pcie_set_readrq().

At least for the qla2xxx case, the patch could easily distill down from:

...
/* PCIe -- adjust Maximum Read Request Size (2048). */
pcie_dctl_reg = pci_find_capability(ha->pdev, PCI_CAP_ID_EXP);
if (pcie_dctl_reg)
if (pcie_set_readrq(ha->pdev, 2048))
DEBUG2(printk("Couldn't write PCI Express read 
request\n"));

to:

...
pcie_set_readrq(ha->pdev, 2048);


Whatever the decision, I can fold this into my next patchset for
qla2xxx and submit.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/2] PCI-X/PCI-Express read control interfaces

2007-05-15 Thread Andrew Vasquez

On Tue, 15 May 2007, Andrew Morton wrote:

 On Tue, 15 May 2007 13:50:27 +0200
 Peter Oruba [EMAIL PROTECTED] wrote:
 
  This patch set introduces a PCI-X / PCI-Express read byte count control 
  interface. Instead of letting every driver to directly read/write to PCI 
  config space for that, an interface is provided. The interface functions 
  then 
  can be used for quirks since some PCI bridges require that read byte count 
  values are set by the BIOS and left unchanged by device drivers.
 
 Some of the patches were wordwrapped, which I fixed.
 
 The way we would merge a feature like this is
 
 - get maintainers to review-and-ack the change

This is definetly good cleanup, and I ACK the QLogic changes.

I do though have some questions on call prerequisites given the
driver-changes, most in the form of:

 diff -uprN -X linux-2.6.22-rc1.orig/Documentation/dontdiff 
 linux-2.6.22-rc1.orig/drivers/infiniband/hw/mthca/mthca_main.c 
 linux-2.6.22-rc1/drivers/infiniband/hw/mthca/mthca_main.c
 --- linux-2.6.22-rc1.orig/drivers/infiniband/hw/mthca/mthca_main.c
 2007-05-14 
 11:29:29.358547000 +0200
 +++ linux-2.6.22-rc1/drivers/infiniband/hw/mthca/mthca_main.c 2007-05-15 
 10:55:24.954074000 +0200
 @@ -137,45 +137,27 @@ static const char mthca_version[] __devi
  
  static int mthca_tune_pci(struct mthca_dev *mdev)
  {
 - int cap;
 - u16 val;
 -
   if (!tune_pci)
   return 0;
  
   /* First try to max out Read Byte Count */
 - cap = pci_find_capability(mdev-pdev, PCI_CAP_ID_PCIX);
 - if (cap) {
 - if (pci_read_config_word(mdev-pdev, cap + PCI_X_CMD, val)) {
 - mthca_err(mdev, Couldn't read PCI-X command register, 
 -   aborting.\n);
 - return -ENODEV;
 - }
 - val = (val  ~PCI_X_CMD_MAX_READ) | (3  2);
 - if (pci_write_config_word(mdev-pdev, cap + PCI_X_CMD, val)) {
 - mthca_err(mdev, Couldn't write PCI-X command register, 
 
 -   aborting.\n);
 + if (pci_find_capability(mdev-pdev, PCI_CAP_ID_PCIX)) {
 + if (pcix_set_mmrbc(mdev-pdev, pcix_get_max_mmrbc(mdev-pdev))) 
 {
 + mthca_err(mdev, Couldn't set PCI-X max read count, 
 + aborting.\n);
...
 - cap = pci_find_capability(mdev-pdev, PCI_CAP_ID_EXP);
 - if (cap) {
 - if (pci_read_config_word(mdev-pdev, cap + PCI_EXP_DEVCTL, 
 val)) {
 - mthca_err(mdev, Couldn't read PCI Express device 
 control 
 -   register, aborting.\n);
 - return -ENODEV;
 - }
 - val = (val  ~PCI_EXP_DEVCTL_READRQ) | (5  12);
 - if (pci_write_config_word(mdev-pdev, cap + PCI_EXP_DEVCTL, 
 val)) {
 - mthca_err(mdev, Couldn't write PCI Express device 
 control 
 -   register, aborting.\n);
 + if (pci_find_capability(mdev-pdev, PCI_CAP_ID_EXP)) {
 + if (pcie_set_readrq(mdev-pdev, 4096)) {
 + mthca_err(mdev, Couldn't write PCI Express read 
 request, 
 + aborting.\n);


In general, if PCI-[Xe] capability structure exists do set-
mmrbc()/readrq(), yet each of those pre-condition checks are already
present in the pcix_set_mmrbc() and pcie_set_readrq().

At least for the qla2xxx case, the patch could easily distill down from:

...
/* PCIe -- adjust Maximum Read Request Size (2048). */
pcie_dctl_reg = pci_find_capability(ha-pdev, PCI_CAP_ID_EXP);
if (pcie_dctl_reg)
if (pcie_set_readrq(ha-pdev, 2048))
DEBUG2(printk(Couldn't write PCI Express read 
request\n));

to:

...
pcie_set_readrq(ha-pdev, 2048);


Whatever the decision, I can fold this into my next patchset for
qla2xxx and submit.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-19 Thread Andrew Vasquez

On Wed, 18 Apr 2007, David Miller wrote:

> > On Wed, 18 Apr 2007, Christoph Hellwig wrote:
> > 
> > > I don't think a module option is a good idea at this point.  The problem
> > > is you broke some so far perfectly working setups, which is not okay.
> > > The only first step can be printing a really big warning.  After this
> > > has been in for a while (at lest half a year) we can make it a non-default
> > > option or turn if off completely in case the warning never triggered in
> > > practice.
> > > 
> > > The only resonable thing for 2.6.21 is to put in David's patch, possible
> > > with an even more drastic warning when the rom is invalid and there's
> > > no prom-fallback available.
> > > 
> > > Note that I expect Sun put in the invalid ROM intentionally, as we have
> > > similar cases with other cards that have totally messed up ROMs in
> > > Sun-branded versions.  Personally I think that's an utterly bad decision
> > > from Sun's side, but we'll have to live with this.
> > 
> > Fine.  I'll rework an alternate patch for the 2.6.22 timeframe...
> 
> We need to fix things now for 2.6.21 and the 2.6.x -stable branches
> because users have unusable systems currently.

Yes, and I'm fine with the original patch you provided which reverts
the change and adds the firmware-upcalls to retrieve the wwpn/wwnn.

> If it's just a time issue I can work on and push the patch, especially
> since I have the means to test things here.

I'll start with the final 2.6.21 -- add modify to add the *flashing*
light warning and some additional bits based on other archs I can test
with embedded ISPs.  Thanks again for the SPARC tips.

Regards,
Andrew Vasquez
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-19 Thread Andrew Vasquez

On Wed, 18 Apr 2007, David Miller wrote:

  On Wed, 18 Apr 2007, Christoph Hellwig wrote:
  
   I don't think a module option is a good idea at this point.  The problem
   is you broke some so far perfectly working setups, which is not okay.
   The only first step can be printing a really big warning.  After this
   has been in for a while (at lest half a year) we can make it a non-default
   option or turn if off completely in case the warning never triggered in
   practice.
   
   The only resonable thing for 2.6.21 is to put in David's patch, possible
   with an even more drastic warning when the rom is invalid and there's
   no prom-fallback available.
   
   Note that I expect Sun put in the invalid ROM intentionally, as we have
   similar cases with other cards that have totally messed up ROMs in
   Sun-branded versions.  Personally I think that's an utterly bad decision
   from Sun's side, but we'll have to live with this.
  
  Fine.  I'll rework an alternate patch for the 2.6.22 timeframe...
 
 We need to fix things now for 2.6.21 and the 2.6.x -stable branches
 because users have unusable systems currently.

Yes, and I'm fine with the original patch you provided which reverts
the change and adds the firmware-upcalls to retrieve the wwpn/wwnn.

 If it's just a time issue I can work on and push the patch, especially
 since I have the means to test things here.

I'll start with the final 2.6.21 -- add modify to add the *flashing*
light warning and some additional bits based on other archs I can test
with embedded ISPs.  Thanks again for the SPARC tips.

Regards,
Andrew Vasquez
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-18 Thread Andrew Vasquez

On Wed, 18 Apr 2007, Christoph Hellwig wrote:

> I don't think a module option is a good idea at this point.  The problem
> is you broke some so far perfectly working setups, which is not okay.
> The only first step can be printing a really big warning.  After this
> has been in for a while (at lest half a year) we can make it a non-default
> option or turn if off completely in case the warning never triggered in
> practice.
> 
> The only resonable thing for 2.6.21 is to put in David's patch, possible
> with an even more drastic warning when the rom is invalid and there's
> no prom-fallback available.
> 
> Note that I expect Sun put in the invalid ROM intentionally, as we have
> similar cases with other cards that have totally messed up ROMs in
> Sun-branded versions.  Personally I think that's an utterly bad decision
> from Sun's side, but we'll have to live with this.

Fine.  I'll rework an alternate patch for the 2.6.22 timeframe...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-18 Thread Andrew Vasquez

On Wed, 18 Apr 2007, Christoph Hellwig wrote:

 I don't think a module option is a good idea at this point.  The problem
 is you broke some so far perfectly working setups, which is not okay.
 The only first step can be printing a really big warning.  After this
 has been in for a while (at lest half a year) we can make it a non-default
 option or turn if off completely in case the warning never triggered in
 practice.
 
 The only resonable thing for 2.6.21 is to put in David's patch, possible
 with an even more drastic warning when the rom is invalid and there's
 no prom-fallback available.
 
 Note that I expect Sun put in the invalid ROM intentionally, as we have
 similar cases with other cards that have totally messed up ROMs in
 Sun-branded versions.  Personally I think that's an utterly bad decision
 from Sun's side, but we'll have to live with this.

Fine.  I'll rework an alternate patch for the 2.6.22 timeframe...

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, David Miller wrote:

> From: Andrew Vasquez <[EMAIL PROTECTED]>
> Date: Mon, 16 Apr 2007 16:47:05 -0700
> 
> > Dave, according to your earlier emails, the qla2xxx driver worked
> > 'fine' in driver versions before commit
> > 7aef45ac92f49e76d990b51b7ecd714b9a608be1.  If that were the case, then
> > you would have seen the warning messages:
> > 
> > ...
> > qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet "
> > "invalid -- WWPN) defaults.\n");
> 
> I have in fact seen the message several times and that messages gives
> me no reason to believe something needs to be fixed.
> 
> It should have said "PLEASE REPORT THIS to [EMAIL PROTECTED]" or
> something similar to indicate the severity better.
> 
> "An invalid WWPN, what's that?" said the user. :)
> 
> How about "FC IDs may conflict and cause miscommunication!  Please
> report to driver author so this can be fixed!" or similar?

That verbiage sounds fine -- so would you consider the previous patch
I submitted (with module parameter) along with the wording above?

I'm in transit for a redeye to NY so I won't be able to modify the
patch, If you would be amenable to the above, Seokmann, could you
rework the patch?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, David Miller wrote:

> From: Andrew Vasquez <[EMAIL PROTECTED]>
> Date: Mon, 16 Apr 2007 16:28:51 -0700
> 
> > Sorry, but let's be realistic, this type of warning would have
> > *NEVER* been addressed if we kept the status quo
> 
> Wrong.  I watch the logs all the time and would have sent you a fix to
> use the Sparc firmware info as soon as I saw the kernel log message.

Dave, according to your earlier emails, the qla2xxx driver worked
'fine' in driver versions before commit
7aef45ac92f49e76d990b51b7ecd714b9a608be1.  If that were the case, then
you would have seen the warning messages:

...
qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet "
"invalid -- WWPN) defaults.\n");

> Anyone who has worked with me over the last 15 years will let you know
> emphatically that this is true.
> 
> AND IN THE MEAN TIME I COULD GET WORK DONE AND MY SYSTEM WOULD BOOT!

I understand that, and recognize your contribution, that was never in
question.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, David Miller wrote:

> From: Andrew Vasquez <[EMAIL PROTECTED]>
> Date: Mon, 16 Apr 2007 15:25:17 -0700
> 
> > Fine, I'll agree that wacking-users (and
> > I'll wager the outliers) with a 2x4 was a bit extreme,
> 
> And that, right there, is basically the end of the conversation.
> 
> You don't do this to users, ever.
> Put a big loud kernel log message in there when this situation
> presents itself, use as many capital letters and scary language that
> you wish.  Let them know that if things explode they get to keep the
> pieces.
> 
> But at least try to give them something that works when you know that
> you can.
>
> You don't need to make someone's system unbootable in order to make
> them aware of a potential problem.  It's very anti-social to approach

Sorry, but let's be realistic, this type of warning would have *NEVER*
been addressed if we kept the status quo -- your modifications to read
the wwpn/wwnn would have never been submitted, everybody would have
kept going on blistfully ignorant of the issue.  Changes such as these
are a common Linux upstream idiom...

So, meeting in the middle, with the NVRAM bits restored along with
some ability for the user to *knowingly* recognize the problem, I take
it, is not going to work for you?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, David Miller wrote:

> From: Andrew Vasquez <[EMAIL PROTECTED]>
> Date: Mon, 16 Apr 2007 14:10:49 -0700
> 
> > Ok, how about the following patch based on the one you posted which
> > adds the codes to retrieve the WWPN/WWNN from firmware on SPARC, and
> > also adds the module-parameter override I mentioned above.
> > 
> > Perhaps the module-parameter should be set to non-zero in the case of
> > SPARC, to take care of your system configurations?
> 
> I think it should default to non-zero always, in fact the option
> is completely pointless.
> 
> The guy who hits this had a system which worked previously, and you're
> explicitly breaking it.  That's wrong.

Sorry, 'it' didn't work...  'It' *never* did.

> How can you not see that this quality of implementation decision
> you're making stinks?

You're defending a position which itself left users with a false sense
of security and comfort.  This is a *real* problem from an enterprise
perspective where FC reigns.  Fine, I'll agree that wacking-users (and
I'll wager the outliers) with a 2x4 was a bit extreme, but I'd much
rather handle those users on a case-by-case basis, either by:

* If dealing with a PCI card, directing a user  to a support staff at
  QLogic to resolve the NVRAM issues.

* If it's some on-board ISP with no NVRAM, as was your SPARC case,
  then add *proper* codes to retrieve the data from some secondary
  persistent store.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, Andrew Vasquez wrote:

> On Mon, 16 Apr 2007, David Miller wrote:
> 
> > They DON'T
> > CARE, they want their systems to work and if you don't give them that
> > you're not being a good driver maintainer.
> 
> Let's push aside attitudes and unrealistic statistics, could we
> perhaps agree to re-add the use of doctored NVRAM (and thus
> non-random WWPN/WWNN) when NVRAM is corrupted or non-present with a
> module-parameter (which defaults to 0) which indicates the user
> *really* knows what she is doing and recognizes WWPN collisions may
> occur -- non-zero the parameter value indicates doctored values will
> be used, zero value (the default) fails initialization.  In both cases
> a big FAT warning is issued.
> 
> > You BROKE things, therefore you must FIX it.
> >
> > Now I'm happy to code up the sparc OFW property bits but your attitude
> > and perspective on this absolutely has to change and the old fallback
> > code still has to go back in there, possible FC ID collisions or not.
> 
> That would be great,  I'd like to insure the balance is maintained for
> *all* our users.

Ok, how about the following patch based on the one you posted which
adds the codes to retrieve the WWPN/WWNN from firmware on SPARC, and
also adds the module-parameter override I mentioned above.

Perhaps the module-parameter should be set to non-zero in the case of
SPARC, to take care of your system configurations?

Regards,
Andrew Vasquez

---

diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h
index 74544ae..b26090d 100644
--- a/drivers/scsi/qla2xxx/qla_gbl.h
+++ b/drivers/scsi/qla2xxx/qla_gbl.h
@@ -62,6 +62,7 @@ extern int ql2xfdmienable;
 extern int ql2xallocfwdump;
 extern int ql2xextended_error_logging;
 extern int ql2xqfullrampup;
+extern int ql2xoverrideinvalidnvram;
 
 extern void qla2x00_sp_compl(scsi_qla_host_t *, srb_t *);
 
diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index 98c01cd..fa5df97 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -11,6 +11,11 @@
 
 #include "qla_devtbl.h"
 
+#ifdef CONFIG_SPARC
+#include 
+#include 
+#endif
+
 /* XXX(hch): this is ugly, but we don't want to pull in exioctl.h */
 #ifndef EXT_IS_LUN_BIT_SET
 #define EXT_IS_LUN_BIT_SET(P,L) \
@@ -1393,6 +1398,42 @@ qla2x00_set_model_info(scsi_qla_host_t *ha, uint8_t 
*model, size_t len, char *de
}
 }
 
+/* On sparc systems, obtain port and node WWN from firmware
+ * properties.
+ */
+static void qla2xxx_nvram_wwn_from_ofw(scsi_qla_host_t *ha, nvram_t *nv)
+{
+#ifdef CONFIG_SPARC
+   struct pci_dev *pdev = ha->pdev;
+   struct pcidev_cookie *pcp = pdev->sysdata;
+   struct device_node *dp = pcp->prom_node;
+   u8 *val;
+   int len;
+
+   val = of_get_property(dp, "port-wwn", );
+   if (val && len >= WWN_SIZE)
+   memcpy(nv->port_name, val, WWN_SIZE);
+
+   val = of_get_property(dp, "node-wwn", );
+   if (val && len >= WWN_SIZE)
+   memcpy(nv->node_name, val, WWN_SIZE);
+#endif
+}
+
+static inline int
+qla2x00_override_invalid_nvram(scsi_qla_host_t *ha)
+{
+   if (!ql2xoverrideinvalidnvram) {
+   qla_printk(KERN_WARNING, ha,
+   "Reload the driver with the ql2xoverrideinvalidnvram \n");
+   qla_printk(KERN_WARNING, ha,
+   " module parameter set to a non-zero value to ignore \n");
+   qla_printk(KERN_WARNING, ha,
+   " this warning.\n");
+   }
+   return ql2xoverrideinvalidnvram;
+}
+
 /*
 * NVRAM configuration for ISP 2xxx
 *
@@ -1440,7 +1481,57 @@ qla2x00_nvram_config(scsi_qla_host_t *ha)
qla_printk(KERN_WARNING, ha, "Inconsistent NVRAM detected: "
"checksum=0x%x id=%c version=0x%x.\n", chksum, nv->id[0],
nv->nvram_version);
-   return QLA_FUNCTION_FAILED;
+   if (!qla2x00_override_invalid_nvram(ha))
+   return QLA_FUNCTION_FAILED;
+   qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet "
+   "invalid -- WWPN) defaults.\n");
+
+   /*
+* Set default initialization control block.
+*/
+   memset(nv, 0, ha->nvram_size);
+   nv->parameter_block_version = ICB_VERSION;
+
+   if (IS_QLA23XX(ha)) {
+   nv->firmware_options[0] = BIT_2 | BIT_1;
+   nv->firmware_options[1] = BIT_7 | BIT_5;
+   nv->add_firmware_options[0] = BIT_5;
+   nv->add_firmware_options[1] = BIT_5 | BIT_4;
+   nv->frame_payload_size = __

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, David Miller wrote:

> From: Andrew Vasquez <[EMAIL PROTECTED]>
> Date: Mon, 16 Apr 2007 09:37:12 -0700
> 
> > On Mon, 16 Apr 2007, David Miller wrote:
> > 
> > > But even if that fails, I think the fallback code should be put back,
> > > since it obviously was used by at least one system and it's probable
> > > that there are some other applications of using this qla2xxx chip that
> > > will have an empty NVRAM too.
> > 
> > Then they should really get their NVRAM corrected, if in fact their
> > NVRAMs are cleared.
> > 
> > > I can understand the apprehension in using a fixed port_name[] value,
> > > since it could conflict with other FC controllers on the mesh, but if
> > > that is so important just choose some random value that is a valid FC
> > > ID or use some characteristic ID that can be used to compose part of
> > > the port WWN in order to give it at least some uniqueness.
> > 
> > Look, there's a fine balance here that we must strike -- the solution
> > that you're proposing implies that there's some 'random' bit-space
> > within the IEEE NAA with which one can safely encode without stomping
> > on any valid OUI.
> 
> The fact is that your driver was significantly more robust
> previously, and now it's so less robust that it now fails for
> people.
> 
> That's totally unacceptable.
> 
> Just like the sparc64 systsems, others depending upon this fallback
> behavior the qla2xxx driver had are going to break and they are not
> going to be able to just go and fix their hardware and re-flash the
> NVRAM.
> 
> Every user on the planet is going to be 1,000 times more happy with a
> big fat warning in their kernel log saying that things might not go
> right, but the driver is going to try anyways, rather than a complete
> non-attempt to make things work.
> 
> You replaced a possible failure with a guarenteed one.
> 
> %99.999 of people are never going to run into a FC ID collision.
> They have an onboard FC controller and a disk or two.

Sorry, but in a SATA/SCSI environment that may be true, but in the
case of FC that expectation is unrealistic.  There are thousands of FC
installations where there are several thousand endpoints (including
initiators and targets) all interconnected.  Let's use your case --
just connect two sparc machines within the same fabric to your
storage, with the old code, there's still a problem.

> They DON'T
> CARE, they want their systems to work and if you don't give them that
> you're not being a good driver maintainer.

Let's push aside attitudes and unrealistic statistics, could we
perhaps agree to re-add the use of doctored NVRAM (and thus
non-random WWPN/WWNN) when NVRAM is corrupted or non-present with a
module-parameter (which defaults to 0) which indicates the user
*really* knows what she is doing and recognizes WWPN collisions may
occur -- non-zero the parameter value indicates doctored values will
be used, zero value (the default) fails initialization.  In both cases
a big FAT warning is issued.

> You BROKE things, therefore you must FIX it.
>
> Now I'm happy to code up the sparc OFW property bits but your attitude
> and perspective on this absolutely has to change and the old fallback
> code still has to go back in there, possible FC ID collisions or not.

That would be great,  I'd like to insure the balance is maintained for
*all* our users.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, David Miller wrote:

> Sparc64 systems which have an on-board qla2xxx chip (such as
> SunBlade-1000 and SunBlade-2000, there are probably some other systems
> like this too) do not have any NVRAM information present, in fact the
> NVRAM is basically all 0's from what I can tell.
> 
> This always worked just fine since the code would previously just use
> a bunch of defaults when an inconsistent NVRAM was detected.
> 
> But the changeset below at the end of this email broke this and now
> I'm seeing bug reports from sparc64 users and I was just able to
> reproduce the problem myself just today as well.  I verified that
> reverting the patch below gets things working again.
> 
> Emanuele, you can feed the patch below to "patch -p1 -R" to get that
> working again so we can move on to the other sparc64 bug we're looking
> into :-)

I sent Emanuele the attached patch during the weekend...

> The failure mode isn't nice, it actually ends up crashing with an OOPS
> in qla2xxx_init_host_attr() because ha->node_name is NULL, it's
> supposed to be initialized by functions like qla2x00_nvram_config()

No, it's not very nice...

> Can we revert the patch below or do something similar to get things
> working again on sparc64?
> 
> The most important thing which qla2x00_nvram_config() seems to want to
> get is the WWN port_name and node_name.  These are provided in the OFW
> device tree so we could pluck them out of there with something like:
> 
> #ifdef CONFIG_SPARC
> #include 
> #include 
> #endif
> 
> ...
> 
> #ifdef CONFIG_SPARC
>   struct pcidev_cookie *pcp = pdev->sysdata;
>   u8 *port_name, *node_name;
> 
>   port_name = of_get_property(pcp->prom_node, "port-wwn", NULL);
>   node_name = of_get_property(pcp->prom_node, "node-wwn", NULL);
> #endif
> Those will hold a pointer to the property values or NULL if the
> property does not exist.  This is private data, so you should make
> copies of them into your local data structure and not use references
> to them.
> 
> I don't see any OFW properties present that could be used to fill in
> the rest of the NVRAM parameters, so we'd need to use the defaults
> that the code before the change was using.

I'd be more inclined to do soemthing like the above, rather than:

> But even if that fails, I think the fallback code should be put back,
> since it obviously was used by at least one system and it's probable
> that there are some other applications of using this qla2xxx chip that
> will have an empty NVRAM too.

Then they should really get their NVRAM corrected, if in fact their
NVRAMs are cleared.

> I can understand the apprehension in using a fixed port_name[] value,
> since it could conflict with other FC controllers on the mesh, but if
> that is so important just choose some random value that is a valid FC
> ID or use some characteristic ID that can be used to compose part of
> the port WWN in order to give it at least some uniqueness.

Look, there's a fine balance here that we must strike -- the solution
that you're proposing implies that there's some 'random' bit-space
within the IEEE NAA with which one can safely encode without stomping
on any valid OUI.
>From 9ee6de3bbaa03390b83226e7bb84c49566a583b3 Mon Sep 17 00:00:00 2001
From: Andrew Vasquez <[EMAIL PROTECTED]>
Date: Wed, 11 Apr 2007 16:02:06 -0700
Subject: [PATCH] qla2xxx: Error-out during probe() if we're unable to complete 
HBA initialization.

Remove a stale check against ha->device_flags
(DFLG_NO_CABLE) as topology scanning is performed within the
DPC-thread context.

Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]>
---
 drivers/scsi/qla2xxx/qla_os.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index b78919a..0a36912 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1577,9 +1577,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct 
pci_device_id *id)
goto probe_failed;
}
 
-   if (qla2x00_initialize_adapter(ha) &&
-   !(ha->device_flags & DFLG_NO_CABLE)) {
-
+   if (qla2x00_initialize_adapter(ha)) {
qla_printk(KERN_WARNING, ha,
"Failed to initialize adapter\n");
 
-- 
1.5.1.1.107.g7a159

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, David Miller wrote:

 Sparc64 systems which have an on-board qla2xxx chip (such as
 SunBlade-1000 and SunBlade-2000, there are probably some other systems
 like this too) do not have any NVRAM information present, in fact the
 NVRAM is basically all 0's from what I can tell.
 
 This always worked just fine since the code would previously just use
 a bunch of defaults when an inconsistent NVRAM was detected.
 
 But the changeset below at the end of this email broke this and now
 I'm seeing bug reports from sparc64 users and I was just able to
 reproduce the problem myself just today as well.  I verified that
 reverting the patch below gets things working again.
 
 Emanuele, you can feed the patch below to patch -p1 -R to get that
 working again so we can move on to the other sparc64 bug we're looking
 into :-)

I sent Emanuele the attached patch during the weekend...

 The failure mode isn't nice, it actually ends up crashing with an OOPS
 in qla2xxx_init_host_attr() because ha-node_name is NULL, it's
 supposed to be initialized by functions like qla2x00_nvram_config()

No, it's not very nice...

 Can we revert the patch below or do something similar to get things
 working again on sparc64?
 
 The most important thing which qla2x00_nvram_config() seems to want to
 get is the WWN port_name and node_name.  These are provided in the OFW
 device tree so we could pluck them out of there with something like:
 
 #ifdef CONFIG_SPARC
 #include asm/prom.h
 #include asm/pbm.h
 #endif
 
 ...
 
 #ifdef CONFIG_SPARC
   struct pcidev_cookie *pcp = pdev-sysdata;
   u8 *port_name, *node_name;
 
   port_name = of_get_property(pcp-prom_node, port-wwn, NULL);
   node_name = of_get_property(pcp-prom_node, node-wwn, NULL);
 #endif
 Those will hold a pointer to the property values or NULL if the
 property does not exist.  This is private data, so you should make
 copies of them into your local data structure and not use references
 to them.
 
 I don't see any OFW properties present that could be used to fill in
 the rest of the NVRAM parameters, so we'd need to use the defaults
 that the code before the change was using.

I'd be more inclined to do soemthing like the above, rather than:

 But even if that fails, I think the fallback code should be put back,
 since it obviously was used by at least one system and it's probable
 that there are some other applications of using this qla2xxx chip that
 will have an empty NVRAM too.

Then they should really get their NVRAM corrected, if in fact their
NVRAMs are cleared.

 I can understand the apprehension in using a fixed port_name[] value,
 since it could conflict with other FC controllers on the mesh, but if
 that is so important just choose some random value that is a valid FC
 ID or use some characteristic ID that can be used to compose part of
 the port WWN in order to give it at least some uniqueness.

Look, there's a fine balance here that we must strike -- the solution
that you're proposing implies that there's some 'random' bit-space
within the IEEE NAA with which one can safely encode without stomping
on any valid OUI.
From 9ee6de3bbaa03390b83226e7bb84c49566a583b3 Mon Sep 17 00:00:00 2001
From: Andrew Vasquez [EMAIL PROTECTED]
Date: Wed, 11 Apr 2007 16:02:06 -0700
Subject: [PATCH] qla2xxx: Error-out during probe() if we're unable to complete 
HBA initialization.

Remove a stale check against ha-device_flags
(DFLG_NO_CABLE) as topology scanning is performed within the
DPC-thread context.

Signed-off-by: Andrew Vasquez [EMAIL PROTECTED]
---
 drivers/scsi/qla2xxx/qla_os.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index b78919a..0a36912 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1577,9 +1577,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct 
pci_device_id *id)
goto probe_failed;
}
 
-   if (qla2x00_initialize_adapter(ha) 
-   !(ha-device_flags  DFLG_NO_CABLE)) {
-
+   if (qla2x00_initialize_adapter(ha)) {
qla_printk(KERN_WARNING, ha,
Failed to initialize adapter\n);
 
-- 
1.5.1.1.107.g7a159

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, David Miller wrote:

 From: Andrew Vasquez [EMAIL PROTECTED]
 Date: Mon, 16 Apr 2007 09:37:12 -0700

  On Mon, 16 Apr 2007, David Miller wrote:

   But even if that fails, I think the fallback code should be put back,
   since it obviously was used by at least one system and it's probable
   that there are some other applications of using this qla2xxx chip that
   will have an empty NVRAM too.

  Then they should really get their NVRAM corrected, if in fact their
  NVRAMs are cleared.

   I can understand the apprehension in using a fixed port_name[] value,
   since it could conflict with other FC controllers on the mesh, but if
   that is so important just choose some random value that is a valid FC
   ID or use some characteristic ID that can be used to compose part of
   the port WWN in order to give it at least some uniqueness.

  Look, there's a fine balance here that we must strike -- the solution
  that you're proposing implies that there's some 'random' bit-space
  within the IEEE NAA with which one can safely encode without stomping
  on any valid OUI.

 The fact is that your driver was significantly more robust
 previously, and now it's so less robust that it now fails for
 people.

 That's totally unacceptable.

 Just like the sparc64 systsems, others depending upon this fallback
 behavior the qla2xxx driver had are going to break and they are not
 going to be able to just go and fix their hardware and re-flash the
 NVRAM.

 Every user on the planet is going to be 1,000 times more happy with a
 big fat warning in their kernel log saying that things might not go
 right, but the driver is going to try anyways, rather than a complete
 non-attempt to make things work.

 You replaced a possible failure with a guarenteed one.

 %99.999 of people are never going to run into a FC ID collision.
 They have an onboard FC controller and a disk or two.

Sorry, but in a SATA/SCSI environment that may be true, but in the
case of FC that expectation is unrealistic.  There are thousands of FC
installations where there are several thousand endpoints (including
initiators and targets) all interconnected.  Let's use your case --
just connect two sparc machines within the same fabric to your
storage, with the old code, there's still a problem.

 They DON'T
 CARE, they want their systems to work and if you don't give them that
 you're not being a good driver maintainer.

Let's push aside attitudes and unrealistic statistics, could we
perhaps agree to re-add the use of doctored NVRAM (and thus
non-random WWPN/WWNN) when NVRAM is corrupted or non-present with a
module-parameter (which defaults to 0) which indicates the user
*really* knows what she is doing and recognizes WWPN collisions may
occur -- non-zero the parameter value indicates doctored values will
be used, zero value (the default) fails initialization.  In both cases
a big FAT warning is issued.

 You BROKE things, therefore you must FIX it.

 Now I'm happy to code up the sparc OFW property bits but your attitude
 and perspective on this absolutely has to change and the old fallback
 code still has to go back in there, possible FC ID collisions or not.

That would be great,  I'd like to insure the balance is maintained for
*all* our users.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, Andrew Vasquez wrote:

 On Mon, 16 Apr 2007, David Miller wrote:
 
  They DON'T
  CARE, they want their systems to work and if you don't give them that
  you're not being a good driver maintainer.
 
 Let's push aside attitudes and unrealistic statistics, could we
 perhaps agree to re-add the use of doctored NVRAM (and thus
 non-random WWPN/WWNN) when NVRAM is corrupted or non-present with a
 module-parameter (which defaults to 0) which indicates the user
 *really* knows what she is doing and recognizes WWPN collisions may
 occur -- non-zero the parameter value indicates doctored values will
 be used, zero value (the default) fails initialization.  In both cases
 a big FAT warning is issued.
 
  You BROKE things, therefore you must FIX it.
 
  Now I'm happy to code up the sparc OFW property bits but your attitude
  and perspective on this absolutely has to change and the old fallback
  code still has to go back in there, possible FC ID collisions or not.
 
 That would be great,  I'd like to insure the balance is maintained for
 *all* our users.

Ok, how about the following patch based on the one you posted which
adds the codes to retrieve the WWPN/WWNN from firmware on SPARC, and
also adds the module-parameter override I mentioned above.

Perhaps the module-parameter should be set to non-zero in the case of
SPARC, to take care of your system configurations?

Regards,
Andrew Vasquez

---

diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h
index 74544ae..b26090d 100644
--- a/drivers/scsi/qla2xxx/qla_gbl.h
+++ b/drivers/scsi/qla2xxx/qla_gbl.h
@@ -62,6 +62,7 @@ extern int ql2xfdmienable;
 extern int ql2xallocfwdump;
 extern int ql2xextended_error_logging;
 extern int ql2xqfullrampup;
+extern int ql2xoverrideinvalidnvram;
 
 extern void qla2x00_sp_compl(scsi_qla_host_t *, srb_t *);
 
diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index 98c01cd..fa5df97 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -11,6 +11,11 @@
 
 #include qla_devtbl.h
 
+#ifdef CONFIG_SPARC
+#include asm/prom.h
+#include asm/pbm.h
+#endif
+
 /* XXX(hch): this is ugly, but we don't want to pull in exioctl.h */
 #ifndef EXT_IS_LUN_BIT_SET
 #define EXT_IS_LUN_BIT_SET(P,L) \
@@ -1393,6 +1398,42 @@ qla2x00_set_model_info(scsi_qla_host_t *ha, uint8_t 
*model, size_t len, char *de
}
 }
 
+/* On sparc systems, obtain port and node WWN from firmware
+ * properties.
+ */
+static void qla2xxx_nvram_wwn_from_ofw(scsi_qla_host_t *ha, nvram_t *nv)
+{
+#ifdef CONFIG_SPARC
+   struct pci_dev *pdev = ha-pdev;
+   struct pcidev_cookie *pcp = pdev-sysdata;
+   struct device_node *dp = pcp-prom_node;
+   u8 *val;
+   int len;
+
+   val = of_get_property(dp, port-wwn, len);
+   if (val  len = WWN_SIZE)
+   memcpy(nv-port_name, val, WWN_SIZE);
+
+   val = of_get_property(dp, node-wwn, len);
+   if (val  len = WWN_SIZE)
+   memcpy(nv-node_name, val, WWN_SIZE);
+#endif
+}
+
+static inline int
+qla2x00_override_invalid_nvram(scsi_qla_host_t *ha)
+{
+   if (!ql2xoverrideinvalidnvram) {
+   qla_printk(KERN_WARNING, ha,
+   Reload the driver with the ql2xoverrideinvalidnvram \n);
+   qla_printk(KERN_WARNING, ha,
+module parameter set to a non-zero value to ignore \n);
+   qla_printk(KERN_WARNING, ha,
+this warning.\n);
+   }
+   return ql2xoverrideinvalidnvram;
+}
+
 /*
 * NVRAM configuration for ISP 2xxx
 *
@@ -1440,7 +1481,57 @@ qla2x00_nvram_config(scsi_qla_host_t *ha)
qla_printk(KERN_WARNING, ha, Inconsistent NVRAM detected: 
checksum=0x%x id=%c version=0x%x.\n, chksum, nv-id[0],
nv-nvram_version);
-   return QLA_FUNCTION_FAILED;
+   if (!qla2x00_override_invalid_nvram(ha))
+   return QLA_FUNCTION_FAILED;
+   qla_printk(KERN_WARNING, ha, Falling back to functioning (yet 
+   invalid -- WWPN) defaults.\n);
+
+   /*
+* Set default initialization control block.
+*/
+   memset(nv, 0, ha-nvram_size);
+   nv-parameter_block_version = ICB_VERSION;
+
+   if (IS_QLA23XX(ha)) {
+   nv-firmware_options[0] = BIT_2 | BIT_1;
+   nv-firmware_options[1] = BIT_7 | BIT_5;
+   nv-add_firmware_options[0] = BIT_5;
+   nv-add_firmware_options[1] = BIT_5 | BIT_4;
+   nv-frame_payload_size = __constant_cpu_to_le16(2048);
+   nv-special_options[1] = BIT_7;
+   } else if (IS_QLA2200(ha)) {
+   nv-firmware_options[0] = BIT_2 | BIT_1;
+   nv-firmware_options[1] = BIT_7 | BIT_5;
+   nv-add_firmware_options[0

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, David Miller wrote:

 From: Andrew Vasquez [EMAIL PROTECTED]
 Date: Mon, 16 Apr 2007 14:10:49 -0700

  Ok, how about the following patch based on the one you posted which
  adds the codes to retrieve the WWPN/WWNN from firmware on SPARC, and
  also adds the module-parameter override I mentioned above.

  Perhaps the module-parameter should be set to non-zero in the case of
  SPARC, to take care of your system configurations?

 I think it should default to non-zero always, in fact the option
 is completely pointless.

 The guy who hits this had a system which worked previously, and you're
 explicitly breaking it.  That's wrong.

Sorry, 'it' didn't work...  'It' *never* did.

 How can you not see that this quality of implementation decision
 you're making stinks?

You're defending a position which itself left users with a false sense
of security and comfort.  This is a *real* problem from an enterprise
perspective where FC reigns.  Fine, I'll agree that wacking-users (and
I'll wager the outliers) with a 2x4 was a bit extreme, but I'd much
rather handle those users on a case-by-case basis, either by:

* If dealing with a PCI card, directing a user  to a support staff at
  QLogic to resolve the NVRAM issues.

* If it's some on-board ISP with no NVRAM, as was your SPARC case,
  then add *proper* codes to retrieve the data from some secondary
  persistent store.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, David Miller wrote:

 From: Andrew Vasquez [EMAIL PROTECTED]
 Date: Mon, 16 Apr 2007 15:25:17 -0700

  Fine, I'll agree that wacking-users (and
  I'll wager the outliers) with a 2x4 was a bit extreme,

 And that, right there, is basically the end of the conversation.

 You don't do this to users, ever.
 Put a big loud kernel log message in there when this situation
 presents itself, use as many capital letters and scary language that
 you wish.  Let them know that if things explode they get to keep the
 pieces.

 But at least try to give them something that works when you know that
 you can.

 You don't need to make someone's system unbootable in order to make
 them aware of a potential problem.  It's very anti-social to approach

Sorry, but let's be realistic, this type of warning would have *NEVER*
been addressed if we kept the status quo -- your modifications to read
the wwpn/wwnn would have never been submitted, everybody would have
kept going on blistfully ignorant of the issue.  Changes such as these
are a common Linux upstream idiom...

So, meeting in the middle, with the NVRAM bits restored along with
some ability for the user to *knowingly* recognize the problem, I take
it, is not going to work for you?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, David Miller wrote:

 From: Andrew Vasquez [EMAIL PROTECTED]
 Date: Mon, 16 Apr 2007 16:28:51 -0700

  Sorry, but let's be realistic, this type of warning would have
  *NEVER* been addressed if we kept the status quo

 Wrong.  I watch the logs all the time and would have sent you a fix to
 use the Sparc firmware info as soon as I saw the kernel log message.

Dave, according to your earlier emails, the qla2xxx driver worked
'fine' in driver versions before commit
7aef45ac92f49e76d990b51b7ecd714b9a608be1.  If that were the case, then
you would have seen the warning messages:

...
qla_printk(KERN_WARNING, ha, Falling back to functioning (yet 
invalid -- WWPN) defaults.\n);

 Anyone who has worked with me over the last 15 years will let you know
 emphatically that this is true.

 AND IN THE MEAN TIME I COULD GET WORK DONE AND MY SYSTEM WOULD BOOT!

I understand that, and recognize your contribution, that was never in
question.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Major qla2xxx regression on sparc64

2007-04-16 Thread Andrew Vasquez

On Mon, 16 Apr 2007, David Miller wrote:

 From: Andrew Vasquez [EMAIL PROTECTED]
 Date: Mon, 16 Apr 2007 16:47:05 -0700

  Dave, according to your earlier emails, the qla2xxx driver worked
  'fine' in driver versions before commit
  7aef45ac92f49e76d990b51b7ecd714b9a608be1.  If that were the case, then
  you would have seen the warning messages:

  ...
  qla_printk(KERN_WARNING, ha, Falling back to functioning (yet 
  invalid -- WWPN) defaults.\n);

 I have in fact seen the message several times and that messages gives
 me no reason to believe something needs to be fixed.

 It should have said PLEASE REPORT THIS to [EMAIL PROTECTED] or
 something similar to indicate the severity better.

 An invalid WWPN, what's that? said the user. :)

 How about FC IDs may conflict and cause miscommunication!  Please
 report to driver author so this can be fixed! or similar?

That verbiage sounds fine -- so would you consider the previous patch
I submitted (with module parameter) along with the wording above?

I'm in transit for a redeye to NY so I won't be able to modify the
patch, If you would be amenable to the above, Seokmann, could you
rework the patch?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: qla2xxx BUG: workqueue leaked lock or atomic

2007-02-27 Thread Andrew Vasquez

On Tue, 27 Feb 2007, Andre Noll wrote:

> On 10:26, Andrew Vasquez wrote:
> > You are loading some stale firmware that's left over on the card --
> > I'm not even sure what 4.00.70 is, as the latest release firmware is
> > 4.00.27.
> 
> That's the firmware which came with the card. Anyway, I just upgraded
> the firmware, but the bug remains. The backtrace differs a bit though
> as now the tg3 network driver seems to be involved as well.
> 
> Thanks for your help
> Andre
...
> [   68.532665] BUG: at kernel/lockdep.c:1860 trace_hardirqs_on()
> [   68.532784] 
> [   68.532785] Call Trace:
> [   68.532979][] trace_hardirqs_on+0xd7/0x180
> [   68.533168]  [] _spin_unlock_irq+0x2b/0x40
> [   68.533295]  [] 
> :qla2xxx:qla2x00_process_completed_request+0x137/0x1d0
> [   68.533457]  [] :qla2xxx:qla2x00_status_entry+0x82/0xa40
> [   68.533577]  [] __lock_acquire+0xcdf/0xd90
> [   68.533693]  [] _spin_unlock_irqrestore+0x42/0x60
> [   68.533816]  [] :qla2xxx:qla24xx_intr_handler+0x4e/0x2b0
> [   68.533942]  [] 
> :qla2xxx:qla24xx_process_response_queue+0xc1/0x1c0
> [   68.534102]  [] :qla2xxx:qla24xx_intr_handler+0x1d4/0x2b0

Ok, since 2.6.20, there been a patch added to qla2xxx which drops the
spin_unlock_irq() call while attempting to ramp-up the queue-depth:

commit befede3dabd204e9c546cbfbe391b29286c57da2
Author: Seokmann Ju <[EMAIL PROTECTED]>
Date:   Tue Jan 9 11:37:52 2007 -0800

[SCSI] qla2xxx: correct locking while call starget_for_each_device()

Removed spin_unlock_irq()/spin_lock_irq() pairs surrounding
starget_for_each_device() calls.
As Matthew W. pointed out, starget_for_each_device() can be called 
under
a spinlock being held.
The change has been tested and verified on qla2xxx.ko module.
    Thanks Matthew W. and Hisashi H. for help.

Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]>
Signed-off-by: Seokmann Ju <[EMAIL PROTECTED]>
Signed-off-by: James Bottomley <[EMAIL PROTECTED]>

http://marc.theaimsgroup.com/?l=linux-scsi=116837234904583=2

Could you try the latest 2.6.21-rc which contains the correction?

Regards,
Andrew Vasquez
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: qla2xxx BUG: workqueue leaked lock or atomic

2007-02-27 Thread Andrew Vasquez

On Tue, 27 Feb 2007, Andre Noll wrote:

 On 10:26, Andrew Vasquez wrote:
  You are loading some stale firmware that's left over on the card --
  I'm not even sure what 4.00.70 is, as the latest release firmware is
  4.00.27.
 
 That's the firmware which came with the card. Anyway, I just upgraded
 the firmware, but the bug remains. The backtrace differs a bit though
 as now the tg3 network driver seems to be involved as well.
 
 Thanks for your help
 Andre
...
 [   68.532665] BUG: at kernel/lockdep.c:1860 trace_hardirqs_on()
 [   68.532784] 
 [   68.532785] Call Trace:
 [   68.532979]  IRQ  [8024b877] trace_hardirqs_on+0xd7/0x180
 [   68.533168]  [80511f5b] _spin_unlock_irq+0x2b/0x40
 [   68.533295]  [88032747] 
 :qla2xxx:qla2x00_process_completed_request+0x137/0x1d0
 [   68.533457]  [88032862] :qla2xxx:qla2x00_status_entry+0x82/0xa40
 [   68.533577]  [8024b17f] __lock_acquire+0xcdf/0xd90
 [   68.533693]  [80511ff2] _spin_unlock_irqrestore+0x42/0x60
 [   68.533816]  [880343fe] :qla2xxx:qla24xx_intr_handler+0x4e/0x2b0
 [   68.533942]  [88033551] 
 :qla2xxx:qla24xx_process_response_queue+0xc1/0x1c0
 [   68.534102]  [88034584] :qla2xxx:qla24xx_intr_handler+0x1d4/0x2b0

Ok, since 2.6.20, there been a patch added to qla2xxx which drops the
spin_unlock_irq() call while attempting to ramp-up the queue-depth:

commit befede3dabd204e9c546cbfbe391b29286c57da2
Author: Seokmann Ju [EMAIL PROTECTED]
Date:   Tue Jan 9 11:37:52 2007 -0800

[SCSI] qla2xxx: correct locking while call starget_for_each_device()

Removed spin_unlock_irq()/spin_lock_irq() pairs surrounding
starget_for_each_device() calls.
As Matthew W. pointed out, starget_for_each_device() can be called 
under
a spinlock being held.
The change has been tested and verified on qla2xxx.ko module.
Thanks Matthew W. and Hisashi H. for help.

Signed-off-by: Andrew Vasquez [EMAIL PROTECTED]
Signed-off-by: Seokmann Ju [EMAIL PROTECTED]
Signed-off-by: James Bottomley [EMAIL PROTECTED]

http://marc.theaimsgroup.com/?l=linux-scsim=116837234904583w=2

Could you try the latest 2.6.21-rc which contains the correction?

Regards,
Andrew Vasquez
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: qla2xxx BUG: workqueue leaked lock or atomic

2007-02-26 Thread Andrew Vasquez

On Mon, 26 Feb 2007, Andre Noll wrote:

> On linux-2.6.20.1, we're seeing hard lockups with 2 raid systems
> connected to a qla2xxx card and used as a single volume via lvm.
> The system seems to lock up only if data gets written to both raid
> systems at the same time.
> 
> On a standard kernel nothing makes it to the log, the system just
> freezes. So we tried a lockdep kernel which reports two BUGs during
> boot, see below.
> 
> Could this be related to our problem?

Before we proceed further, could you retrieve the latest firmware
release for 24xx type HBAs:

> [   64.151096] QLogic Fibre Channel HBA Driver
> [   64.151405] ACPI: PCI Interrupt :05:08.0[A] -> GSI 32 (level, low) -> 
> IRQ 32
> [   64.151821] qla2xxx :05:08.0: Found an ISP2422, irq 32, iobase 
> 0xc2006000
> [   64.152231] qla2xxx :05:08.0: Configuring PCI space...
> [   64.152498] qla2xxx :05:08.0: Configure NVRAM parameters...
> [   64.159088] qla2xxx :05:08.0: Verifying loaded RISC code...
> [   74.169623] qla2xxx :05:08.0: Firmware image unavailable.
> [   74.169737] qla2xxx :05:08.0: Firmware images can be retrieved from: 
> ftp://ftp.qlogic.com/outgoing/linux/firmware/.
> [   74.169902] qla2xxx :05:08.0: Attempting to load (potentially 
> outdated) firmware from flash.
> [   74.760935] qla2xxx :05:08.0: Allocated (64 KB) for EFT...
> [   74.761186] qla2xxx :05:08.0: Allocated (1413 KB) for firmware dump...
> [   74.776988] scsi0 : qla2xxx
> [   74.961451] qla2xxx :05:08.0: 
> [   74.961452]  QLogic Fibre Channel HBA Driver: 8.01.07-k4
> [   74.961453]   QLogic HP AE369-60001 - QLA2340
> [   74.961454]   ISP2422: PCI-X Mode 1 (133 MHz) @ :05:08.0 hdma+, 
> host#=0, fw=4.00.70 [IP] 

You are loading some stale firmware that's left over on the card --
I'm not even sure what 4.00.70 is, as the latest release firmware is
4.00.27.  You can retrieve the image here:

ftp://ftp.qlogic.com/outgoing/linux/firmware/ql2400_fw.bin

Let's start there... before we move on to this:

> [   75.778656] BUG: at kernel/lockdep.c:1860 trace_hardirqs_on()
> [   75.778771] 
> [   75.778772] Call Trace:
> [   75.778967][] trace_hardirqs_on+0xd7/0x180
> [   75.779154]  [] _spin_unlock_irq+0x2b/0x40
> [   75.779271]  [] 
> qla2x00_process_completed_request+0x137/0x1d0
> [   75.779424]  [] qla2x00_status_entry+0x82/0xa40
> [   75.779541]  [] __lock_acquire+0xcdf/0xd90
> [   75.779657]  [] _spin_unlock_irqrestore+0x42/0x60
> [   75.779775]  [] qla24xx_intr_handler+0x4e/0x2b0
> [   75.779892]  [] qla24xx_process_response_queue+0xc1/0x1c0
> [   75.780012]  [] qla24xx_intr_handler+0x1d4/0x2b0
> [   75.780131]  [] handle_IRQ_event+0x20/0x60

Hmm

Regards,
Andrew Vasquez
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: qla2xxx BUG: workqueue leaked lock or atomic

2007-02-26 Thread Andrew Vasquez

On Mon, 26 Feb 2007, Andre Noll wrote:

 On linux-2.6.20.1, we're seeing hard lockups with 2 raid systems
 connected to a qla2xxx card and used as a single volume via lvm.
 The system seems to lock up only if data gets written to both raid
 systems at the same time.
 
 On a standard kernel nothing makes it to the log, the system just
 freezes. So we tried a lockdep kernel which reports two BUGs during
 boot, see below.
 
 Could this be related to our problem?

Before we proceed further, could you retrieve the latest firmware
release for 24xx type HBAs:

 [   64.151096] QLogic Fibre Channel HBA Driver
 [   64.151405] ACPI: PCI Interrupt :05:08.0[A] - GSI 32 (level, low) - 
 IRQ 32
 [   64.151821] qla2xxx :05:08.0: Found an ISP2422, irq 32, iobase 
 0xc2006000
 [   64.152231] qla2xxx :05:08.0: Configuring PCI space...
 [   64.152498] qla2xxx :05:08.0: Configure NVRAM parameters...
 [   64.159088] qla2xxx :05:08.0: Verifying loaded RISC code...
 [   74.169623] qla2xxx :05:08.0: Firmware image unavailable.
 [   74.169737] qla2xxx :05:08.0: Firmware images can be retrieved from: 
 ftp://ftp.qlogic.com/outgoing/linux/firmware/.
 [   74.169902] qla2xxx :05:08.0: Attempting to load (potentially 
 outdated) firmware from flash.
 [   74.760935] qla2xxx :05:08.0: Allocated (64 KB) for EFT...
 [   74.761186] qla2xxx :05:08.0: Allocated (1413 KB) for firmware dump...
 [   74.776988] scsi0 : qla2xxx
 [   74.961451] qla2xxx :05:08.0: 
 [   74.961452]  QLogic Fibre Channel HBA Driver: 8.01.07-k4
 [   74.961453]   QLogic HP AE369-60001 - QLA2340
 [   74.961454]   ISP2422: PCI-X Mode 1 (133 MHz) @ :05:08.0 hdma+, 
 host#=0, fw=4.00.70 [IP] 

You are loading some stale firmware that's left over on the card --
I'm not even sure what 4.00.70 is, as the latest release firmware is
4.00.27.  You can retrieve the image here:

ftp://ftp.qlogic.com/outgoing/linux/firmware/ql2400_fw.bin

Let's start there... before we move on to this:

 [   75.778656] BUG: at kernel/lockdep.c:1860 trace_hardirqs_on()
 [   75.778771] 
 [   75.778772] Call Trace:
 [   75.778967]  IRQ  [8024b877] trace_hardirqs_on+0xd7/0x180
 [   75.779154]  [8052bc1b] _spin_unlock_irq+0x2b/0x40
 [   75.779271]  [804605d7] 
 qla2x00_process_completed_request+0x137/0x1d0
 [   75.779424]  [804606f2] qla2x00_status_entry+0x82/0xa40
 [   75.779541]  [8024b17f] __lock_acquire+0xcdf/0xd90
 [   75.779657]  [8052bcb2] _spin_unlock_irqrestore+0x42/0x60
 [   75.779775]  [8046228e] qla24xx_intr_handler+0x4e/0x2b0
 [   75.779892]  [804613e1] qla24xx_process_response_queue+0xc1/0x1c0
 [   75.780012]  [80462414] qla24xx_intr_handler+0x1d4/0x2b0
 [   75.780131]  [8025e950] handle_IRQ_event+0x20/0x60

Hmm

Regards,
Andrew Vasquez
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Andrew Vasquez

On Fri, 02 Feb 2007, Randy Dunlap wrote:

> On Fri, 2 Feb 2007 16:25:41 -0800 Andrew Morton wrote:
> 
> > On Fri, 2 Feb 2007 12:56:30 -0800
> > Andrew Vasquez <[EMAIL PROTECTED]> wrote:
> > 
> > > > > dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats 
> > > > > limit=2m passes=100 pattern=iot dlimit=2048
> > 
> > What is this mysterious dt command, btw?
> 
> I expect that it's the one here:
> http://www.scsifaq.org/RMiller_Tools/index.html

yep, that's the one.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Andrew Vasquez

On Thu, 01 Feb 2007, Andrew Morton wrote:

> On Mon, 22 Jan 2007 10:35:10 -0800 Andrew Vasquez <[EMAIL PROTECTED]> wrote:
> > Basically what is happening from the FC side is the initiator executes
> > a simple dt test:
> > 
> > dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats 
> > limit=2m passes=100 pattern=iot dlimit=2048
> > 
> > against a single lun (a very basic Windows target mode driver).
> > During the test a port-enable, port-disable script is running agains
> > the switch's port that is connected to the target (this occurs every
> > sixty seconds (for a disabled duration of 2 seconds).  Additionally,
> > the target itself is set to LOGO (logout) or drop off the topology
> > every 30 seconds.
> 
> I don't understand what effect the port-enable/port-disable has upon the
> system.  Will it cause I/O errors, or what?

No I/O errors should make there way to the upper-layers (block/FS).
The system *should* be shielded from the fibre-channel fabric events.
I just wanted to explain what the (basic sanity) test did.

> > This test runs fine up to 2.6.19.
> 
> One thing we did in there was to give direct-io-against-blockdevs some
> special-case bio-preparation code.  Perhaps this is tickling a bug somehow.
> 
> We can revert that change like this:
> 
> 
> diff -puN fs/block_dev.c~a fs/block_dev.c
> --- a/fs/block_dev.c~a
> +++ a/fs/block_dev.c
> @@ -196,8 +196,47 @@ static void blk_unget_page(struct page *
>   pvec->page[--pvec->idx] = page;
>  }
>  
> +static int
> +blkdev_get_blocks(struct inode *inode, sector_t iblock,
> + struct buffer_head *bh, int create)
...

Hmm, with this patch we've noted two main differences:

1) I/O throughput with the basic 'dd' command used (above) is back to
   60MB/s, rather than the appalling 20-22 MB/s we were seeing with
   2.6.20-rcX.

2) No panics -- so far with 2+ hours of testing.  With our vanilla
   system of 2.6.20-rc7, the test could trigger the panic within 15 to
   20 minutes.

We'll let this run over the weekend -- I'll certainly let you know if
anything has changed (failures).

--
Andrew Vasquez
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Andrew Vasquez

On Thu, 01 Feb 2007, Andrew Morton wrote:

 On Mon, 22 Jan 2007 10:35:10 -0800 Andrew Vasquez [EMAIL PROTECTED] wrote:
  Basically what is happening from the FC side is the initiator executes
  a simple dt test:
  
  dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats 
  limit=2m passes=100 pattern=iot dlimit=2048
  
  against a single lun (a very basic Windows target mode driver).
  During the test a port-enable, port-disable script is running agains
  the switch's port that is connected to the target (this occurs every
  sixty seconds (for a disabled duration of 2 seconds).  Additionally,
  the target itself is set to LOGO (logout) or drop off the topology
  every 30 seconds.
 
 I don't understand what effect the port-enable/port-disable has upon the
 system.  Will it cause I/O errors, or what?

No I/O errors should make there way to the upper-layers (block/FS).
The system *should* be shielded from the fibre-channel fabric events.
I just wanted to explain what the (basic sanity) test did.

  This test runs fine up to 2.6.19.
 
 One thing we did in there was to give direct-io-against-blockdevs some
 special-case bio-preparation code.  Perhaps this is tickling a bug somehow.
 
 We can revert that change like this:
 
 
 diff -puN fs/block_dev.c~a fs/block_dev.c
 --- a/fs/block_dev.c~a
 +++ a/fs/block_dev.c
 @@ -196,8 +196,47 @@ static void blk_unget_page(struct page *
   pvec-page[--pvec-idx] = page;
  }
  
 +static int
 +blkdev_get_blocks(struct inode *inode, sector_t iblock,
 + struct buffer_head *bh, int create)
...

Hmm, with this patch we've noted two main differences:

1) I/O throughput with the basic 'dd' command used (above) is back to
   60MB/s, rather than the appalling 20-22 MB/s we were seeing with
   2.6.20-rcX.

2) No panics -- so far with 2+ hours of testing.  With our vanilla
   system of 2.6.20-rc7, the test could trigger the panic within 15 to
   20 minutes.

We'll let this run over the weekend -- I'll certainly let you know if
anything has changed (failures).

--
Andrew Vasquez
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Andrew Vasquez

On Fri, 02 Feb 2007, Randy Dunlap wrote:

 On Fri, 2 Feb 2007 16:25:41 -0800 Andrew Morton wrote:
 
  On Fri, 2 Feb 2007 12:56:30 -0800
  Andrew Vasquez [EMAIL PROTECTED] wrote:
  
 dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats 
 limit=2m passes=100 pattern=iot dlimit=2048
  
  What is this mysterious dt command, btw?
 
 I expect that it's the one here:
 http://www.scsifaq.org/RMiller_Tools/index.html

yep, that's the one.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-01-22 Thread Andrew Vasquez

9b939eb22cfbe0beea12] one more 
EXPORT_UNUSED_SYMBOL removal
git-bisect bad 029530f810dd5147f7e59b939eb22cfbe0beea12


Not sure how much help it is, but while trying to instrument
as-iosched.c, we can see that the failure has a fairly stable
signature:

[15280.813479] as_dispatch_request: ad=810038704da8 ; reads=0 ; 
writes=1 ; dir=0 ;fifo_list[async]=81003f4071a0 ad-new_batch=0 
ad->change_batch=0.
[15280.827032] as_dispatch_request: q=81003fdae050 ; #_reqs=128 ; 
lmerge=81003f4071a0 ;.

which means:

static int as_dispatch_request(request_queue_t *q, int force)
{
...
const int writes = !list_empty(>fifo_list[REQ_ASYNC]);

'writes' is set, the batch_data_dir has not changed
(ad->batch_data_dir != REQ_SYNC) in the following segment:

...
if (writes) {
dispatch_writes:
BUG_ON(RB_EMPTY_ROOT(>sort_list[REQ_ASYNC]));

if (ad->batch_data_dir == REQ_SYNC) {
ad->changed_batch = 1;

/*
 * new_batch might be 1 when the queue runs out 
of
 * reads. A subsequent submission of a write 
might
 * cause a change of batch before the read is 
finished.
 */
ad->new_batch = 0;
}
ad->batch_data_dir = REQ_ASYNC;
ad->current_write_count = ad->write_batch_count;
ad->write_batch_idled = 0;
rq = ad->next_rq[ad->batch_data_dir];
goto dispatch_request;
}

ad->next_rq[ad->batch_data_dir] is NULL, and is then passed down to
as_move_to_dispatch() where the first dereference of rq:

static void as_move_to_dispatch(struct as_data *ad, struct request *rq)
{
const int data_dir = rq_is_sync(rq);

borks the machine.  What's odd (perhaps it's just our rudimentary
understanding of AS) is that there are segments of code where
ad->next_rq[REQ_ASYNC] is checked against NULL (in 'writes' case it is
not).

Anyway, any ideas or hints?  Attached is the .config used.

Thanks,
Andrew Vasquez
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.19
# Fri Jan 19 16:53:19 2007
#
CONFIG_X86_64=y
CONFIG_64BIT=y
CONFIG_X86=y
CONFIG_ZONE_DMA32=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_CMPXCHG=y
CONFIG_EARLY_PRINTK=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_DMI=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
# CONFIG_IKCONFIG is not set
# CONFIG_CPUSETS is not set
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_RELAY is not set
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_SLAB=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
# CONFIG_SLOB is not set

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
# CONFIG_KMOD is not set
CONFIG_STOP_MACHINE=y

#
# Block layer
#
CONFIG_BLOCK=y
# CONFIG_BLK_DEV_IO_TRACE is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_AS=y
# CONFIG_DEFAULT_DEADLINE is not set
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="anticipatory"

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_VSMP is not set
# CONFIG_MK8 is not set
CONFIG_MPSC=y
# CONFIG_MCORE2 is not set
# CONFIG_GENERIC_CPU is not set
CONFIG_X86_L1_CACHE_BYTES=128
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_X86_INTERNODE_CACHE_BYTES=128
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
# CO

[BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-01-22 Thread Andrew Vasquez

 addresses
git-bisect good 69688262fb94e92a32f188b79c02dc32016d4927
# bad: [5faad620264290b17e80a8b0996b039ea0d5ac73] Merge branch 
'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block
git-bisect bad 5faad620264290b17e80a8b0996b039ea0d5ac73
# bad: [3161986224a3faa8ccca3e665b7404d81e7ee3cf] fbdev: remove 
references to non-existent fbmon_valid_timings()
git-bisect bad 3161986224a3faa8ccca3e665b7404d81e7ee3cf
# bad: [c954e2a5d1c9662991a41282297ddebcadee0578] knfsd: nfsd4: make 
verify and nverify wrappers
git-bisect bad c954e2a5d1c9662991a41282297ddebcadee0578
# bad: [021d3a72459191a76e8e482ee4937ba6bc9fd712] knfsd: nfsd4: 
handling more nfsd_cross_mnt errors in nfsd4 readdir
git-bisect bad 021d3a72459191a76e8e482ee4937ba6bc9fd712
# bad: [b797b5beac966df5c5d96c0d39fe366f57135343] knfsd: svcrpc: fix 
gss krb5i memory leak
git-bisect bad b797b5beac966df5c5d96c0d39fe366f57135343
# bad: [b21a323710e77a27b2f66af901bd3640c30aba6e] remove the broken 
BLK_DEV_SWIM_IOP driver
git-bisect bad b21a323710e77a27b2f66af901bd3640c30aba6e
# bad: [029530f810dd5147f7e59b939eb22cfbe0beea12] one more 
EXPORT_UNUSED_SYMBOL removal
git-bisect bad 029530f810dd5147f7e59b939eb22cfbe0beea12


Not sure how much help it is, but while trying to instrument
as-iosched.c, we can see that the failure has a fairly stable
signature:

[15280.813479] as_dispatch_request: ad=810038704da8 ; reads=0 ; 
writes=1 ; dir=0 ;fifo_list[async]=81003f4071a0 ad-new_batch=0 
ad-change_batch=0.
[15280.827032] as_dispatch_request: q=81003fdae050 ; #_reqs=128 ; 
lmerge=81003f4071a0 ;.

which means:

static int as_dispatch_request(request_queue_t *q, int force)
{
...
const int writes = !list_empty(ad-fifo_list[REQ_ASYNC]);

'writes' is set, the batch_data_dir has not changed
(ad-batch_data_dir != REQ_SYNC) in the following segment:

...
if (writes) {
dispatch_writes:
BUG_ON(RB_EMPTY_ROOT(ad-sort_list[REQ_ASYNC]));

if (ad-batch_data_dir == REQ_SYNC) {
ad-changed_batch = 1;

/*
 * new_batch might be 1 when the queue runs out 
of
 * reads. A subsequent submission of a write 
might
 * cause a change of batch before the read is 
finished.
 */
ad-new_batch = 0;
}
ad-batch_data_dir = REQ_ASYNC;
ad-current_write_count = ad-write_batch_count;
ad-write_batch_idled = 0;
rq = ad-next_rq[ad-batch_data_dir];
goto dispatch_request;
}

ad-next_rq[ad-batch_data_dir] is NULL, and is then passed down to
as_move_to_dispatch() where the first dereference of rq:

static void as_move_to_dispatch(struct as_data *ad, struct request *rq)
{
const int data_dir = rq_is_sync(rq);

borks the machine.  What's odd (perhaps it's just our rudimentary
understanding of AS) is that there are segments of code where
ad-next_rq[REQ_ASYNC] is checked against NULL (in 'writes' case it is
not).

Anyway, any ideas or hints?  Attached is the .config used.

Thanks,
Andrew Vasquez
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.19
# Fri Jan 19 16:53:19 2007
#
CONFIG_X86_64=y
CONFIG_64BIT=y
CONFIG_X86=y
CONFIG_ZONE_DMA32=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_CMPXCHG=y
CONFIG_EARLY_PRINTK=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_DMI=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
# CONFIG_IKCONFIG is not set
# CONFIG_CPUSETS is not set
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_RELAY is not set
CONFIG_INITRAMFS_SOURCE=
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y

Re: [-mm patch] make qla2x00_reg_remote_port() static

2006-11-27 Thread Andrew Vasquez

On Fri, 24 Nov 2006, Adrian Bunk wrote:

> On Thu, Nov 23, 2006 at 02:17:03AM -0800, Andrew Morton wrote:
> >...
> > Changes since 2.6.19-rc5-mm2:
> >...
> >  git-scsi-misc.patch
> >...
> >  git trees
> >...
> 
> qla2x00_reg_remote_port() can now become static.
> 
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

Acked-by: Andrew Vasquez <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [-mm patch] make qla2x00_reg_remote_port() static

2006-11-27 Thread Andrew Vasquez

On Fri, 24 Nov 2006, Adrian Bunk wrote:

 On Thu, Nov 23, 2006 at 02:17:03AM -0800, Andrew Morton wrote:
 ...
  Changes since 2.6.19-rc5-mm2:
 ...
   git-scsi-misc.patch
 ...
   git trees
 ...
 
 qla2x00_reg_remote_port() can now become static.
 
 Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

Acked-by: Andrew Vasquez [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.13] Warning in the qla2xxx driver

2005-09-08 Thread Andrew Vasquez

On Thu, 01 Sep 2005, Daniel Walker wrote:

> Remove possible uninitialized "sg" field warning in the qla24xx driver
> 
> Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]>
> 
> Index: linux-2.6.13/drivers/scsi/qla2xxx/qla_iocb.c
> ===
> --- linux-2.6.13.orig/drivers/scsi/qla2xxx/qla_iocb.c 2005-08-28 
> 23:41:01.0 +
> +++ linux-2.6.13/drivers/scsi/qla2xxx/qla_iocb.c  2005-08-31 
> 18:31:03.0 +
> @@ -744,7 +744,7 @@ qla24xx_start_scsi(srb_t *sp)
>   uint32_tindex;
>   uint32_thandle;
>   struct cmd_type_7 *cmd_pkt;
> - struct scatterlist *sg;
> + struct scatterlist *sg = NULL;
>   uint16_tcnt;
>   uint16_treq_cnt;
>   uint16_ttot_dsds;

This was already addressed in the following patch:

http://marc.theaimsgroup.com/?l=linux-scsi=112510857722632=2

which was recently pull by Linus:

http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=131736d34ebc3251d79ddfd08a5e57a3e86decd4

--
av
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.13] Warning in the qla2xxx driver

2005-09-08 Thread Andrew Vasquez

On Thu, 01 Sep 2005, Daniel Walker wrote:

 Remove possible uninitialized sg field warning in the qla24xx driver
 
 Signed-Off-By: Daniel Walker [EMAIL PROTECTED]
 
 Index: linux-2.6.13/drivers/scsi/qla2xxx/qla_iocb.c
 ===
 --- linux-2.6.13.orig/drivers/scsi/qla2xxx/qla_iocb.c 2005-08-28 
 23:41:01.0 +
 +++ linux-2.6.13/drivers/scsi/qla2xxx/qla_iocb.c  2005-08-31 
 18:31:03.0 +
 @@ -744,7 +744,7 @@ qla24xx_start_scsi(srb_t *sp)
   uint32_tindex;
   uint32_thandle;
   struct cmd_type_7 *cmd_pkt;
 - struct scatterlist *sg;
 + struct scatterlist *sg = NULL;
   uint16_tcnt;
   uint16_treq_cnt;
   uint16_ttot_dsds;

This was already addressed in the following patch:

http://marc.theaimsgroup.com/?l=linux-scsim=112510857722632w=2

which was recently pull by Linus:

http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=131736d34ebc3251d79ddfd08a5e57a3e86decd4

--
av
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rc7 qla2xxx unaligned accesses

2005-08-26 Thread Andrew Vasquez

On Thu, 25 Aug 2005, Keith Owens wrote:
> On Wed, 24 Aug 2005 11:22:52 -0700, 
> Andrew Vasquez <[EMAIL PROTECTED]> wrote:
> >On Wed, 24 Aug 2005, Keith Owens wrote:
> >
> >> 2.6.13-rc7 + kdb on ia64.  The qla2xxx drivers are getting unaligned
> >> accesses at startup.
> >> 
> >> qla2300 :01:02.0: Found an ISP2312, irq 66, iobase 0xc0080f30
> >> qla2300 :01:02.0: Configuring PCI space...
> >> PCI: slot :01:02.0 has incorrect PCI cache line size of 0 bytes, 
> >> correcting to 128
> >> qla2300 :01:02.0: Configure NVRAM parameters...
> >> qla2300 :01:02.0: Verifying loaded RISC code...
> >> qla2300 :01:02.0: Waiting for LIP to complete...
> >> qla2300 :01:02.0: Cable is unplugged...
> >> scsi1 : qla2xxx
> >> kernel unaligned access to 0xe0300667800c, ip=0xa001005cd0b1
> >
> >Yes, I have a fix for this in my patch-queue.  I'll attach it here for
> >reference.  I'll forward onto linux-scsi post 2.6.13.
> >
> >--
> >av
> >
> >---
> >
> >On some platforms the hard-casting of the 8 byte node_name
> >and port_name arrays to an u64 would cause unaligned-access
> >warnings.  Generalize the conversions with consistent
> >shifting of WWN bytes.
> >
> >Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]>
> >---
> >
> > drivers/scsi/qla2xxx/qla_attr.c |   27 +--
> > 1 files changed, 17 insertions(+), 10 deletions(-)
> >
> >24e16c86578498fd71a3e33bebbd8be7323a03c6
> >diff --git a/drivers/scsi/qla2xxx/qla_attr.c 
> >b/drivers/scsi/qla2xxx/qla_attr.c
> >--- a/drivers/scsi/qla2xxx/qla_attr.c
> >+++ b/drivers/scsi/qla2xxx/qla_attr.c
> >@@ -345,6 +345,15 @@ struct class_device_attribute *qla2x00_h
> > 
> > /* Host attributes. */
> > 
> >+static u64
> >+wwn_to_u64(uint8_t *wwn)
> >+{
> >+return (u64)wwn[0] << 56 | (u64)wwn[1] << 48 |
> >+(u64)wwn[2] << 40 | (u64)wwn[3] << 32 |
> >+(u64)wwn[4] << 24 | (u64)wwn[5] << 16 |
> >+(u64)wwn[6] <<  8 | (u64)wwn[7];
> >+}
> >+
> 
> Any reason you defined your own function instead of using the standard
> get_unaligned()?

I was unaware there was even such a helper.  Anyway, the wwn_to_u64()
function adds another benefit -- clarity, were converting a 8 byte
WWN array to it's endian-agnosting 64bit value.  I suppose, we could
make it inline.

--
AV
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rc7 qla2xxx unaligned accesses

2005-08-26 Thread Andrew Vasquez

On Thu, 25 Aug 2005, Keith Owens wrote:
 On Wed, 24 Aug 2005 11:22:52 -0700, 
 Andrew Vasquez [EMAIL PROTECTED] wrote:
 On Wed, 24 Aug 2005, Keith Owens wrote:
 
  2.6.13-rc7 + kdb on ia64.  The qla2xxx drivers are getting unaligned
  accesses at startup.
  
  qla2300 :01:02.0: Found an ISP2312, irq 66, iobase 0xc0080f30
  qla2300 :01:02.0: Configuring PCI space...
  PCI: slot :01:02.0 has incorrect PCI cache line size of 0 bytes, 
  correcting to 128
  qla2300 :01:02.0: Configure NVRAM parameters...
  qla2300 :01:02.0: Verifying loaded RISC code...
  qla2300 :01:02.0: Waiting for LIP to complete...
  qla2300 :01:02.0: Cable is unplugged...
  scsi1 : qla2xxx
  kernel unaligned access to 0xe0300667800c, ip=0xa001005cd0b1
 
 Yes, I have a fix for this in my patch-queue.  I'll attach it here for
 reference.  I'll forward onto linux-scsi post 2.6.13.
 
 --
 av
 
 ---
 
 On some platforms the hard-casting of the 8 byte node_name
 and port_name arrays to an u64 would cause unaligned-access
 warnings.  Generalize the conversions with consistent
 shifting of WWN bytes.
 
 Signed-off-by: Andrew Vasquez [EMAIL PROTECTED]
 ---
 
  drivers/scsi/qla2xxx/qla_attr.c |   27 +--
  1 files changed, 17 insertions(+), 10 deletions(-)
 
 24e16c86578498fd71a3e33bebbd8be7323a03c6
 diff --git a/drivers/scsi/qla2xxx/qla_attr.c 
 b/drivers/scsi/qla2xxx/qla_attr.c
 --- a/drivers/scsi/qla2xxx/qla_attr.c
 +++ b/drivers/scsi/qla2xxx/qla_attr.c
 @@ -345,6 +345,15 @@ struct class_device_attribute *qla2x00_h
  
  /* Host attributes. */
  
 +static u64
 +wwn_to_u64(uint8_t *wwn)
 +{
 +return (u64)wwn[0]  56 | (u64)wwn[1]  48 |
 +(u64)wwn[2]  40 | (u64)wwn[3]  32 |
 +(u64)wwn[4]  24 | (u64)wwn[5]  16 |
 +(u64)wwn[6]   8 | (u64)wwn[7];
 +}
 +
 
 Any reason you defined your own function instead of using the standard
 get_unaligned()?

I was unaware there was even such a helper.  Anyway, the wwn_to_u64()
function adds another benefit -- clarity, were converting a 8 byte
WWN array to it's endian-agnosting 64bit value.  I suppose, we could
make it inline.

--
AV
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rc7 qla2xxx unaligned accesses

2005-08-24 Thread Andrew Vasquez

On Wed, 24 Aug 2005, Keith Owens wrote:

> 2.6.13-rc7 + kdb on ia64.  The qla2xxx drivers are getting unaligned
> accesses at startup.
> 
> qla2300 :01:02.0: Found an ISP2312, irq 66, iobase 0xc0080f30
> qla2300 :01:02.0: Configuring PCI space...
> PCI: slot :01:02.0 has incorrect PCI cache line size of 0 bytes, 
> correcting to 128
> qla2300 :01:02.0: Configure NVRAM parameters...
> qla2300 :01:02.0: Verifying loaded RISC code...
> qla2300 :01:02.0: Waiting for LIP to complete...
> qla2300 :01:02.0: Cable is unplugged...
> scsi1 : qla2xxx
> kernel unaligned access to 0xe0300667800c, ip=0xa001005cd0b1

Yes, I have a fix for this in my patch-queue.  I'll attach it here for
reference.  I'll forward onto linux-scsi post 2.6.13.

--
av

---

On some platforms the hard-casting of the 8 byte node_name
and port_name arrays to an u64 would cause unaligned-access
warnings.  Generalize the conversions with consistent
shifting of WWN bytes.

Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]>
---

 drivers/scsi/qla2xxx/qla_attr.c |   27 +--
 1 files changed, 17 insertions(+), 10 deletions(-)

24e16c86578498fd71a3e33bebbd8be7323a03c6
diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
--- a/drivers/scsi/qla2xxx/qla_attr.c
+++ b/drivers/scsi/qla2xxx/qla_attr.c
@@ -345,6 +345,15 @@ struct class_device_attribute *qla2x00_h
 
 /* Host attributes. */
 
+static u64
+wwn_to_u64(uint8_t *wwn)
+{
+   return (u64)wwn[0] << 56 | (u64)wwn[1] << 48 |
+   (u64)wwn[2] << 40 | (u64)wwn[3] << 32 |
+   (u64)wwn[4] << 24 | (u64)wwn[5] << 16 |
+   (u64)wwn[6] <<  8 | (u64)wwn[7];
+}
+
 static void
 qla2x00_get_host_port_id(struct Scsi_Host *shost)
 {
@@ -360,16 +369,16 @@ qla2x00_get_starget_node_name(struct scs
struct Scsi_Host *host = dev_to_shost(starget->dev.parent);
scsi_qla_host_t *ha = to_qla_host(host);
fc_port_t *fcport;
-   uint64_t node_name = 0;
+   u64 node_name = 0;
 
list_for_each_entry(fcport, >fcports, list) {
if (starget->id == fcport->os_target_id) {
-   node_name = *(uint64_t *)fcport->node_name;
+   node_name = wwn_to_u64(fcport->node_name);
break;
}
}
 
-   fc_starget_node_name(starget) = be64_to_cpu(node_name);
+   fc_starget_node_name(starget) = node_name;
 }
 
 static void
@@ -378,16 +387,16 @@ qla2x00_get_starget_port_name(struct scs
struct Scsi_Host *host = dev_to_shost(starget->dev.parent);
scsi_qla_host_t *ha = to_qla_host(host);
fc_port_t *fcport;
-   uint64_t port_name = 0;
+   u64 port_name = 0;
 
list_for_each_entry(fcport, >fcports, list) {
if (starget->id == fcport->os_target_id) {
-   port_name = *(uint64_t *)fcport->port_name;
+   port_name = wwn_to_u64(fcport->port_name);
break;
}
}
 
-   fc_starget_port_name(starget) = be64_to_cpu(port_name);
+   fc_starget_port_name(starget) = port_name;
 }
 
 static void
@@ -460,9 +469,7 @@ struct fc_function_template qla2xxx_tran
 void
 qla2x00_init_host_attr(scsi_qla_host_t *ha)
 {
-   fc_host_node_name(ha->host) =
-   be64_to_cpu(*(uint64_t *)ha->init_cb->node_name);
-   fc_host_port_name(ha->host) =
-   be64_to_cpu(*(uint64_t *)ha->init_cb->port_name);
+   fc_host_node_name(ha->host) = wwn_to_u64(ha->init_cb->node_name);
+   fc_host_port_name(ha->host) = wwn_to_u64(ha->init_cb->port_name);
fc_host_supported_classes(ha->host) = FC_COS_CLASS3;
 }

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rc7 qla2xxx unaligned accesses

2005-08-24 Thread Andrew Vasquez

On Wed, 24 Aug 2005, Keith Owens wrote:

 2.6.13-rc7 + kdb on ia64.  The qla2xxx drivers are getting unaligned
 accesses at startup.
 
 qla2300 :01:02.0: Found an ISP2312, irq 66, iobase 0xc0080f30
 qla2300 :01:02.0: Configuring PCI space...
 PCI: slot :01:02.0 has incorrect PCI cache line size of 0 bytes, 
 correcting to 128
 qla2300 :01:02.0: Configure NVRAM parameters...
 qla2300 :01:02.0: Verifying loaded RISC code...
 qla2300 :01:02.0: Waiting for LIP to complete...
 qla2300 :01:02.0: Cable is unplugged...
 scsi1 : qla2xxx
 kernel unaligned access to 0xe0300667800c, ip=0xa001005cd0b1

Yes, I have a fix for this in my patch-queue.  I'll attach it here for
reference.  I'll forward onto linux-scsi post 2.6.13.

--
av

---

On some platforms the hard-casting of the 8 byte node_name
and port_name arrays to an u64 would cause unaligned-access
warnings.  Generalize the conversions with consistent
shifting of WWN bytes.

Signed-off-by: Andrew Vasquez [EMAIL PROTECTED]
---

 drivers/scsi/qla2xxx/qla_attr.c |   27 +--
 1 files changed, 17 insertions(+), 10 deletions(-)

24e16c86578498fd71a3e33bebbd8be7323a03c6
diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
--- a/drivers/scsi/qla2xxx/qla_attr.c
+++ b/drivers/scsi/qla2xxx/qla_attr.c
@@ -345,6 +345,15 @@ struct class_device_attribute *qla2x00_h
 
 /* Host attributes. */
 
+static u64
+wwn_to_u64(uint8_t *wwn)
+{
+   return (u64)wwn[0]  56 | (u64)wwn[1]  48 |
+   (u64)wwn[2]  40 | (u64)wwn[3]  32 |
+   (u64)wwn[4]  24 | (u64)wwn[5]  16 |
+   (u64)wwn[6]   8 | (u64)wwn[7];
+}
+
 static void
 qla2x00_get_host_port_id(struct Scsi_Host *shost)
 {
@@ -360,16 +369,16 @@ qla2x00_get_starget_node_name(struct scs
struct Scsi_Host *host = dev_to_shost(starget-dev.parent);
scsi_qla_host_t *ha = to_qla_host(host);
fc_port_t *fcport;
-   uint64_t node_name = 0;
+   u64 node_name = 0;
 
list_for_each_entry(fcport, ha-fcports, list) {
if (starget-id == fcport-os_target_id) {
-   node_name = *(uint64_t *)fcport-node_name;
+   node_name = wwn_to_u64(fcport-node_name);
break;
}
}
 
-   fc_starget_node_name(starget) = be64_to_cpu(node_name);
+   fc_starget_node_name(starget) = node_name;
 }
 
 static void
@@ -378,16 +387,16 @@ qla2x00_get_starget_port_name(struct scs
struct Scsi_Host *host = dev_to_shost(starget-dev.parent);
scsi_qla_host_t *ha = to_qla_host(host);
fc_port_t *fcport;
-   uint64_t port_name = 0;
+   u64 port_name = 0;
 
list_for_each_entry(fcport, ha-fcports, list) {
if (starget-id == fcport-os_target_id) {
-   port_name = *(uint64_t *)fcport-port_name;
+   port_name = wwn_to_u64(fcport-port_name);
break;
}
}
 
-   fc_starget_port_name(starget) = be64_to_cpu(port_name);
+   fc_starget_port_name(starget) = port_name;
 }
 
 static void
@@ -460,9 +469,7 @@ struct fc_function_template qla2xxx_tran
 void
 qla2x00_init_host_attr(scsi_qla_host_t *ha)
 {
-   fc_host_node_name(ha-host) =
-   be64_to_cpu(*(uint64_t *)ha-init_cb-node_name);
-   fc_host_port_name(ha-host) =
-   be64_to_cpu(*(uint64_t *)ha-init_cb-port_name);
+   fc_host_node_name(ha-host) = wwn_to_u64(ha-init_cb-node_name);
+   fc_host_port_name(ha-host) = wwn_to_u64(ha-init_cb-port_name);
fc_host_supported_classes(ha-host) = FC_COS_CLASS3;
 }

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fix up qla2xxx configuration bogosity

2005-07-28 Thread Andrew Vasquez

On Thu, 28 Jul 2005, James Bottomley wrote:

> On Wed, 2005-07-27 at 22:10 -0700, Andrew Vasquez wrote:
> > Would you also apply the attached patch which adds the appropriate
> > FW_LOADER pre-requisite and a separate entry for ISP24xx support.
> 
> That's what I see reading the code; however, it looks like it's *only*
> the 24xx that needs it (qla24xx_load_risc_hotplug).  The patch below
> pulls in the FW loader for every qlogic fibre driver, not just the
> qla24xx; is there a reason for doing this?

Yes, I've been working on a set of patches which add this
functionality across the board with supported ISP types (21xx, 22xx,
23xx).  I should have some patches for submission in next week's
time-frame.  So rather than a adding #if code around the relevant 24xx
specific codes in qla2xxx, I chose the fw_loader path for all types.

-- 
Andrew Vasquez
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fix up qla2xxx configuration bogosity

2005-07-28 Thread Andrew Vasquez

On Thu, 28 Jul 2005, James Bottomley wrote:

 On Wed, 2005-07-27 at 22:10 -0700, Andrew Vasquez wrote:
  Would you also apply the attached patch which adds the appropriate
  FW_LOADER pre-requisite and a separate entry for ISP24xx support.
 
 That's what I see reading the code; however, it looks like it's *only*
 the 24xx that needs it (qla24xx_load_risc_hotplug).  The patch below
 pulls in the FW loader for every qlogic fibre driver, not just the
 qla24xx; is there a reason for doing this?

Yes, I've been working on a set of patches which add this
functionality across the board with supported ISP types (21xx, 22xx,
23xx).  I should have some patches for submission in next week's
time-frame.  So rather than a adding #if code around the relevant 24xx
specific codes in qla2xxx, I chose the fw_loader path for all types.

-- 
Andrew Vasquez
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fix up qla2xxx configuration bogosity

2005-07-27 Thread Andrew Vasquez

Linus,

In looking through your latest git-pull and update of the Kconfig
quirks in qla2xxx:

Fix up qla2xxx configuration bogosity
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=e0aa8afd97536a9d94f82a07b4c4b3f05aef6f82;hp=e4ff4d7f9d85a2bc714307eb9113617182e62845


Would you also apply the attached patch which adds the appropriate
FW_LOADER pre-requisite and a separate entry for ISP24xx support.

Thanks to Adrian Bunk and Jesper Juhl for their efforts in fixing this
quirk.

Regards,
Andrew Vasquez

---

diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig
--- a/drivers/scsi/qla2xxx/Kconfig
+++ b/drivers/scsi/qla2xxx/Kconfig
@@ -7,6 +7,7 @@ config SCSI_QLA21XX
tristate "QLogic ISP2100 host adapter family support"
depends on SCSI_QLA2XXX
 select SCSI_FC_ATTRS
+   select FW_LOADER
---help---
This driver supports the QLogic 21xx (ISP2100) host adapter family.
 
@@ -14,6 +15,7 @@ config SCSI_QLA22XX
tristate "QLogic ISP2200 host adapter family support"
depends on SCSI_QLA2XXX
 select SCSI_FC_ATTRS
+   select FW_LOADER
---help---
This driver supports the QLogic 22xx (ISP2200) host adapter family.
 
@@ -21,6 +23,7 @@ config SCSI_QLA2300
tristate "QLogic ISP2300 host adapter family support"
depends on SCSI_QLA2XXX
 select SCSI_FC_ATTRS
+   select FW_LOADER
---help---
This driver supports the QLogic 2300 (ISP2300 and ISP2312) host
adapter family.
@@ -29,6 +32,7 @@ config SCSI_QLA2322
tristate "QLogic ISP2322 host adapter family support"
depends on SCSI_QLA2XXX
 select SCSI_FC_ATTRS
+   select FW_LOADER
---help---
This driver supports the QLogic 2322 (ISP2322) host adapter family.
 
@@ -36,6 +40,16 @@ config SCSI_QLA6312
tristate "QLogic ISP63xx host adapter family support"
depends on SCSI_QLA2XXX
 select SCSI_FC_ATTRS
+   select FW_LOADER
---help---
This driver supports the QLogic 63xx (ISP6312 and ISP6322) host
adapter family.
+
+config SCSI_QLA24XX
+   tristate "QLogic ISP24xx host adapter family support"
+   depends on SCSI_QLA2XXX
+   select SCSI_FC_ATTRS
+   select FW_LOADER
+   ---help---
+   This driver supports the QLogic 24xx (ISP2422 and ISP2432) host
+   adapter family.
diff --git a/drivers/scsi/qla2xxx/Makefile b/drivers/scsi/qla2xxx/Makefile
--- a/drivers/scsi/qla2xxx/Makefile
+++ b/drivers/scsi/qla2xxx/Makefile
@@ -1,5 +1,4 @@
 EXTRA_CFLAGS += -DUNIQUE_FW_NAME
-EXTRA_CFLAGS += -DCONFIG_SCSI_QLA24XX -DCONFIG_SCSI_QLA24XX_MODULE
 
 qla2xxx-y := qla_os.o qla_init.o qla_mbx.o qla_iocb.o qla_isr.o qla_gs.o \
qla_dbg.o qla_sup.o qla_rscn.o qla_attr.o
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Incorrect driver getting loaded for Qlogic FC-HBA

2005-07-27 Thread Andrew Vasquez

On Wed, 27 Jul 2005, Rajat Jain wrote:

> On 7/27/05, Andrew Vasquez <[EMAIL PROTECTED]> wrote:
> > 
> > A similar problem was noted with RHEL4, it seems the modules.pcimap
> > and pci.ids file were correct, but the pcitable file contained entries
> > for all ql[ae]23xx based HBAs to load qla2300.ko.
> > 
> > It's my understanding that this was fixed for RHEL4 U1.  Which distro
> > are you using?  If you are using RHEL, and are still having problems,
> > I'd suggest you file a report with Redhat.
> > 
> > Regards,
> > Andrew Vasquez
> > 
> 
> BINGO! I AM using RHEL 4. So does that mean I can rectify the problem
> by making appropriate changes to "pcitable" file?

I'm trying to get a firm answer from the folks who originally
discvoered the problem some time back, it seems you have two options:

 - during installation of RHEL4 (and not RHEL4U1), load with the
   'noprobe' option:

linux noprobe

   and manually select the appropriate drivers to load.

 - (post installation) modify the /etc/modprobe.conf to and rename the
   qla2300 entry to qla2322 (i.e.):

alias scsi_hostadapter1 qla2322

   modify the modules.pcimap table to load qla2322 for the 2322
   device-id:

qla2300 0x1077  0x2322  ...

   to:

qla2322 0x1077  0x2322  ...


Beyond that, I'd suggest you log a report with Redhat, as that's the
extent of the workaround knowledge without going to RHEL4U1.

Hope this helps,
Andrew Vasquez
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Incorrect driver getting loaded for Qlogic FC-HBA

2005-07-27 Thread Andrew Vasquez

On Wed, 27 Jul 2005, Rajat Jain wrote:

 On 7/27/05, Andrew Vasquez [EMAIL PROTECTED] wrote:
  
  A similar problem was noted with RHEL4, it seems the modules.pcimap
  and pci.ids file were correct, but the pcitable file contained entries
  for all ql[ae]23xx based HBAs to load qla2300.ko.
  
  It's my understanding that this was fixed for RHEL4 U1.  Which distro
  are you using?  If you are using RHEL, and are still having problems,
  I'd suggest you file a report with Redhat.
  
  Regards,
  Andrew Vasquez
  
 
 BINGO! I AM using RHEL 4. So does that mean I can rectify the problem
 by making appropriate changes to pcitable file?

I'm trying to get a firm answer from the folks who originally
discvoered the problem some time back, it seems you have two options:

 - during installation of RHEL4 (and not RHEL4U1), load with the
   'noprobe' option:

linux noprobe

   and manually select the appropriate drivers to load.

 - (post installation) modify the /etc/modprobe.conf to and rename the
   qla2300 entry to qla2322 (i.e.):

alias scsi_hostadapter1 qla2322

   modify the modules.pcimap table to load qla2322 for the 2322
   device-id:

qla2300 0x1077  0x2322  ...

   to:

qla2322 0x1077  0x2322  ...


Beyond that, I'd suggest you log a report with Redhat, as that's the
extent of the workaround knowledge without going to RHEL4U1.

Hope this helps,
Andrew Vasquez
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fix up qla2xxx configuration bogosity

2005-07-27 Thread Andrew Vasquez

Linus,

In looking through your latest git-pull and update of the Kconfig
quirks in qla2xxx:

Fix up qla2xxx configuration bogosity
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=e0aa8afd97536a9d94f82a07b4c4b3f05aef6f82;hp=e4ff4d7f9d85a2bc714307eb9113617182e62845


Would you also apply the attached patch which adds the appropriate
FW_LOADER pre-requisite and a separate entry for ISP24xx support.

Thanks to Adrian Bunk and Jesper Juhl for their efforts in fixing this
quirk.

Regards,
Andrew Vasquez

---

diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig
--- a/drivers/scsi/qla2xxx/Kconfig
+++ b/drivers/scsi/qla2xxx/Kconfig
@@ -7,6 +7,7 @@ config SCSI_QLA21XX
tristate QLogic ISP2100 host adapter family support
depends on SCSI_QLA2XXX
 select SCSI_FC_ATTRS
+   select FW_LOADER
---help---
This driver supports the QLogic 21xx (ISP2100) host adapter family.
 
@@ -14,6 +15,7 @@ config SCSI_QLA22XX
tristate QLogic ISP2200 host adapter family support
depends on SCSI_QLA2XXX
 select SCSI_FC_ATTRS
+   select FW_LOADER
---help---
This driver supports the QLogic 22xx (ISP2200) host adapter family.
 
@@ -21,6 +23,7 @@ config SCSI_QLA2300
tristate QLogic ISP2300 host adapter family support
depends on SCSI_QLA2XXX
 select SCSI_FC_ATTRS
+   select FW_LOADER
---help---
This driver supports the QLogic 2300 (ISP2300 and ISP2312) host
adapter family.
@@ -29,6 +32,7 @@ config SCSI_QLA2322
tristate QLogic ISP2322 host adapter family support
depends on SCSI_QLA2XXX
 select SCSI_FC_ATTRS
+   select FW_LOADER
---help---
This driver supports the QLogic 2322 (ISP2322) host adapter family.
 
@@ -36,6 +40,16 @@ config SCSI_QLA6312
tristate QLogic ISP63xx host adapter family support
depends on SCSI_QLA2XXX
 select SCSI_FC_ATTRS
+   select FW_LOADER
---help---
This driver supports the QLogic 63xx (ISP6312 and ISP6322) host
adapter family.
+
+config SCSI_QLA24XX
+   tristate QLogic ISP24xx host adapter family support
+   depends on SCSI_QLA2XXX
+   select SCSI_FC_ATTRS
+   select FW_LOADER
+   ---help---
+   This driver supports the QLogic 24xx (ISP2422 and ISP2432) host
+   adapter family.
diff --git a/drivers/scsi/qla2xxx/Makefile b/drivers/scsi/qla2xxx/Makefile
--- a/drivers/scsi/qla2xxx/Makefile
+++ b/drivers/scsi/qla2xxx/Makefile
@@ -1,5 +1,4 @@
 EXTRA_CFLAGS += -DUNIQUE_FW_NAME
-EXTRA_CFLAGS += -DCONFIG_SCSI_QLA24XX -DCONFIG_SCSI_QLA24XX_MODULE
 
 qla2xxx-y := qla_os.o qla_init.o qla_mbx.o qla_iocb.o qla_isr.o qla_gs.o \
qla_dbg.o qla_sup.o qla_rscn.o qla_attr.o
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Incorrect driver getting loaded for Qlogic FC-HBA

2005-07-26 Thread Andrew Vasquez

On Tue, 26 Jul 2005, Rajat Jain wrote:
> On 7/26/05, Greg KH <[EMAIL PROTECTED]> wrote:
> > On Mon, Jul 25, 2005 at 11:02:39AM +0900, Rajat Jain wrote:
> > > I'm using Kernel 2.6.9 and am having a Qlogic QLE2362 FC-HBA in my
> > > system. I selected all the Qlogic SCSI drivers while buiding the
> > > kernel. Now the problem is that every time I reboot, I have to
> > > MANUALLY modprobe the qla2322.ko module in the kernel and only then my
> > > HBA works. By default, the kernel loads qla2300.ko, which is not the
> > > correct driver for the card, and hence the HBA does not work. Here is
> > > the lspci output:
> > 
> > "by default" the kernel does not load any modules.  That's up to the
> > hotplug system, or some other package.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> Thanks. I just checked .. that is right. So let me put it this way.
> When ever I hot-plug my HBA into the system, the driver "qla2300" gets
> loaded. Where as the correct driver is "qla2322". This evident from
> the output of "modules.pcimap" file and "lspci". The PCI device number
> of HBA is 2322. and in modules.pcimap file, qla2322 is supposed to be
> loaded when this HBA is hot-plugged. But module qla2300 is getting
> loaded.
> 
> Any pointers on where could the problem be? Or how should I approach
> this problem?

A similar problem was noted with RHEL4, it seems the modules.pcimap
and pci.ids file were correct, but the pcitable file contained entries
for all ql[ae]23xx based HBAs to load qla2300.ko.

It's my understanding that this was fixed for RHEL4 U1.  Which distro
are you using?  If you are using RHEL, and are still having problems,
I'd suggest you file a report with Redhat.

Regards,
Andrew Vasquez
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 >

1 - 100 of 111 matches

Mail list logo