Re: [PATCH -mmotm] scsi: fix the wrong position of the comment
On Sun, 10 Mar 2013, James Bottomley wrote: > On Sun, 2013-03-10 at 00:57 -0800, Andrew Morton wrote: > > On Sun, 10 Mar 2013 08:22:47 + James Bottomley > > wrote: > > > > > [missing SCSI cc added] > > > On Sun, 2013-03-10 at 17:09 +0900, Akinobu Mita wrote: > > > > This fixes the wrong position of the comment introduced by > > > > scsi-rename-random32-to-prandom_u32.patch in the -mm tree. > > > > > > > > Signed-off-by: Akinobu Mita > > > > Cc: "James E.J. Bottomley" > > > > Cc: Andrew Vasquez > > > > --- > > > > drivers/scsi/qla2xxx/qla_attr.c | 6 +++--- > > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/scsi/qla2xxx/qla_attr.c > > > > b/drivers/scsi/qla2xxx/qla_attr.c > > > > index 04bf7b8..e44d47e 100644 > > > > --- a/drivers/scsi/qla2xxx/qla_attr.c > > > > +++ b/drivers/scsi/qla2xxx/qla_attr.c > > > > @@ -1939,13 +1939,13 @@ qla24xx_vport_delete(struct fc_vport *fc_vport) > > > > } > > > > > > > > /* No pending activities shall be there on the vha now */ > > > > - if (ql2xextended_error_logging & ql_dbg_user) > > > > - msleep(prandom_u32() % 10); > > > > + if (ql2xextended_error_logging & ql_dbg_user) { > > > > /* > > > > * Just to see if something falls on the net we have > > > > placed > > > > * below > > > > */ > > > > - > > > > + msleep(prandom_u32() % 10); > > > > + } > > > > > > I don't git a toss if it's random or prandom: Andrew: get rid of it; we > > > do not sleep in kernel for random intervals whatever the provocation ... > > > if this is supposed to be a warning or error condition then print > > > something. > > > > That msleep was added by > > > > commit feafb7b1714cf599a6d0fed45801ab3f66046cbd > > Author: Arun Easi > > AuthorDate: Fri Sep 3 14:57:00 2010 -0700 > > Commit: James Bottomley > > CommitDate: Sun Sep 5 15:13:12 2010 -0300 > > > > [SCSI] qla2xxx: Fix vport delete issues > > Sorry, I didn't notice multiple Andrews on the cc list. I meant Andrew > Vasquez (or other member of the qla team) remove this, please (and > preferably do something correct). > James, We'll take a look at this, yes. Adding Giri and Co. to the CC. Thanks, AV This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mmotm] scsi: fix the wrong position of the comment
On Sun, 10 Mar 2013, James Bottomley wrote: On Sun, 2013-03-10 at 00:57 -0800, Andrew Morton wrote: On Sun, 10 Mar 2013 08:22:47 + James Bottomley jbottom...@parallels.com wrote: [missing SCSI cc added] On Sun, 2013-03-10 at 17:09 +0900, Akinobu Mita wrote: This fixes the wrong position of the comment introduced by scsi-rename-random32-to-prandom_u32.patch in the -mm tree. Signed-off-by: Akinobu Mita akinobu.m...@gmail.com Cc: James E.J. Bottomley jbottom...@parallels.com Cc: Andrew Vasquez andrew.vasq...@qlogic.com --- drivers/scsi/qla2xxx/qla_attr.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c index 04bf7b8..e44d47e 100644 --- a/drivers/scsi/qla2xxx/qla_attr.c +++ b/drivers/scsi/qla2xxx/qla_attr.c @@ -1939,13 +1939,13 @@ qla24xx_vport_delete(struct fc_vport *fc_vport) } /* No pending activities shall be there on the vha now */ - if (ql2xextended_error_logging ql_dbg_user) - msleep(prandom_u32() % 10); + if (ql2xextended_error_logging ql_dbg_user) { /* * Just to see if something falls on the net we have placed * below */ - + msleep(prandom_u32() % 10); + } I don't git a toss if it's random or prandom: Andrew: get rid of it; we do not sleep in kernel for random intervals whatever the provocation ... if this is supposed to be a warning or error condition then print something. That msleep was added by commit feafb7b1714cf599a6d0fed45801ab3f66046cbd Author: Arun Easi arun.e...@qlogic.com AuthorDate: Fri Sep 3 14:57:00 2010 -0700 Commit: James Bottomley james.bottom...@suse.de CommitDate: Sun Sep 5 15:13:12 2010 -0300 [SCSI] qla2xxx: Fix vport delete issues Sorry, I didn't notice multiple Andrews on the cc list. I meant Andrew Vasquez (or other member of the qla team) remove this, please (and preferably do something correct). James, We'll take a look at this, yes. Adding Giri and Co. to the CC. Thanks, AV This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] scsi/qla4xxx/ql4_isr.c: remove dead code
On Tue, 19 Feb 2008, James Bottomley wrote: > On Tue, 2008-02-19 at 18:35 -0800, Andrew Vasquez wrote: > > On Tue, 19 Feb 2008, James Bottomley wrote: > > > > > On Tue, 2008-02-19 at 21:29 +0200, Adrian Bunk wrote: > > > > This patch removes dead code spotted by the Coverity checker. > > > > > > > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> > > > > > > > > --- > > > > > > > > drivers/scsi/qla4xxx/ql4_isr.c | 18 +- > > > > 1 file changed, 1 insertion(+), 17 deletions(-) > > > > > > > > --- linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c.old2008-02-19 > > > > 20:29:16.0 +0200 > > > > +++ linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c2008-02-19 > > > > 20:30:37.0 +0200 > > > > @@ -91,38 +91,22 @@ static void qla4xxx_status_entry(struct > > > > if (scsi_status == 0) { > > > > cmd->result = DID_OK << 16; > > > > break; > > > > } > > > > > > > > if (sts_entry->iscsiFlags & ISCSI_FLAG_RESIDUAL_OVER) { > > > > cmd->result = DID_ERROR << 16; > > > > break; > > > > } > > > > > > > > - if (sts_entry->iscsiFlags _FLAG_RESIDUAL_UNDER) { > > > > + if (sts_entry->iscsiFlags _FLAG_RESIDUAL_UNDER) > > > > scsi_set_resid(cmd, residual); > > > > - if (!scsi_status && ((scsi_bufflen(cmd) - > > > > residual) < > > > > - cmd->underflow)) { > > > > - > > > > - cmd->result = DID_ERROR << 16; > > > > - > > > > - DEBUG2(printk("scsi%ld:%d:%d:%d: %s: " > > > > - "Mid-layer Data underrun0, " > > > > - "xferlen = 0x%x, " > > > > - "residual = 0x%x\n", > > > > ha->host_no, > > > > - cmd->device->channel, > > > > - cmd->device->id, > > > > - cmd->device->lun, __func__, > > > > - scsi_bufflen(cmd), residual)); > > > > - break; > > > > - } > > > > - } > > > > > > This code doesn't look dead to me, it looks to be enforcing > > > cmd->underrun if set ... what makes the coverity checker think it can > > > never be executed? > > > > Hmm, guess it's the earlier 'if (scsi_status == 0)' check a few lines > > up... Dave S., can you take a look at this... Thanks, av > > Ah, so the !scsi_status is wrong it was supposed to be scsi_status != > 0 ... and even then it can just be dropped. My guess is that the check should have been written as: ... if (sts_entry->iscsiFlags _FLAG_RESIDUAL_UNDER) scsi_set_resid(cmd, residual); if ((scsi_bufflen(cmd) - residual) < cmd->underflow) { ... It looks to be a logic-error while porting from qla2xxx, where scsi_status during CS_COMPLETE is the full 16-bit status (high-byte is transport, low-byte SCSI status) from from the FCP_RSP frame (not so in iSCSI, where it's just the SCSI-status) and the residual check in qla_isr.c::qla2x00_status_entry() looks like: if (!lscsi_status && ((unsigned)(scsi_bufflen(cp) - resid) < cp->underflow)) { ... I'll defer to Dave S. for verification. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] scsi/qla4xxx/ql4_isr.c: remove dead code
On Tue, 19 Feb 2008, James Bottomley wrote: > On Tue, 2008-02-19 at 21:29 +0200, Adrian Bunk wrote: > > This patch removes dead code spotted by the Coverity checker. > > > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> > > > > --- > > > > drivers/scsi/qla4xxx/ql4_isr.c | 18 +- > > 1 file changed, 1 insertion(+), 17 deletions(-) > > > > --- linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c.old2008-02-19 > > 20:29:16.0 +0200 > > +++ linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c2008-02-19 > > 20:30:37.0 +0200 > > @@ -91,38 +91,22 @@ static void qla4xxx_status_entry(struct > > if (scsi_status == 0) { > > cmd->result = DID_OK << 16; > > break; > > } > > > > if (sts_entry->iscsiFlags & ISCSI_FLAG_RESIDUAL_OVER) { > > cmd->result = DID_ERROR << 16; > > break; > > } > > > > - if (sts_entry->iscsiFlags _FLAG_RESIDUAL_UNDER) { > > + if (sts_entry->iscsiFlags _FLAG_RESIDUAL_UNDER) > > scsi_set_resid(cmd, residual); > > - if (!scsi_status && ((scsi_bufflen(cmd) - residual) < > > - cmd->underflow)) { > > - > > - cmd->result = DID_ERROR << 16; > > - > > - DEBUG2(printk("scsi%ld:%d:%d:%d: %s: " > > - "Mid-layer Data underrun0, " > > - "xferlen = 0x%x, " > > - "residual = 0x%x\n", ha->host_no, > > - cmd->device->channel, > > - cmd->device->id, > > - cmd->device->lun, __func__, > > - scsi_bufflen(cmd), residual)); > > - break; > > - } > > - } > > This code doesn't look dead to me, it looks to be enforcing > cmd->underrun if set ... what makes the coverity checker think it can > never be executed? Hmm, guess it's the earlier 'if (scsi_status == 0)' check a few lines up... Dave S., can you take a look at this... Thanks, av -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] scsi/qla4xxx/ql4_isr.c: remove dead code
On Tue, 19 Feb 2008, James Bottomley wrote: On Tue, 2008-02-19 at 21:29 +0200, Adrian Bunk wrote: This patch removes dead code spotted by the Coverity checker. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- drivers/scsi/qla4xxx/ql4_isr.c | 18 +- 1 file changed, 1 insertion(+), 17 deletions(-) --- linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c.old2008-02-19 20:29:16.0 +0200 +++ linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c2008-02-19 20:30:37.0 +0200 @@ -91,38 +91,22 @@ static void qla4xxx_status_entry(struct if (scsi_status == 0) { cmd-result = DID_OK 16; break; } if (sts_entry-iscsiFlags ISCSI_FLAG_RESIDUAL_OVER) { cmd-result = DID_ERROR 16; break; } - if (sts_entry-iscsiFlags ISCSI_FLAG_RESIDUAL_UNDER) { + if (sts_entry-iscsiFlags ISCSI_FLAG_RESIDUAL_UNDER) scsi_set_resid(cmd, residual); - if (!scsi_status ((scsi_bufflen(cmd) - residual) - cmd-underflow)) { - - cmd-result = DID_ERROR 16; - - DEBUG2(printk(scsi%ld:%d:%d:%d: %s: - Mid-layer Data underrun0, - xferlen = 0x%x, - residual = 0x%x\n, ha-host_no, - cmd-device-channel, - cmd-device-id, - cmd-device-lun, __func__, - scsi_bufflen(cmd), residual)); - break; - } - } This code doesn't look dead to me, it looks to be enforcing cmd-underrun if set ... what makes the coverity checker think it can never be executed? Hmm, guess it's the earlier 'if (scsi_status == 0)' check a few lines up... Dave S., can you take a look at this... Thanks, av -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] scsi/qla4xxx/ql4_isr.c: remove dead code
On Tue, 19 Feb 2008, James Bottomley wrote: On Tue, 2008-02-19 at 18:35 -0800, Andrew Vasquez wrote: On Tue, 19 Feb 2008, James Bottomley wrote: On Tue, 2008-02-19 at 21:29 +0200, Adrian Bunk wrote: This patch removes dead code spotted by the Coverity checker. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- drivers/scsi/qla4xxx/ql4_isr.c | 18 +- 1 file changed, 1 insertion(+), 17 deletions(-) --- linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c.old2008-02-19 20:29:16.0 +0200 +++ linux-2.6/drivers/scsi/qla4xxx/ql4_isr.c2008-02-19 20:30:37.0 +0200 @@ -91,38 +91,22 @@ static void qla4xxx_status_entry(struct if (scsi_status == 0) { cmd-result = DID_OK 16; break; } if (sts_entry-iscsiFlags ISCSI_FLAG_RESIDUAL_OVER) { cmd-result = DID_ERROR 16; break; } - if (sts_entry-iscsiFlags ISCSI_FLAG_RESIDUAL_UNDER) { + if (sts_entry-iscsiFlags ISCSI_FLAG_RESIDUAL_UNDER) scsi_set_resid(cmd, residual); - if (!scsi_status ((scsi_bufflen(cmd) - residual) - cmd-underflow)) { - - cmd-result = DID_ERROR 16; - - DEBUG2(printk(scsi%ld:%d:%d:%d: %s: - Mid-layer Data underrun0, - xferlen = 0x%x, - residual = 0x%x\n, ha-host_no, - cmd-device-channel, - cmd-device-id, - cmd-device-lun, __func__, - scsi_bufflen(cmd), residual)); - break; - } - } This code doesn't look dead to me, it looks to be enforcing cmd-underrun if set ... what makes the coverity checker think it can never be executed? Hmm, guess it's the earlier 'if (scsi_status == 0)' check a few lines up... Dave S., can you take a look at this... Thanks, av Ah, so the !scsi_status is wrong it was supposed to be scsi_status != 0 ... and even then it can just be dropped. My guess is that the check should have been written as: ... if (sts_entry-iscsiFlags ISCSI_FLAG_RESIDUAL_UNDER) scsi_set_resid(cmd, residual); if ((scsi_bufflen(cmd) - residual) cmd-underflow) { ... It looks to be a logic-error while porting from qla2xxx, where scsi_status during CS_COMPLETE is the full 16-bit status (high-byte is transport, low-byte SCSI status) from from the FCP_RSP frame (not so in iSCSI, where it's just the SCSI-status) and the residual check in qla_isr.c::qla2x00_status_entry() looks like: if (!lscsi_status ((unsigned)(scsi_bufflen(cp) - resid) cp-underflow)) { ... I'll defer to Dave S. for verification. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24 regression w/ QLA2300
On Tue, 05 Feb 2008, Alan D. Brunelle wrote: > > and send the resultant kernel logs? > > Here's the output to the console (if there are other logs you need, > let me know). I'll try the patch next, and sorry, hadn't realized > merges were still coming in under 2.6.24 in Linus' tree... > > QLogic Fibre Channel HBA Driver > ACPI: PCI Interrupt :40:01.0[A] -> GSI 38 (level, low) -> IRQ 58 > qla2xxx :40:01.0: Found an ISP2312, irq 58, iobase 0xc000a0041000 > qla2xxx :40:01.0: Configuring PCI space... > qla2x00_get_flash_version(): Unrecognized code type ff at pcids da1c. > qla2x00_get_flash_version(): Unrecognized code type ff at pcids 1f61c. > qla2xxx :40:01.0: Configure NVRAM parameters... > qla2xxx :40:01.0: Verifying loaded RISC code... > scsi(14): Load RISC code > scsi(14): Verifying Checksum of loaded RISC code. > scsi(14): Checksum OK, start firmware. > qla2xxx :40:01.0: Allocated (412 KB) for firmware dump... > scsi(14): Issue init firmware. > qla2x00_mailbox_command(14): FAILED. mbx0=4001, mbx1=0, mbx2=ba8a, > cmd=48 Ok, this is what I would have expected with the linus' tree prior to the fix. I just double-checked, the fix in question has yet to make it's way to Linus' tree. It's currently in scsi-misc-2.6: http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=a571fdf7caa010e17f6a70c0c52e0992e87af7db which should filter up to linux-2.6.git during Linus' next pull. thanks, av -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24 regression w/ QLA2300
On Tue, 05 Feb 2008, Andrew Vasquez wrote: > > Could you load the (default 2.6.24) driver with > > ql2xextended_error_logging modules parameter set: > > > > # insmod qla2xxx ql2xextended_error_logging=1 > > > > and send the resultant kernel logs? > > Could you tray the patch referenced here: > > qla2xxx: Correct issue where incorrect init-fw mailbox command was used on > non-NPIV capable ISPs. > http://article.gmane.org/gmane.linux.scsi/38240 BTW: the regression in question is not present in vanilla 2.6.24. Instead it was introduced early on in the 2.6.25 merge-window. Linus' tree currently has the patch referenced above as well. -- av -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24 regression w/ QLA2300
On Tue, 05 Feb 2008, Andrew Vasquez wrote: > On Tue, 05 Feb 2008, Alan D. Brunelle wrote: > > > commit 9b73e76f3cf63379dcf45fcd4f112f5812418d0a > > Merge: 50d9a12... 23c3e29... > > Author: Linus Torvalds <[EMAIL PROTECTED]> > > Date: Fri Jan 25 17:19:08 2008 -0800 > > > > Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 > > > > * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: > > (200 commits) > > > > I believe a regression was introduced. I'm running on a 4-way IA64, > > with straight 2.6.24 and 2 dual-port cards: > > > > 40:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) > > 40:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) > > c0:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) > > c0:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) > > > > the adapters failed initialization. In particular, I narrowed it down > > to failing the qla2x00_mbox_command call within qla2x00_init_firmware > > function. I went and removed the qla2x00-related parts of this (large-ish) > > merge, and the 4 ports initialized just fine. > > Could you load the (default 2.6.24) driver with > ql2xextended_error_logging modules parameter set: > > # insmod qla2xxx ql2xextended_error_logging=1 > > and send the resultant kernel logs? Could you tray the patch referenced here: qla2xxx: Correct issue where incorrect init-fw mailbox command was used on non-NPIV capable ISPs. http://article.gmane.org/gmane.linux.scsi/38240 Thanks, av --- qla2xxx: Correct issue where incorrect init-fw mailbox command was used on non-NPIV capable ISPs. BIT_2 of the firmware attributes is only valid on FW-interface-2 type HBAs. Code in commit c48339decceec8e011498b0fc4c7c7d8b2ea06c1 would cause the incorrect initialize-firmware mailbox command to be issued for non-NPIV capable ISPs. Correct this by reverting to previously used (and correct) pre-condition 'if' check. Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]> --- drivers/scsi/qla2xxx/qla_mbx.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c index 0c10c0b..99d29ff 100644 --- a/drivers/scsi/qla2xxx/qla_mbx.c +++ b/drivers/scsi/qla2xxx/qla_mbx.c @@ -980,7 +980,7 @@ qla2x00_init_firmware(scsi_qla_host_t *ha, uint16_t size) DEBUG11(printk("qla2x00_init_firmware(%ld): entered.\n", ha->host_no)); - if (ha->fw_attributes & BIT_2) + if (ha->flags.npiv_supported) mcp->mb[0] = MBC_MID_INITIALIZE_FIRMWARE; else mcp->mb[0] = MBC_INITIALIZE_FIRMWARE; -- 1.5.4.rc5.5.gab98 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24 regression w/ QLA2300
On Tue, 05 Feb 2008, Alan D. Brunelle wrote: > commit 9b73e76f3cf63379dcf45fcd4f112f5812418d0a > Merge: 50d9a12... 23c3e29... > Author: Linus Torvalds <[EMAIL PROTECTED]> > Date: Fri Jan 25 17:19:08 2008 -0800 > > Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 > > * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 > commits) > > I believe a regression was introduced. I'm running on a 4-way IA64, > with straight 2.6.24 and 2 dual-port cards: > > 40:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) > 40:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) > c0:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) > c0:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) > > the adapters failed initialization. In particular, I narrowed it down > to failing the qla2x00_mbox_command call within qla2x00_init_firmware > function. I went and removed the qla2x00-related parts of this (large-ish) > merge, and the 4 ports initialized just fine. Could you load the (default 2.6.24) driver with ql2xextended_error_logging modules parameter set: # insmod qla2xxx ql2xextended_error_logging=1 and send the resultant kernel logs? > Specifically, reverting the "patch" below enabled the devices to initialize > properly. > > If need be, I'm certainly willing to help narrow down to the specific part in > this patch... That's a rather large patch... :( Any chance you could git-bisect? Also, could you send your .config file you are using? Thanks, Andrew Vasquez -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24 regression w/ QLA2300
On Tue, 05 Feb 2008, Alan D. Brunelle wrote: commit 9b73e76f3cf63379dcf45fcd4f112f5812418d0a Merge: 50d9a12... 23c3e29... Author: Linus Torvalds [EMAIL PROTECTED] Date: Fri Jan 25 17:19:08 2008 -0800 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 commits) I believe a regression was introduced. I'm running on a 4-way IA64, with straight 2.6.24 and 2 dual-port cards: 40:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) 40:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) c0:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) c0:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) the adapters failed initialization. In particular, I narrowed it down to failing the qla2x00_mbox_command call within qla2x00_init_firmware function. I went and removed the qla2x00-related parts of this (large-ish) merge, and the 4 ports initialized just fine. Could you load the (default 2.6.24) driver with ql2xextended_error_logging modules parameter set: # insmod qla2xxx ql2xextended_error_logging=1 and send the resultant kernel logs? Specifically, reverting the patch below enabled the devices to initialize properly. If need be, I'm certainly willing to help narrow down to the specific part in this patch... That's a rather large patch... :( Any chance you could git-bisect? Also, could you send your .config file you are using? Thanks, Andrew Vasquez -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24 regression w/ QLA2300
On Tue, 05 Feb 2008, Andrew Vasquez wrote: On Tue, 05 Feb 2008, Alan D. Brunelle wrote: commit 9b73e76f3cf63379dcf45fcd4f112f5812418d0a Merge: 50d9a12... 23c3e29... Author: Linus Torvalds [EMAIL PROTECTED] Date: Fri Jan 25 17:19:08 2008 -0800 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 commits) I believe a regression was introduced. I'm running on a 4-way IA64, with straight 2.6.24 and 2 dual-port cards: 40:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) 40:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) c0:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) c0:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03) the adapters failed initialization. In particular, I narrowed it down to failing the qla2x00_mbox_command call within qla2x00_init_firmware function. I went and removed the qla2x00-related parts of this (large-ish) merge, and the 4 ports initialized just fine. Could you load the (default 2.6.24) driver with ql2xextended_error_logging modules parameter set: # insmod qla2xxx ql2xextended_error_logging=1 and send the resultant kernel logs? Could you tray the patch referenced here: qla2xxx: Correct issue where incorrect init-fw mailbox command was used on non-NPIV capable ISPs. http://article.gmane.org/gmane.linux.scsi/38240 Thanks, av --- qla2xxx: Correct issue where incorrect init-fw mailbox command was used on non-NPIV capable ISPs. BIT_2 of the firmware attributes is only valid on FW-interface-2 type HBAs. Code in commit c48339decceec8e011498b0fc4c7c7d8b2ea06c1 would cause the incorrect initialize-firmware mailbox command to be issued for non-NPIV capable ISPs. Correct this by reverting to previously used (and correct) pre-condition 'if' check. Signed-off-by: Andrew Vasquez [EMAIL PROTECTED] --- drivers/scsi/qla2xxx/qla_mbx.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c index 0c10c0b..99d29ff 100644 --- a/drivers/scsi/qla2xxx/qla_mbx.c +++ b/drivers/scsi/qla2xxx/qla_mbx.c @@ -980,7 +980,7 @@ qla2x00_init_firmware(scsi_qla_host_t *ha, uint16_t size) DEBUG11(printk(qla2x00_init_firmware(%ld): entered.\n, ha-host_no)); - if (ha-fw_attributes BIT_2) + if (ha-flags.npiv_supported) mcp-mb[0] = MBC_MID_INITIALIZE_FIRMWARE; else mcp-mb[0] = MBC_INITIALIZE_FIRMWARE; -- 1.5.4.rc5.5.gab98 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24 regression w/ QLA2300
On Tue, 05 Feb 2008, Andrew Vasquez wrote: Could you load the (default 2.6.24) driver with ql2xextended_error_logging modules parameter set: # insmod qla2xxx ql2xextended_error_logging=1 and send the resultant kernel logs? Could you tray the patch referenced here: qla2xxx: Correct issue where incorrect init-fw mailbox command was used on non-NPIV capable ISPs. http://article.gmane.org/gmane.linux.scsi/38240 BTW: the regression in question is not present in vanilla 2.6.24. Instead it was introduced early on in the 2.6.25 merge-window. Linus' tree currently has the patch referenced above as well. -- av -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24 regression w/ QLA2300
On Tue, 05 Feb 2008, Alan D. Brunelle wrote: and send the resultant kernel logs? Here's the output to the console (if there are other logs you need, let me know). I'll try the patch next, and sorry, hadn't realized merges were still coming in under 2.6.24 in Linus' tree... QLogic Fibre Channel HBA Driver ACPI: PCI Interrupt :40:01.0[A] - GSI 38 (level, low) - IRQ 58 qla2xxx :40:01.0: Found an ISP2312, irq 58, iobase 0xc000a0041000 qla2xxx :40:01.0: Configuring PCI space... qla2x00_get_flash_version(): Unrecognized code type ff at pcids da1c. qla2x00_get_flash_version(): Unrecognized code type ff at pcids 1f61c. qla2xxx :40:01.0: Configure NVRAM parameters... qla2xxx :40:01.0: Verifying loaded RISC code... scsi(14): Load RISC code scsi(14): Verifying Checksum of loaded RISC code. scsi(14): Checksum OK, start firmware. qla2xxx :40:01.0: Allocated (412 KB) for firmware dump... scsi(14): Issue init firmware. qla2x00_mailbox_command(14): FAILED. mbx0=4001, mbx1=0, mbx2=ba8a, cmd=48 Ok, this is what I would have expected with the linus' tree prior to the fix. I just double-checked, the fix in question has yet to make it's way to Linus' tree. It's currently in scsi-misc-2.6: http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=a571fdf7caa010e17f6a70c0c52e0992e87af7db which should filter up to linux-2.6.git during Linus' next pull. thanks, av -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] pci: Remove users of pci_enable_device_bars()
On Tue, 08 Jan 2008, Benjamin Herrenschmidt wrote: > On Mon, 2008-01-07 at 11:42 -0800, Andrew Vasquez wrote: > > That's fine. I take it these patches will be funneled via > > gregkh/pci-2.6.git. There's some qla2xxx updates which are queued for > > post-2.6.24 consumption in jejb/scsi-misc-2.6.git which don't appear > > to have any conflicts. > > > > I do though have a series of patches which I'll hold off on submitting > > to linux-scsi until these PCI changes are merged, as there's some > > minor conflicts merging the three branches. James B, will that be > > fine? > > > > The patches themselves appear to be working fine within several of our > > test rings. I hold off on some of the cleanup post-merge time... > > Thanks for testing ! They should be queued with Greg indeed. Just curious, this patch-set hasn't made it upstream to linux-2.6.git, will the work be deferred to 2.6.26?? -- av -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] pci: Remove users of pci_enable_device_bars()
On Tue, 08 Jan 2008, Benjamin Herrenschmidt wrote: On Mon, 2008-01-07 at 11:42 -0800, Andrew Vasquez wrote: That's fine. I take it these patches will be funneled via gregkh/pci-2.6.git. There's some qla2xxx updates which are queued for post-2.6.24 consumption in jejb/scsi-misc-2.6.git which don't appear to have any conflicts. I do though have a series of patches which I'll hold off on submitting to linux-scsi until these PCI changes are merged, as there's some minor conflicts merging the three branches. James B, will that be fine? The patches themselves appear to be working fine within several of our test rings. I hold off on some of the cleanup post-merge time... Thanks for testing ! They should be queued with Greg indeed. Just curious, this patch-set hasn't made it upstream to linux-2.6.git, will the work be deferred to 2.6.26?? -- av -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)
On Tue, 29 Jan 2008, Luck, Tony wrote: > > > Will try this next. > > > > Should work even better since it avoids a lock and copy, but please do > > test if you have the time. > > That one works too (survived two full builds at "make -j32" on the 16-way > system). > > Thanks for the quick turnaround. Jens, this patch appears to work well on my 16-way box... Compile tests appear to pass (quickly). I'll be sure to let you know if anything else 'block' related pops-up... thanks, av -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)
On Tue, 29 Jan 2008, Jens Axboe wrote: > Great, thanks for confirming. It does look like a clear bug in cciss, it > just got exposed now that it uses proper end request handling. We never > need to clear ->data_len, since for blk_fs_request() it will be cleared > on init. So just setting a residual count there for blk_fs_request() > like cciss does is fine. > > Anyway, it's in my pending queue for Linus. > Hmm, probably not related to the block changes in your tree, but I'm seeing yet another problem after working (compile jobs) the machine: [ 61.423922] BUG: spinlock recursion on CPU#2, kjournald/2317 [ 61.427843] lock: 81042c5a4988, .magic: dead4ead, .owner: kjournald/2317, .owner_cpu: 2 [ 61.427843] Pid: 2317, comm: kjournald Not tainted 2.6.24 #45 [ 61.427843] [ 61.427843] Call Trace: [ 61.427843] [] _raw_spin_lock+0xe9/0x12a [ 61.427843] [] as_merged_requests+0xfe/0x115 [ 61.427843] [] elv_merge_requests+0x1f/0x45 [ 61.427843] [] attempt_merge+0x281/0x347 [ 61.427843] [] __make_request+0x1e6/0x598 [ 61.427843] [] generic_make_request+0x1c8/0x276 [ 61.427843] [] submit_bio+0x61/0xdb [ 61.427843] [] submit_bh+0xe2/0x118 [ 61.427843] [] journal_do_submit_data+0x28/0x39 [ 61.427843] [] journal_commit_transaction+0xdbe/0x1394 [ 61.427843] [] lock_timer_base+0x26/0x4e [ 61.427843] [] kjournald+0x104/0x373 [ 61.427843] [] autoremove_wake_function+0x0/0x2e [ 61.427843] [] kjournald+0x0/0x373 [ 61.427843] [] kthread+0x3d/0x61 [ 61.427843] [] child_rip+0xa/0x12 [ 61.427843] [] kthread+0x0/0x61 [ 61.427843] [] child_rip+0x0/0x12 [ 61.427843] [ 124.555789] BUG: soft lockup - CPU#6 stuck for 61s! [as:7191] [ 124.555789] CPU 6: [ 124.555789] Modules linked in: [ 124.555789] Pid: 7191, comm: as Not tainted 2.6.24 #45 [ 124.555789] RIP: 0010:[] [] _raw_spin_lock+0xa5/0x12a [ 124.555789] RSP: 0018:81042b50be18 EFLAGS: 0246 [ 124.555789] RAX: RBX: 0ef44415 RCX: 2b543897 [ 124.555789] RDX: 009c RSI: 81042c87b868 RDI: 0001 [ 124.555789] RBP: 2b0273879000 R08: 7fff38107000 R09: 805739c4 [ 124.555789] R10: 81042f33ed78 R11: 81042f33ed78 R12: 81042b50f3c0 [ 124.555789] R13: 0010 R14: 8000 R15: [ 124.555789] FS: 2b02736f3ef0() GS:81042f98a560() knlGS: [ 124.555789] CS: 0010 DS: ES: CR0: 8005003b [ 124.555789] CR2: 0085d018 CR3: 00042cb0e000 CR4: 06e0 [ 124.555789] DR0: DR1: DR2: [ 124.555789] DR3: DR6: 0ff0 DR7: 0400 [ 124.555789] [ 124.555789] Call Trace: [ 124.555789] [] _raw_spin_lock+0xb3/0x12a [ 124.555789] [] flush_tlb_others+0x4b/0xa8 [ 124.555789] [] flush_tlb_mm+0x4a/0x99 [ 124.555789] [] unmap_region+0x10a/0x141 [ 124.555789] [] do_munmap+0x1fd/0x2b9 [ 124.555789] [] __down_write_nested+0xa0/0xb0 [ 124.555789] [] sys_munmap+0x3b/0x57 [ 124.555789] [] system_call+0x7e/0x83 [ 124.555789] [ 124.555789] BUG: soft lockup - CPU#14 stuck for 61s! [cc1:7190] [ 124.555789] CPU 14: [ 124.555789] Modules linked in: [ 124.555789] Pid: 7190, comm: cc1 Not tainted 2.6.24 #45 [ 124.555789] RIP: 0010:[] [] flush_tlb_others+0x75/0xa8 [ 124.555789] RSP: 0018:81042b965e48 EFLAGS: 0202 [ 124.555789] RAX: 0010 RBX: 0006 RCX: 0003 [ 124.555789] RDX: 0010 RSI: 81042b965df8 RDI: 0002 [ 124.555789] RBP: 810011cc2658 R08: 2aea0f72d000 R09: 80573dc4 [ 124.555789] R10: 81042e65b7b8 R11: 81042e65b7b8 R12: 80630640 [ 124.555789] R13: R14: 80264d07 R15: 81042e0dd960 [ 124.555789] FS: 2aea0fafb6f0() GS:81042fb01cc0() knlGS: [ 124.555789] CS: 0010 DS: ES: CR0: 80050033 [ 124.555789] CR2: 2aea117fe000 CR3: 00042b537000 CR4: 06e0 [ 124.555789] DR0: DR1: DR2: [ 124.555789] DR3: DR6: 0ff0 DR7: 0400 [ 124.555789] [ 124.555789] Call Trace: [ 124.555789] [] flush_tlb_others+0x69/0xa8 [ 124.555789] [] flush_tlb_mm+0x4a/0x99 [
Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)
On Tue, 29 Jan 2008, Miller, Mike (OS Dev) wrote: > Jens wrote: > > > -Original Message- > > From: Jens Axboe [mailto:[EMAIL PROTECTED] > > Sent: Tuesday, January 29, 2008 12:54 PM > > To: Andrew Vasquez > > Cc: Linux Kernel Mailing List; Miller, Mike (OS Dev); > > [EMAIL PROTECTED]; [EMAIL PROTECTED] > > Subject: Re: kernel BUG at drivers/block/cciss.c:1260! (with > > recent linux-2.6 tree) > > > > On Tue, Jan 29 2008, Andrew Vasquez wrote: > > > On Tue, 29 Jan 2008, Jens Axboe wrote: > > > > > > > On Tue, Jan 29 2008, Andrew Vasquez wrote: > > > > > On Tue, 29 Jan 2008, Jens Axboe wrote: > > > > > > > > > > > > Here the final snippet that was logged: > > > > > > > > > > > > > > [ 12.724997] input: USB HID v1.01 Mouse [HP > > Virtual Keyboard] on usb-:01:04.4-1 > > > > > > > [ 12.728971] usbcore: registered new interface > > driver usbhid > > > > > > > [ 12.732866] drivers/hid/usbhid/hid-core.c: > > v2.6:USB HID core driver > > > > > > > [ 12.741172] TCP cubic registered > > > > > > > [ 12.744506] NET: Registered protocol family 1 > > > > > > > [ 12.744884] NET: Registered protocol family 17 > > > > > > > [ 12.749217] Freeing unused kernel memory: 228k freed > > > > > > > [ 12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8 > > > > > > > [ 12.888929] > > > > > > > [ 12.888930] sector 651061426900570, nr/cnr 0/0 > > > > > > > [ 12.892895] bio 81042f130730, biotail > > 81042f130730, buffer , data > > , len 0 > > > > > > > [ 12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 > > 00 00 00 00 00 00 > > > > > > > > > > > > Ah ok, I see the problem... cciss is overriding the > > data_len for > > > > > > BLOCK_PC requests, hence it does not complete them properly. > > > > > > Hmm. Does this work? > > > > > > > > > > > > diff --git a/drivers/block/cciss.c > > b/drivers/block/cciss.c index > > > > > > ef50068..b6fa52e 100644 > > > > > > --- a/drivers/block/cciss.c > > > > > > +++ b/drivers/block/cciss.c > > > > > > @@ -2524,7 +2524,6 @@ after_error_processing: > > > > > > resend_cciss_cmd(h, cmd); > > > > > > return; > > > > > > } > > > > > > - cmd->rq->data_len = 0; > > > > > > cmd->rq->completion_data = cmd; > > > > > > blk_complete_request(cmd->rq); } > > > > > > > > > > > > > > > Things look good so far -- with the patch above I can > > finally boot > > > > > the machine. > > > > > > > > Cool, sorry about that. Will get that applied asap. So after this > > > > patch was applied, you didn't see any debug messages from > > > > blk_dump_rq_flags() anymore, right? > > > > > > That's correct. I've yet to see any additional debug-messages from > > > blk_dump_rq_flags(). > > > > Great, thanks for confirming. It does look like a clear bug > > in cciss, it just got exposed now that it uses proper end > > request handling. We never need to clear ->data_len, since > > for blk_fs_request() it will be cleared on init. So just > > setting a residual count there for blk_fs_request() like > > cciss does is fine. > > Just so I'm clear: just removing the one line is enough to resolve the > problem? That's correct. The only other change to cciss.c in my tree is where the BUG() call was replaced with a call to blk_dump_rq_flags(): @@ -1257,7 +1257,8 @@ static void cciss_softirq_done(struct request *rq) #endif /* CCISS_DEBUG */ if (blk_end_request(rq, (rq->errors == 0) ? 0 : -EIO, blk_rq_bytes(rq))) - BUG(); + blk_dump_rq_flags(rq, "cciss rq"); +// BUG(); spin_lock_irqsave(>lock, flags); cmd_free(h, cmd, 1); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)
On Tue, 29 Jan 2008, Jens Axboe wrote: > On Tue, Jan 29 2008, Andrew Vasquez wrote: > > On Tue, 29 Jan 2008, Jens Axboe wrote: > > > > > > Here the final snippet that was logged: > > > > > > > > [ 12.724997] input: USB HID v1.01 Mouse [HP Virtual Keyboard] on > > > > usb-:01:04.4-1 > > > > [ 12.728971] usbcore: registered new interface driver usbhid > > > > [ 12.732866] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver > > > > [ 12.741172] TCP cubic registered > > > > [ 12.744506] NET: Registered protocol family 1 > > > > [ 12.744884] NET: Registered protocol family 17 > > > > [ 12.749217] Freeing unused kernel memory: 228k freed > > > > [ 12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8 > > > > [ 12.888929] > > > > [ 12.888930] sector 651061426900570, nr/cnr 0/0 > > > > [ 12.892895] bio 81042f130730, biotail 81042f130730, buffer > > > > , data , len 0 > > > > [ 12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 > > > > > > Ah ok, I see the problem... cciss is overriding the data_len for > > > BLOCK_PC requests, hence it does not complete them properly. Hmm. Does > > > this work? > > > > > > diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c > > > index ef50068..b6fa52e 100644 > > > --- a/drivers/block/cciss.c > > > +++ b/drivers/block/cciss.c > > > @@ -2524,7 +2524,6 @@ after_error_processing: > > > resend_cciss_cmd(h, cmd); > > > return; > > > } > > > - cmd->rq->data_len = 0; > > > cmd->rq->completion_data = cmd; > > > blk_complete_request(cmd->rq); > > > } > > > > > > Things look good so far -- with the patch above I can finally boot the > > machine. > > Cool, sorry about that. Will get that applied asap. So after this patch > was applied, you didn't see any debug messages from blk_dump_rq_flags() > anymore, right? That's correct. I've yet to see any additional debug-messages from blk_dump_rq_flags(). -- av -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)
On Tue, 29 Jan 2008, Jens Axboe wrote: > > Here the final snippet that was logged: > > > > [ 12.724997] input: USB HID v1.01 Mouse [HP Virtual Keyboard] on > > usb-:01:04.4-1 > > [ 12.728971] usbcore: registered new interface driver usbhid > > [ 12.732866] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver > > [ 12.741172] TCP cubic registered > > [ 12.744506] NET: Registered protocol family 1 > > [ 12.744884] NET: Registered protocol family 17 > > [ 12.749217] Freeing unused kernel memory: 228k freed > > [ 12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8 > > [ 12.888929] > > [ 12.888930] sector 651061426900570, nr/cnr 0/0 > > [ 12.892895] bio 81042f130730, biotail 81042f130730, buffer > > , data , len 0 > > [ 12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 > > Ah ok, I see the problem... cciss is overriding the data_len for > BLOCK_PC requests, hence it does not complete them properly. Hmm. Does > this work? > > diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c > index ef50068..b6fa52e 100644 > --- a/drivers/block/cciss.c > +++ b/drivers/block/cciss.c > @@ -2524,7 +2524,6 @@ after_error_processing: > resend_cciss_cmd(h, cmd); > return; > } > - cmd->rq->data_len = 0; > cmd->rq->completion_data = cmd; > blk_complete_request(cmd->rq); > } Things look good so far -- with the patch above I can finally boot the machine. Thanks, av -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)
On Tue, 29 Jan 2008, Jens Axboe wrote: > Andrew, can you try with this applied? > > diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c > index ef50068..bd7b352 100644 > --- a/drivers/block/cciss.c > +++ b/drivers/block/cciss.c > @@ -1257,7 +1257,7 @@ static void cciss_softirq_done(struct request *rq) > #endif /* CCISS_DEBUG */ > > if (blk_end_request(rq, (rq->errors == 0) ? 0 : -EIO, blk_rq_bytes(rq))) > - BUG(); > + blk_dump_rq_flags(rq, "cciss rq"); > > spin_lock_irqsave(>lock, flags); > cmd_free(h, cmd, 1); Here the final snippet that was logged: [ 12.724997] input: USB HID v1.01 Mouse [HP Virtual Keyboard] on usb-:01:04.4-1 [ 12.728971] usbcore: registered new interface driver usbhid [ 12.732866] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver [ 12.741172] TCP cubic registered [ 12.744506] NET: Registered protocol family 1 [ 12.744884] NET: Registered protocol family 17 [ 12.749217] Freeing unused kernel memory: 228k freed [ 12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8 [ 12.888929] [ 12.888930] sector 651061426900570, nr/cnr 0/0 [ 12.892895] bio 81042f130730, biotail 81042f130730, buffer , data , len 0 [ 12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)
Hitting a consistent BUG() with recent Linus' linux-2.6.git: [ 12.941428] [ cut here ] [ 12.944874] kernel BUG at drivers/block/cciss.c:1260! [ 12.944874] invalid opcode: [1] SMP [ 12.944874] CPU 0 [ 12.944874] Modules linked in: [ 12.944874] Pid: 0, comm: swapper Not tainted 2.6.24 #43 [ 12.944874] RIP: 0010:[] [] cciss_softirq_done+0xbc/0x1bf [ 12.944874] RSP: 0018:8063aed0 EFLAGS: 00010202 [ 12.944874] RAX: 0001 RBX: 8100cf800010 RCX: 81042f1253b0 [ 12.944874] RDX: 81042de398f0 RSI: 81042de398f0 RDI: 0001 [ 12.944874] RBP: 81042daa R08: 81042f1253b0 R09: 0001 [ 12.944874] R10: 00fe R11: R12: 0002 [ 12.944874] R13: 0001 R14: 8100cf80 R15: 81042de398f0 [ 12.944874] FS: () GS:805bb000() knlGS: [ 12.944874] CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b [ 12.944874] CR2: 2afed7eea340 CR3: 00042dbba000 CR4: 06e0 [ 12.944874] DR0: DR1: DR2: [ 12.944874] DR3: DR6: 0ff0 DR7: 0400 [ 12.944874] Process swapper (pid: 0, threadinfo 805f4000, task 805624a0) [ 12.944874] Stack: 8063af10 0001 80632d60 [ 12.944874] 000a 805bb900 8032038f [ 12.944874] 8063af10 8063af10 805bb940 802346b4 [ 12.944874] Call Trace: [ 12.944874][] blk_done_softirq+0x69/0x78 [ 12.944874] [] __do_softirq+0x6f/0xd8 [ 12.944874] [] call_softirq+0x1c/0x30 [ 12.944874] [] do_softirq+0x30/0x80 [ 12.944874] [] do_IRQ+0x72/0xd9 [ 12.944874] [] mwait_idle+0x0/0x46 [ 12.944874] [] default_idle+0x0/0x3d [ 12.944874] [] ret_from_intr+0x0/0xa [ 12.944874][] mwait_idle+0x42/0x46 [ 12.944874] [] cpu_idle+0x6a/0xae [ 12.944874] [ 12.944874] [ 12.944874] Code: 0f 0b eb fe 48 8d 85 d8 c0 00 00 48 89 04 24 48 89 c7 e8 e5 [ 12.944874] RIP [] cciss_softirq_done+0xbc/0x1bf [ 12.944874] RSP [ 12.944903] ---[ end trace e9c631603f90d22f ]--- code in question is in drivers/block/cciss.c:cciss_softirq_done(): ... if (blk_end_request(rq, (rq->errors == 0) ? 0 : -EIO, blk_rq_bytes(rq))) BUG(); And appears to be a result of a recent merge: commit f0f0052069989b80d2a3e50c9cd2f2a650bc1aea Refs: v2.6.24-1949-gf0f0052 Merge: 68fbda7... a65b586... Author: Linus Torvalds <[EMAIL PROTECTED]> Date: Tue Jan 29 08:51:32 2008 +1100 Merge branch 'blk-end-request' of git://git.kernel.dk/linux-2.6-block Here's the commit which added the blk_end_request() BUG() on: commit 3daeea29f9348263e0dda89a565074390475bdf8 Refs: v2.6.24-1743-g3daeea2 Author: Kiyoshi Ueda <[EMAIL PROTECTED]> Date: Tue Dec 11 17:50:03 2007 -0500 blk_end_request: changing cciss (take 4) This patch converts cciss to use blk_end_request interfaces. Related 'uptodate' arguments are converted to 'error'. cciss is a little bit different from "normal" drivers. cciss directly calls bio_endio() and disk_stat_add() when completing request. But those can be replaced with __end_that_request_first(). After the replacement, request completion procedures of those drivers become like the following: o end_that_request_first() o add_disk_randomness() o end_that_request_last() This can be converted to blk_end_request() by following the rule (a) mentioned in the patch subject "[PATCH 01/30] blk_end_request: add new request completion interface". Cc: Mike Miller <[EMAIL PROTECTED]> Signed-off-by: Kiyoshi Ueda <[EMAIL PROTECTED]> Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]> Signed-off-by: Jens Axboe <[EMAIL PROTECTED]> -- av -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)
Hitting a consistent BUG() with recent Linus' linux-2.6.git: [ 12.941428] [ cut here ] [ 12.944874] kernel BUG at drivers/block/cciss.c:1260! [ 12.944874] invalid opcode: [1] SMP [ 12.944874] CPU 0 [ 12.944874] Modules linked in: [ 12.944874] Pid: 0, comm: swapper Not tainted 2.6.24 #43 [ 12.944874] RIP: 0010:[8039e43d] [8039e43d] cciss_softirq_done+0xbc/0x1bf [ 12.944874] RSP: 0018:8063aed0 EFLAGS: 00010202 [ 12.944874] RAX: 0001 RBX: 8100cf800010 RCX: 81042f1253b0 [ 12.944874] RDX: 81042de398f0 RSI: 81042de398f0 RDI: 0001 [ 12.944874] RBP: 81042daa R08: 81042f1253b0 R09: 0001 [ 12.944874] R10: 00fe R11: R12: 0002 [ 12.944874] R13: 0001 R14: 8100cf80 R15: 81042de398f0 [ 12.944874] FS: () GS:805bb000() knlGS: [ 12.944874] CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b [ 12.944874] CR2: 2afed7eea340 CR3: 00042dbba000 CR4: 06e0 [ 12.944874] DR0: DR1: DR2: [ 12.944874] DR3: DR6: 0ff0 DR7: 0400 [ 12.944874] Process swapper (pid: 0, threadinfo 805f4000, task 805624a0) [ 12.944874] Stack: 8063af10 0001 80632d60 [ 12.944874] 000a 805bb900 8032038f [ 12.944874] 8063af10 8063af10 805bb940 802346b4 [ 12.944874] Call Trace: [ 12.944874] IRQ [8032038f] blk_done_softirq+0x69/0x78 [ 12.944874] [802346b4] __do_softirq+0x6f/0xd8 [ 12.944874] [8020c45c] call_softirq+0x1c/0x30 [ 12.944874] [8020e347] do_softirq+0x30/0x80 [ 12.944874] [8020e409] do_IRQ+0x72/0xd9 [ 12.944874] [8020a50a] mwait_idle+0x0/0x46 [ 12.944874] [8020a3da] default_idle+0x0/0x3d [ 12.944874] [8020b7e1] ret_from_intr+0x0/0xa [ 12.944874] EOI [8020a54c] mwait_idle+0x42/0x46 [ 12.944874] [8020a481] cpu_idle+0x6a/0xae [ 12.944874] [ 12.944874] [ 12.944874] Code: 0f 0b eb fe 48 8d 85 d8 c0 00 00 48 89 04 24 48 89 c7 e8 e5 [ 12.944874] RIP [8039e43d] cciss_softirq_done+0xbc/0x1bf [ 12.944874] RSP 8063aed0 [ 12.944903] ---[ end trace e9c631603f90d22f ]--- code in question is in drivers/block/cciss.c:cciss_softirq_done(): ... if (blk_end_request(rq, (rq-errors == 0) ? 0 : -EIO, blk_rq_bytes(rq))) BUG(); And appears to be a result of a recent merge: commit f0f0052069989b80d2a3e50c9cd2f2a650bc1aea Refs: v2.6.24-1949-gf0f0052 Merge: 68fbda7... a65b586... Author: Linus Torvalds [EMAIL PROTECTED] Date: Tue Jan 29 08:51:32 2008 +1100 Merge branch 'blk-end-request' of git://git.kernel.dk/linux-2.6-block Here's the commit which added the blk_end_request() BUG() on: commit 3daeea29f9348263e0dda89a565074390475bdf8 Refs: v2.6.24-1743-g3daeea2 Author: Kiyoshi Ueda [EMAIL PROTECTED] Date: Tue Dec 11 17:50:03 2007 -0500 blk_end_request: changing cciss (take 4) This patch converts cciss to use blk_end_request interfaces. Related 'uptodate' arguments are converted to 'error'. cciss is a little bit different from normal drivers. cciss directly calls bio_endio() and disk_stat_add() when completing request. But those can be replaced with __end_that_request_first(). After the replacement, request completion procedures of those drivers become like the following: o end_that_request_first() o add_disk_randomness() o end_that_request_last() This can be converted to blk_end_request() by following the rule (a) mentioned in the patch subject [PATCH 01/30] blk_end_request: add new request completion interface. Cc: Mike Miller [EMAIL PROTECTED] Signed-off-by: Kiyoshi Ueda [EMAIL PROTECTED] Signed-off-by: Jun'ichi Nomura [EMAIL PROTECTED] Signed-off-by: Jens Axboe [EMAIL PROTECTED] -- av -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)
On Tue, 29 Jan 2008, Luck, Tony wrote: Will try this next. Should work even better since it avoids a lock and copy, but please do test if you have the time. That one works too (survived two full builds at make -j32 on the 16-way system). Thanks for the quick turnaround. Jens, this patch appears to work well on my 16-way box... Compile tests appear to pass (quickly). I'll be sure to let you know if anything else 'block' related pops-up... thanks, av -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)
On Tue, 29 Jan 2008, Jens Axboe wrote: On Tue, Jan 29 2008, Andrew Vasquez wrote: On Tue, 29 Jan 2008, Jens Axboe wrote: Here the final snippet that was logged: [ 12.724997] input: USB HID v1.01 Mouse [HP Virtual Keyboard] on usb-:01:04.4-1 [ 12.728971] usbcore: registered new interface driver usbhid [ 12.732866] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver [ 12.741172] TCP cubic registered [ 12.744506] NET: Registered protocol family 1 [ 12.744884] NET: Registered protocol family 17 [ 12.749217] Freeing unused kernel memory: 228k freed [ 12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8 [ 12.888929] [ 12.888930] sector 651061426900570, nr/cnr 0/0 [ 12.892895] bio 81042f130730, biotail 81042f130730, buffer , data , len 0 [ 12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 Ah ok, I see the problem... cciss is overriding the data_len for BLOCK_PC requests, hence it does not complete them properly. Hmm. Does this work? diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c index ef50068..b6fa52e 100644 --- a/drivers/block/cciss.c +++ b/drivers/block/cciss.c @@ -2524,7 +2524,6 @@ after_error_processing: resend_cciss_cmd(h, cmd); return; } - cmd-rq-data_len = 0; cmd-rq-completion_data = cmd; blk_complete_request(cmd-rq); } Things look good so far -- with the patch above I can finally boot the machine. Cool, sorry about that. Will get that applied asap. So after this patch was applied, you didn't see any debug messages from blk_dump_rq_flags() anymore, right? That's correct. I've yet to see any additional debug-messages from blk_dump_rq_flags(). -- av -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)
On Tue, 29 Jan 2008, Jens Axboe wrote: Andrew, can you try with this applied? diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c index ef50068..bd7b352 100644 --- a/drivers/block/cciss.c +++ b/drivers/block/cciss.c @@ -1257,7 +1257,7 @@ static void cciss_softirq_done(struct request *rq) #endif /* CCISS_DEBUG */ if (blk_end_request(rq, (rq-errors == 0) ? 0 : -EIO, blk_rq_bytes(rq))) - BUG(); + blk_dump_rq_flags(rq, cciss rq); spin_lock_irqsave(h-lock, flags); cmd_free(h, cmd, 1); Here the final snippet that was logged: [ 12.724997] input: USB HID v1.01 Mouse [HP Virtual Keyboard] on usb-:01:04.4-1 [ 12.728971] usbcore: registered new interface driver usbhid [ 12.732866] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver [ 12.741172] TCP cubic registered [ 12.744506] NET: Registered protocol family 1 [ 12.744884] NET: Registered protocol family 17 [ 12.749217] Freeing unused kernel memory: 228k freed [ 12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8 [ 12.888929] [ 12.888930] sector 651061426900570, nr/cnr 0/0 [ 12.892895] bio 81042f130730, biotail 81042f130730, buffer , data , len 0 [ 12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)
On Tue, 29 Jan 2008, Jens Axboe wrote: Here the final snippet that was logged: [ 12.724997] input: USB HID v1.01 Mouse [HP Virtual Keyboard] on usb-:01:04.4-1 [ 12.728971] usbcore: registered new interface driver usbhid [ 12.732866] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver [ 12.741172] TCP cubic registered [ 12.744506] NET: Registered protocol family 1 [ 12.744884] NET: Registered protocol family 17 [ 12.749217] Freeing unused kernel memory: 228k freed [ 12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8 [ 12.888929] [ 12.888930] sector 651061426900570, nr/cnr 0/0 [ 12.892895] bio 81042f130730, biotail 81042f130730, buffer , data , len 0 [ 12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 Ah ok, I see the problem... cciss is overriding the data_len for BLOCK_PC requests, hence it does not complete them properly. Hmm. Does this work? diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c index ef50068..b6fa52e 100644 --- a/drivers/block/cciss.c +++ b/drivers/block/cciss.c @@ -2524,7 +2524,6 @@ after_error_processing: resend_cciss_cmd(h, cmd); return; } - cmd-rq-data_len = 0; cmd-rq-completion_data = cmd; blk_complete_request(cmd-rq); } Things look good so far -- with the patch above I can finally boot the machine. Thanks, av -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)
On Tue, 29 Jan 2008, Miller, Mike (OS Dev) wrote: Jens wrote: -Original Message- From: Jens Axboe [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 29, 2008 12:54 PM To: Andrew Vasquez Cc: Linux Kernel Mailing List; Miller, Mike (OS Dev); [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree) On Tue, Jan 29 2008, Andrew Vasquez wrote: On Tue, 29 Jan 2008, Jens Axboe wrote: On Tue, Jan 29 2008, Andrew Vasquez wrote: On Tue, 29 Jan 2008, Jens Axboe wrote: Here the final snippet that was logged: [ 12.724997] input: USB HID v1.01 Mouse [HP Virtual Keyboard] on usb-:01:04.4-1 [ 12.728971] usbcore: registered new interface driver usbhid [ 12.732866] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver [ 12.741172] TCP cubic registered [ 12.744506] NET: Registered protocol family 1 [ 12.744884] NET: Registered protocol family 17 [ 12.749217] Freeing unused kernel memory: 228k freed [ 12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8 [ 12.888929] [ 12.888930] sector 651061426900570, nr/cnr 0/0 [ 12.892895] bio 81042f130730, biotail 81042f130730, buffer , data , len 0 [ 12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 Ah ok, I see the problem... cciss is overriding the data_len for BLOCK_PC requests, hence it does not complete them properly. Hmm. Does this work? diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c index ef50068..b6fa52e 100644 --- a/drivers/block/cciss.c +++ b/drivers/block/cciss.c @@ -2524,7 +2524,6 @@ after_error_processing: resend_cciss_cmd(h, cmd); return; } - cmd-rq-data_len = 0; cmd-rq-completion_data = cmd; blk_complete_request(cmd-rq); } Things look good so far -- with the patch above I can finally boot the machine. Cool, sorry about that. Will get that applied asap. So after this patch was applied, you didn't see any debug messages from blk_dump_rq_flags() anymore, right? That's correct. I've yet to see any additional debug-messages from blk_dump_rq_flags(). Great, thanks for confirming. It does look like a clear bug in cciss, it just got exposed now that it uses proper end request handling. We never need to clear -data_len, since for blk_fs_request() it will be cleared on init. So just setting a residual count there for blk_fs_request() like cciss does is fine. Just so I'm clear: just removing the one line is enough to resolve the problem? That's correct. The only other change to cciss.c in my tree is where the BUG() call was replaced with a call to blk_dump_rq_flags(): @@ -1257,7 +1257,8 @@ static void cciss_softirq_done(struct request *rq) #endif /* CCISS_DEBUG */ if (blk_end_request(rq, (rq-errors == 0) ? 0 : -EIO, blk_rq_bytes(rq))) - BUG(); + blk_dump_rq_flags(rq, cciss rq); +// BUG(); spin_lock_irqsave(h-lock, flags); cmd_free(h, cmd, 1); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] pci: Remove users of pci_enable_device_bars()
On Thu, 20 Dec 2007, Benjamin Herrenschmidt wrote: > This patch converts users of pci_enable_device_bars() to the new > pci_enable_device_{io,mem} interface. > > The new API fits nicely, except maybe for the QLA case where a bit of > code re-organization might be a good idea but I prefer sticking to the > simple patch as I don't have hardware to test on. That's fine. I take it these patches will be funneled via gregkh/pci-2.6.git. There's some qla2xxx updates which are queued for post-2.6.24 consumption in jejb/scsi-misc-2.6.git which don't appear to have any conflicts. I do though have a series of patches which I'll hold off on submitting to linux-scsi until these PCI changes are merged, as there's some minor conflicts merging the three branches. James B, will that be fine? The patches themselves appear to be working fine within several of our test rings. I hold off on some of the cleanup post-merge time... Thanks, Andrew Vasquez > Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> > --- > > drivers/ata/pata_cs5520.c |2 +- > drivers/i2c/busses/scx200_acb.c |2 +- > drivers/ide/pci/cs5520.c| 10 -- > drivers/ide/setup-pci.c |6 -- > drivers/scsi/lpfc/lpfc_init.c |3 +-- > drivers/scsi/qla2xxx/qla_os.c | 12 +--- Acked-by: Andrew Vasquez <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] pci: Remove users of pci_enable_device_bars()
On Thu, 20 Dec 2007, Benjamin Herrenschmidt wrote: This patch converts users of pci_enable_device_bars() to the new pci_enable_device_{io,mem} interface. The new API fits nicely, except maybe for the QLA case where a bit of code re-organization might be a good idea but I prefer sticking to the simple patch as I don't have hardware to test on. That's fine. I take it these patches will be funneled via gregkh/pci-2.6.git. There's some qla2xxx updates which are queued for post-2.6.24 consumption in jejb/scsi-misc-2.6.git which don't appear to have any conflicts. I do though have a series of patches which I'll hold off on submitting to linux-scsi until these PCI changes are merged, as there's some minor conflicts merging the three branches. James B, will that be fine? The patches themselves appear to be working fine within several of our test rings. I hold off on some of the cleanup post-merge time... Thanks, Andrew Vasquez Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- drivers/ata/pata_cs5520.c |2 +- drivers/i2c/busses/scx200_acb.c |2 +- drivers/ide/pci/cs5520.c| 10 -- drivers/ide/setup-pci.c |6 -- drivers/scsi/lpfc/lpfc_init.c |3 +-- drivers/scsi/qla2xxx/qla_os.c | 12 +--- Acked-by: Andrew Vasquez [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/scsi/: Spelling fixes
On Mon, 17 Dec 2007, Joe Perches wrote: > Signed-off-by: Joe Perches <[EMAIL PROTECTED]> > --- > drivers/scsi/NCR53C9x.h |2 +- > drivers/scsi/aic7xxx/aic79xx_inline.h |2 +- > drivers/scsi/aic7xxx/aic79xx_osm.c|2 +- > drivers/scsi/aic7xxx/aic79xx_pci.c|4 ++-- > drivers/scsi/aic7xxx/aic7xxx_inline.h |2 +- > drivers/scsi/aic7xxx/aic7xxx_osm.c|2 +- > drivers/scsi/ipr.c|2 +- > drivers/scsi/ips.c|2 +- > drivers/scsi/iscsi_tcp.c |4 ++-- > drivers/scsi/lpfc/lpfc.h |2 +- > drivers/scsi/lpfc/lpfc_mbox.c |2 +- > drivers/scsi/megaraid/megaraid_mbox.c | 10 +- > drivers/scsi/psi240i.c|2 +- > drivers/scsi/qla2xxx/qla_gs.c |2 +- qla2xxx bits: Acked-by: Andrew Vasquez <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/scsi/: Spelling fixes
On Mon, 17 Dec 2007, Joe Perches wrote: Signed-off-by: Joe Perches [EMAIL PROTECTED] --- drivers/scsi/NCR53C9x.h |2 +- drivers/scsi/aic7xxx/aic79xx_inline.h |2 +- drivers/scsi/aic7xxx/aic79xx_osm.c|2 +- drivers/scsi/aic7xxx/aic79xx_pci.c|4 ++-- drivers/scsi/aic7xxx/aic7xxx_inline.h |2 +- drivers/scsi/aic7xxx/aic7xxx_osm.c|2 +- drivers/scsi/ipr.c|2 +- drivers/scsi/ips.c|2 +- drivers/scsi/iscsi_tcp.c |4 ++-- drivers/scsi/lpfc/lpfc.h |2 +- drivers/scsi/lpfc/lpfc_mbox.c |2 +- drivers/scsi/megaraid/megaraid_mbox.c | 10 +- drivers/scsi/psi240i.c|2 +- drivers/scsi/qla2xxx/qla_gs.c |2 +- qla2xxx bits: Acked-by: Andrew Vasquez [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one
On Tue, 09 Oct 2007, James Smart wrote: > Why do you prefer request_firmware() vs something over sysfs ? > > Does environments like the kdump kernel also have access to data needed > by request_firmware() ? There's already much in the way of automation and infrastructure present in supporting the request_firwmare() interfaces (perhaps not the best of names) which can provide for a level of flexibility beyond a basic 'soft_port_name' interface. Though I don't see why both can't coexist cleanly -- I take it the use case you are considering is: software recognizes no valid WWPN available, query via request_firmware() fails, software halts initialization (rather than fail), and awaits the admin to poke '0x123456.. > /sys/.../fc_host/soft_port_name', causing a ping to the driver and continuation of initialization with requested portname? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one
On Tue, 09 Oct 2007, James Smart wrote: Why do you prefer request_firmware() vs something over sysfs ? Does environments like the kdump kernel also have access to data needed by request_firmware() ? There's already much in the way of automation and infrastructure present in supporting the request_firwmare() interfaces (perhaps not the best of names) which can provide for a level of flexibility beyond a basic 'soft_port_name' interface. Though I don't see why both can't coexist cleanly -- I take it the use case you are considering is: software recognizes no valid WWPN available, query via request_firmware() fails, software halts initialization (rather than fail), and awaits the admin to poke '0x123456.. /sys/.../fc_host/soft_port_name', causing a ping to the driver and continuation of initialization with requested portname? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one
On Mon, 08 Oct 2007, Darrick J. Wong wrote: > On Mon, Oct 08, 2007 at 03:48:32PM -0700, Andrew Vasquez wrote: > > > So how about factoring that out to a transport-level interface. How > > about something along the lines of the following patch, whereby the > > software driver upon detecting no valid WWPN, makes an upcall to each > > interface's 'request_wwn()'. The data passed in from shost_gendev > > should be enough for some helper script to cull relevent device bits > > and perhaps offer some level of persistence... Off base? > > Hrm... jejb made a remark that it might be better to pass the > scsi_host's device into request_firmware() as your example does, so I'll > pitch in a patch to do likewise with libsas--the scsi_host knows the > actual device it's coming from, and userland can sort that all out later > anyway via DEVPATH. > > I suppose one could also have multiple scsi_hosts per PCI device, which > means that my first patch would stumble horribly in more than a few > cases. This is done already in the FC case -- NPIV. Though with that interface, the administrator is already responsible for assigning proper WWNN/WWPN during creation. > > Darrick, forgive the FC example, I don't do SAS... > > That's ok, I don't do FC. :) Looks mostly good to me... -- av - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one
On Mon, 08 Oct 2007, Darrick J. Wong wrote: > If the aic94xx chip doesn't have a SAS address in the chip's flash memory, > use the request_firmware() interface to get one from userspace. This > way, there's no debate as to who or how an address gets generated--it's > totally up to the administrator to provide it if the card doesn't have one. So how about factoring that out to a transport-level interface. How about something along the lines of the following patch, whereby the software driver upon detecting no valid WWPN, makes an upcall to each interface's 'request_wwn()'. The data passed in from shost_gendev should be enough for some helper script to cull relevent device bits and perhaps offer some level of persistence... Off base? Darrick, forgive the FC example, I don't do SAS... -- av -- diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c index 7a7cfe5..5e0d953 100644 --- a/drivers/scsi/scsi_transport_fc.c +++ b/drivers/scsi/scsi_transport_fc.c @@ -35,6 +35,7 @@ #include #include #include +#include #include "scsi_priv.h" #include "scsi_transport_fc_internal.h" @@ -3251,6 +3252,30 @@ fc_vport_sched_delete(struct work_struct *work) vport->channel, stat); } +int +fc_request_wwn(struct Scsi_Host *shost, u64 *wwn) +{ + const struct firmware *fw; + int stat; + + stat = request_firmware(, "fc_addr", >shost_gendev); + if (stat) + return stat; + + if (fw->size < 16) { + stat = -EINVAL; + goto out; + } + + stat = fc_parse_wwn(fw->data, wwn); + if (stat) + return stat; + +out: + release_firmware(fw); + return stat; +} +EXPORT_SYMBOL(fc_request_wwn); /* Original Author: Martin Hicks */ MODULE_AUTHOR("James Smart"); diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h index e466d88..e80c36c 100644 --- a/include/scsi/scsi_transport_fc.h +++ b/include/scsi/scsi_transport_fc.h @@ -734,4 +734,6 @@ void fc_host_post_vendor_event(struct Scsi_Host *shost, u32 event_number, */ int fc_vport_terminate(struct fc_vport *vport); +int fc_request_wwn(struct Scsi_Host *, u64 *); + #endif /* SCSI_TRANSPORT_FC_H */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one
On Mon, 08 Oct 2007, Darrick J. Wong wrote: If the aic94xx chip doesn't have a SAS address in the chip's flash memory, use the request_firmware() interface to get one from userspace. This way, there's no debate as to who or how an address gets generated--it's totally up to the administrator to provide it if the card doesn't have one. So how about factoring that out to a transport-level interface. How about something along the lines of the following patch, whereby the software driver upon detecting no valid WWPN, makes an upcall to each interface's 'request_wwn()'. The data passed in from shost_gendev should be enough for some helper script to cull relevent device bits and perhaps offer some level of persistence... Off base? Darrick, forgive the FC example, I don't do SAS... -- av -- diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c index 7a7cfe5..5e0d953 100644 --- a/drivers/scsi/scsi_transport_fc.c +++ b/drivers/scsi/scsi_transport_fc.c @@ -35,6 +35,7 @@ #include linux/netlink.h #include net/netlink.h #include scsi/scsi_netlink_fc.h +#include linux/firmware.h #include scsi_priv.h #include scsi_transport_fc_internal.h @@ -3251,6 +3252,30 @@ fc_vport_sched_delete(struct work_struct *work) vport-channel, stat); } +int +fc_request_wwn(struct Scsi_Host *shost, u64 *wwn) +{ + const struct firmware *fw; + int stat; + + stat = request_firmware(fw, fc_addr, shost-shost_gendev); + if (stat) + return stat; + + if (fw-size 16) { + stat = -EINVAL; + goto out; + } + + stat = fc_parse_wwn(fw-data, wwn); + if (stat) + return stat; + +out: + release_firmware(fw); + return stat; +} +EXPORT_SYMBOL(fc_request_wwn); /* Original Author: Martin Hicks */ MODULE_AUTHOR(James Smart); diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h index e466d88..e80c36c 100644 --- a/include/scsi/scsi_transport_fc.h +++ b/include/scsi/scsi_transport_fc.h @@ -734,4 +734,6 @@ void fc_host_post_vendor_event(struct Scsi_Host *shost, u32 event_number, */ int fc_vport_terminate(struct fc_vport *vport); +int fc_request_wwn(struct Scsi_Host *, u64 *); + #endif /* SCSI_TRANSPORT_FC_H */ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one
On Mon, 08 Oct 2007, Darrick J. Wong wrote: On Mon, Oct 08, 2007 at 03:48:32PM -0700, Andrew Vasquez wrote: So how about factoring that out to a transport-level interface. How about something along the lines of the following patch, whereby the software driver upon detecting no valid WWPN, makes an upcall to each interface's 'request_wwn()'. The data passed in from shost_gendev should be enough for some helper script to cull relevent device bits and perhaps offer some level of persistence... Off base? Hrm... jejb made a remark that it might be better to pass the scsi_host's device into request_firmware() as your example does, so I'll pitch in a patch to do likewise with libsas--the scsi_host knows the actual device it's coming from, and userland can sort that all out later anyway via DEVPATH. I suppose one could also have multiple scsi_hosts per PCI device, which means that my first patch would stumble horribly in more than a few cases. This is done already in the FC case -- NPIV. Though with that interface, the administrator is already responsible for assigning proper WWNN/WWPN during creation. Darrick, forgive the FC example, I don't do SAS... That's ok, I don't do FC. :) Looks mostly good to me... -- av - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [3/3] 2.6.23-rc2: known regressions v2
On Wed, 08 Aug 2007, Michal Piotrowski wrote: > Here is a list of some known regressions in 2.6.23-rc2. > > Feel free to add new regressions/remove fixed etc. > http://kernelnewbies.org/known_regressions ... > SCSI > > Subject : unable to handle kernel NULL pointer dereference in > qla2x00_read_nvram_data > References : http://lkml.org/lkml/2007/8/6/506 > Last known good : ? > Submitter : Zhang, Yanmin <[EMAIL PROTECTED]> > Caused-By : ? > Handled-By : ? > Status : unknown Already addressed in Linus' latest GIT tree: http://article.gmane.org/gmane.linux.scsi/33472/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [3/3] 2.6.23-rc2: known regressions v2
On Wed, 08 Aug 2007, Michal Piotrowski wrote: Here is a list of some known regressions in 2.6.23-rc2. Feel free to add new regressions/remove fixed etc. http://kernelnewbies.org/known_regressions ... SCSI Subject : unable to handle kernel NULL pointer dereference in qla2x00_read_nvram_data References : http://lkml.org/lkml/2007/8/6/506 Last known good : ? Submitter : Zhang, Yanmin [EMAIL PROTECTED] Caused-By : ? Handled-By : ? Status : unknown Already addressed in Linus' latest GIT tree: http://article.gmane.org/gmane.linux.scsi/33472/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] qla2xxx: allocate enough space for the full PCI descriptor.
Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]> --- On Thu, 26 Jul 2007, Andrew Vasquez wrote: > On Thu, 26 Jul 2007, Andrew Patterson wrote: > > > On Thu, 2007-07-26 at 15:36 +0200, Ulrich Windl wrote: > > > Hi, > > > > > > <6>QLogic Fibre Channel HBA Driver > > > <6>GSI 49 (level, low) -> CPU 3 (0x0300) vector 51 > > > <6>ACPI: PCI Interrupt :0f:01.0[A] -> GSI 49 (level, low) -> IRQ 51 > > > <6>qla2xxx :0f:01.0: Found an ISP2422, irq 51, iobase 0xc000b004 > > > [...] > > > <6>qla2xxx :0f:01.0: LOOP UP detected (4 Gbps). > > > <6>qla2xxx :0f:01.0: Topology - (F_Port), Host Loop address 0x0 > > > <6>scsi0 : qla2xxx > > > <6>qla2xxx :0f:01.0: > > > <4> QLogic Fibre Channel HBA Driver: 8.01.07-k3 > > > <4> QLogic HP AB378-60001 - > > > <4> ISP2422: PCI-X Mode 2 (133 MH4.00.26 [IP] @ :0f:01.0 hdma+, host#=0, > > > fw=4.00.26 [IP] > > The 33/66/100/133 values refer to the bus-clock speed at which the > card is operating. As is seen here (although a bit truncated -- > separate issue, I'll try to see if I can reproduce this on one of my > HPQ rigs), Ok, so what's happening here is the buffer passed in (pci_info) does not have bytes allocated (off by 3). James, please apply... diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 93c0c7e..acca898 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1564,7 +1564,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) struct Scsi_Host *host; scsi_qla_host_t *ha; unsigned long flags = 0; - char pci_info[20]; + char pci_info[30]; char fw_str[30]; struct scsi_host_template *sht; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] qla2xxx: allocate enough space for the full PCI descriptor.
Signed-off-by: Andrew Vasquez [EMAIL PROTECTED] --- On Thu, 26 Jul 2007, Andrew Vasquez wrote: On Thu, 26 Jul 2007, Andrew Patterson wrote: On Thu, 2007-07-26 at 15:36 +0200, Ulrich Windl wrote: Hi, 6QLogic Fibre Channel HBA Driver 6GSI 49 (level, low) - CPU 3 (0x0300) vector 51 6ACPI: PCI Interrupt :0f:01.0[A] - GSI 49 (level, low) - IRQ 51 6qla2xxx :0f:01.0: Found an ISP2422, irq 51, iobase 0xc000b004 [...] 6qla2xxx :0f:01.0: LOOP UP detected (4 Gbps). 6qla2xxx :0f:01.0: Topology - (F_Port), Host Loop address 0x0 6scsi0 : qla2xxx 6qla2xxx :0f:01.0: 4 QLogic Fibre Channel HBA Driver: 8.01.07-k3 4 QLogic HP AB378-60001 - 4 ISP2422: PCI-X Mode 2 (133 MH4.00.26 [IP] @ :0f:01.0 hdma+, host#=0, fw=4.00.26 [IP] The 33/66/100/133 values refer to the bus-clock speed at which the card is operating. As is seen here (although a bit truncated -- separate issue, I'll try to see if I can reproduce this on one of my HPQ rigs), Ok, so what's happening here is the buffer passed in (pci_info) does not have bytes allocated (off by 3). James, please apply... diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 93c0c7e..acca898 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1564,7 +1564,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) struct Scsi_Host *host; scsi_qla_host_t *ha; unsigned long flags = 0; - char pci_info[20]; + char pci_info[30]; char fw_str[30]; struct scsi_host_template *sht; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 64] drivers/scsi/qla2xxx/qla_init.c: mostly kmalloc + memset conversion to k[cz]alloc
On Tue, 31 Jul 2007, Mariusz Kozlowski wrote: > Signed-off-by: Mariusz Kozlowski <[EMAIL PROTECTED]> > > drivers/scsi/qla2xxx/qla_init.c | 107445 -> 107327 (-118 bytes) > drivers/scsi/qla2xxx/qla_init.o | 237540 -> 237424 (-116 bytes) > > drivers/scsi/qla2xxx/qla_init.c | 14 ++ > 1 file changed, 6 insertions(+), 8 deletions(-) Acked-by: Andrew Vasquez <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: PCI-X @ 266MHz on HP rx6600 (Qlogic 4Gb FC HBA)
> On Fri, 27 Jul 2007, Andrew Patterson wrote: > > > On Thu, 2007-07-26 at 23:23 -0700, Andrew Vasquez wrote: > > > > > The 33/66/100/133 values refer to the bus-clock speed at which the > > > card is operating. As is seen here (although a bit truncated -- > > > separate issue, I'll try to see if I can reproduce this on one of my > > > HPQ rigs), the card is inserted into a PCI-X Mode-2 capable 133MHz > > > (bus clock) slot. When operating under this mode, each data-phase > > > between two devices is divided into 2 sub-phases, effectively doubling > > > the transfer-data-rate to 266Mhz. > > > > I guess the proper terminology would be 266 MT/s (Mega > > Transfers/second). Looking through the PSI_SIG PCI-X 2.0 marketing > > blurbs, they use MHz a lot when referring to MT/S. So I would still > > consider this to be a minor bug. The user wants to know the transfer > > rate, not the actual frequency of the bus. Maybe just print out the > > mode used instead, e.g., "PCI-X 266"? Given PCI-X Mode-2 can run at different bus-clock speeds, how about this as an alternative? PCI-X 266 (133Mhz) it's a bit more descriptive than PCI-X Mode 2 (133Mhz) then again, I don't want to beat this thing to death... --- diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index c488996..26f7e54 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -283,9 +283,9 @@ qla24xx_pci_info_str(struct scsi_qla_host *ha, char *str) } else { strcat(str, "-X "); if (pci_bus & BIT_2) - strcat(str, "Mode 2"); + strcat(str, "266"); else - strcat(str, "Mode 1"); + strcat(str, "133"); strcat(str, " ("); strcat(str, pci_bus_modes[pci_bus & ~BIT_2]); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: PCI-X @ 266MHz on HP rx6600 (Qlogic 4Gb FC HBA)
On Fri, 27 Jul 2007, Andrew Patterson wrote: On Thu, 2007-07-26 at 23:23 -0700, Andrew Vasquez wrote: The 33/66/100/133 values refer to the bus-clock speed at which the card is operating. As is seen here (although a bit truncated -- separate issue, I'll try to see if I can reproduce this on one of my HPQ rigs), the card is inserted into a PCI-X Mode-2 capable 133MHz (bus clock) slot. When operating under this mode, each data-phase between two devices is divided into 2 sub-phases, effectively doubling the transfer-data-rate to 266Mhz. I guess the proper terminology would be 266 MT/s (Mega Transfers/second). Looking through the PSI_SIG PCI-X 2.0 marketing blurbs, they use MHz a lot when referring to MT/S. So I would still consider this to be a minor bug. The user wants to know the transfer rate, not the actual frequency of the bus. Maybe just print out the mode used instead, e.g., PCI-X 266? Given PCI-X Mode-2 can run at different bus-clock speeds, how about this as an alternative? PCI-X 266 (133Mhz) it's a bit more descriptive than PCI-X Mode 2 (133Mhz) then again, I don't want to beat this thing to death... --- diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index c488996..26f7e54 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -283,9 +283,9 @@ qla24xx_pci_info_str(struct scsi_qla_host *ha, char *str) } else { strcat(str, -X ); if (pci_bus BIT_2) - strcat(str, Mode 2); + strcat(str, 266); else - strcat(str, Mode 1); + strcat(str, 133); strcat(str, (); strcat(str, pci_bus_modes[pci_bus ~BIT_2]); } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 64] drivers/scsi/qla2xxx/qla_init.c: mostly kmalloc + memset conversion to k[cz]alloc
On Tue, 31 Jul 2007, Mariusz Kozlowski wrote: Signed-off-by: Mariusz Kozlowski [EMAIL PROTECTED] drivers/scsi/qla2xxx/qla_init.c | 107445 - 107327 (-118 bytes) drivers/scsi/qla2xxx/qla_init.o | 237540 - 237424 (-116 bytes) drivers/scsi/qla2xxx/qla_init.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) Acked-by: Andrew Vasquez [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: PCI-X @ 266MHz on HP rx6600 (Qlogic 4Gb FC HBA)
On Fri, 27 Jul 2007, Andrew Patterson wrote: > On Thu, 2007-07-26 at 23:23 -0700, Andrew Vasquez wrote: > > > The 33/66/100/133 values refer to the bus-clock speed at which the > > card is operating. As is seen here (although a bit truncated -- > > separate issue, I'll try to see if I can reproduce this on one of my > > HPQ rigs), the card is inserted into a PCI-X Mode-2 capable 133MHz > > (bus clock) slot. When operating under this mode, each data-phase > > between two devices is divided into 2 sub-phases, effectively doubling > > the transfer-data-rate to 266Mhz. > > I guess the proper terminology would be 266 MT/s (Mega > Transfers/second). Looking through the PSI_SIG PCI-X 2.0 marketing > blurbs, they use MHz a lot when referring to MT/S. So I would still > consider this to be a minor bug. The user wants to know the transfer > rate, not the actual frequency of the bus. Maybe just print out the > mode used instead, e.g., "PCI-X 266"? That sounds reasonable. I'll spin some patches today after I verify all the bus-bits with the PUI group. Thanks, Andrew Vasquez - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: PCI-X @ 266MHz on HP rx6600 (Qlogic 4Gb FC HBA)
On Thu, 26 Jul 2007, Andrew Patterson wrote: > On Thu, 2007-07-26 at 15:36 +0200, Ulrich Windl wrote: > > Hi, > > > > I have a question: The Qlogic ISP2422 chip is said to handle PCI-X 266MHz. > > So does > > the HP Itanium2 server rx6600. Basically that was the reason to select that > > server. The FC-HBA is in a 266 MHz capable slot. However when booting > > SLES10 SP1 > > for IA64, the logs say: There's a mixup here in terminology... The QLA2460 card which you have does in fact support 'PCI-X 266'... > > <6>QLogic Fibre Channel HBA Driver > > <6>GSI 49 (level, low) -> CPU 3 (0x0300) vector 51 > > <6>ACPI: PCI Interrupt :0f:01.0[A] -> GSI 49 (level, low) -> IRQ 51 > > <6>qla2xxx :0f:01.0: Found an ISP2422, irq 51, iobase 0xc000b004 > > [...] > > <6>qla2xxx :0f:01.0: LOOP UP detected (4 Gbps). > > <6>qla2xxx :0f:01.0: Topology - (F_Port), Host Loop address 0x0 > > <6>scsi0 : qla2xxx > > <6>qla2xxx :0f:01.0: > > <4> QLogic Fibre Channel HBA Driver: 8.01.07-k3 > > <4> QLogic HP AB378-60001 - > > <4> ISP2422: PCI-X Mode 2 (133 MH4.00.26 [IP] @ :0f:01.0 hdma+, > > host#=0, > > fw=4.00.26 [IP] The 33/66/100/133 values refer to the bus-clock speed at which the card is operating. As is seen here (although a bit truncated -- separate issue, I'll try to see if I can reproduce this on one of my HPQ rigs), the card is inserted into a PCI-X Mode-2 capable 133MHz (bus clock) slot. When operating under this mode, each data-phase between two devices is divided into 2 sub-phases, effectively doubling the transfer-data-rate to 266Mhz. > > <5> Vendor: HPModel: HSV200Rev: 6100 > > <5> Type: RAID ANSI SCSI revision: 02 > > <5> 0:0:0:0: Attached scsi generic sg0 type 12 > > > > Now does Linux support the speed of 266 MHz, and is it just displayed > > incorrectly, > > or doesn't Linux support the speed of 266MHz yet? > > This is a bug in the driver. The lookup table only goes to 133 MHz. > > static char *pci_bus_modes[] = { > "33", "66", "100", "133", > > The same problem exists in the scsi_misc tree. Regards, Andrew Vasquez - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: PCI-X @ 266MHz on HP rx6600 (Qlogic 4Gb FC HBA)
On Thu, 26 Jul 2007, Andrew Patterson wrote: On Thu, 2007-07-26 at 15:36 +0200, Ulrich Windl wrote: Hi, I have a question: The Qlogic ISP2422 chip is said to handle PCI-X 266MHz. So does the HP Itanium2 server rx6600. Basically that was the reason to select that server. The FC-HBA is in a 266 MHz capable slot. However when booting SLES10 SP1 for IA64, the logs say: There's a mixup here in terminology... The QLA2460 card which you have does in fact support 'PCI-X 266'... 6QLogic Fibre Channel HBA Driver 6GSI 49 (level, low) - CPU 3 (0x0300) vector 51 6ACPI: PCI Interrupt :0f:01.0[A] - GSI 49 (level, low) - IRQ 51 6qla2xxx :0f:01.0: Found an ISP2422, irq 51, iobase 0xc000b004 [...] 6qla2xxx :0f:01.0: LOOP UP detected (4 Gbps). 6qla2xxx :0f:01.0: Topology - (F_Port), Host Loop address 0x0 6scsi0 : qla2xxx 6qla2xxx :0f:01.0: 4 QLogic Fibre Channel HBA Driver: 8.01.07-k3 4 QLogic HP AB378-60001 - 4 ISP2422: PCI-X Mode 2 (133 MH4.00.26 [IP] @ :0f:01.0 hdma+, host#=0, fw=4.00.26 [IP] The 33/66/100/133 values refer to the bus-clock speed at which the card is operating. As is seen here (although a bit truncated -- separate issue, I'll try to see if I can reproduce this on one of my HPQ rigs), the card is inserted into a PCI-X Mode-2 capable 133MHz (bus clock) slot. When operating under this mode, each data-phase between two devices is divided into 2 sub-phases, effectively doubling the transfer-data-rate to 266Mhz. 5 Vendor: HPModel: HSV200Rev: 6100 5 Type: RAID ANSI SCSI revision: 02 5 0:0:0:0: Attached scsi generic sg0 type 12 Now does Linux support the speed of 266 MHz, and is it just displayed incorrectly, or doesn't Linux support the speed of 266MHz yet? This is a bug in the driver. The lookup table only goes to 133 MHz. static char *pci_bus_modes[] = { 33, 66, 100, 133, The same problem exists in the scsi_misc tree. Regards, Andrew Vasquez - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: PCI-X @ 266MHz on HP rx6600 (Qlogic 4Gb FC HBA)
On Fri, 27 Jul 2007, Andrew Patterson wrote: On Thu, 2007-07-26 at 23:23 -0700, Andrew Vasquez wrote: The 33/66/100/133 values refer to the bus-clock speed at which the card is operating. As is seen here (although a bit truncated -- separate issue, I'll try to see if I can reproduce this on one of my HPQ rigs), the card is inserted into a PCI-X Mode-2 capable 133MHz (bus clock) slot. When operating under this mode, each data-phase between two devices is divided into 2 sub-phases, effectively doubling the transfer-data-rate to 266Mhz. I guess the proper terminology would be 266 MT/s (Mega Transfers/second). Looking through the PSI_SIG PCI-X 2.0 marketing blurbs, they use MHz a lot when referring to MT/S. So I would still consider this to be a minor bug. The user wants to know the transfer rate, not the actual frequency of the bus. Maybe just print out the mode used instead, e.g., PCI-X 266? That sounds reasonable. I'll spin some patches today after I verify all the bus-bits with the PUI group. Thanks, Andrew Vasquez - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] quiet down swiotlb warnings
On Fri, 01 Jun 2007, Andi Kleen wrote: > On Fri, Jun 01, 2007 at 03:38:57PM -0400, Rik van Riel wrote: > > Andi Kleen wrote: > > > > >An pci_map_sg failing typically leads to an IO error and we've > > >always printk'ed those. Otherwise people will wonder why they > > >get EIO. > > > > In some situations. In this case the qla2xxx driver uses > > the pci_map_sg() failure as a throttling mechanism and > > First WTF does it need swiotlb anyways? QA hardware should > be definitely DAC capable, shouldn't it? yes, the card can support 64bit DMA transfers. but in this case the 'required' DMA mask returned from dma_get_required_mask() states that a 32bit mask would suffice. Here's a snippet from the bugzilla report (https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=219216): QLogic Fibre Channel HBA Driver PCI: Enabling device :1f:00.0 (0140 -> 0143) ACPI: PCI Interrupt :1f:00.0[A] -> GSI 16 (level, low) -> IRQ 16 qla2xxx :1f:00.0: Found an ISP2432, irq 16, iobase 0xc202 *** qla2x00_config_dma_addressing: required_mask set to 7fff. *** qla2x00_config_dma_addressing: required_mask has no high-dword bits set. *** qla2x00_config_dma_addressing: set consistent 64bit mask returned 0. *** qla2x00_config_dma_addressing: defaulting to 32bit mask/consistent-mask. qla2xxx :1f:00.0: Configuring PCI space... Which tells me that a 32bit DMA mask is being set for dma_set_mask() and pci_set_consistent_dma_mask() since dma_get_required_mask() is returning back 7fff -- no upper-dword bits set... ... > > printing out all the warnings will actually slow down the > > system. > > Another reason is that there is a lot of code that > still doesn't check the return values and when that > happens you might get data corruption too. > > > > > Andi, what do you propose as a solution? > > A different interface; like I wrote in my earlier mail. > > Another probabibility would be to have a blocking interface > to swiotlb that won't fail. That would be the better solution > long term, but i was told it is hard to fit into some current > driver interfaces. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] quiet down swiotlb warnings
On Fri, 01 Jun 2007, Andi Kleen wrote: > Rik van Riel <[EMAIL PROTECTED]> writes: > > > It turns out that the qla2xxx driver sometimes fills up the iotlb > > on purpose and throttles itself when pci_map_sg() fails. In the > > case of a driver that expects and handles pci_map_sg() failures, > > we should not spam the user's console with swiotlb full messages. > > Why does it do that? Could we supply a better interface > for whatever it is trying to do here? The driver only calls pci_map_sg() once it's insured that all local driver resources are available to submit an I/O to the hardware. > > - printk(KERN_ERR "DMA: Out of SW-IOMMU space for %zu bytes at " > > - "device %s\n", size, dev ? dev->bus_id : "?"); > > + if (++warnings < 5) > > + printk(KERN_ERR "DMA: Out of SW-IOMMU space for %zu bytes at " > > + "device %s\n", size, dev ? dev->bus_id : "?"); > > Bad idea imho. swiotlb mappings should always lead to printk by default > because it is pretty dangerous. Why? It's just another resource which is consumed -- the qla2xxx driver is the final consumer before I/O is submitted out on the wire. The mappings are held for the shorted time required -- as such, are released as soon as the I/O completes. > One possible solution for this I could think of would be to define a > new pci_map_sg_couldfail() or similar that doesn't warn and use a weak > fallback just calling pci_map_sg on other IOMMU implementations. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] quiet down swiotlb warnings
On Fri, 01 Jun 2007, Andi Kleen wrote: Rik van Riel [EMAIL PROTECTED] writes: It turns out that the qla2xxx driver sometimes fills up the iotlb on purpose and throttles itself when pci_map_sg() fails. In the case of a driver that expects and handles pci_map_sg() failures, we should not spam the user's console with swiotlb full messages. Why does it do that? Could we supply a better interface for whatever it is trying to do here? The driver only calls pci_map_sg() once it's insured that all local driver resources are available to submit an I/O to the hardware. - printk(KERN_ERR DMA: Out of SW-IOMMU space for %zu bytes at - device %s\n, size, dev ? dev-bus_id : ?); + if (++warnings 5) + printk(KERN_ERR DMA: Out of SW-IOMMU space for %zu bytes at + device %s\n, size, dev ? dev-bus_id : ?); Bad idea imho. swiotlb mappings should always lead to printk by default because it is pretty dangerous. Why? It's just another resource which is consumed -- the qla2xxx driver is the final consumer before I/O is submitted out on the wire. The mappings are held for the shorted time required -- as such, are released as soon as the I/O completes. One possible solution for this I could think of would be to define a new pci_map_sg_couldfail() or similar that doesn't warn and use a weak fallback just calling pci_map_sg on other IOMMU implementations. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] quiet down swiotlb warnings
On Fri, 01 Jun 2007, Andi Kleen wrote: On Fri, Jun 01, 2007 at 03:38:57PM -0400, Rik van Riel wrote: Andi Kleen wrote: An pci_map_sg failing typically leads to an IO error and we've always printk'ed those. Otherwise people will wonder why they get EIO. In some situations. In this case the qla2xxx driver uses the pci_map_sg() failure as a throttling mechanism and First WTF does it need swiotlb anyways? QA hardware should be definitely DAC capable, shouldn't it? yes, the card can support 64bit DMA transfers. but in this case the 'required' DMA mask returned from dma_get_required_mask() states that a 32bit mask would suffice. Here's a snippet from the bugzilla report (https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=219216): QLogic Fibre Channel HBA Driver PCI: Enabling device :1f:00.0 (0140 - 0143) ACPI: PCI Interrupt :1f:00.0[A] - GSI 16 (level, low) - IRQ 16 qla2xxx :1f:00.0: Found an ISP2432, irq 16, iobase 0xc202 *** qla2x00_config_dma_addressing: required_mask set to 7fff. *** qla2x00_config_dma_addressing: required_mask has no high-dword bits set. *** qla2x00_config_dma_addressing: set consistent 64bit mask returned 0. *** qla2x00_config_dma_addressing: defaulting to 32bit mask/consistent-mask. qla2xxx :1f:00.0: Configuring PCI space... Which tells me that a 32bit DMA mask is being set for dma_set_mask() and pci_set_consistent_dma_mask() since dma_get_required_mask() is returning back 7fff -- no upper-dword bits set... ... printing out all the warnings will actually slow down the system. Another reason is that there is a lot of code that still doesn't check the return values and when that happens you might get data corruption too. Andi, what do you propose as a solution? A different interface; like I wrote in my earlier mail. Another probabibility would be to have a blocking interface to swiotlb that won't fail. That would be the better solution long term, but i was told it is hard to fit into some current driver interfaces. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] PCI-X/PCI-Express read control interfaces
On Tue, 15 May 2007, Andrew Morton wrote: > On Tue, 15 May 2007 13:50:27 +0200 > "Peter Oruba" <[EMAIL PROTECTED]> wrote: > > > This patch set introduces a PCI-X / PCI-Express read byte count control > > interface. Instead of letting every driver to directly read/write to PCI > > config space for that, an interface is provided. The interface functions > > then > > can be used for quirks since some PCI bridges require that read byte count > > values are set by the BIOS and left unchanged by device drivers. > > Some of the patches were wordwrapped, which I fixed. > > The way we would merge a feature like this is > > - get maintainers to review-and-ack the change This is definetly good cleanup, and I ACK the QLogic changes. I do though have some questions on call prerequisites given the driver-changes, most in the form of: > diff -uprN -X linux-2.6.22-rc1.orig/Documentation/dontdiff > linux-2.6.22-rc1.orig/drivers/infiniband/hw/mthca/mthca_main.c > linux-2.6.22-rc1/drivers/infiniband/hw/mthca/mthca_main.c > --- linux-2.6.22-rc1.orig/drivers/infiniband/hw/mthca/mthca_main.c > 2007-05-14 > 11:29:29.358547000 +0200 > +++ linux-2.6.22-rc1/drivers/infiniband/hw/mthca/mthca_main.c 2007-05-15 > 10:55:24.954074000 +0200 > @@ -137,45 +137,27 @@ static const char mthca_version[] __devi > > static int mthca_tune_pci(struct mthca_dev *mdev) > { > - int cap; > - u16 val; > - > if (!tune_pci) > return 0; > > /* First try to max out Read Byte Count */ > - cap = pci_find_capability(mdev->pdev, PCI_CAP_ID_PCIX); > - if (cap) { > - if (pci_read_config_word(mdev->pdev, cap + PCI_X_CMD, )) { > - mthca_err(mdev, "Couldn't read PCI-X command register, " > - "aborting.\n"); > - return -ENODEV; > - } > - val = (val & ~PCI_X_CMD_MAX_READ) | (3 << 2); > - if (pci_write_config_word(mdev->pdev, cap + PCI_X_CMD, val)) { > - mthca_err(mdev, "Couldn't write PCI-X command register, > " > - "aborting.\n"); > + if (pci_find_capability(mdev->pdev, PCI_CAP_ID_PCIX)) { > + if (pcix_set_mmrbc(mdev->pdev, pcix_get_max_mmrbc(mdev->pdev))) > { > + mthca_err(mdev, "Couldn't set PCI-X max read count, " > + "aborting.\n"); ... > - cap = pci_find_capability(mdev->pdev, PCI_CAP_ID_EXP); > - if (cap) { > - if (pci_read_config_word(mdev->pdev, cap + PCI_EXP_DEVCTL, > )) { > - mthca_err(mdev, "Couldn't read PCI Express device > control " > - "register, aborting.\n"); > - return -ENODEV; > - } > - val = (val & ~PCI_EXP_DEVCTL_READRQ) | (5 << 12); > - if (pci_write_config_word(mdev->pdev, cap + PCI_EXP_DEVCTL, > val)) { > - mthca_err(mdev, "Couldn't write PCI Express device > control " > - "register, aborting.\n"); > + if (pci_find_capability(mdev->pdev, PCI_CAP_ID_EXP)) { > + if (pcie_set_readrq(mdev->pdev, 4096)) { > + mthca_err(mdev, "Couldn't write PCI Express read > request, " > + "aborting.\n"); In general, if PCI-[Xe] capability structure exists do set- mmrbc()/readrq(), yet each of those pre-condition checks are already present in the pcix_set_mmrbc() and pcie_set_readrq(). At least for the qla2xxx case, the patch could easily distill down from: ... /* PCIe -- adjust Maximum Read Request Size (2048). */ pcie_dctl_reg = pci_find_capability(ha->pdev, PCI_CAP_ID_EXP); if (pcie_dctl_reg) if (pcie_set_readrq(ha->pdev, 2048)) DEBUG2(printk("Couldn't write PCI Express read request\n")); to: ... pcie_set_readrq(ha->pdev, 2048); Whatever the decision, I can fold this into my next patchset for qla2xxx and submit. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] PCI-X/PCI-Express read control interfaces
On Tue, 15 May 2007, Andrew Morton wrote: On Tue, 15 May 2007 13:50:27 +0200 Peter Oruba [EMAIL PROTECTED] wrote: This patch set introduces a PCI-X / PCI-Express read byte count control interface. Instead of letting every driver to directly read/write to PCI config space for that, an interface is provided. The interface functions then can be used for quirks since some PCI bridges require that read byte count values are set by the BIOS and left unchanged by device drivers. Some of the patches were wordwrapped, which I fixed. The way we would merge a feature like this is - get maintainers to review-and-ack the change This is definetly good cleanup, and I ACK the QLogic changes. I do though have some questions on call prerequisites given the driver-changes, most in the form of: diff -uprN -X linux-2.6.22-rc1.orig/Documentation/dontdiff linux-2.6.22-rc1.orig/drivers/infiniband/hw/mthca/mthca_main.c linux-2.6.22-rc1/drivers/infiniband/hw/mthca/mthca_main.c --- linux-2.6.22-rc1.orig/drivers/infiniband/hw/mthca/mthca_main.c 2007-05-14 11:29:29.358547000 +0200 +++ linux-2.6.22-rc1/drivers/infiniband/hw/mthca/mthca_main.c 2007-05-15 10:55:24.954074000 +0200 @@ -137,45 +137,27 @@ static const char mthca_version[] __devi static int mthca_tune_pci(struct mthca_dev *mdev) { - int cap; - u16 val; - if (!tune_pci) return 0; /* First try to max out Read Byte Count */ - cap = pci_find_capability(mdev-pdev, PCI_CAP_ID_PCIX); - if (cap) { - if (pci_read_config_word(mdev-pdev, cap + PCI_X_CMD, val)) { - mthca_err(mdev, Couldn't read PCI-X command register, - aborting.\n); - return -ENODEV; - } - val = (val ~PCI_X_CMD_MAX_READ) | (3 2); - if (pci_write_config_word(mdev-pdev, cap + PCI_X_CMD, val)) { - mthca_err(mdev, Couldn't write PCI-X command register, - aborting.\n); + if (pci_find_capability(mdev-pdev, PCI_CAP_ID_PCIX)) { + if (pcix_set_mmrbc(mdev-pdev, pcix_get_max_mmrbc(mdev-pdev))) { + mthca_err(mdev, Couldn't set PCI-X max read count, + aborting.\n); ... - cap = pci_find_capability(mdev-pdev, PCI_CAP_ID_EXP); - if (cap) { - if (pci_read_config_word(mdev-pdev, cap + PCI_EXP_DEVCTL, val)) { - mthca_err(mdev, Couldn't read PCI Express device control - register, aborting.\n); - return -ENODEV; - } - val = (val ~PCI_EXP_DEVCTL_READRQ) | (5 12); - if (pci_write_config_word(mdev-pdev, cap + PCI_EXP_DEVCTL, val)) { - mthca_err(mdev, Couldn't write PCI Express device control - register, aborting.\n); + if (pci_find_capability(mdev-pdev, PCI_CAP_ID_EXP)) { + if (pcie_set_readrq(mdev-pdev, 4096)) { + mthca_err(mdev, Couldn't write PCI Express read request, + aborting.\n); In general, if PCI-[Xe] capability structure exists do set- mmrbc()/readrq(), yet each of those pre-condition checks are already present in the pcix_set_mmrbc() and pcie_set_readrq(). At least for the qla2xxx case, the patch could easily distill down from: ... /* PCIe -- adjust Maximum Read Request Size (2048). */ pcie_dctl_reg = pci_find_capability(ha-pdev, PCI_CAP_ID_EXP); if (pcie_dctl_reg) if (pcie_set_readrq(ha-pdev, 2048)) DEBUG2(printk(Couldn't write PCI Express read request\n)); to: ... pcie_set_readrq(ha-pdev, 2048); Whatever the decision, I can fold this into my next patchset for qla2xxx and submit. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Wed, 18 Apr 2007, David Miller wrote: > > On Wed, 18 Apr 2007, Christoph Hellwig wrote: > > > > > I don't think a module option is a good idea at this point. The problem > > > is you broke some so far perfectly working setups, which is not okay. > > > The only first step can be printing a really big warning. After this > > > has been in for a while (at lest half a year) we can make it a non-default > > > option or turn if off completely in case the warning never triggered in > > > practice. > > > > > > The only resonable thing for 2.6.21 is to put in David's patch, possible > > > with an even more drastic warning when the rom is invalid and there's > > > no prom-fallback available. > > > > > > Note that I expect Sun put in the invalid ROM intentionally, as we have > > > similar cases with other cards that have totally messed up ROMs in > > > Sun-branded versions. Personally I think that's an utterly bad decision > > > from Sun's side, but we'll have to live with this. > > > > Fine. I'll rework an alternate patch for the 2.6.22 timeframe... > > We need to fix things now for 2.6.21 and the 2.6.x -stable branches > because users have unusable systems currently. Yes, and I'm fine with the original patch you provided which reverts the change and adds the firmware-upcalls to retrieve the wwpn/wwnn. > If it's just a time issue I can work on and push the patch, especially > since I have the means to test things here. I'll start with the final 2.6.21 -- add modify to add the *flashing* light warning and some additional bits based on other archs I can test with embedded ISPs. Thanks again for the SPARC tips. Regards, Andrew Vasquez - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Wed, 18 Apr 2007, David Miller wrote: On Wed, 18 Apr 2007, Christoph Hellwig wrote: I don't think a module option is a good idea at this point. The problem is you broke some so far perfectly working setups, which is not okay. The only first step can be printing a really big warning. After this has been in for a while (at lest half a year) we can make it a non-default option or turn if off completely in case the warning never triggered in practice. The only resonable thing for 2.6.21 is to put in David's patch, possible with an even more drastic warning when the rom is invalid and there's no prom-fallback available. Note that I expect Sun put in the invalid ROM intentionally, as we have similar cases with other cards that have totally messed up ROMs in Sun-branded versions. Personally I think that's an utterly bad decision from Sun's side, but we'll have to live with this. Fine. I'll rework an alternate patch for the 2.6.22 timeframe... We need to fix things now for 2.6.21 and the 2.6.x -stable branches because users have unusable systems currently. Yes, and I'm fine with the original patch you provided which reverts the change and adds the firmware-upcalls to retrieve the wwpn/wwnn. If it's just a time issue I can work on and push the patch, especially since I have the means to test things here. I'll start with the final 2.6.21 -- add modify to add the *flashing* light warning and some additional bits based on other archs I can test with embedded ISPs. Thanks again for the SPARC tips. Regards, Andrew Vasquez - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Wed, 18 Apr 2007, Christoph Hellwig wrote: > I don't think a module option is a good idea at this point. The problem > is you broke some so far perfectly working setups, which is not okay. > The only first step can be printing a really big warning. After this > has been in for a while (at lest half a year) we can make it a non-default > option or turn if off completely in case the warning never triggered in > practice. > > The only resonable thing for 2.6.21 is to put in David's patch, possible > with an even more drastic warning when the rom is invalid and there's > no prom-fallback available. > > Note that I expect Sun put in the invalid ROM intentionally, as we have > similar cases with other cards that have totally messed up ROMs in > Sun-branded versions. Personally I think that's an utterly bad decision > from Sun's side, but we'll have to live with this. Fine. I'll rework an alternate patch for the 2.6.22 timeframe... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Wed, 18 Apr 2007, Christoph Hellwig wrote: I don't think a module option is a good idea at this point. The problem is you broke some so far perfectly working setups, which is not okay. The only first step can be printing a really big warning. After this has been in for a while (at lest half a year) we can make it a non-default option or turn if off completely in case the warning never triggered in practice. The only resonable thing for 2.6.21 is to put in David's patch, possible with an even more drastic warning when the rom is invalid and there's no prom-fallback available. Note that I expect Sun put in the invalid ROM intentionally, as we have similar cases with other cards that have totally messed up ROMs in Sun-branded versions. Personally I think that's an utterly bad decision from Sun's side, but we'll have to live with this. Fine. I'll rework an alternate patch for the 2.6.22 timeframe... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, David Miller wrote: > From: Andrew Vasquez <[EMAIL PROTECTED]> > Date: Mon, 16 Apr 2007 16:47:05 -0700 > > > Dave, according to your earlier emails, the qla2xxx driver worked > > 'fine' in driver versions before commit > > 7aef45ac92f49e76d990b51b7ecd714b9a608be1. If that were the case, then > > you would have seen the warning messages: > > > > ... > > qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet " > > "invalid -- WWPN) defaults.\n"); > > I have in fact seen the message several times and that messages gives > me no reason to believe something needs to be fixed. > > It should have said "PLEASE REPORT THIS to [EMAIL PROTECTED]" or > something similar to indicate the severity better. > > "An invalid WWPN, what's that?" said the user. :) > > How about "FC IDs may conflict and cause miscommunication! Please > report to driver author so this can be fixed!" or similar? That verbiage sounds fine -- so would you consider the previous patch I submitted (with module parameter) along with the wording above? I'm in transit for a redeye to NY so I won't be able to modify the patch, If you would be amenable to the above, Seokmann, could you rework the patch? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, David Miller wrote: > From: Andrew Vasquez <[EMAIL PROTECTED]> > Date: Mon, 16 Apr 2007 16:28:51 -0700 > > > Sorry, but let's be realistic, this type of warning would have > > *NEVER* been addressed if we kept the status quo > > Wrong. I watch the logs all the time and would have sent you a fix to > use the Sparc firmware info as soon as I saw the kernel log message. Dave, according to your earlier emails, the qla2xxx driver worked 'fine' in driver versions before commit 7aef45ac92f49e76d990b51b7ecd714b9a608be1. If that were the case, then you would have seen the warning messages: ... qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet " "invalid -- WWPN) defaults.\n"); > Anyone who has worked with me over the last 15 years will let you know > emphatically that this is true. > > AND IN THE MEAN TIME I COULD GET WORK DONE AND MY SYSTEM WOULD BOOT! I understand that, and recognize your contribution, that was never in question. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, David Miller wrote: > From: Andrew Vasquez <[EMAIL PROTECTED]> > Date: Mon, 16 Apr 2007 15:25:17 -0700 > > > Fine, I'll agree that wacking-users (and > > I'll wager the outliers) with a 2x4 was a bit extreme, > > And that, right there, is basically the end of the conversation. > > You don't do this to users, ever. > Put a big loud kernel log message in there when this situation > presents itself, use as many capital letters and scary language that > you wish. Let them know that if things explode they get to keep the > pieces. > > But at least try to give them something that works when you know that > you can. > > You don't need to make someone's system unbootable in order to make > them aware of a potential problem. It's very anti-social to approach Sorry, but let's be realistic, this type of warning would have *NEVER* been addressed if we kept the status quo -- your modifications to read the wwpn/wwnn would have never been submitted, everybody would have kept going on blistfully ignorant of the issue. Changes such as these are a common Linux upstream idiom... So, meeting in the middle, with the NVRAM bits restored along with some ability for the user to *knowingly* recognize the problem, I take it, is not going to work for you? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, David Miller wrote: > From: Andrew Vasquez <[EMAIL PROTECTED]> > Date: Mon, 16 Apr 2007 14:10:49 -0700 > > > Ok, how about the following patch based on the one you posted which > > adds the codes to retrieve the WWPN/WWNN from firmware on SPARC, and > > also adds the module-parameter override I mentioned above. > > > > Perhaps the module-parameter should be set to non-zero in the case of > > SPARC, to take care of your system configurations? > > I think it should default to non-zero always, in fact the option > is completely pointless. > > The guy who hits this had a system which worked previously, and you're > explicitly breaking it. That's wrong. Sorry, 'it' didn't work... 'It' *never* did. > How can you not see that this quality of implementation decision > you're making stinks? You're defending a position which itself left users with a false sense of security and comfort. This is a *real* problem from an enterprise perspective where FC reigns. Fine, I'll agree that wacking-users (and I'll wager the outliers) with a 2x4 was a bit extreme, but I'd much rather handle those users on a case-by-case basis, either by: * If dealing with a PCI card, directing a user to a support staff at QLogic to resolve the NVRAM issues. * If it's some on-board ISP with no NVRAM, as was your SPARC case, then add *proper* codes to retrieve the data from some secondary persistent store. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, Andrew Vasquez wrote: > On Mon, 16 Apr 2007, David Miller wrote: > > > They DON'T > > CARE, they want their systems to work and if you don't give them that > > you're not being a good driver maintainer. > > Let's push aside attitudes and unrealistic statistics, could we > perhaps agree to re-add the use of doctored NVRAM (and thus > non-random WWPN/WWNN) when NVRAM is corrupted or non-present with a > module-parameter (which defaults to 0) which indicates the user > *really* knows what she is doing and recognizes WWPN collisions may > occur -- non-zero the parameter value indicates doctored values will > be used, zero value (the default) fails initialization. In both cases > a big FAT warning is issued. > > > You BROKE things, therefore you must FIX it. > > > > Now I'm happy to code up the sparc OFW property bits but your attitude > > and perspective on this absolutely has to change and the old fallback > > code still has to go back in there, possible FC ID collisions or not. > > That would be great, I'd like to insure the balance is maintained for > *all* our users. Ok, how about the following patch based on the one you posted which adds the codes to retrieve the WWPN/WWNN from firmware on SPARC, and also adds the module-parameter override I mentioned above. Perhaps the module-parameter should be set to non-zero in the case of SPARC, to take care of your system configurations? Regards, Andrew Vasquez --- diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h index 74544ae..b26090d 100644 --- a/drivers/scsi/qla2xxx/qla_gbl.h +++ b/drivers/scsi/qla2xxx/qla_gbl.h @@ -62,6 +62,7 @@ extern int ql2xfdmienable; extern int ql2xallocfwdump; extern int ql2xextended_error_logging; extern int ql2xqfullrampup; +extern int ql2xoverrideinvalidnvram; extern void qla2x00_sp_compl(scsi_qla_host_t *, srb_t *); diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index 98c01cd..fa5df97 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -11,6 +11,11 @@ #include "qla_devtbl.h" +#ifdef CONFIG_SPARC +#include +#include +#endif + /* XXX(hch): this is ugly, but we don't want to pull in exioctl.h */ #ifndef EXT_IS_LUN_BIT_SET #define EXT_IS_LUN_BIT_SET(P,L) \ @@ -1393,6 +1398,42 @@ qla2x00_set_model_info(scsi_qla_host_t *ha, uint8_t *model, size_t len, char *de } } +/* On sparc systems, obtain port and node WWN from firmware + * properties. + */ +static void qla2xxx_nvram_wwn_from_ofw(scsi_qla_host_t *ha, nvram_t *nv) +{ +#ifdef CONFIG_SPARC + struct pci_dev *pdev = ha->pdev; + struct pcidev_cookie *pcp = pdev->sysdata; + struct device_node *dp = pcp->prom_node; + u8 *val; + int len; + + val = of_get_property(dp, "port-wwn", ); + if (val && len >= WWN_SIZE) + memcpy(nv->port_name, val, WWN_SIZE); + + val = of_get_property(dp, "node-wwn", ); + if (val && len >= WWN_SIZE) + memcpy(nv->node_name, val, WWN_SIZE); +#endif +} + +static inline int +qla2x00_override_invalid_nvram(scsi_qla_host_t *ha) +{ + if (!ql2xoverrideinvalidnvram) { + qla_printk(KERN_WARNING, ha, + "Reload the driver with the ql2xoverrideinvalidnvram \n"); + qla_printk(KERN_WARNING, ha, + " module parameter set to a non-zero value to ignore \n"); + qla_printk(KERN_WARNING, ha, + " this warning.\n"); + } + return ql2xoverrideinvalidnvram; +} + /* * NVRAM configuration for ISP 2xxx * @@ -1440,7 +1481,57 @@ qla2x00_nvram_config(scsi_qla_host_t *ha) qla_printk(KERN_WARNING, ha, "Inconsistent NVRAM detected: " "checksum=0x%x id=%c version=0x%x.\n", chksum, nv->id[0], nv->nvram_version); - return QLA_FUNCTION_FAILED; + if (!qla2x00_override_invalid_nvram(ha)) + return QLA_FUNCTION_FAILED; + qla_printk(KERN_WARNING, ha, "Falling back to functioning (yet " + "invalid -- WWPN) defaults.\n"); + + /* +* Set default initialization control block. +*/ + memset(nv, 0, ha->nvram_size); + nv->parameter_block_version = ICB_VERSION; + + if (IS_QLA23XX(ha)) { + nv->firmware_options[0] = BIT_2 | BIT_1; + nv->firmware_options[1] = BIT_7 | BIT_5; + nv->add_firmware_options[0] = BIT_5; + nv->add_firmware_options[1] = BIT_5 | BIT_4; + nv->frame_payload_size = __
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, David Miller wrote: > From: Andrew Vasquez <[EMAIL PROTECTED]> > Date: Mon, 16 Apr 2007 09:37:12 -0700 > > > On Mon, 16 Apr 2007, David Miller wrote: > > > > > But even if that fails, I think the fallback code should be put back, > > > since it obviously was used by at least one system and it's probable > > > that there are some other applications of using this qla2xxx chip that > > > will have an empty NVRAM too. > > > > Then they should really get their NVRAM corrected, if in fact their > > NVRAMs are cleared. > > > > > I can understand the apprehension in using a fixed port_name[] value, > > > since it could conflict with other FC controllers on the mesh, but if > > > that is so important just choose some random value that is a valid FC > > > ID or use some characteristic ID that can be used to compose part of > > > the port WWN in order to give it at least some uniqueness. > > > > Look, there's a fine balance here that we must strike -- the solution > > that you're proposing implies that there's some 'random' bit-space > > within the IEEE NAA with which one can safely encode without stomping > > on any valid OUI. > > The fact is that your driver was significantly more robust > previously, and now it's so less robust that it now fails for > people. > > That's totally unacceptable. > > Just like the sparc64 systsems, others depending upon this fallback > behavior the qla2xxx driver had are going to break and they are not > going to be able to just go and fix their hardware and re-flash the > NVRAM. > > Every user on the planet is going to be 1,000 times more happy with a > big fat warning in their kernel log saying that things might not go > right, but the driver is going to try anyways, rather than a complete > non-attempt to make things work. > > You replaced a possible failure with a guarenteed one. > > %99.999 of people are never going to run into a FC ID collision. > They have an onboard FC controller and a disk or two. Sorry, but in a SATA/SCSI environment that may be true, but in the case of FC that expectation is unrealistic. There are thousands of FC installations where there are several thousand endpoints (including initiators and targets) all interconnected. Let's use your case -- just connect two sparc machines within the same fabric to your storage, with the old code, there's still a problem. > They DON'T > CARE, they want their systems to work and if you don't give them that > you're not being a good driver maintainer. Let's push aside attitudes and unrealistic statistics, could we perhaps agree to re-add the use of doctored NVRAM (and thus non-random WWPN/WWNN) when NVRAM is corrupted or non-present with a module-parameter (which defaults to 0) which indicates the user *really* knows what she is doing and recognizes WWPN collisions may occur -- non-zero the parameter value indicates doctored values will be used, zero value (the default) fails initialization. In both cases a big FAT warning is issued. > You BROKE things, therefore you must FIX it. > > Now I'm happy to code up the sparc OFW property bits but your attitude > and perspective on this absolutely has to change and the old fallback > code still has to go back in there, possible FC ID collisions or not. That would be great, I'd like to insure the balance is maintained for *all* our users. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, David Miller wrote: > Sparc64 systems which have an on-board qla2xxx chip (such as > SunBlade-1000 and SunBlade-2000, there are probably some other systems > like this too) do not have any NVRAM information present, in fact the > NVRAM is basically all 0's from what I can tell. > > This always worked just fine since the code would previously just use > a bunch of defaults when an inconsistent NVRAM was detected. > > But the changeset below at the end of this email broke this and now > I'm seeing bug reports from sparc64 users and I was just able to > reproduce the problem myself just today as well. I verified that > reverting the patch below gets things working again. > > Emanuele, you can feed the patch below to "patch -p1 -R" to get that > working again so we can move on to the other sparc64 bug we're looking > into :-) I sent Emanuele the attached patch during the weekend... > The failure mode isn't nice, it actually ends up crashing with an OOPS > in qla2xxx_init_host_attr() because ha->node_name is NULL, it's > supposed to be initialized by functions like qla2x00_nvram_config() No, it's not very nice... > Can we revert the patch below or do something similar to get things > working again on sparc64? > > The most important thing which qla2x00_nvram_config() seems to want to > get is the WWN port_name and node_name. These are provided in the OFW > device tree so we could pluck them out of there with something like: > > #ifdef CONFIG_SPARC > #include > #include > #endif > > ... > > #ifdef CONFIG_SPARC > struct pcidev_cookie *pcp = pdev->sysdata; > u8 *port_name, *node_name; > > port_name = of_get_property(pcp->prom_node, "port-wwn", NULL); > node_name = of_get_property(pcp->prom_node, "node-wwn", NULL); > #endif > Those will hold a pointer to the property values or NULL if the > property does not exist. This is private data, so you should make > copies of them into your local data structure and not use references > to them. > > I don't see any OFW properties present that could be used to fill in > the rest of the NVRAM parameters, so we'd need to use the defaults > that the code before the change was using. I'd be more inclined to do soemthing like the above, rather than: > But even if that fails, I think the fallback code should be put back, > since it obviously was used by at least one system and it's probable > that there are some other applications of using this qla2xxx chip that > will have an empty NVRAM too. Then they should really get their NVRAM corrected, if in fact their NVRAMs are cleared. > I can understand the apprehension in using a fixed port_name[] value, > since it could conflict with other FC controllers on the mesh, but if > that is so important just choose some random value that is a valid FC > ID or use some characteristic ID that can be used to compose part of > the port WWN in order to give it at least some uniqueness. Look, there's a fine balance here that we must strike -- the solution that you're proposing implies that there's some 'random' bit-space within the IEEE NAA with which one can safely encode without stomping on any valid OUI. >From 9ee6de3bbaa03390b83226e7bb84c49566a583b3 Mon Sep 17 00:00:00 2001 From: Andrew Vasquez <[EMAIL PROTECTED]> Date: Wed, 11 Apr 2007 16:02:06 -0700 Subject: [PATCH] qla2xxx: Error-out during probe() if we're unable to complete HBA initialization. Remove a stale check against ha->device_flags (DFLG_NO_CABLE) as topology scanning is performed within the DPC-thread context. Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]> --- drivers/scsi/qla2xxx/qla_os.c |4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index b78919a..0a36912 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1577,9 +1577,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) goto probe_failed; } - if (qla2x00_initialize_adapter(ha) && - !(ha->device_flags & DFLG_NO_CABLE)) { - + if (qla2x00_initialize_adapter(ha)) { qla_printk(KERN_WARNING, ha, "Failed to initialize adapter\n"); -- 1.5.1.1.107.g7a159
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, David Miller wrote: Sparc64 systems which have an on-board qla2xxx chip (such as SunBlade-1000 and SunBlade-2000, there are probably some other systems like this too) do not have any NVRAM information present, in fact the NVRAM is basically all 0's from what I can tell. This always worked just fine since the code would previously just use a bunch of defaults when an inconsistent NVRAM was detected. But the changeset below at the end of this email broke this and now I'm seeing bug reports from sparc64 users and I was just able to reproduce the problem myself just today as well. I verified that reverting the patch below gets things working again. Emanuele, you can feed the patch below to patch -p1 -R to get that working again so we can move on to the other sparc64 bug we're looking into :-) I sent Emanuele the attached patch during the weekend... The failure mode isn't nice, it actually ends up crashing with an OOPS in qla2xxx_init_host_attr() because ha-node_name is NULL, it's supposed to be initialized by functions like qla2x00_nvram_config() No, it's not very nice... Can we revert the patch below or do something similar to get things working again on sparc64? The most important thing which qla2x00_nvram_config() seems to want to get is the WWN port_name and node_name. These are provided in the OFW device tree so we could pluck them out of there with something like: #ifdef CONFIG_SPARC #include asm/prom.h #include asm/pbm.h #endif ... #ifdef CONFIG_SPARC struct pcidev_cookie *pcp = pdev-sysdata; u8 *port_name, *node_name; port_name = of_get_property(pcp-prom_node, port-wwn, NULL); node_name = of_get_property(pcp-prom_node, node-wwn, NULL); #endif Those will hold a pointer to the property values or NULL if the property does not exist. This is private data, so you should make copies of them into your local data structure and not use references to them. I don't see any OFW properties present that could be used to fill in the rest of the NVRAM parameters, so we'd need to use the defaults that the code before the change was using. I'd be more inclined to do soemthing like the above, rather than: But even if that fails, I think the fallback code should be put back, since it obviously was used by at least one system and it's probable that there are some other applications of using this qla2xxx chip that will have an empty NVRAM too. Then they should really get their NVRAM corrected, if in fact their NVRAMs are cleared. I can understand the apprehension in using a fixed port_name[] value, since it could conflict with other FC controllers on the mesh, but if that is so important just choose some random value that is a valid FC ID or use some characteristic ID that can be used to compose part of the port WWN in order to give it at least some uniqueness. Look, there's a fine balance here that we must strike -- the solution that you're proposing implies that there's some 'random' bit-space within the IEEE NAA with which one can safely encode without stomping on any valid OUI. From 9ee6de3bbaa03390b83226e7bb84c49566a583b3 Mon Sep 17 00:00:00 2001 From: Andrew Vasquez [EMAIL PROTECTED] Date: Wed, 11 Apr 2007 16:02:06 -0700 Subject: [PATCH] qla2xxx: Error-out during probe() if we're unable to complete HBA initialization. Remove a stale check against ha-device_flags (DFLG_NO_CABLE) as topology scanning is performed within the DPC-thread context. Signed-off-by: Andrew Vasquez [EMAIL PROTECTED] --- drivers/scsi/qla2xxx/qla_os.c |4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index b78919a..0a36912 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1577,9 +1577,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) goto probe_failed; } - if (qla2x00_initialize_adapter(ha) - !(ha-device_flags DFLG_NO_CABLE)) { - + if (qla2x00_initialize_adapter(ha)) { qla_printk(KERN_WARNING, ha, Failed to initialize adapter\n); -- 1.5.1.1.107.g7a159
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, David Miller wrote: From: Andrew Vasquez [EMAIL PROTECTED] Date: Mon, 16 Apr 2007 09:37:12 -0700 On Mon, 16 Apr 2007, David Miller wrote: But even if that fails, I think the fallback code should be put back, since it obviously was used by at least one system and it's probable that there are some other applications of using this qla2xxx chip that will have an empty NVRAM too. Then they should really get their NVRAM corrected, if in fact their NVRAMs are cleared. I can understand the apprehension in using a fixed port_name[] value, since it could conflict with other FC controllers on the mesh, but if that is so important just choose some random value that is a valid FC ID or use some characteristic ID that can be used to compose part of the port WWN in order to give it at least some uniqueness. Look, there's a fine balance here that we must strike -- the solution that you're proposing implies that there's some 'random' bit-space within the IEEE NAA with which one can safely encode without stomping on any valid OUI. The fact is that your driver was significantly more robust previously, and now it's so less robust that it now fails for people. That's totally unacceptable. Just like the sparc64 systsems, others depending upon this fallback behavior the qla2xxx driver had are going to break and they are not going to be able to just go and fix their hardware and re-flash the NVRAM. Every user on the planet is going to be 1,000 times more happy with a big fat warning in their kernel log saying that things might not go right, but the driver is going to try anyways, rather than a complete non-attempt to make things work. You replaced a possible failure with a guarenteed one. %99.999 of people are never going to run into a FC ID collision. They have an onboard FC controller and a disk or two. Sorry, but in a SATA/SCSI environment that may be true, but in the case of FC that expectation is unrealistic. There are thousands of FC installations where there are several thousand endpoints (including initiators and targets) all interconnected. Let's use your case -- just connect two sparc machines within the same fabric to your storage, with the old code, there's still a problem. They DON'T CARE, they want their systems to work and if you don't give them that you're not being a good driver maintainer. Let's push aside attitudes and unrealistic statistics, could we perhaps agree to re-add the use of doctored NVRAM (and thus non-random WWPN/WWNN) when NVRAM is corrupted or non-present with a module-parameter (which defaults to 0) which indicates the user *really* knows what she is doing and recognizes WWPN collisions may occur -- non-zero the parameter value indicates doctored values will be used, zero value (the default) fails initialization. In both cases a big FAT warning is issued. You BROKE things, therefore you must FIX it. Now I'm happy to code up the sparc OFW property bits but your attitude and perspective on this absolutely has to change and the old fallback code still has to go back in there, possible FC ID collisions or not. That would be great, I'd like to insure the balance is maintained for *all* our users. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, Andrew Vasquez wrote: On Mon, 16 Apr 2007, David Miller wrote: They DON'T CARE, they want their systems to work and if you don't give them that you're not being a good driver maintainer. Let's push aside attitudes and unrealistic statistics, could we perhaps agree to re-add the use of doctored NVRAM (and thus non-random WWPN/WWNN) when NVRAM is corrupted or non-present with a module-parameter (which defaults to 0) which indicates the user *really* knows what she is doing and recognizes WWPN collisions may occur -- non-zero the parameter value indicates doctored values will be used, zero value (the default) fails initialization. In both cases a big FAT warning is issued. You BROKE things, therefore you must FIX it. Now I'm happy to code up the sparc OFW property bits but your attitude and perspective on this absolutely has to change and the old fallback code still has to go back in there, possible FC ID collisions or not. That would be great, I'd like to insure the balance is maintained for *all* our users. Ok, how about the following patch based on the one you posted which adds the codes to retrieve the WWPN/WWNN from firmware on SPARC, and also adds the module-parameter override I mentioned above. Perhaps the module-parameter should be set to non-zero in the case of SPARC, to take care of your system configurations? Regards, Andrew Vasquez --- diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h index 74544ae..b26090d 100644 --- a/drivers/scsi/qla2xxx/qla_gbl.h +++ b/drivers/scsi/qla2xxx/qla_gbl.h @@ -62,6 +62,7 @@ extern int ql2xfdmienable; extern int ql2xallocfwdump; extern int ql2xextended_error_logging; extern int ql2xqfullrampup; +extern int ql2xoverrideinvalidnvram; extern void qla2x00_sp_compl(scsi_qla_host_t *, srb_t *); diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index 98c01cd..fa5df97 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -11,6 +11,11 @@ #include qla_devtbl.h +#ifdef CONFIG_SPARC +#include asm/prom.h +#include asm/pbm.h +#endif + /* XXX(hch): this is ugly, but we don't want to pull in exioctl.h */ #ifndef EXT_IS_LUN_BIT_SET #define EXT_IS_LUN_BIT_SET(P,L) \ @@ -1393,6 +1398,42 @@ qla2x00_set_model_info(scsi_qla_host_t *ha, uint8_t *model, size_t len, char *de } } +/* On sparc systems, obtain port and node WWN from firmware + * properties. + */ +static void qla2xxx_nvram_wwn_from_ofw(scsi_qla_host_t *ha, nvram_t *nv) +{ +#ifdef CONFIG_SPARC + struct pci_dev *pdev = ha-pdev; + struct pcidev_cookie *pcp = pdev-sysdata; + struct device_node *dp = pcp-prom_node; + u8 *val; + int len; + + val = of_get_property(dp, port-wwn, len); + if (val len = WWN_SIZE) + memcpy(nv-port_name, val, WWN_SIZE); + + val = of_get_property(dp, node-wwn, len); + if (val len = WWN_SIZE) + memcpy(nv-node_name, val, WWN_SIZE); +#endif +} + +static inline int +qla2x00_override_invalid_nvram(scsi_qla_host_t *ha) +{ + if (!ql2xoverrideinvalidnvram) { + qla_printk(KERN_WARNING, ha, + Reload the driver with the ql2xoverrideinvalidnvram \n); + qla_printk(KERN_WARNING, ha, +module parameter set to a non-zero value to ignore \n); + qla_printk(KERN_WARNING, ha, +this warning.\n); + } + return ql2xoverrideinvalidnvram; +} + /* * NVRAM configuration for ISP 2xxx * @@ -1440,7 +1481,57 @@ qla2x00_nvram_config(scsi_qla_host_t *ha) qla_printk(KERN_WARNING, ha, Inconsistent NVRAM detected: checksum=0x%x id=%c version=0x%x.\n, chksum, nv-id[0], nv-nvram_version); - return QLA_FUNCTION_FAILED; + if (!qla2x00_override_invalid_nvram(ha)) + return QLA_FUNCTION_FAILED; + qla_printk(KERN_WARNING, ha, Falling back to functioning (yet + invalid -- WWPN) defaults.\n); + + /* +* Set default initialization control block. +*/ + memset(nv, 0, ha-nvram_size); + nv-parameter_block_version = ICB_VERSION; + + if (IS_QLA23XX(ha)) { + nv-firmware_options[0] = BIT_2 | BIT_1; + nv-firmware_options[1] = BIT_7 | BIT_5; + nv-add_firmware_options[0] = BIT_5; + nv-add_firmware_options[1] = BIT_5 | BIT_4; + nv-frame_payload_size = __constant_cpu_to_le16(2048); + nv-special_options[1] = BIT_7; + } else if (IS_QLA2200(ha)) { + nv-firmware_options[0] = BIT_2 | BIT_1; + nv-firmware_options[1] = BIT_7 | BIT_5; + nv-add_firmware_options[0
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, David Miller wrote: From: Andrew Vasquez [EMAIL PROTECTED] Date: Mon, 16 Apr 2007 14:10:49 -0700 Ok, how about the following patch based on the one you posted which adds the codes to retrieve the WWPN/WWNN from firmware on SPARC, and also adds the module-parameter override I mentioned above. Perhaps the module-parameter should be set to non-zero in the case of SPARC, to take care of your system configurations? I think it should default to non-zero always, in fact the option is completely pointless. The guy who hits this had a system which worked previously, and you're explicitly breaking it. That's wrong. Sorry, 'it' didn't work... 'It' *never* did. How can you not see that this quality of implementation decision you're making stinks? You're defending a position which itself left users with a false sense of security and comfort. This is a *real* problem from an enterprise perspective where FC reigns. Fine, I'll agree that wacking-users (and I'll wager the outliers) with a 2x4 was a bit extreme, but I'd much rather handle those users on a case-by-case basis, either by: * If dealing with a PCI card, directing a user to a support staff at QLogic to resolve the NVRAM issues. * If it's some on-board ISP with no NVRAM, as was your SPARC case, then add *proper* codes to retrieve the data from some secondary persistent store. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, David Miller wrote: From: Andrew Vasquez [EMAIL PROTECTED] Date: Mon, 16 Apr 2007 15:25:17 -0700 Fine, I'll agree that wacking-users (and I'll wager the outliers) with a 2x4 was a bit extreme, And that, right there, is basically the end of the conversation. You don't do this to users, ever. Put a big loud kernel log message in there when this situation presents itself, use as many capital letters and scary language that you wish. Let them know that if things explode they get to keep the pieces. But at least try to give them something that works when you know that you can. You don't need to make someone's system unbootable in order to make them aware of a potential problem. It's very anti-social to approach Sorry, but let's be realistic, this type of warning would have *NEVER* been addressed if we kept the status quo -- your modifications to read the wwpn/wwnn would have never been submitted, everybody would have kept going on blistfully ignorant of the issue. Changes such as these are a common Linux upstream idiom... So, meeting in the middle, with the NVRAM bits restored along with some ability for the user to *knowingly* recognize the problem, I take it, is not going to work for you? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, David Miller wrote: From: Andrew Vasquez [EMAIL PROTECTED] Date: Mon, 16 Apr 2007 16:28:51 -0700 Sorry, but let's be realistic, this type of warning would have *NEVER* been addressed if we kept the status quo Wrong. I watch the logs all the time and would have sent you a fix to use the Sparc firmware info as soon as I saw the kernel log message. Dave, according to your earlier emails, the qla2xxx driver worked 'fine' in driver versions before commit 7aef45ac92f49e76d990b51b7ecd714b9a608be1. If that were the case, then you would have seen the warning messages: ... qla_printk(KERN_WARNING, ha, Falling back to functioning (yet invalid -- WWPN) defaults.\n); Anyone who has worked with me over the last 15 years will let you know emphatically that this is true. AND IN THE MEAN TIME I COULD GET WORK DONE AND MY SYSTEM WOULD BOOT! I understand that, and recognize your contribution, that was never in question. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major qla2xxx regression on sparc64
On Mon, 16 Apr 2007, David Miller wrote: From: Andrew Vasquez [EMAIL PROTECTED] Date: Mon, 16 Apr 2007 16:47:05 -0700 Dave, according to your earlier emails, the qla2xxx driver worked 'fine' in driver versions before commit 7aef45ac92f49e76d990b51b7ecd714b9a608be1. If that were the case, then you would have seen the warning messages: ... qla_printk(KERN_WARNING, ha, Falling back to functioning (yet invalid -- WWPN) defaults.\n); I have in fact seen the message several times and that messages gives me no reason to believe something needs to be fixed. It should have said PLEASE REPORT THIS to [EMAIL PROTECTED] or something similar to indicate the severity better. An invalid WWPN, what's that? said the user. :) How about FC IDs may conflict and cause miscommunication! Please report to driver author so this can be fixed! or similar? That verbiage sounds fine -- so would you consider the previous patch I submitted (with module parameter) along with the wording above? I'm in transit for a redeye to NY so I won't be able to modify the patch, If you would be amenable to the above, Seokmann, could you rework the patch? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: qla2xxx BUG: workqueue leaked lock or atomic
On Tue, 27 Feb 2007, Andre Noll wrote: > On 10:26, Andrew Vasquez wrote: > > You are loading some stale firmware that's left over on the card -- > > I'm not even sure what 4.00.70 is, as the latest release firmware is > > 4.00.27. > > That's the firmware which came with the card. Anyway, I just upgraded > the firmware, but the bug remains. The backtrace differs a bit though > as now the tg3 network driver seems to be involved as well. > > Thanks for your help > Andre ... > [ 68.532665] BUG: at kernel/lockdep.c:1860 trace_hardirqs_on() > [ 68.532784] > [ 68.532785] Call Trace: > [ 68.532979][] trace_hardirqs_on+0xd7/0x180 > [ 68.533168] [] _spin_unlock_irq+0x2b/0x40 > [ 68.533295] [] > :qla2xxx:qla2x00_process_completed_request+0x137/0x1d0 > [ 68.533457] [] :qla2xxx:qla2x00_status_entry+0x82/0xa40 > [ 68.533577] [] __lock_acquire+0xcdf/0xd90 > [ 68.533693] [] _spin_unlock_irqrestore+0x42/0x60 > [ 68.533816] [] :qla2xxx:qla24xx_intr_handler+0x4e/0x2b0 > [ 68.533942] [] > :qla2xxx:qla24xx_process_response_queue+0xc1/0x1c0 > [ 68.534102] [] :qla2xxx:qla24xx_intr_handler+0x1d4/0x2b0 Ok, since 2.6.20, there been a patch added to qla2xxx which drops the spin_unlock_irq() call while attempting to ramp-up the queue-depth: commit befede3dabd204e9c546cbfbe391b29286c57da2 Author: Seokmann Ju <[EMAIL PROTECTED]> Date: Tue Jan 9 11:37:52 2007 -0800 [SCSI] qla2xxx: correct locking while call starget_for_each_device() Removed spin_unlock_irq()/spin_lock_irq() pairs surrounding starget_for_each_device() calls. As Matthew W. pointed out, starget_for_each_device() can be called under a spinlock being held. The change has been tested and verified on qla2xxx.ko module. Thanks Matthew W. and Hisashi H. for help. Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]> Signed-off-by: Seokmann Ju <[EMAIL PROTECTED]> Signed-off-by: James Bottomley <[EMAIL PROTECTED]> http://marc.theaimsgroup.com/?l=linux-scsi=116837234904583=2 Could you try the latest 2.6.21-rc which contains the correction? Regards, Andrew Vasquez - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: qla2xxx BUG: workqueue leaked lock or atomic
On Tue, 27 Feb 2007, Andre Noll wrote: On 10:26, Andrew Vasquez wrote: You are loading some stale firmware that's left over on the card -- I'm not even sure what 4.00.70 is, as the latest release firmware is 4.00.27. That's the firmware which came with the card. Anyway, I just upgraded the firmware, but the bug remains. The backtrace differs a bit though as now the tg3 network driver seems to be involved as well. Thanks for your help Andre ... [ 68.532665] BUG: at kernel/lockdep.c:1860 trace_hardirqs_on() [ 68.532784] [ 68.532785] Call Trace: [ 68.532979] IRQ [8024b877] trace_hardirqs_on+0xd7/0x180 [ 68.533168] [80511f5b] _spin_unlock_irq+0x2b/0x40 [ 68.533295] [88032747] :qla2xxx:qla2x00_process_completed_request+0x137/0x1d0 [ 68.533457] [88032862] :qla2xxx:qla2x00_status_entry+0x82/0xa40 [ 68.533577] [8024b17f] __lock_acquire+0xcdf/0xd90 [ 68.533693] [80511ff2] _spin_unlock_irqrestore+0x42/0x60 [ 68.533816] [880343fe] :qla2xxx:qla24xx_intr_handler+0x4e/0x2b0 [ 68.533942] [88033551] :qla2xxx:qla24xx_process_response_queue+0xc1/0x1c0 [ 68.534102] [88034584] :qla2xxx:qla24xx_intr_handler+0x1d4/0x2b0 Ok, since 2.6.20, there been a patch added to qla2xxx which drops the spin_unlock_irq() call while attempting to ramp-up the queue-depth: commit befede3dabd204e9c546cbfbe391b29286c57da2 Author: Seokmann Ju [EMAIL PROTECTED] Date: Tue Jan 9 11:37:52 2007 -0800 [SCSI] qla2xxx: correct locking while call starget_for_each_device() Removed spin_unlock_irq()/spin_lock_irq() pairs surrounding starget_for_each_device() calls. As Matthew W. pointed out, starget_for_each_device() can be called under a spinlock being held. The change has been tested and verified on qla2xxx.ko module. Thanks Matthew W. and Hisashi H. for help. Signed-off-by: Andrew Vasquez [EMAIL PROTECTED] Signed-off-by: Seokmann Ju [EMAIL PROTECTED] Signed-off-by: James Bottomley [EMAIL PROTECTED] http://marc.theaimsgroup.com/?l=linux-scsim=116837234904583w=2 Could you try the latest 2.6.21-rc which contains the correction? Regards, Andrew Vasquez - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: qla2xxx BUG: workqueue leaked lock or atomic
On Mon, 26 Feb 2007, Andre Noll wrote: > On linux-2.6.20.1, we're seeing hard lockups with 2 raid systems > connected to a qla2xxx card and used as a single volume via lvm. > The system seems to lock up only if data gets written to both raid > systems at the same time. > > On a standard kernel nothing makes it to the log, the system just > freezes. So we tried a lockdep kernel which reports two BUGs during > boot, see below. > > Could this be related to our problem? Before we proceed further, could you retrieve the latest firmware release for 24xx type HBAs: > [ 64.151096] QLogic Fibre Channel HBA Driver > [ 64.151405] ACPI: PCI Interrupt :05:08.0[A] -> GSI 32 (level, low) -> > IRQ 32 > [ 64.151821] qla2xxx :05:08.0: Found an ISP2422, irq 32, iobase > 0xc2006000 > [ 64.152231] qla2xxx :05:08.0: Configuring PCI space... > [ 64.152498] qla2xxx :05:08.0: Configure NVRAM parameters... > [ 64.159088] qla2xxx :05:08.0: Verifying loaded RISC code... > [ 74.169623] qla2xxx :05:08.0: Firmware image unavailable. > [ 74.169737] qla2xxx :05:08.0: Firmware images can be retrieved from: > ftp://ftp.qlogic.com/outgoing/linux/firmware/. > [ 74.169902] qla2xxx :05:08.0: Attempting to load (potentially > outdated) firmware from flash. > [ 74.760935] qla2xxx :05:08.0: Allocated (64 KB) for EFT... > [ 74.761186] qla2xxx :05:08.0: Allocated (1413 KB) for firmware dump... > [ 74.776988] scsi0 : qla2xxx > [ 74.961451] qla2xxx :05:08.0: > [ 74.961452] QLogic Fibre Channel HBA Driver: 8.01.07-k4 > [ 74.961453] QLogic HP AE369-60001 - QLA2340 > [ 74.961454] ISP2422: PCI-X Mode 1 (133 MHz) @ :05:08.0 hdma+, > host#=0, fw=4.00.70 [IP] You are loading some stale firmware that's left over on the card -- I'm not even sure what 4.00.70 is, as the latest release firmware is 4.00.27. You can retrieve the image here: ftp://ftp.qlogic.com/outgoing/linux/firmware/ql2400_fw.bin Let's start there... before we move on to this: > [ 75.778656] BUG: at kernel/lockdep.c:1860 trace_hardirqs_on() > [ 75.778771] > [ 75.778772] Call Trace: > [ 75.778967][] trace_hardirqs_on+0xd7/0x180 > [ 75.779154] [] _spin_unlock_irq+0x2b/0x40 > [ 75.779271] [] > qla2x00_process_completed_request+0x137/0x1d0 > [ 75.779424] [] qla2x00_status_entry+0x82/0xa40 > [ 75.779541] [] __lock_acquire+0xcdf/0xd90 > [ 75.779657] [] _spin_unlock_irqrestore+0x42/0x60 > [ 75.779775] [] qla24xx_intr_handler+0x4e/0x2b0 > [ 75.779892] [] qla24xx_process_response_queue+0xc1/0x1c0 > [ 75.780012] [] qla24xx_intr_handler+0x1d4/0x2b0 > [ 75.780131] [] handle_IRQ_event+0x20/0x60 Hmm Regards, Andrew Vasquez - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: qla2xxx BUG: workqueue leaked lock or atomic
On Mon, 26 Feb 2007, Andre Noll wrote: On linux-2.6.20.1, we're seeing hard lockups with 2 raid systems connected to a qla2xxx card and used as a single volume via lvm. The system seems to lock up only if data gets written to both raid systems at the same time. On a standard kernel nothing makes it to the log, the system just freezes. So we tried a lockdep kernel which reports two BUGs during boot, see below. Could this be related to our problem? Before we proceed further, could you retrieve the latest firmware release for 24xx type HBAs: [ 64.151096] QLogic Fibre Channel HBA Driver [ 64.151405] ACPI: PCI Interrupt :05:08.0[A] - GSI 32 (level, low) - IRQ 32 [ 64.151821] qla2xxx :05:08.0: Found an ISP2422, irq 32, iobase 0xc2006000 [ 64.152231] qla2xxx :05:08.0: Configuring PCI space... [ 64.152498] qla2xxx :05:08.0: Configure NVRAM parameters... [ 64.159088] qla2xxx :05:08.0: Verifying loaded RISC code... [ 74.169623] qla2xxx :05:08.0: Firmware image unavailable. [ 74.169737] qla2xxx :05:08.0: Firmware images can be retrieved from: ftp://ftp.qlogic.com/outgoing/linux/firmware/. [ 74.169902] qla2xxx :05:08.0: Attempting to load (potentially outdated) firmware from flash. [ 74.760935] qla2xxx :05:08.0: Allocated (64 KB) for EFT... [ 74.761186] qla2xxx :05:08.0: Allocated (1413 KB) for firmware dump... [ 74.776988] scsi0 : qla2xxx [ 74.961451] qla2xxx :05:08.0: [ 74.961452] QLogic Fibre Channel HBA Driver: 8.01.07-k4 [ 74.961453] QLogic HP AE369-60001 - QLA2340 [ 74.961454] ISP2422: PCI-X Mode 1 (133 MHz) @ :05:08.0 hdma+, host#=0, fw=4.00.70 [IP] You are loading some stale firmware that's left over on the card -- I'm not even sure what 4.00.70 is, as the latest release firmware is 4.00.27. You can retrieve the image here: ftp://ftp.qlogic.com/outgoing/linux/firmware/ql2400_fw.bin Let's start there... before we move on to this: [ 75.778656] BUG: at kernel/lockdep.c:1860 trace_hardirqs_on() [ 75.778771] [ 75.778772] Call Trace: [ 75.778967] IRQ [8024b877] trace_hardirqs_on+0xd7/0x180 [ 75.779154] [8052bc1b] _spin_unlock_irq+0x2b/0x40 [ 75.779271] [804605d7] qla2x00_process_completed_request+0x137/0x1d0 [ 75.779424] [804606f2] qla2x00_status_entry+0x82/0xa40 [ 75.779541] [8024b17f] __lock_acquire+0xcdf/0xd90 [ 75.779657] [8052bcb2] _spin_unlock_irqrestore+0x42/0x60 [ 75.779775] [8046228e] qla24xx_intr_handler+0x4e/0x2b0 [ 75.779892] [804613e1] qla24xx_process_response_queue+0xc1/0x1c0 [ 75.780012] [80462414] qla24xx_intr_handler+0x1d4/0x2b0 [ 75.780131] [8025e950] handle_IRQ_event+0x20/0x60 Hmm Regards, Andrew Vasquez - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135
On Fri, 02 Feb 2007, Randy Dunlap wrote: > On Fri, 2 Feb 2007 16:25:41 -0800 Andrew Morton wrote: > > > On Fri, 2 Feb 2007 12:56:30 -0800 > > Andrew Vasquez <[EMAIL PROTECTED]> wrote: > > > > > > > dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats > > > > > limit=2m passes=100 pattern=iot dlimit=2048 > > > > What is this mysterious dt command, btw? > > I expect that it's the one here: > http://www.scsifaq.org/RMiller_Tools/index.html yep, that's the one. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135
On Thu, 01 Feb 2007, Andrew Morton wrote: > On Mon, 22 Jan 2007 10:35:10 -0800 Andrew Vasquez <[EMAIL PROTECTED]> wrote: > > Basically what is happening from the FC side is the initiator executes > > a simple dt test: > > > > dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats > > limit=2m passes=100 pattern=iot dlimit=2048 > > > > against a single lun (a very basic Windows target mode driver). > > During the test a port-enable, port-disable script is running agains > > the switch's port that is connected to the target (this occurs every > > sixty seconds (for a disabled duration of 2 seconds). Additionally, > > the target itself is set to LOGO (logout) or drop off the topology > > every 30 seconds. > > I don't understand what effect the port-enable/port-disable has upon the > system. Will it cause I/O errors, or what? No I/O errors should make there way to the upper-layers (block/FS). The system *should* be shielded from the fibre-channel fabric events. I just wanted to explain what the (basic sanity) test did. > > This test runs fine up to 2.6.19. > > One thing we did in there was to give direct-io-against-blockdevs some > special-case bio-preparation code. Perhaps this is tickling a bug somehow. > > We can revert that change like this: > > > diff -puN fs/block_dev.c~a fs/block_dev.c > --- a/fs/block_dev.c~a > +++ a/fs/block_dev.c > @@ -196,8 +196,47 @@ static void blk_unget_page(struct page * > pvec->page[--pvec->idx] = page; > } > > +static int > +blkdev_get_blocks(struct inode *inode, sector_t iblock, > + struct buffer_head *bh, int create) ... Hmm, with this patch we've noted two main differences: 1) I/O throughput with the basic 'dd' command used (above) is back to 60MB/s, rather than the appalling 20-22 MB/s we were seeing with 2.6.20-rcX. 2) No panics -- so far with 2+ hours of testing. With our vanilla system of 2.6.20-rc7, the test could trigger the panic within 15 to 20 minutes. We'll let this run over the weekend -- I'll certainly let you know if anything has changed (failures). -- Andrew Vasquez - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135
On Thu, 01 Feb 2007, Andrew Morton wrote: On Mon, 22 Jan 2007 10:35:10 -0800 Andrew Vasquez [EMAIL PROTECTED] wrote: Basically what is happening from the FC side is the initiator executes a simple dt test: dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats limit=2m passes=100 pattern=iot dlimit=2048 against a single lun (a very basic Windows target mode driver). During the test a port-enable, port-disable script is running agains the switch's port that is connected to the target (this occurs every sixty seconds (for a disabled duration of 2 seconds). Additionally, the target itself is set to LOGO (logout) or drop off the topology every 30 seconds. I don't understand what effect the port-enable/port-disable has upon the system. Will it cause I/O errors, or what? No I/O errors should make there way to the upper-layers (block/FS). The system *should* be shielded from the fibre-channel fabric events. I just wanted to explain what the (basic sanity) test did. This test runs fine up to 2.6.19. One thing we did in there was to give direct-io-against-blockdevs some special-case bio-preparation code. Perhaps this is tickling a bug somehow. We can revert that change like this: diff -puN fs/block_dev.c~a fs/block_dev.c --- a/fs/block_dev.c~a +++ a/fs/block_dev.c @@ -196,8 +196,47 @@ static void blk_unget_page(struct page * pvec-page[--pvec-idx] = page; } +static int +blkdev_get_blocks(struct inode *inode, sector_t iblock, + struct buffer_head *bh, int create) ... Hmm, with this patch we've noted two main differences: 1) I/O throughput with the basic 'dd' command used (above) is back to 60MB/s, rather than the appalling 20-22 MB/s we were seeing with 2.6.20-rcX. 2) No panics -- so far with 2+ hours of testing. With our vanilla system of 2.6.20-rc7, the test could trigger the panic within 15 to 20 minutes. We'll let this run over the weekend -- I'll certainly let you know if anything has changed (failures). -- Andrew Vasquez - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135
On Fri, 02 Feb 2007, Randy Dunlap wrote: On Fri, 2 Feb 2007 16:25:41 -0800 Andrew Morton wrote: On Fri, 2 Feb 2007 12:56:30 -0800 Andrew Vasquez [EMAIL PROTECTED] wrote: dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats limit=2m passes=100 pattern=iot dlimit=2048 What is this mysterious dt command, btw? I expect that it's the one here: http://www.scsifaq.org/RMiller_Tools/index.html yep, that's the one. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135
9b939eb22cfbe0beea12] one more EXPORT_UNUSED_SYMBOL removal git-bisect bad 029530f810dd5147f7e59b939eb22cfbe0beea12 Not sure how much help it is, but while trying to instrument as-iosched.c, we can see that the failure has a fairly stable signature: [15280.813479] as_dispatch_request: ad=810038704da8 ; reads=0 ; writes=1 ; dir=0 ;fifo_list[async]=81003f4071a0 ad-new_batch=0 ad->change_batch=0. [15280.827032] as_dispatch_request: q=81003fdae050 ; #_reqs=128 ; lmerge=81003f4071a0 ;. which means: static int as_dispatch_request(request_queue_t *q, int force) { ... const int writes = !list_empty(>fifo_list[REQ_ASYNC]); 'writes' is set, the batch_data_dir has not changed (ad->batch_data_dir != REQ_SYNC) in the following segment: ... if (writes) { dispatch_writes: BUG_ON(RB_EMPTY_ROOT(>sort_list[REQ_ASYNC])); if (ad->batch_data_dir == REQ_SYNC) { ad->changed_batch = 1; /* * new_batch might be 1 when the queue runs out of * reads. A subsequent submission of a write might * cause a change of batch before the read is finished. */ ad->new_batch = 0; } ad->batch_data_dir = REQ_ASYNC; ad->current_write_count = ad->write_batch_count; ad->write_batch_idled = 0; rq = ad->next_rq[ad->batch_data_dir]; goto dispatch_request; } ad->next_rq[ad->batch_data_dir] is NULL, and is then passed down to as_move_to_dispatch() where the first dereference of rq: static void as_move_to_dispatch(struct as_data *ad, struct request *rq) { const int data_dir = rq_is_sync(rq); borks the machine. What's odd (perhaps it's just our rudimentary understanding of AS) is that there are segments of code where ad->next_rq[REQ_ASYNC] is checked against NULL (in 'writes' case it is not). Anyway, any ideas or hints? Attached is the .config used. Thanks, Andrew Vasquez # # Automatically generated make config: don't edit # Linux kernel version: 2.6.19 # Fri Jan 19 16:53:19 2007 # CONFIG_X86_64=y CONFIG_64BIT=y CONFIG_X86=y CONFIG_ZONE_DMA32=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_MMU=y CONFIG_RWSEM_GENERIC_SPINLOCK=y CONFIG_GENERIC_HWEIGHT=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_CMPXCHG=y CONFIG_EARLY_PRINTK=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_ARCH_POPULATES_NODE_MAP=y CONFIG_DMI=y CONFIG_AUDIT_ARCH=y CONFIG_GENERIC_BUG=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y # CONFIG_IPC_NS is not set # CONFIG_POSIX_MQUEUE is not set # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # CONFIG_UTS_NS is not set # CONFIG_AUDIT is not set # CONFIG_IKCONFIG is not set # CONFIG_CPUSETS is not set CONFIG_SYSFS_DEPRECATED=y # CONFIG_RELAY is not set CONFIG_INITRAMFS_SOURCE="" CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y # CONFIG_EMBEDDED is not set CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SHMEM=y CONFIG_SLAB=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # CONFIG_SLOB is not set # # Loadable module support # CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y CONFIG_MODVERSIONS=y # CONFIG_MODULE_SRCVERSION_ALL is not set # CONFIG_KMOD is not set CONFIG_STOP_MACHINE=y # # Block layer # CONFIG_BLOCK=y # CONFIG_BLK_DEV_IO_TRACE is not set # # IO Schedulers # CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y CONFIG_DEFAULT_AS=y # CONFIG_DEFAULT_DEADLINE is not set # CONFIG_DEFAULT_CFQ is not set # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED="anticipatory" # # Processor type and features # CONFIG_X86_PC=y # CONFIG_X86_VSMP is not set # CONFIG_MK8 is not set CONFIG_MPSC=y # CONFIG_MCORE2 is not set # CONFIG_GENERIC_CPU is not set CONFIG_X86_L1_CACHE_BYTES=128 CONFIG_X86_L1_CACHE_SHIFT=7 CONFIG_X86_INTERNODE_CACHE_BYTES=128 CONFIG_X86_TSC=y CONFIG_X86_GOOD_APIC=y # CO
[BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135
addresses git-bisect good 69688262fb94e92a32f188b79c02dc32016d4927 # bad: [5faad620264290b17e80a8b0996b039ea0d5ac73] Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block git-bisect bad 5faad620264290b17e80a8b0996b039ea0d5ac73 # bad: [3161986224a3faa8ccca3e665b7404d81e7ee3cf] fbdev: remove references to non-existent fbmon_valid_timings() git-bisect bad 3161986224a3faa8ccca3e665b7404d81e7ee3cf # bad: [c954e2a5d1c9662991a41282297ddebcadee0578] knfsd: nfsd4: make verify and nverify wrappers git-bisect bad c954e2a5d1c9662991a41282297ddebcadee0578 # bad: [021d3a72459191a76e8e482ee4937ba6bc9fd712] knfsd: nfsd4: handling more nfsd_cross_mnt errors in nfsd4 readdir git-bisect bad 021d3a72459191a76e8e482ee4937ba6bc9fd712 # bad: [b797b5beac966df5c5d96c0d39fe366f57135343] knfsd: svcrpc: fix gss krb5i memory leak git-bisect bad b797b5beac966df5c5d96c0d39fe366f57135343 # bad: [b21a323710e77a27b2f66af901bd3640c30aba6e] remove the broken BLK_DEV_SWIM_IOP driver git-bisect bad b21a323710e77a27b2f66af901bd3640c30aba6e # bad: [029530f810dd5147f7e59b939eb22cfbe0beea12] one more EXPORT_UNUSED_SYMBOL removal git-bisect bad 029530f810dd5147f7e59b939eb22cfbe0beea12 Not sure how much help it is, but while trying to instrument as-iosched.c, we can see that the failure has a fairly stable signature: [15280.813479] as_dispatch_request: ad=810038704da8 ; reads=0 ; writes=1 ; dir=0 ;fifo_list[async]=81003f4071a0 ad-new_batch=0 ad-change_batch=0. [15280.827032] as_dispatch_request: q=81003fdae050 ; #_reqs=128 ; lmerge=81003f4071a0 ;. which means: static int as_dispatch_request(request_queue_t *q, int force) { ... const int writes = !list_empty(ad-fifo_list[REQ_ASYNC]); 'writes' is set, the batch_data_dir has not changed (ad-batch_data_dir != REQ_SYNC) in the following segment: ... if (writes) { dispatch_writes: BUG_ON(RB_EMPTY_ROOT(ad-sort_list[REQ_ASYNC])); if (ad-batch_data_dir == REQ_SYNC) { ad-changed_batch = 1; /* * new_batch might be 1 when the queue runs out of * reads. A subsequent submission of a write might * cause a change of batch before the read is finished. */ ad-new_batch = 0; } ad-batch_data_dir = REQ_ASYNC; ad-current_write_count = ad-write_batch_count; ad-write_batch_idled = 0; rq = ad-next_rq[ad-batch_data_dir]; goto dispatch_request; } ad-next_rq[ad-batch_data_dir] is NULL, and is then passed down to as_move_to_dispatch() where the first dereference of rq: static void as_move_to_dispatch(struct as_data *ad, struct request *rq) { const int data_dir = rq_is_sync(rq); borks the machine. What's odd (perhaps it's just our rudimentary understanding of AS) is that there are segments of code where ad-next_rq[REQ_ASYNC] is checked against NULL (in 'writes' case it is not). Anyway, any ideas or hints? Attached is the .config used. Thanks, Andrew Vasquez # # Automatically generated make config: don't edit # Linux kernel version: 2.6.19 # Fri Jan 19 16:53:19 2007 # CONFIG_X86_64=y CONFIG_64BIT=y CONFIG_X86=y CONFIG_ZONE_DMA32=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_MMU=y CONFIG_RWSEM_GENERIC_SPINLOCK=y CONFIG_GENERIC_HWEIGHT=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_CMPXCHG=y CONFIG_EARLY_PRINTK=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_ARCH_POPULATES_NODE_MAP=y CONFIG_DMI=y CONFIG_AUDIT_ARCH=y CONFIG_GENERIC_BUG=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION= # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y # CONFIG_IPC_NS is not set # CONFIG_POSIX_MQUEUE is not set # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # CONFIG_UTS_NS is not set # CONFIG_AUDIT is not set # CONFIG_IKCONFIG is not set # CONFIG_CPUSETS is not set CONFIG_SYSFS_DEPRECATED=y # CONFIG_RELAY is not set CONFIG_INITRAMFS_SOURCE= CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y # CONFIG_EMBEDDED is not set CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y
Re: [-mm patch] make qla2x00_reg_remote_port() static
On Fri, 24 Nov 2006, Adrian Bunk wrote: > On Thu, Nov 23, 2006 at 02:17:03AM -0800, Andrew Morton wrote: > >... > > Changes since 2.6.19-rc5-mm2: > >... > > git-scsi-misc.patch > >... > > git trees > >... > > qla2x00_reg_remote_port() can now become static. > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> Acked-by: Andrew Vasquez <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm patch] make qla2x00_reg_remote_port() static
On Fri, 24 Nov 2006, Adrian Bunk wrote: On Thu, Nov 23, 2006 at 02:17:03AM -0800, Andrew Morton wrote: ... Changes since 2.6.19-rc5-mm2: ... git-scsi-misc.patch ... git trees ... qla2x00_reg_remote_port() can now become static. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] Acked-by: Andrew Vasquez [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.13] Warning in the qla2xxx driver
On Thu, 01 Sep 2005, Daniel Walker wrote: > Remove possible uninitialized "sg" field warning in the qla24xx driver > > Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]> > > Index: linux-2.6.13/drivers/scsi/qla2xxx/qla_iocb.c > === > --- linux-2.6.13.orig/drivers/scsi/qla2xxx/qla_iocb.c 2005-08-28 > 23:41:01.0 + > +++ linux-2.6.13/drivers/scsi/qla2xxx/qla_iocb.c 2005-08-31 > 18:31:03.0 + > @@ -744,7 +744,7 @@ qla24xx_start_scsi(srb_t *sp) > uint32_tindex; > uint32_thandle; > struct cmd_type_7 *cmd_pkt; > - struct scatterlist *sg; > + struct scatterlist *sg = NULL; > uint16_tcnt; > uint16_treq_cnt; > uint16_ttot_dsds; This was already addressed in the following patch: http://marc.theaimsgroup.com/?l=linux-scsi=112510857722632=2 which was recently pull by Linus: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=131736d34ebc3251d79ddfd08a5e57a3e86decd4 -- av - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.13] Warning in the qla2xxx driver
On Thu, 01 Sep 2005, Daniel Walker wrote: Remove possible uninitialized sg field warning in the qla24xx driver Signed-Off-By: Daniel Walker [EMAIL PROTECTED] Index: linux-2.6.13/drivers/scsi/qla2xxx/qla_iocb.c === --- linux-2.6.13.orig/drivers/scsi/qla2xxx/qla_iocb.c 2005-08-28 23:41:01.0 + +++ linux-2.6.13/drivers/scsi/qla2xxx/qla_iocb.c 2005-08-31 18:31:03.0 + @@ -744,7 +744,7 @@ qla24xx_start_scsi(srb_t *sp) uint32_tindex; uint32_thandle; struct cmd_type_7 *cmd_pkt; - struct scatterlist *sg; + struct scatterlist *sg = NULL; uint16_tcnt; uint16_treq_cnt; uint16_ttot_dsds; This was already addressed in the following patch: http://marc.theaimsgroup.com/?l=linux-scsim=112510857722632w=2 which was recently pull by Linus: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=131736d34ebc3251d79ddfd08a5e57a3e86decd4 -- av - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc7 qla2xxx unaligned accesses
On Thu, 25 Aug 2005, Keith Owens wrote: > On Wed, 24 Aug 2005 11:22:52 -0700, > Andrew Vasquez <[EMAIL PROTECTED]> wrote: > >On Wed, 24 Aug 2005, Keith Owens wrote: > > > >> 2.6.13-rc7 + kdb on ia64. The qla2xxx drivers are getting unaligned > >> accesses at startup. > >> > >> qla2300 :01:02.0: Found an ISP2312, irq 66, iobase 0xc0080f30 > >> qla2300 :01:02.0: Configuring PCI space... > >> PCI: slot :01:02.0 has incorrect PCI cache line size of 0 bytes, > >> correcting to 128 > >> qla2300 :01:02.0: Configure NVRAM parameters... > >> qla2300 :01:02.0: Verifying loaded RISC code... > >> qla2300 :01:02.0: Waiting for LIP to complete... > >> qla2300 :01:02.0: Cable is unplugged... > >> scsi1 : qla2xxx > >> kernel unaligned access to 0xe0300667800c, ip=0xa001005cd0b1 > > > >Yes, I have a fix for this in my patch-queue. I'll attach it here for > >reference. I'll forward onto linux-scsi post 2.6.13. > > > >-- > >av > > > >--- > > > >On some platforms the hard-casting of the 8 byte node_name > >and port_name arrays to an u64 would cause unaligned-access > >warnings. Generalize the conversions with consistent > >shifting of WWN bytes. > > > >Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]> > >--- > > > > drivers/scsi/qla2xxx/qla_attr.c | 27 +-- > > 1 files changed, 17 insertions(+), 10 deletions(-) > > > >24e16c86578498fd71a3e33bebbd8be7323a03c6 > >diff --git a/drivers/scsi/qla2xxx/qla_attr.c > >b/drivers/scsi/qla2xxx/qla_attr.c > >--- a/drivers/scsi/qla2xxx/qla_attr.c > >+++ b/drivers/scsi/qla2xxx/qla_attr.c > >@@ -345,6 +345,15 @@ struct class_device_attribute *qla2x00_h > > > > /* Host attributes. */ > > > >+static u64 > >+wwn_to_u64(uint8_t *wwn) > >+{ > >+return (u64)wwn[0] << 56 | (u64)wwn[1] << 48 | > >+(u64)wwn[2] << 40 | (u64)wwn[3] << 32 | > >+(u64)wwn[4] << 24 | (u64)wwn[5] << 16 | > >+(u64)wwn[6] << 8 | (u64)wwn[7]; > >+} > >+ > > Any reason you defined your own function instead of using the standard > get_unaligned()? I was unaware there was even such a helper. Anyway, the wwn_to_u64() function adds another benefit -- clarity, were converting a 8 byte WWN array to it's endian-agnosting 64bit value. I suppose, we could make it inline. -- AV - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc7 qla2xxx unaligned accesses
On Thu, 25 Aug 2005, Keith Owens wrote: On Wed, 24 Aug 2005 11:22:52 -0700, Andrew Vasquez [EMAIL PROTECTED] wrote: On Wed, 24 Aug 2005, Keith Owens wrote: 2.6.13-rc7 + kdb on ia64. The qla2xxx drivers are getting unaligned accesses at startup. qla2300 :01:02.0: Found an ISP2312, irq 66, iobase 0xc0080f30 qla2300 :01:02.0: Configuring PCI space... PCI: slot :01:02.0 has incorrect PCI cache line size of 0 bytes, correcting to 128 qla2300 :01:02.0: Configure NVRAM parameters... qla2300 :01:02.0: Verifying loaded RISC code... qla2300 :01:02.0: Waiting for LIP to complete... qla2300 :01:02.0: Cable is unplugged... scsi1 : qla2xxx kernel unaligned access to 0xe0300667800c, ip=0xa001005cd0b1 Yes, I have a fix for this in my patch-queue. I'll attach it here for reference. I'll forward onto linux-scsi post 2.6.13. -- av --- On some platforms the hard-casting of the 8 byte node_name and port_name arrays to an u64 would cause unaligned-access warnings. Generalize the conversions with consistent shifting of WWN bytes. Signed-off-by: Andrew Vasquez [EMAIL PROTECTED] --- drivers/scsi/qla2xxx/qla_attr.c | 27 +-- 1 files changed, 17 insertions(+), 10 deletions(-) 24e16c86578498fd71a3e33bebbd8be7323a03c6 diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c --- a/drivers/scsi/qla2xxx/qla_attr.c +++ b/drivers/scsi/qla2xxx/qla_attr.c @@ -345,6 +345,15 @@ struct class_device_attribute *qla2x00_h /* Host attributes. */ +static u64 +wwn_to_u64(uint8_t *wwn) +{ +return (u64)wwn[0] 56 | (u64)wwn[1] 48 | +(u64)wwn[2] 40 | (u64)wwn[3] 32 | +(u64)wwn[4] 24 | (u64)wwn[5] 16 | +(u64)wwn[6] 8 | (u64)wwn[7]; +} + Any reason you defined your own function instead of using the standard get_unaligned()? I was unaware there was even such a helper. Anyway, the wwn_to_u64() function adds another benefit -- clarity, were converting a 8 byte WWN array to it's endian-agnosting 64bit value. I suppose, we could make it inline. -- AV - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc7 qla2xxx unaligned accesses
On Wed, 24 Aug 2005, Keith Owens wrote: > 2.6.13-rc7 + kdb on ia64. The qla2xxx drivers are getting unaligned > accesses at startup. > > qla2300 :01:02.0: Found an ISP2312, irq 66, iobase 0xc0080f30 > qla2300 :01:02.0: Configuring PCI space... > PCI: slot :01:02.0 has incorrect PCI cache line size of 0 bytes, > correcting to 128 > qla2300 :01:02.0: Configure NVRAM parameters... > qla2300 :01:02.0: Verifying loaded RISC code... > qla2300 :01:02.0: Waiting for LIP to complete... > qla2300 :01:02.0: Cable is unplugged... > scsi1 : qla2xxx > kernel unaligned access to 0xe0300667800c, ip=0xa001005cd0b1 Yes, I have a fix for this in my patch-queue. I'll attach it here for reference. I'll forward onto linux-scsi post 2.6.13. -- av --- On some platforms the hard-casting of the 8 byte node_name and port_name arrays to an u64 would cause unaligned-access warnings. Generalize the conversions with consistent shifting of WWN bytes. Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]> --- drivers/scsi/qla2xxx/qla_attr.c | 27 +-- 1 files changed, 17 insertions(+), 10 deletions(-) 24e16c86578498fd71a3e33bebbd8be7323a03c6 diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c --- a/drivers/scsi/qla2xxx/qla_attr.c +++ b/drivers/scsi/qla2xxx/qla_attr.c @@ -345,6 +345,15 @@ struct class_device_attribute *qla2x00_h /* Host attributes. */ +static u64 +wwn_to_u64(uint8_t *wwn) +{ + return (u64)wwn[0] << 56 | (u64)wwn[1] << 48 | + (u64)wwn[2] << 40 | (u64)wwn[3] << 32 | + (u64)wwn[4] << 24 | (u64)wwn[5] << 16 | + (u64)wwn[6] << 8 | (u64)wwn[7]; +} + static void qla2x00_get_host_port_id(struct Scsi_Host *shost) { @@ -360,16 +369,16 @@ qla2x00_get_starget_node_name(struct scs struct Scsi_Host *host = dev_to_shost(starget->dev.parent); scsi_qla_host_t *ha = to_qla_host(host); fc_port_t *fcport; - uint64_t node_name = 0; + u64 node_name = 0; list_for_each_entry(fcport, >fcports, list) { if (starget->id == fcport->os_target_id) { - node_name = *(uint64_t *)fcport->node_name; + node_name = wwn_to_u64(fcport->node_name); break; } } - fc_starget_node_name(starget) = be64_to_cpu(node_name); + fc_starget_node_name(starget) = node_name; } static void @@ -378,16 +387,16 @@ qla2x00_get_starget_port_name(struct scs struct Scsi_Host *host = dev_to_shost(starget->dev.parent); scsi_qla_host_t *ha = to_qla_host(host); fc_port_t *fcport; - uint64_t port_name = 0; + u64 port_name = 0; list_for_each_entry(fcport, >fcports, list) { if (starget->id == fcport->os_target_id) { - port_name = *(uint64_t *)fcport->port_name; + port_name = wwn_to_u64(fcport->port_name); break; } } - fc_starget_port_name(starget) = be64_to_cpu(port_name); + fc_starget_port_name(starget) = port_name; } static void @@ -460,9 +469,7 @@ struct fc_function_template qla2xxx_tran void qla2x00_init_host_attr(scsi_qla_host_t *ha) { - fc_host_node_name(ha->host) = - be64_to_cpu(*(uint64_t *)ha->init_cb->node_name); - fc_host_port_name(ha->host) = - be64_to_cpu(*(uint64_t *)ha->init_cb->port_name); + fc_host_node_name(ha->host) = wwn_to_u64(ha->init_cb->node_name); + fc_host_port_name(ha->host) = wwn_to_u64(ha->init_cb->port_name); fc_host_supported_classes(ha->host) = FC_COS_CLASS3; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc7 qla2xxx unaligned accesses
On Wed, 24 Aug 2005, Keith Owens wrote: 2.6.13-rc7 + kdb on ia64. The qla2xxx drivers are getting unaligned accesses at startup. qla2300 :01:02.0: Found an ISP2312, irq 66, iobase 0xc0080f30 qla2300 :01:02.0: Configuring PCI space... PCI: slot :01:02.0 has incorrect PCI cache line size of 0 bytes, correcting to 128 qla2300 :01:02.0: Configure NVRAM parameters... qla2300 :01:02.0: Verifying loaded RISC code... qla2300 :01:02.0: Waiting for LIP to complete... qla2300 :01:02.0: Cable is unplugged... scsi1 : qla2xxx kernel unaligned access to 0xe0300667800c, ip=0xa001005cd0b1 Yes, I have a fix for this in my patch-queue. I'll attach it here for reference. I'll forward onto linux-scsi post 2.6.13. -- av --- On some platforms the hard-casting of the 8 byte node_name and port_name arrays to an u64 would cause unaligned-access warnings. Generalize the conversions with consistent shifting of WWN bytes. Signed-off-by: Andrew Vasquez [EMAIL PROTECTED] --- drivers/scsi/qla2xxx/qla_attr.c | 27 +-- 1 files changed, 17 insertions(+), 10 deletions(-) 24e16c86578498fd71a3e33bebbd8be7323a03c6 diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c --- a/drivers/scsi/qla2xxx/qla_attr.c +++ b/drivers/scsi/qla2xxx/qla_attr.c @@ -345,6 +345,15 @@ struct class_device_attribute *qla2x00_h /* Host attributes. */ +static u64 +wwn_to_u64(uint8_t *wwn) +{ + return (u64)wwn[0] 56 | (u64)wwn[1] 48 | + (u64)wwn[2] 40 | (u64)wwn[3] 32 | + (u64)wwn[4] 24 | (u64)wwn[5] 16 | + (u64)wwn[6] 8 | (u64)wwn[7]; +} + static void qla2x00_get_host_port_id(struct Scsi_Host *shost) { @@ -360,16 +369,16 @@ qla2x00_get_starget_node_name(struct scs struct Scsi_Host *host = dev_to_shost(starget-dev.parent); scsi_qla_host_t *ha = to_qla_host(host); fc_port_t *fcport; - uint64_t node_name = 0; + u64 node_name = 0; list_for_each_entry(fcport, ha-fcports, list) { if (starget-id == fcport-os_target_id) { - node_name = *(uint64_t *)fcport-node_name; + node_name = wwn_to_u64(fcport-node_name); break; } } - fc_starget_node_name(starget) = be64_to_cpu(node_name); + fc_starget_node_name(starget) = node_name; } static void @@ -378,16 +387,16 @@ qla2x00_get_starget_port_name(struct scs struct Scsi_Host *host = dev_to_shost(starget-dev.parent); scsi_qla_host_t *ha = to_qla_host(host); fc_port_t *fcport; - uint64_t port_name = 0; + u64 port_name = 0; list_for_each_entry(fcport, ha-fcports, list) { if (starget-id == fcport-os_target_id) { - port_name = *(uint64_t *)fcport-port_name; + port_name = wwn_to_u64(fcport-port_name); break; } } - fc_starget_port_name(starget) = be64_to_cpu(port_name); + fc_starget_port_name(starget) = port_name; } static void @@ -460,9 +469,7 @@ struct fc_function_template qla2xxx_tran void qla2x00_init_host_attr(scsi_qla_host_t *ha) { - fc_host_node_name(ha-host) = - be64_to_cpu(*(uint64_t *)ha-init_cb-node_name); - fc_host_port_name(ha-host) = - be64_to_cpu(*(uint64_t *)ha-init_cb-port_name); + fc_host_node_name(ha-host) = wwn_to_u64(ha-init_cb-node_name); + fc_host_port_name(ha-host) = wwn_to_u64(ha-init_cb-port_name); fc_host_supported_classes(ha-host) = FC_COS_CLASS3; } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fix up qla2xxx configuration bogosity
On Thu, 28 Jul 2005, James Bottomley wrote: > On Wed, 2005-07-27 at 22:10 -0700, Andrew Vasquez wrote: > > Would you also apply the attached patch which adds the appropriate > > FW_LOADER pre-requisite and a separate entry for ISP24xx support. > > That's what I see reading the code; however, it looks like it's *only* > the 24xx that needs it (qla24xx_load_risc_hotplug). The patch below > pulls in the FW loader for every qlogic fibre driver, not just the > qla24xx; is there a reason for doing this? Yes, I've been working on a set of patches which add this functionality across the board with supported ISP types (21xx, 22xx, 23xx). I should have some patches for submission in next week's time-frame. So rather than a adding #if code around the relevant 24xx specific codes in qla2xxx, I chose the fw_loader path for all types. -- Andrew Vasquez - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fix up qla2xxx configuration bogosity
On Thu, 28 Jul 2005, James Bottomley wrote: On Wed, 2005-07-27 at 22:10 -0700, Andrew Vasquez wrote: Would you also apply the attached patch which adds the appropriate FW_LOADER pre-requisite and a separate entry for ISP24xx support. That's what I see reading the code; however, it looks like it's *only* the 24xx that needs it (qla24xx_load_risc_hotplug). The patch below pulls in the FW loader for every qlogic fibre driver, not just the qla24xx; is there a reason for doing this? Yes, I've been working on a set of patches which add this functionality across the board with supported ISP types (21xx, 22xx, 23xx). I should have some patches for submission in next week's time-frame. So rather than a adding #if code around the relevant 24xx specific codes in qla2xxx, I chose the fw_loader path for all types. -- Andrew Vasquez - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fix up qla2xxx configuration bogosity
Linus, In looking through your latest git-pull and update of the Kconfig quirks in qla2xxx: Fix up qla2xxx configuration bogosity http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=e0aa8afd97536a9d94f82a07b4c4b3f05aef6f82;hp=e4ff4d7f9d85a2bc714307eb9113617182e62845 Would you also apply the attached patch which adds the appropriate FW_LOADER pre-requisite and a separate entry for ISP24xx support. Thanks to Adrian Bunk and Jesper Juhl for their efforts in fixing this quirk. Regards, Andrew Vasquez --- diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig --- a/drivers/scsi/qla2xxx/Kconfig +++ b/drivers/scsi/qla2xxx/Kconfig @@ -7,6 +7,7 @@ config SCSI_QLA21XX tristate "QLogic ISP2100 host adapter family support" depends on SCSI_QLA2XXX select SCSI_FC_ATTRS + select FW_LOADER ---help--- This driver supports the QLogic 21xx (ISP2100) host adapter family. @@ -14,6 +15,7 @@ config SCSI_QLA22XX tristate "QLogic ISP2200 host adapter family support" depends on SCSI_QLA2XXX select SCSI_FC_ATTRS + select FW_LOADER ---help--- This driver supports the QLogic 22xx (ISP2200) host adapter family. @@ -21,6 +23,7 @@ config SCSI_QLA2300 tristate "QLogic ISP2300 host adapter family support" depends on SCSI_QLA2XXX select SCSI_FC_ATTRS + select FW_LOADER ---help--- This driver supports the QLogic 2300 (ISP2300 and ISP2312) host adapter family. @@ -29,6 +32,7 @@ config SCSI_QLA2322 tristate "QLogic ISP2322 host adapter family support" depends on SCSI_QLA2XXX select SCSI_FC_ATTRS + select FW_LOADER ---help--- This driver supports the QLogic 2322 (ISP2322) host adapter family. @@ -36,6 +40,16 @@ config SCSI_QLA6312 tristate "QLogic ISP63xx host adapter family support" depends on SCSI_QLA2XXX select SCSI_FC_ATTRS + select FW_LOADER ---help--- This driver supports the QLogic 63xx (ISP6312 and ISP6322) host adapter family. + +config SCSI_QLA24XX + tristate "QLogic ISP24xx host adapter family support" + depends on SCSI_QLA2XXX + select SCSI_FC_ATTRS + select FW_LOADER + ---help--- + This driver supports the QLogic 24xx (ISP2422 and ISP2432) host + adapter family. diff --git a/drivers/scsi/qla2xxx/Makefile b/drivers/scsi/qla2xxx/Makefile --- a/drivers/scsi/qla2xxx/Makefile +++ b/drivers/scsi/qla2xxx/Makefile @@ -1,5 +1,4 @@ EXTRA_CFLAGS += -DUNIQUE_FW_NAME -EXTRA_CFLAGS += -DCONFIG_SCSI_QLA24XX -DCONFIG_SCSI_QLA24XX_MODULE qla2xxx-y := qla_os.o qla_init.o qla_mbx.o qla_iocb.o qla_isr.o qla_gs.o \ qla_dbg.o qla_sup.o qla_rscn.o qla_attr.o - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Incorrect driver getting loaded for Qlogic FC-HBA
On Wed, 27 Jul 2005, Rajat Jain wrote: > On 7/27/05, Andrew Vasquez <[EMAIL PROTECTED]> wrote: > > > > A similar problem was noted with RHEL4, it seems the modules.pcimap > > and pci.ids file were correct, but the pcitable file contained entries > > for all ql[ae]23xx based HBAs to load qla2300.ko. > > > > It's my understanding that this was fixed for RHEL4 U1. Which distro > > are you using? If you are using RHEL, and are still having problems, > > I'd suggest you file a report with Redhat. > > > > Regards, > > Andrew Vasquez > > > > BINGO! I AM using RHEL 4. So does that mean I can rectify the problem > by making appropriate changes to "pcitable" file? I'm trying to get a firm answer from the folks who originally discvoered the problem some time back, it seems you have two options: - during installation of RHEL4 (and not RHEL4U1), load with the 'noprobe' option: linux noprobe and manually select the appropriate drivers to load. - (post installation) modify the /etc/modprobe.conf to and rename the qla2300 entry to qla2322 (i.e.): alias scsi_hostadapter1 qla2322 modify the modules.pcimap table to load qla2322 for the 2322 device-id: qla2300 0x1077 0x2322 ... to: qla2322 0x1077 0x2322 ... Beyond that, I'd suggest you log a report with Redhat, as that's the extent of the workaround knowledge without going to RHEL4U1. Hope this helps, Andrew Vasquez - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Incorrect driver getting loaded for Qlogic FC-HBA
On Wed, 27 Jul 2005, Rajat Jain wrote: On 7/27/05, Andrew Vasquez [EMAIL PROTECTED] wrote: A similar problem was noted with RHEL4, it seems the modules.pcimap and pci.ids file were correct, but the pcitable file contained entries for all ql[ae]23xx based HBAs to load qla2300.ko. It's my understanding that this was fixed for RHEL4 U1. Which distro are you using? If you are using RHEL, and are still having problems, I'd suggest you file a report with Redhat. Regards, Andrew Vasquez BINGO! I AM using RHEL 4. So does that mean I can rectify the problem by making appropriate changes to pcitable file? I'm trying to get a firm answer from the folks who originally discvoered the problem some time back, it seems you have two options: - during installation of RHEL4 (and not RHEL4U1), load with the 'noprobe' option: linux noprobe and manually select the appropriate drivers to load. - (post installation) modify the /etc/modprobe.conf to and rename the qla2300 entry to qla2322 (i.e.): alias scsi_hostadapter1 qla2322 modify the modules.pcimap table to load qla2322 for the 2322 device-id: qla2300 0x1077 0x2322 ... to: qla2322 0x1077 0x2322 ... Beyond that, I'd suggest you log a report with Redhat, as that's the extent of the workaround knowledge without going to RHEL4U1. Hope this helps, Andrew Vasquez - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fix up qla2xxx configuration bogosity
Linus, In looking through your latest git-pull and update of the Kconfig quirks in qla2xxx: Fix up qla2xxx configuration bogosity http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=e0aa8afd97536a9d94f82a07b4c4b3f05aef6f82;hp=e4ff4d7f9d85a2bc714307eb9113617182e62845 Would you also apply the attached patch which adds the appropriate FW_LOADER pre-requisite and a separate entry for ISP24xx support. Thanks to Adrian Bunk and Jesper Juhl for their efforts in fixing this quirk. Regards, Andrew Vasquez --- diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig --- a/drivers/scsi/qla2xxx/Kconfig +++ b/drivers/scsi/qla2xxx/Kconfig @@ -7,6 +7,7 @@ config SCSI_QLA21XX tristate QLogic ISP2100 host adapter family support depends on SCSI_QLA2XXX select SCSI_FC_ATTRS + select FW_LOADER ---help--- This driver supports the QLogic 21xx (ISP2100) host adapter family. @@ -14,6 +15,7 @@ config SCSI_QLA22XX tristate QLogic ISP2200 host adapter family support depends on SCSI_QLA2XXX select SCSI_FC_ATTRS + select FW_LOADER ---help--- This driver supports the QLogic 22xx (ISP2200) host adapter family. @@ -21,6 +23,7 @@ config SCSI_QLA2300 tristate QLogic ISP2300 host adapter family support depends on SCSI_QLA2XXX select SCSI_FC_ATTRS + select FW_LOADER ---help--- This driver supports the QLogic 2300 (ISP2300 and ISP2312) host adapter family. @@ -29,6 +32,7 @@ config SCSI_QLA2322 tristate QLogic ISP2322 host adapter family support depends on SCSI_QLA2XXX select SCSI_FC_ATTRS + select FW_LOADER ---help--- This driver supports the QLogic 2322 (ISP2322) host adapter family. @@ -36,6 +40,16 @@ config SCSI_QLA6312 tristate QLogic ISP63xx host adapter family support depends on SCSI_QLA2XXX select SCSI_FC_ATTRS + select FW_LOADER ---help--- This driver supports the QLogic 63xx (ISP6312 and ISP6322) host adapter family. + +config SCSI_QLA24XX + tristate QLogic ISP24xx host adapter family support + depends on SCSI_QLA2XXX + select SCSI_FC_ATTRS + select FW_LOADER + ---help--- + This driver supports the QLogic 24xx (ISP2422 and ISP2432) host + adapter family. diff --git a/drivers/scsi/qla2xxx/Makefile b/drivers/scsi/qla2xxx/Makefile --- a/drivers/scsi/qla2xxx/Makefile +++ b/drivers/scsi/qla2xxx/Makefile @@ -1,5 +1,4 @@ EXTRA_CFLAGS += -DUNIQUE_FW_NAME -EXTRA_CFLAGS += -DCONFIG_SCSI_QLA24XX -DCONFIG_SCSI_QLA24XX_MODULE qla2xxx-y := qla_os.o qla_init.o qla_mbx.o qla_iocb.o qla_isr.o qla_gs.o \ qla_dbg.o qla_sup.o qla_rscn.o qla_attr.o - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Incorrect driver getting loaded for Qlogic FC-HBA
On Tue, 26 Jul 2005, Rajat Jain wrote: > On 7/26/05, Greg KH <[EMAIL PROTECTED]> wrote: > > On Mon, Jul 25, 2005 at 11:02:39AM +0900, Rajat Jain wrote: > > > I'm using Kernel 2.6.9 and am having a Qlogic QLE2362 FC-HBA in my > > > system. I selected all the Qlogic SCSI drivers while buiding the > > > kernel. Now the problem is that every time I reboot, I have to > > > MANUALLY modprobe the qla2322.ko module in the kernel and only then my > > > HBA works. By default, the kernel loads qla2300.ko, which is not the > > > correct driver for the card, and hence the HBA does not work. Here is > > > the lspci output: > > > > "by default" the kernel does not load any modules. That's up to the > > hotplug system, or some other package. > > > > thanks, > > > > greg k-h > > > > Thanks. I just checked .. that is right. So let me put it this way. > When ever I hot-plug my HBA into the system, the driver "qla2300" gets > loaded. Where as the correct driver is "qla2322". This evident from > the output of "modules.pcimap" file and "lspci". The PCI device number > of HBA is 2322. and in modules.pcimap file, qla2322 is supposed to be > loaded when this HBA is hot-plugged. But module qla2300 is getting > loaded. > > Any pointers on where could the problem be? Or how should I approach > this problem? A similar problem was noted with RHEL4, it seems the modules.pcimap and pci.ids file were correct, but the pcitable file contained entries for all ql[ae]23xx based HBAs to load qla2300.ko. It's my understanding that this was fixed for RHEL4 U1. Which distro are you using? If you are using RHEL, and are still having problems, I'd suggest you file a report with Redhat. Regards, Andrew Vasquez - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/