On Thu, May 24, 2007 at 10:27:02AM -0700, Andrew Morton wrote:
> On Thu, 24 May 2007 11:07:56 -0500
> "Mike Miller (OS Dev)" <[EMAIL PROTECTED]> wrote:
> 
> > So I guess I found the answer to my own question. msi_free_irqs was 
> > apparently added
> > in 2.6.22-something. I don't find it in 2.6.21.2 or anywhere else. So 
> > somebody broke a
> > couple of things.
> > The most noticable is cciss hangs after turning on interrupts. The reason 
> > for that is
> > the kernel now looks at my array of MSI-X vectors in reverse order. We have 
> > 4 ways of
> > generating an interrupt on Smart Array hw. They are:
> > 
> > #       define DOORBELL_INT     0
> > #       define PERF_MODE_INT    1
> > #       define SIMPLE_MODE_INT  2
> > #       define MEMQ_MODE_INT    3
> > 

I apologize for the vagueness of the message. This dirty hack makes cciss work 
in the
.22-rc kernel. I have not yet figured out what broke.

-----------------------------------------------------------------------------------------
diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
index 5acc6c4..7383483 100644
--- a/drivers/block/cciss.c
+++ b/drivers/block/cciss.c
@@ -3494,7 +3494,7 @@ static void cciss_shutdown(struct pci_dev *pdev)
        } else {
                printk(KERN_WARNING "Error flushing cache on controller %d\n", 
i);
        }
-       free_irq(hba[i]->intr[2], hba[i]);
+       free_irq(hba[i]->intr[SIMPLE_MODE_INT], hba[i]);
 }
 
 static void __devexit cciss_remove_one(struct pci_dev *pdev)
diff --git a/drivers/block/cciss.h b/drivers/block/cciss.h
index b70988d..26b5866 100644
--- a/drivers/block/cciss.h
+++ b/drivers/block/cciss.h
@@ -70,8 +70,8 @@ struct ctlr_info
        int     highest_lun;
        int     usage_count;  /* number of opens all all minor devices */
 #      define DOORBELL_INT     0
-#      define PERF_MODE_INT    1
-#      define SIMPLE_MODE_INT  2
+#      define PERF_MODE_INT    2
+#      define SIMPLE_MODE_INT  1
 #      define MEMQ_MODE_INT    3
        unsigned int intr[4];
        unsigned int msix_vector;
-----------------------------------------------------------------------------------------

> > For INTx these four lines are OR'ed together and run to one interrupt pin. 
> > MSI-X
> > breaks this hardware OR'ing so we must register either all 4 or at the 
> > least the
> > correct interrupt. When I first submitted the MSI/X support I was 
> > registering all 4.
> > Someone changed that to only register a single MSI-X vector. That worked 
> > fine until
> > 2.6.22-something. 
> > Now it appears that the kernel is looking at the array in reverse order. 
> > IOW, I must
> > register PERF_MODE_INT in order for cciss to work. That's messed up. 
> > Anybody want to

Have not yet found the change that caused this but my nasty little hack gets 
around it
for my testing. After I return from the long weekend I'll try to hunt this down.


> > `fess up to making these changes? :)
> > I'll keep working this, but I'm going to undo someones change when I figure 
> > out where
> > it's broke.
> > 
> 
> I'd guess that you're referring to Michael's changes.  If you can identify
> the offending code in a less vague fashion, more confident answers can
> be given ;)
> 
> I canot find any sign of anyone altering the IRQ handling in cciss.c after
> your initial MSI-support merge.  But that's perhaps isn't what you meant.
> it's all rather foggy.  Please either quote file-n-line, or grab a copy
> of git-blame.

Now for my original mail where the driver Oops'ed on rmmod. This patch prevents 
the
Oops but I'm not 100% sure it's right. Here's the Oops:


Completed flushing cache on controller 2
BUG: unable to handle kernel paging request at virtual address f8b2200c
 printing eip:
c01e9cc7
*pdpt = 0000000000003001
*pde = 0000000037e48067
*pte = 0000000000000000
Oops: 0002 [#1]
SMP
Modules linked in: cciss ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core 
sunrpc loop dm_multipath button battery asus_acpi ac tg3 floppy sg dm_snapshot 
dm_zero dm_mirror ext3 jbd dm_mod ata_piix libata mptsas scsi_transport_sas 
mptspi scsi_transport_spi mptscsih mptbase sd_mod scsi_mod
CPU:    1
EIP:    0060:[<c01e9cc7>]    Not tainted VLI
EFLAGS: 00010286   (2.6.22-rc2-gd2579053 #1)
EIP is at msi_free_irqs+0x81/0xbe
eax: f8b22000   ebx: f71f3180   ecx: f7fff280   edx: c1886eb8
esi: f7c4e800   edi: f7c4ec48   ebp: 00000002   esp: f5a0dec8
ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
Process rmmod (pid: 5286, ti=f5a0d000 task=c47d2550 task.ti=f5a0d000)
Stack: 00000002 f8b72294 00000400 f8b69ca7 f8b6bc6c 00000002 00000000 00000000
       00000000 00000000 00000000 f5a997f4 f8b69d61 f7c5a4b0 f7c4e848 f7c4e848
       f7c4e800 f7c4e800 f8b72294 f7c4e848 f8b72294 c01e3cdf f7c4e848 c024c469
Call Trace:
 [<f8b69ca7>] cciss_shutdown+0xae/0xc3 [cciss]
 [<f8b69d61>] cciss_remove_one+0xa5/0x178 [cciss]
 [<c01e3cdf>] pci_device_remove+0x16/0x35
 [<c024c469>] __device_release_driver+0x71/0x8e
 [<c024c56e>] driver_detach+0xa0/0xde
 [<c024bc5c>] bus_remove_driver+0x27/0x41
 [<c01e3ef3>] pci_unregister_driver+0xb/0x13
 [<f8b6a343>] cciss_cleanup+0xf/0x51 [cciss]
 [<c0139ced>] sys_delete_module+0x110/0x135
 [<c0104c7a>] sysenter_past_esp+0x5f/0x85
 =======================
Code: 00 39 c2 74 5d 0f b6 03 24 1f 3c 11 75 24 8d 86 54 04 00 00 39 43 0c 75 
08 8b 43 14 e8 32 df f2 ff 0f b7 43 02 c1 e0 04 03 43 14 <c7> 40 0c 01 00 00 00 
8d 4b 0c 8b 43 0c 8b 51 04 89 50 04 89 02
EIP: [<c01e9cc7>] msi_free_irqs+0x81/0xbe SS:ESP 0068:f5a0dec8
BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
 [<c011d16a>] __might_sleep+0xa7/0xad
 [<c013239a>] down_read+0x12/0x25
 [<c013db2f>] acct_collect+0x32/0x128
 [<c0121dc4>] do_exit+0x198/0x37e
 [<c0105f7b>] die+0x1da/0x1e2
 [<c03123ad>] do_page_fault+0x5b9/0x69c
 [<c01127b6>] smp_call_function+0x1a/0x1d
 [<c0311df4>] do_page_fault+0x0/0x69c
 [<c0310b1a>] error_code+0x72/0x78
 [<c011007b>] single_msr_reserve+0xf/0x3b
 [<c01e9cc7>] msi_free_irqs+0x81/0xbe
 [<f8b69ca7>] cciss_shutdown+0xae/0xc3 [cciss]
 [<f8b69d61>] cciss_remove_one+0xa5/0x178 [cciss]
 [<c01e3cdf>] pci_device_remove+0x16/0x35
 [<c024c469>] __device_release_driver+0x71/0x8e
 [<c024c56e>] driver_detach+0xa0/0xde
 [<c024bc5c>] bus_remove_driver+0x27/0x41
 [<c01e3ef3>] pci_unregister_driver+0xb/0x13
 [<f8b6a343>] cciss_cleanup+0xf/0x51 [cciss]
 [<c0139ced>] sys_delete_module+0x110/0x135
 [<c0104c7a>] sysenter_past_esp+0x5f/0x85
 =======================


Michael or Eric, would you please review this patch and see if it's OK? Adding 
an else
between the the if (list_is....) and the writel resolved the Oops. I'm not sure 
how this 
is supposed to work but using entry->mask_base after iounmap'ing seems wrong.


diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index d9cbd58..0e67723 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -560,10 +560,11 @@ static int msi_free_irqs(struct pci_dev* dev)
                if (entry->msi_attrib.type == PCI_CAP_ID_MSIX) {
                        if (list_is_last(&entry->list, &dev->msi_list))
                                iounmap(entry->mask_base);
-
-                       writel(1, entry->mask_base + entry->msi_attrib.entry_nr
-                                 * PCI_MSIX_ENTRY_SIZE
-                                 + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
+                       else
+                               writel(1, entry->mask_base
+                                       + entry->msi_attrib.entry_nr
+                                       * PCI_MSIX_ENTRY_SIZE
+                                       + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
                }
                list_del(&entry->list);
                kfree(entry);
-----------------------------------------------------------------------------------------

I hope this clears up a little of the fog.

Thanks,
mikem

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to