For the record, I started getting this too, but only if I boot with PCI=NOMSI
I'm on I965G, not GM.  GM has MSI disabled regardless in the driver
due to errata.

On Sat, Nov 8, 2008 at 4:42 PM, Steven J Newbury <[EMAIL PROTECTED]> wrote:
> On Sat, 2008-11-08 at 21:26 +0000, Steven J Newbury wrote:
>> On Sat, 2008-11-08 at 10:27 -0500, Robert Noland wrote:
>> > On Fri, 2008-11-07 at 22:00 +0000, Steven J Newbury wrote:
>> > > On Fri, 2008-11-07 at 21:44 +0000, Steven J Newbury wrote:
>> > > > On Fri, 2008-11-07 at 20:45 +0000, Steven J Newbury wrote:
>> > > > > On Fri, 2008-11-07 at 11:11 -0800, Eric Anholt wrote:
>> > > > > > On Fri, 2008-11-07 at 14:01 +0000, Steven J Newbury wrote:
>> > > >
>> > > > > > > I'm on 965GM and I'm having a serious interrupt problem since 
>> > > > > > > this patch
>> > > > > > > went into for-review:
>> > > > > > >
>> > > > > > > Nov  7 04:20:22 infinity irq 16: nobody cared (try booting with 
>> > > > > > > the
>> > > > > > > "irqpoll" option)
>> > > > > > > Nov  7 04:20:22 infinity Pid: 0, comm: swapper Not tainted
>> > > > > > > 2.6.28-rc3-00236-g1d7eff8 #23
>> > > > > > > Nov  7 04:20:22 infinity Call Trace:
>> > > > > > > Nov  7 04:20:22 infinity <IRQ>  [<ffffffff80491a25>] ?
>> > > > > > > i915_driver_irq_handler+0x53/0x186
>> > > > > > > Nov  7 04:20:22 infinity [<ffffffff80270b55>] 
>> > > > > > > __report_bad_irq+0x3d/0x8c
>> > > > > > > Nov  7 04:20:22 infinity [<ffffffff80270cb7>] 
>> > > > > > > note_interrupt+0x113/0x178
>> > > > > > > Nov  7 04:20:22 infinity [<ffffffff802713db>] handle_fasteoi_irq
>> > > > > > > +0x99/0xc3
>> > > > > > > Nov  7 04:20:22 infinity [<ffffffff8020ee5f>] do_IRQ+0x9c/0x11d
>> > > > > > > Nov  7 04:20:22 infinity [<ffffffff8020c826>] 
>> > > > > > > ret_from_intr+0x0/0xa
>> > > > > > > Nov  7 04:20:22 infinity <EOI>  [<ffffffff804572c0>] ?
>> > > > > > > acpi_idle_enter_simple+0x175/0x1a8
>> > > > > > > Nov  7 04:20:22 infinity [<ffffffff804572b6>] ? 
>> > > > > > > acpi_idle_enter_simple
>> > > > > > > +0x16b/0x1a8
>> > > > > > > Nov  7 04:20:22 infinity [<ffffffff8052af56>] ? cpuidle_idle_call
>> > > > > > > +0xa6/0xe0
>> > > > > > > Nov  7 04:20:22 infinity [<ffffffff8020b47a>] ? 
>> > > > > > > cpu_idle+0x4c/0xb0
>> > > > > > > Nov  7 04:20:22 infinity [<ffffffff80614551>] ? 
>> > > > > > > rest_init+0x75/0x77
>> > > > > > > Nov  7 04:20:22 infinity handlers:
>> > > > > > > Nov  7 04:20:22 infinity [<ffffffff804919d2>] 
>> > > > > > > (i915_driver_irq_handler
>> > > > > > > +0x0/0x186)
>> > > > > > > Nov  7 04:20:22 infinity Disabling IRQ #16
>> > > > > > >
>> > > > > > > This happens after a random amount of time in X, athough never 
>> > > > > > > very
>> > > > > > > long.  From this point on there are no interrupts generated 
>> > > > > > > unless I
>> > > > > > > switch vts away from X and back again.
>> > > > I'm wrong here.  Switching vts only "fixes" the second problem below.
>> > > >
>> > > > >   This gets interrupts working
>> > > > > > > again for a short while.
>> > > > > >
>> > > > > > Can you get /proc/dri/0/i915_gem_interrupt from before and just 
>> > > > > > after
>> > > > > > the problem occurs?
>> > > > > >
>> > > > > I'll fire up a for-review kernel and see what it says.
>> > > >
>> > > > Before X:
>> > > >
>> > > > Interrupt enable:    00000000
>> > > > Interrupt identity:  00000000
>> > > > Interrupt mask:      fffedfff
>> > > > Pipe A stat:         00000203
>> > > > Pipe B stat:         80000206
>> > > > Interrupts received: 0
>> > > > Current sequence:    0
>> > > > Waiter sequence:     0
>> > > > IRQ sequence:        0
>> > > >
>> > > > After X has started:
>> > > >
>> > > > Interrupt enable:    00000051
>> > > > Interrupt identity:  00000002
>> > > > Interrupt mask:      fffedfac
>> > > > Pipe A stat:         00020204
>> > > > Pipe B stat:         00000206
>> > > > Interrupts received: 1327
>> > > > Current sequence:    1742
>> > > > Waiter sequence:     0
>> > > > IRQ sequence:        1738
>> > > >
>> > > > Interrupt enable:    00000051
>> > > > Interrupt identity:  00000002
>> > > > Interrupt mask:      fffedfac
>> > > > Pipe A stat:         00020204
>> > > > Pipe B stat:         00000206
>> > > > Interrupts received: 33424
>> > > > Current sequence:    43154
>> > > > Waiter sequence:     0
>> > > > IRQ sequence:        43132
>> > > >
>> > > > Interrupt enable:    00000051
>> > > > Interrupt identity:  00000002
>> > > > Interrupt mask:      fffedfac
>> > > > Pipe A stat:         00020204
>> > > > Pipe B stat:         00020000
>> > > > Interrupts received: 42250
>> > > > Current sequence:    58442
>> > > > Waiter sequence:     0
>> > > > IRQ sequence:        58434
>> > > > ____
>> > > >
>> > > > After interrupt failure:
>> > > >
>> > > > Interrupt enable:    00000051
>> > > > Interrupt identity:  00000000
>> > > > Interrupt mask:      fffedfac
>> > > > Pipe A stat:         00020204
>> > > > Pipe B stat:         00000206
>> > > > Interrupts received: 200097
>> > > > Current sequence:    96282
>> > > > Waiter sequence:     0
>> > > > IRQ sequence:        96282
>> > > >
>> > > > Output of 'cat /proc/interrupts' :
>> > > >            CPU0       CPU1
>> > > >   0:     309831     301848   IO-APIC-edge      timer
>> > > >   1:        964       1747   IO-APIC-edge      i8042
>> > > >   4:          1          1   IO-APIC-edge
>> > > >   8:          1          0   IO-APIC-edge      rtc0
>> > > >   9:          0          1   IO-APIC-fasteoi   acpi
>> > > >  12:      11555      16280   IO-APIC-edge      i8042
>> > > >  14:          0          0   IO-APIC-edge      ata_piix
>> > > >  15:          0          0   IO-APIC-edge      ata_piix
>> > > >  16:      99522     100479   IO-APIC-fasteoi   [EMAIL 
>> > > > PROTECTED]:0000:00:02.0
>> > > >  19:          6          9   IO-APIC-fasteoi   yenta, firewire_ohci
>> > > >  20:         75         63   IO-APIC-fasteoi   uhci_hcd:usb1,
>> > > > uhci_hcd:usb3, ehci_hcd:usb7
>> > > >  21:        204        216   IO-APIC-fasteoi   uhci_hcd:usb2,
>> > > > uhci_hcd:usb4, HDA Intel
>> > > >  22:        352        644   IO-APIC-fasteoi   uhci_hcd:usb5,
>> > > > ehci_hcd:usb6
>> > > >  43:       4898       5996   PCI-MSI-edge      ahci
>> > > > NMI:          0          0   Non-maskable interrupts
>> > > > LOC:     116278      86951   Local timer interrupts
>> > > > RES:      27385      27476   Rescheduling interrupts
>> > > > CAL:         91         32   Function call interrupts
>> > > > TLB:         32         96   TLB shootdowns
>> > > > TRM:          0          0   Thermal event interrupts
>> > > > THR:          0          0   Threshold APIC interrupts
>> > > > SPU:          0          0   Spurious interrupts
>> > > > ERR:          0
>> > > > MIS:          0
>> > >
>> > > Curiously, the i915_gem_interrupt count continues to rise despite no
>> > > more interrupts being recorded in /proc/interrupts.  Clearly interrupts
>> > > are not working, X is very slow, and glxgears reports interrupts are not
>> > > working correctly.
>> > >
>> > > Currently:
>> > > cat /proc/dri/0/i915_gem_interrupt
>> > > Interrupt enable:    00000051
>> > > Interrupt identity:  00000002
>> > > Interrupt mask:      fffedfac
>> > > Pipe A stat:         00000000
>> > > Pipe B stat:         00000206
>> > > Interrupts received: 615479
>> > > Current sequence:    308340
>> > > Waiter sequence:     0
>> > > IRQ sequence:        308338
>> >
>> > Unless keithp's most recent patch moving BREADCRUMB_INDEX prevents some
>> > internal brain damage, messing with IER often seems to be a bad idea, at
>> > least on 965gm.  I've spent most of the week fighting this issue on
>> > FreeBSD.  Last night, I flipped the logic back to setting up IER during
>> > interrupt handler install and flipping bits in IMR to enable / disable
>> > irqs and everything is working correctly again.  I have made some other
>> > code changes in the handler, but none of them resolved the issue.
>> > Inverting the logic got everything working again, for both INTx and MSI.
>> > I know that it is published that MSI should not be used on the 965gm,
>> > but I've not seen any issues on my hardware.
>> >
>> > robert.
>> Now this is really weird, if I suspend to RAM and then resume, from that
>> point everything seems to work fine so far!?!  My guess is the
>> re-installation of the interrupt handler on resume occurs with different
>> register values compared to the initial setup.
>
> I hit send too soon.  It worked for a while, longer than it has
> previously, but it has happened again.
>
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
> Build the coolest Linux based applications with Moblin SDK & win great prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> --
> _______________________________________________
> Dri-devel mailing list
> Dri-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dri-devel
>

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to