Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Harald Arnesen wrote:

> I have the same problem on my Lenovo T500. I think the graphics card is
> involved.
> 
> This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> nobody cared" during boot, not when I boot with the ATI card.

Confirming this. After a lot of hassle, I have bisected this reliably to

commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
Author: Daniel Vetter 
Date:   Sat Dec 1 13:53:45 2012 +0100

drm/i915: use the gmbus irq for waits

Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
happening in parallel.

Attaching dmesg.txt from the machine with 28c70f162a as head, with 
drm.debug=0xe.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Jiri Kosina wrote:

> > I have the same problem on my Lenovo T500. I think the graphics card is
> > involved.
> > 
> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > nobody cared" during boot, not when I boot with the ATI card.
> 
> Confirming this. After a lot of hassle, I have bisected this reliably to
> 
>   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>   Author: Daniel Vetter 
>   Date:   Sat Dec 1 13:53:45 2012 +0100
> 
>   drm/i915: use the gmbus irq for waits
> 
> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> happening in parallel.
> 
> Attaching dmesg.txt from the machine with 28c70f162a as head, with 
> drm.debug=0xe.

Just a datapoint -- I have put a trivial debugging patch in place, and it 
reveals that "nobody cared" for irq 16 happens long after last

I915_WRITE(GMBUS4 + reg_offset, 0);

has been performed in gmbus_wait_hw_status(). On the other hand, if I 
comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), 
then it of course falls back to GPIO bit-banging, but the "nobody cared" 
for irq 16 is gone. 

So it seems like something gets severely confused by the I915_WRITE to 
GMBUS4 + reg_offset. So far this seems to have been reported solely on 
Lenovos as far as I can see (although a completely different types), so it 
might be some platform-specific quirk?

Honestly, I still don't understand how all the GMBUS stuff relates to IRQ 
16 at all. 

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Greg KH
On Fri, Mar 15, 2013 at 02:33:13PM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Harald Arnesen wrote:
> 
> > I have the same problem on my Lenovo T500. I think the graphics card is
> > involved.
> > 
> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > nobody cared" during boot, not when I boot with the ATI card.
> 
> Confirming this. After a lot of hassle, I have bisected this reliably to
> 
>   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>   Author: Daniel Vetter 
>   Date:   Sat Dec 1 13:53:45 2012 +0100
> 
>   drm/i915: use the gmbus irq for waits
> 
> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> happening in parallel.

Wasn't this fixed by the merge from David
(2cc79544bd0aabb4b3cf467ead5df526d9134c64)?  I can't figure out the
exact commit that the merge message referred to though...

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Greg KH wrote:

> > > I have the same problem on my Lenovo T500. I think the graphics card is
> > > involved.
> > > 
> > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > > nobody cared" during boot, not when I boot with the ATI card.
> > 
> > Confirming this. After a lot of hassle, I have bisected this reliably to
> > 
> > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > Author: Daniel Vetter 
> > Date:   Sat Dec 1 13:53:45 2012 +0100
> > 
> > drm/i915: use the gmbus irq for waits
> > 
> > Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> > happening in parallel.
> 
> Wasn't this fixed by the merge from David
> (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?

Why do you think it should, please?

(I am seeing this with a2362d247 still).

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Greg KH
On Fri, Mar 15, 2013 at 04:37:56PM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Greg KH wrote:
> 
> > > > I have the same problem on my Lenovo T500. I think the graphics card is
> > > > involved.
> > > > 
> > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > > > nobody cared" during boot, not when I boot with the ATI card.
> > > 
> > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > 
> > >   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > >   Author: Daniel Vetter 
> > >   Date:   Sat Dec 1 13:53:45 2012 +0100
> > > 
> > >   drm/i915: use the gmbus irq for waits
> > > 
> > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > what's 
> > > happening in parallel.
> > 
> > Wasn't this fixed by the merge from David
> > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> 
> Why do you think it should, please?

The line:
- Fix PCH irq handling race which resulted in missed gmbus/dp
  aux irqs and subsequent fallout (Paulo)

> (I am seeing this with a2362d247 still).

Ok, I guess it isn't still fixed properly, just was guessing :)

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Greg KH wrote:

> > > > > I have the same problem on my Lenovo T500. I think the graphics card 
> > > > > is
> > > > > involved.
> > > > > 
> > > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 
> > > > > 16:
> > > > > nobody cared" during boot, not when I boot with the ATI card.
> > > > 
> > > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > > 
> > > > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > > > Author: Daniel Vetter 
> > > > Date:   Sat Dec 1 13:53:45 2012 +0100
> > > > 
> > > > drm/i915: use the gmbus irq for waits
> > > > 
> > > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > > what's 
> > > > happening in parallel.
> > > 
> > > Wasn't this fixed by the merge from David
> > > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> > 
> > Why do you think it should, please?
> 
> The line:
>   - Fix PCH irq handling race which resulted in missed gmbus/dp
> aux irqs and subsequent fallout (Paulo)

Ah, that one. I believe that should be irrelevant for GM chipsets, as they 
don't have AUX line, right?

> > (I am seeing this with a2362d247 still).
> 
> Ok, I guess it isn't still fixed properly, just was guessing :)

Seems like this is a different issue.

Thanks,

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Yinghai Lu
On Fri, Mar 15, 2013 at 8:14 AM, Jiri Kosina  wrote:

> Just a datapoint -- I have put a trivial debugging patch in place, and it
> reveals that "nobody cared" for irq 16 happens long after last
>
> I915_WRITE(GMBUS4 + reg_offset, 0);
>
> has been performed in gmbus_wait_hw_status(). On the other hand, if I
> comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> then it of course falls back to GPIO bit-banging, but the "nobody cared"
> for irq 16 is gone.
>
> So it seems like something gets severely confused by the I915_WRITE to
> GMBUS4 + reg_offset. So far this seems to have been reported solely on
> Lenovos as far as I can see (although a completely different types), so it
> might be some platform-specific quirk?
>
> Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> 16 at all.

that device is using
i915 :00:02.0: irq 44 for MSI/MSI-X

so can you try to boot with pci=nomsi?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-17 Thread Shawn Starr
On Friday, March 15, 2013 12:14:28 PM Yinghai Lu wrote:
> On Fri, Mar 15, 2013 at 8:14 AM, Jiri Kosina  wrote:
> > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > reveals that "nobody cared" for irq 16 happens long after last
> > 
> > I915_WRITE(GMBUS4 + reg_offset, 0);
> > 
> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > for irq 16 is gone.
> > 
> > So it seems like something gets severely confused by the I915_WRITE to
> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > Lenovos as far as I can see (although a completely different types), so it
> > might be some platform-specific quirk?
> > 
> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > 16 at all.
> 
> that device is using
> i915 :00:02.0: irq 44 for MSI/MSI-X
> 
> so can you try to boot with pci=nomsi?

I can try disabling MSI with 3.9.0-0.rc2.git0.4.fc20. -rc3 is not yet 
available in rawhide.

thanks,
Shawn___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Daniel Vetter
On Fri, Mar 15, 2013 at 08:47:39AM -0700, Greg KH wrote:
> On Fri, Mar 15, 2013 at 04:37:56PM +0100, Jiri Kosina wrote:
> > On Fri, 15 Mar 2013, Greg KH wrote:
> > 
> > > > > I have the same problem on my Lenovo T500. I think the graphics card 
> > > > > is
> > > > > involved.
> > > > > 
> > > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 
> > > > > 16:
> > > > > nobody cared" during boot, not when I boot with the ATI card.
> > > > 
> > > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > > 
> > > > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > > > Author: Daniel Vetter 
> > > > Date:   Sat Dec 1 13:53:45 2012 +0100
> > > > 
> > > > drm/i915: use the gmbus irq for waits
> > > > 
> > > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > > what's 
> > > > happening in parallel.
> > > 
> > > Wasn't this fixed by the merge from David
> > > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> > 
> > Why do you think it should, please?
> 
> The line:
>   - Fix PCH irq handling race which resulted in missed gmbus/dp
> aux irqs and subsequent fallout (Paulo)
> 
> > (I am seeing this with a2362d247 still).
> 
> Ok, I guess it isn't still fixed properly, just was guessing :)

Yeah, the above fix is for pch split platforms, whereas these reports here
are for gm45 (which doesn't have the pch display split). Acking of gmbus
interrupts works differently on those, I'm testing right now whether I can
reproduce this fail.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Fri, 15 Mar 2013, Yinghai Lu wrote:

> > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > reveals that "nobody cared" for irq 16 happens long after last
> >
> > I915_WRITE(GMBUS4 + reg_offset, 0);
> >
> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > for irq 16 is gone.
> >
> > So it seems like something gets severely confused by the I915_WRITE to
> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > Lenovos as far as I can see (although a completely different types), so it
> > might be some platform-specific quirk?
> >
> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > 16 at all.
> 
> that device is using
> i915 :00:02.0: irq 44 for MSI/MSI-X
> 
> so can you try to boot with pci=nomsi?

Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
interrupts go away.

My understanding from the other mail is that DAniel Vetter already has an 
idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
this datapoint regarding MSI will fit into it.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Yinghai Lu
On Mon, Mar 18, 2013 at 2:12 AM, Jiri Kosina  wrote:
> On Fri, 15 Mar 2013, Yinghai Lu wrote:
>
>> > Just a datapoint -- I have put a trivial debugging patch in place, and it
>> > reveals that "nobody cared" for irq 16 happens long after last
>> >
>> > I915_WRITE(GMBUS4 + reg_offset, 0);
>> >
>> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
>> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
>> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
>> > for irq 16 is gone.
>> >
>> > So it seems like something gets severely confused by the I915_WRITE to
>> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
>> > Lenovos as far as I can see (although a completely different types), so it
>> > might be some platform-specific quirk?
>> >
>> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
>> > 16 at all.
>>
>> that device is using
>> i915 :00:02.0: irq 44 for MSI/MSI-X
>>
>> so can you try to boot with pci=nomsi?
>
> Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
> interrupts go away.
>
> My understanding from the other mail is that DAniel Vetter already has an
> idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
> this datapoint regarding MSI will fit into it.

What is /proc/interrupts difference between with and without pci=nomsi ?

drm is forced to share irq 16?

Thanks

Yinghai
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Thomas Meyer
My laptop is an Acer 1810T. I see this error message each boot.

Kind regards
Thomas

Jiri Kosina  schrieb:

>On Fri, 15 Mar 2013, Jiri Kosina wrote:
>
>> > I have the same problem on my Lenovo T500. I think the graphics card is
>> > involved.
>> > 
>> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
>> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
>> > nobody cared" during boot, not when I boot with the ATI card.
>> 
>> Confirming this. After a lot of hassle, I have bisected this reliably to
>> 
>>  commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>>  Author: Daniel Vetter 
>>  Date:   Sat Dec 1 13:53:45 2012 +0100
>> 
>>  drm/i915: use the gmbus irq for waits
>> 
>> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
>> happening in parallel.
>> 
>> Attaching dmesg.txt from the machine with 28c70f162a as head, with 
>> drm.debug=0xe.
>
>Just a datapoint -- I have put a trivial debugging patch in place, and it 
>reveals that "nobody cared" for irq 16 happens long after last
>
>   I915_WRITE(GMBUS4 + reg_offset, 0);
>
>has been performed in gmbus_wait_hw_status(). On the other hand, if I 
>comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), 
>then it of course falls back to GPIO bit-banging, but the "nobody cared" 
>for irq 16 is gone. 
>
>So it seems like something gets severely confused by the I915_WRITE to 
>GMBUS4 + reg_offset. So far this seems to have been reported solely on 
>Lenovos as far as I can see (although a completely different types), so it 
>might be some platform-specific quirk?
>
>Honestly, I still don't understand how all the GMBUS stuff relates to IRQ 
>16 at all. 
>
>-- 
>Jiri Kosina
>SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Daniel Vetter
On Mon, Mar 18, 2013 at 10:12:49AM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Yinghai Lu wrote:
> 
> > > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > > reveals that "nobody cared" for irq 16 happens long after last
> > >
> > > I915_WRITE(GMBUS4 + reg_offset, 0);
> > >
> > > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > > for irq 16 is gone.
> > >
> > > So it seems like something gets severely confused by the I915_WRITE to
> > > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > > Lenovos as far as I can see (although a completely different types), so it
> > > might be some platform-specific quirk?
> > >
> > > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > > 16 at all.
> > 
> > that device is using
> > i915 :00:02.0: irq 44 for MSI/MSI-X
> > 
> > so can you try to boot with pci=nomsi?
> 
> Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
> interrupts go away.
> 
> My understanding from the other mail is that DAniel Vetter already has an 
> idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
> this datapoint regarding MSI will fit into it.

Yep, there's a big comment in the irq handler for that chipset that we
have a gaping race with when using MSI interrupts. Although the comment
bodly claims that the race is small enough to avoid the dreaded "nobody
cared" message. Looks like gmbus is good at hitting that race - on newer
chips it already brought up a similar race in handling pch interrupts.

Can you please give the below patch a whirl? It removes the probably race
msi race avoidance code and replaces it with the same trick Paulo used to
fix pch irq handling races.

Thanks, Daniel
---
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 3c7bb04..13de12e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2684,7 +2684,7 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 {
struct drm_device *dev = (struct drm_device *) arg;
drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
-   u32 iir, new_iir;
+   u32 iir, new_iir, ier;
u32 pipe_stats[I915_MAX_PIPES];
unsigned long irqflags;
int irq_received;
@@ -2692,9 +2692,14 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 
atomic_inc(&dev_priv->irq_received);
 
+   /* irq race avoidance, copy&pasta from Paulo's PCH irq fix */
+   ier = I915_READ(IER);
+   I915_WRITE(IER, 0);
+   POSTING_READ(IER);
+
iir = I915_READ(IIR);
 
-   for (;;) {
+   do {
bool blc_event = false;
 
irq_received = iir != 0;
@@ -2792,7 +2797,10 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 * stray interrupts.
 */
iir = new_iir;
-   }
+   } while (0);
+
+   I915_WRITE(IER, ier);
+   POSTING_READ(IER);
 
i915_update_dri1_breadcrumb(dev);
 
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Chris Wilson
On Mon, Mar 18, 2013 at 08:19:03PM +0100, Daniel Vetter wrote:
> On Mon, Mar 18, 2013 at 10:12:49AM +0100, Jiri Kosina wrote:
> > On Fri, 15 Mar 2013, Yinghai Lu wrote:
> > 
> > > > Just a datapoint -- I have put a trivial debugging patch in place, and 
> > > > it
> > > > reveals that "nobody cared" for irq 16 happens long after last
> > > >
> > > > I915_WRITE(GMBUS4 + reg_offset, 0);
> > > >
> > > > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > > > comment out both GMBUS4 register offset writes in 
> > > > gmbus_wait_hw_status(),
> > > > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > > > for irq 16 is gone.
> > > >
> > > > So it seems like something gets severely confused by the I915_WRITE to
> > > > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > > > Lenovos as far as I can see (although a completely different types), so 
> > > > it
> > > > might be some platform-specific quirk?
> > > >
> > > > Honestly, I still don't understand how all the GMBUS stuff relates to 
> > > > IRQ
> > > > 16 at all.
> > > 
> > > that device is using
> > > i915 :00:02.0: irq 44 for MSI/MSI-X
> > > 
> > > so can you try to boot with pci=nomsi?
> > 
> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
> > interrupts go away.
> > 
> > My understanding from the other mail is that DAniel Vetter already has an 
> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
> > this datapoint regarding MSI will fit into it.
> 
> Yep, there's a big comment in the irq handler for that chipset that we
> have a gaping race with when using MSI interrupts. Although the comment
> bodly claims that the race is small enough to avoid the dreaded "nobody
> cared" message. Looks like gmbus is good at hitting that race - on newer
> chips it already brought up a similar race in handling pch interrupts.
> 
> Can you please give the below patch a whirl? It removes the probably race
> msi race avoidance code and replaces it with the same trick Paulo used to
> fix pch irq handling races.

Still nobody cares about irq16.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Mon, 18 Mar 2013, Daniel Vetter wrote:

> Yep, there's a big comment in the irq handler for that chipset that we
> have a gaping race with when using MSI interrupts. Although the comment
> bodly claims that the race is small enough to avoid the dreaded "nobody
> cared" message. Looks like gmbus is good at hitting that race - on newer
> chips it already brought up a similar race in handling pch interrupts.

I see ... will target my focus in that direction, thanks.

> Can you please give the below patch a whirl? It removes the probably race
> msi race avoidance code and replaces it with the same trick Paulo used to
> fix pch irq handling races.

Unfortunately it didn't change anything, the spurious interrupt report is 
still there.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Mon, 18 Mar 2013, Yinghai Lu wrote:

> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
> > interrupts go away.
> >
> > My understanding from the other mail is that DAniel Vetter already has an
> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
> > this datapoint regarding MSI will fit into it.
> 
> What is /proc/interrupts difference between with and without pci=nomsi ?
> 
> drm is forced to share irq 16?

Yup, IRQ 16 is being shared, and one of the owners is i915.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Yinghai Lu
On Mon, Mar 18, 2013 at 3:05 PM, Jiri Kosina  wrote:
> On Mon, 18 Mar 2013, Yinghai Lu wrote:
>
>> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
>> > interrupts go away.
>> >
>> > My understanding from the other mail is that DAniel Vetter already has an
>> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
>> > this datapoint regarding MSI will fit into it.
>>
>> What is /proc/interrupts difference between with and without pci=nomsi ?
>>
>> drm is forced to share irq 16?
>
> Yup, IRQ 16 is being shared, and one of the owners is i915.

the vga report strange INTx status...

00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series
Chipset Integrated Graphics Controller (rev 07) (prog-if 00 [VGA
controller])
Subsystem: Lenovo Device 20e4
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
SERR-  [disabled]
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee0100c  Data: 4142
Capabilities: [d0] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Kernel driver in use: i915
Kernel modules: i915

it should be INTx-, after we have set DisINTx+ in control.

So INTx can not be disabled after it get enabled before ?

the VGA on my T420 looks right.

00:02.0 VGA compatible controller: Intel Corporation 2nd Generation
Core Processor Family Integrated Graphics Controller (rev 09) (prog-if
00 [VGA controller])
Subsystem: Lenovo Device 21ce
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
SERR- http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Harald Arnesen wrote:

> I have the same problem on my Lenovo T500. I think the graphics card is
> involved.
> 
> This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> nobody cared" during boot, not when I boot with the ATI card.

Confirming this. After a lot of hassle, I have bisected this reliably to

commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
Author: Daniel Vetter 
Date:   Sat Dec 1 13:53:45 2012 +0100

drm/i915: use the gmbus irq for waits

Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
happening in parallel.

Attaching dmesg.txt from the machine with 28c70f162a as head, with 
drm.debug=0xe.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Jiri Kosina wrote:

> > I have the same problem on my Lenovo T500. I think the graphics card is
> > involved.
> > 
> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > nobody cared" during boot, not when I boot with the ATI card.
> 
> Confirming this. After a lot of hassle, I have bisected this reliably to
> 
>   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>   Author: Daniel Vetter 
>   Date:   Sat Dec 1 13:53:45 2012 +0100
> 
>   drm/i915: use the gmbus irq for waits
> 
> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> happening in parallel.
> 
> Attaching dmesg.txt from the machine with 28c70f162a as head, with 
> drm.debug=0xe.

Just a datapoint -- I have put a trivial debugging patch in place, and it 
reveals that "nobody cared" for irq 16 happens long after last

I915_WRITE(GMBUS4 + reg_offset, 0);

has been performed in gmbus_wait_hw_status(). On the other hand, if I 
comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), 
then it of course falls back to GPIO bit-banging, but the "nobody cared" 
for irq 16 is gone. 

So it seems like something gets severely confused by the I915_WRITE to 
GMBUS4 + reg_offset. So far this seems to have been reported solely on 
Lenovos as far as I can see (although a completely different types), so it 
might be some platform-specific quirk?

Honestly, I still don't understand how all the GMBUS stuff relates to IRQ 
16 at all. 

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Greg KH
On Fri, Mar 15, 2013 at 02:33:13PM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Harald Arnesen wrote:
> 
> > I have the same problem on my Lenovo T500. I think the graphics card is
> > involved.
> > 
> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > nobody cared" during boot, not when I boot with the ATI card.
> 
> Confirming this. After a lot of hassle, I have bisected this reliably to
> 
>   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>   Author: Daniel Vetter 
>   Date:   Sat Dec 1 13:53:45 2012 +0100
> 
>   drm/i915: use the gmbus irq for waits
> 
> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> happening in parallel.

Wasn't this fixed by the merge from David
(2cc79544bd0aabb4b3cf467ead5df526d9134c64)?  I can't figure out the
exact commit that the merge message referred to though...

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Greg KH wrote:

> > > I have the same problem on my Lenovo T500. I think the graphics card is
> > > involved.
> > > 
> > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > > nobody cared" during boot, not when I boot with the ATI card.
> > 
> > Confirming this. After a lot of hassle, I have bisected this reliably to
> > 
> > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > Author: Daniel Vetter 
> > Date:   Sat Dec 1 13:53:45 2012 +0100
> > 
> > drm/i915: use the gmbus irq for waits
> > 
> > Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> > happening in parallel.
> 
> Wasn't this fixed by the merge from David
> (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?

Why do you think it should, please?

(I am seeing this with a2362d247 still).

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Greg KH
On Fri, Mar 15, 2013 at 04:37:56PM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Greg KH wrote:
> 
> > > > I have the same problem on my Lenovo T500. I think the graphics card is
> > > > involved.
> > > > 
> > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > > > nobody cared" during boot, not when I boot with the ATI card.
> > > 
> > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > 
> > >   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > >   Author: Daniel Vetter 
> > >   Date:   Sat Dec 1 13:53:45 2012 +0100
> > > 
> > >   drm/i915: use the gmbus irq for waits
> > > 
> > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > what's 
> > > happening in parallel.
> > 
> > Wasn't this fixed by the merge from David
> > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> 
> Why do you think it should, please?

The line:
- Fix PCH irq handling race which resulted in missed gmbus/dp
  aux irqs and subsequent fallout (Paulo)

> (I am seeing this with a2362d247 still).

Ok, I guess it isn't still fixed properly, just was guessing :)

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Greg KH wrote:

> > > > > I have the same problem on my Lenovo T500. I think the graphics card 
> > > > > is
> > > > > involved.
> > > > > 
> > > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 
> > > > > 16:
> > > > > nobody cared" during boot, not when I boot with the ATI card.
> > > > 
> > > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > > 
> > > > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > > > Author: Daniel Vetter 
> > > > Date:   Sat Dec 1 13:53:45 2012 +0100
> > > > 
> > > > drm/i915: use the gmbus irq for waits
> > > > 
> > > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > > what's 
> > > > happening in parallel.
> > > 
> > > Wasn't this fixed by the merge from David
> > > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> > 
> > Why do you think it should, please?
> 
> The line:
>   - Fix PCH irq handling race which resulted in missed gmbus/dp
> aux irqs and subsequent fallout (Paulo)

Ah, that one. I believe that should be irrelevant for GM chipsets, as they 
don't have AUX line, right?

> > (I am seeing this with a2362d247 still).
> 
> Ok, I guess it isn't still fixed properly, just was guessing :)

Seems like this is a different issue.

Thanks,

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Yinghai Lu
On Fri, Mar 15, 2013 at 8:14 AM, Jiri Kosina  wrote:

> Just a datapoint -- I have put a trivial debugging patch in place, and it
> reveals that "nobody cared" for irq 16 happens long after last
>
> I915_WRITE(GMBUS4 + reg_offset, 0);
>
> has been performed in gmbus_wait_hw_status(). On the other hand, if I
> comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> then it of course falls back to GPIO bit-banging, but the "nobody cared"
> for irq 16 is gone.
>
> So it seems like something gets severely confused by the I915_WRITE to
> GMBUS4 + reg_offset. So far this seems to have been reported solely on
> Lenovos as far as I can see (although a completely different types), so it
> might be some platform-specific quirk?
>
> Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> 16 at all.

that device is using
i915 :00:02.0: irq 44 for MSI/MSI-X

so can you try to boot with pci=nomsi?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-17 Thread Shawn Starr
On Friday, March 15, 2013 12:14:28 PM Yinghai Lu wrote:
> On Fri, Mar 15, 2013 at 8:14 AM, Jiri Kosina  wrote:
> > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > reveals that "nobody cared" for irq 16 happens long after last
> > 
> > I915_WRITE(GMBUS4 + reg_offset, 0);
> > 
> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > for irq 16 is gone.
> > 
> > So it seems like something gets severely confused by the I915_WRITE to
> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > Lenovos as far as I can see (although a completely different types), so it
> > might be some platform-specific quirk?
> > 
> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > 16 at all.
> 
> that device is using
> i915 :00:02.0: irq 44 for MSI/MSI-X
> 
> so can you try to boot with pci=nomsi?

I can try disabling MSI with 3.9.0-0.rc2.git0.4.fc20. -rc3 is not yet 
available in rawhide.

thanks,
Shawn___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Daniel Vetter
On Fri, Mar 15, 2013 at 08:47:39AM -0700, Greg KH wrote:
> On Fri, Mar 15, 2013 at 04:37:56PM +0100, Jiri Kosina wrote:
> > On Fri, 15 Mar 2013, Greg KH wrote:
> > 
> > > > > I have the same problem on my Lenovo T500. I think the graphics card 
> > > > > is
> > > > > involved.
> > > > > 
> > > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 
> > > > > 16:
> > > > > nobody cared" during boot, not when I boot with the ATI card.
> > > > 
> > > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > > 
> > > > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > > > Author: Daniel Vetter 
> > > > Date:   Sat Dec 1 13:53:45 2012 +0100
> > > > 
> > > > drm/i915: use the gmbus irq for waits
> > > > 
> > > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > > what's 
> > > > happening in parallel.
> > > 
> > > Wasn't this fixed by the merge from David
> > > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> > 
> > Why do you think it should, please?
> 
> The line:
>   - Fix PCH irq handling race which resulted in missed gmbus/dp
> aux irqs and subsequent fallout (Paulo)
> 
> > (I am seeing this with a2362d247 still).
> 
> Ok, I guess it isn't still fixed properly, just was guessing :)

Yeah, the above fix is for pch split platforms, whereas these reports here
are for gm45 (which doesn't have the pch display split). Acking of gmbus
interrupts works differently on those, I'm testing right now whether I can
reproduce this fail.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Fri, 15 Mar 2013, Yinghai Lu wrote:

> > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > reveals that "nobody cared" for irq 16 happens long after last
> >
> > I915_WRITE(GMBUS4 + reg_offset, 0);
> >
> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > for irq 16 is gone.
> >
> > So it seems like something gets severely confused by the I915_WRITE to
> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > Lenovos as far as I can see (although a completely different types), so it
> > might be some platform-specific quirk?
> >
> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > 16 at all.
> 
> that device is using
> i915 :00:02.0: irq 44 for MSI/MSI-X
> 
> so can you try to boot with pci=nomsi?

Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
interrupts go away.

My understanding from the other mail is that DAniel Vetter already has an 
idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
this datapoint regarding MSI will fit into it.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Yinghai Lu
On Mon, Mar 18, 2013 at 2:12 AM, Jiri Kosina  wrote:
> On Fri, 15 Mar 2013, Yinghai Lu wrote:
>
>> > Just a datapoint -- I have put a trivial debugging patch in place, and it
>> > reveals that "nobody cared" for irq 16 happens long after last
>> >
>> > I915_WRITE(GMBUS4 + reg_offset, 0);
>> >
>> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
>> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
>> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
>> > for irq 16 is gone.
>> >
>> > So it seems like something gets severely confused by the I915_WRITE to
>> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
>> > Lenovos as far as I can see (although a completely different types), so it
>> > might be some platform-specific quirk?
>> >
>> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
>> > 16 at all.
>>
>> that device is using
>> i915 :00:02.0: irq 44 for MSI/MSI-X
>>
>> so can you try to boot with pci=nomsi?
>
> Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
> interrupts go away.
>
> My understanding from the other mail is that DAniel Vetter already has an
> idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
> this datapoint regarding MSI will fit into it.

What is /proc/interrupts difference between with and without pci=nomsi ?

drm is forced to share irq 16?

Thanks

Yinghai
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Thomas Meyer
My laptop is an Acer 1810T. I see this error message each boot.

Kind regards
Thomas

Jiri Kosina  schrieb:

>On Fri, 15 Mar 2013, Jiri Kosina wrote:
>
>> > I have the same problem on my Lenovo T500. I think the graphics card is
>> > involved.
>> > 
>> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
>> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
>> > nobody cared" during boot, not when I boot with the ATI card.
>> 
>> Confirming this. After a lot of hassle, I have bisected this reliably to
>> 
>>  commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>>  Author: Daniel Vetter 
>>  Date:   Sat Dec 1 13:53:45 2012 +0100
>> 
>>  drm/i915: use the gmbus irq for waits
>> 
>> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
>> happening in parallel.
>> 
>> Attaching dmesg.txt from the machine with 28c70f162a as head, with 
>> drm.debug=0xe.
>
>Just a datapoint -- I have put a trivial debugging patch in place, and it 
>reveals that "nobody cared" for irq 16 happens long after last
>
>   I915_WRITE(GMBUS4 + reg_offset, 0);
>
>has been performed in gmbus_wait_hw_status(). On the other hand, if I 
>comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), 
>then it of course falls back to GPIO bit-banging, but the "nobody cared" 
>for irq 16 is gone. 
>
>So it seems like something gets severely confused by the I915_WRITE to 
>GMBUS4 + reg_offset. So far this seems to have been reported solely on 
>Lenovos as far as I can see (although a completely different types), so it 
>might be some platform-specific quirk?
>
>Honestly, I still don't understand how all the GMBUS stuff relates to IRQ 
>16 at all. 
>
>-- 
>Jiri Kosina
>SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Daniel Vetter
On Mon, Mar 18, 2013 at 10:12:49AM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Yinghai Lu wrote:
> 
> > > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > > reveals that "nobody cared" for irq 16 happens long after last
> > >
> > > I915_WRITE(GMBUS4 + reg_offset, 0);
> > >
> > > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > > for irq 16 is gone.
> > >
> > > So it seems like something gets severely confused by the I915_WRITE to
> > > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > > Lenovos as far as I can see (although a completely different types), so it
> > > might be some platform-specific quirk?
> > >
> > > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > > 16 at all.
> > 
> > that device is using
> > i915 :00:02.0: irq 44 for MSI/MSI-X
> > 
> > so can you try to boot with pci=nomsi?
> 
> Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
> interrupts go away.
> 
> My understanding from the other mail is that DAniel Vetter already has an 
> idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
> this datapoint regarding MSI will fit into it.

Yep, there's a big comment in the irq handler for that chipset that we
have a gaping race with when using MSI interrupts. Although the comment
bodly claims that the race is small enough to avoid the dreaded "nobody
cared" message. Looks like gmbus is good at hitting that race - on newer
chips it already brought up a similar race in handling pch interrupts.

Can you please give the below patch a whirl? It removes the probably race
msi race avoidance code and replaces it with the same trick Paulo used to
fix pch irq handling races.

Thanks, Daniel
---
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 3c7bb04..13de12e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2684,7 +2684,7 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 {
struct drm_device *dev = (struct drm_device *) arg;
drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
-   u32 iir, new_iir;
+   u32 iir, new_iir, ier;
u32 pipe_stats[I915_MAX_PIPES];
unsigned long irqflags;
int irq_received;
@@ -2692,9 +2692,14 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 
atomic_inc(&dev_priv->irq_received);
 
+   /* irq race avoidance, copy&pasta from Paulo's PCH irq fix */
+   ier = I915_READ(IER);
+   I915_WRITE(IER, 0);
+   POSTING_READ(IER);
+
iir = I915_READ(IIR);
 
-   for (;;) {
+   do {
bool blc_event = false;
 
irq_received = iir != 0;
@@ -2792,7 +2797,10 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 * stray interrupts.
 */
iir = new_iir;
-   }
+   } while (0);
+
+   I915_WRITE(IER, ier);
+   POSTING_READ(IER);
 
i915_update_dri1_breadcrumb(dev);
 
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Chris Wilson
On Mon, Mar 18, 2013 at 08:19:03PM +0100, Daniel Vetter wrote:
> On Mon, Mar 18, 2013 at 10:12:49AM +0100, Jiri Kosina wrote:
> > On Fri, 15 Mar 2013, Yinghai Lu wrote:
> > 
> > > > Just a datapoint -- I have put a trivial debugging patch in place, and 
> > > > it
> > > > reveals that "nobody cared" for irq 16 happens long after last
> > > >
> > > > I915_WRITE(GMBUS4 + reg_offset, 0);
> > > >
> > > > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > > > comment out both GMBUS4 register offset writes in 
> > > > gmbus_wait_hw_status(),
> > > > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > > > for irq 16 is gone.
> > > >
> > > > So it seems like something gets severely confused by the I915_WRITE to
> > > > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > > > Lenovos as far as I can see (although a completely different types), so 
> > > > it
> > > > might be some platform-specific quirk?
> > > >
> > > > Honestly, I still don't understand how all the GMBUS stuff relates to 
> > > > IRQ
> > > > 16 at all.
> > > 
> > > that device is using
> > > i915 :00:02.0: irq 44 for MSI/MSI-X
> > > 
> > > so can you try to boot with pci=nomsi?
> > 
> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
> > interrupts go away.
> > 
> > My understanding from the other mail is that DAniel Vetter already has an 
> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
> > this datapoint regarding MSI will fit into it.
> 
> Yep, there's a big comment in the irq handler for that chipset that we
> have a gaping race with when using MSI interrupts. Although the comment
> bodly claims that the race is small enough to avoid the dreaded "nobody
> cared" message. Looks like gmbus is good at hitting that race - on newer
> chips it already brought up a similar race in handling pch interrupts.
> 
> Can you please give the below patch a whirl? It removes the probably race
> msi race avoidance code and replaces it with the same trick Paulo used to
> fix pch irq handling races.

Still nobody cares about irq16.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Mon, 18 Mar 2013, Daniel Vetter wrote:

> Yep, there's a big comment in the irq handler for that chipset that we
> have a gaping race with when using MSI interrupts. Although the comment
> bodly claims that the race is small enough to avoid the dreaded "nobody
> cared" message. Looks like gmbus is good at hitting that race - on newer
> chips it already brought up a similar race in handling pch interrupts.

I see ... will target my focus in that direction, thanks.

> Can you please give the below patch a whirl? It removes the probably race
> msi race avoidance code and replaces it with the same trick Paulo used to
> fix pch irq handling races.

Unfortunately it didn't change anything, the spurious interrupt report is 
still there.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Mon, 18 Mar 2013, Yinghai Lu wrote:

> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
> > interrupts go away.
> >
> > My understanding from the other mail is that DAniel Vetter already has an
> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
> > this datapoint regarding MSI will fit into it.
> 
> What is /proc/interrupts difference between with and without pci=nomsi ?
> 
> drm is forced to share irq 16?

Yup, IRQ 16 is being shared, and one of the owners is i915.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Yinghai Lu
On Mon, Mar 18, 2013 at 3:05 PM, Jiri Kosina  wrote:
> On Mon, 18 Mar 2013, Yinghai Lu wrote:
>
>> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
>> > interrupts go away.
>> >
>> > My understanding from the other mail is that DAniel Vetter already has an
>> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
>> > this datapoint regarding MSI will fit into it.
>>
>> What is /proc/interrupts difference between with and without pci=nomsi ?
>>
>> drm is forced to share irq 16?
>
> Yup, IRQ 16 is being shared, and one of the owners is i915.

the vga report strange INTx status...

00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series
Chipset Integrated Graphics Controller (rev 07) (prog-if 00 [VGA
controller])
Subsystem: Lenovo Device 20e4
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
SERR-  [disabled]
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee0100c  Data: 4142
Capabilities: [d0] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Kernel driver in use: i915
Kernel modules: i915

it should be INTx-, after we have set DisINTx+ in control.

So INTx can not be disabled after it get enabled before ?

the VGA on my T420 looks right.

00:02.0 VGA compatible controller: Intel Corporation 2nd Generation
Core Processor Family Integrated Graphics Controller (rev 09) (prog-if
00 [VGA controller])
Subsystem: Lenovo Device 21ce
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
SERR- http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Harald Arnesen wrote:

> I have the same problem on my Lenovo T500. I think the graphics card is
> involved.
> 
> This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> nobody cared" during boot, not when I boot with the ATI card.

Confirming this. After a lot of hassle, I have bisected this reliably to

commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
Author: Daniel Vetter 
Date:   Sat Dec 1 13:53:45 2012 +0100

drm/i915: use the gmbus irq for waits

Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
happening in parallel.

Attaching dmesg.txt from the machine with 28c70f162a as head, with 
drm.debug=0xe.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Jiri Kosina wrote:

> > I have the same problem on my Lenovo T500. I think the graphics card is
> > involved.
> > 
> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > nobody cared" during boot, not when I boot with the ATI card.
> 
> Confirming this. After a lot of hassle, I have bisected this reliably to
> 
>   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>   Author: Daniel Vetter 
>   Date:   Sat Dec 1 13:53:45 2012 +0100
> 
>   drm/i915: use the gmbus irq for waits
> 
> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> happening in parallel.
> 
> Attaching dmesg.txt from the machine with 28c70f162a as head, with 
> drm.debug=0xe.

Just a datapoint -- I have put a trivial debugging patch in place, and it 
reveals that "nobody cared" for irq 16 happens long after last

I915_WRITE(GMBUS4 + reg_offset, 0);

has been performed in gmbus_wait_hw_status(). On the other hand, if I 
comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), 
then it of course falls back to GPIO bit-banging, but the "nobody cared" 
for irq 16 is gone. 

So it seems like something gets severely confused by the I915_WRITE to 
GMBUS4 + reg_offset. So far this seems to have been reported solely on 
Lenovos as far as I can see (although a completely different types), so it 
might be some platform-specific quirk?

Honestly, I still don't understand how all the GMBUS stuff relates to IRQ 
16 at all. 

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Greg KH
On Fri, Mar 15, 2013 at 02:33:13PM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Harald Arnesen wrote:
> 
> > I have the same problem on my Lenovo T500. I think the graphics card is
> > involved.
> > 
> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > nobody cared" during boot, not when I boot with the ATI card.
> 
> Confirming this. After a lot of hassle, I have bisected this reliably to
> 
>   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>   Author: Daniel Vetter 
>   Date:   Sat Dec 1 13:53:45 2012 +0100
> 
>   drm/i915: use the gmbus irq for waits
> 
> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> happening in parallel.

Wasn't this fixed by the merge from David
(2cc79544bd0aabb4b3cf467ead5df526d9134c64)?  I can't figure out the
exact commit that the merge message referred to though...

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Greg KH wrote:

> > > I have the same problem on my Lenovo T500. I think the graphics card is
> > > involved.
> > > 
> > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > > nobody cared" during boot, not when I boot with the ATI card.
> > 
> > Confirming this. After a lot of hassle, I have bisected this reliably to
> > 
> > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > Author: Daniel Vetter 
> > Date:   Sat Dec 1 13:53:45 2012 +0100
> > 
> > drm/i915: use the gmbus irq for waits
> > 
> > Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> > happening in parallel.
> 
> Wasn't this fixed by the merge from David
> (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?

Why do you think it should, please?

(I am seeing this with a2362d247 still).

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Greg KH
On Fri, Mar 15, 2013 at 04:37:56PM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Greg KH wrote:
> 
> > > > I have the same problem on my Lenovo T500. I think the graphics card is
> > > > involved.
> > > > 
> > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > > > nobody cared" during boot, not when I boot with the ATI card.
> > > 
> > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > 
> > >   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > >   Author: Daniel Vetter 
> > >   Date:   Sat Dec 1 13:53:45 2012 +0100
> > > 
> > >   drm/i915: use the gmbus irq for waits
> > > 
> > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > what's 
> > > happening in parallel.
> > 
> > Wasn't this fixed by the merge from David
> > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> 
> Why do you think it should, please?

The line:
- Fix PCH irq handling race which resulted in missed gmbus/dp
  aux irqs and subsequent fallout (Paulo)

> (I am seeing this with a2362d247 still).

Ok, I guess it isn't still fixed properly, just was guessing :)

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Greg KH wrote:

> > > > > I have the same problem on my Lenovo T500. I think the graphics card 
> > > > > is
> > > > > involved.
> > > > > 
> > > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 
> > > > > 16:
> > > > > nobody cared" during boot, not when I boot with the ATI card.
> > > > 
> > > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > > 
> > > > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > > > Author: Daniel Vetter 
> > > > Date:   Sat Dec 1 13:53:45 2012 +0100
> > > > 
> > > > drm/i915: use the gmbus irq for waits
> > > > 
> > > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > > what's 
> > > > happening in parallel.
> > > 
> > > Wasn't this fixed by the merge from David
> > > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> > 
> > Why do you think it should, please?
> 
> The line:
>   - Fix PCH irq handling race which resulted in missed gmbus/dp
> aux irqs and subsequent fallout (Paulo)

Ah, that one. I believe that should be irrelevant for GM chipsets, as they 
don't have AUX line, right?

> > (I am seeing this with a2362d247 still).
> 
> Ok, I guess it isn't still fixed properly, just was guessing :)

Seems like this is a different issue.

Thanks,

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Yinghai Lu
On Fri, Mar 15, 2013 at 8:14 AM, Jiri Kosina  wrote:

> Just a datapoint -- I have put a trivial debugging patch in place, and it
> reveals that "nobody cared" for irq 16 happens long after last
>
> I915_WRITE(GMBUS4 + reg_offset, 0);
>
> has been performed in gmbus_wait_hw_status(). On the other hand, if I
> comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> then it of course falls back to GPIO bit-banging, but the "nobody cared"
> for irq 16 is gone.
>
> So it seems like something gets severely confused by the I915_WRITE to
> GMBUS4 + reg_offset. So far this seems to have been reported solely on
> Lenovos as far as I can see (although a completely different types), so it
> might be some platform-specific quirk?
>
> Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> 16 at all.

that device is using
i915 :00:02.0: irq 44 for MSI/MSI-X

so can you try to boot with pci=nomsi?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-17 Thread Shawn Starr
On Friday, March 15, 2013 12:14:28 PM Yinghai Lu wrote:
> On Fri, Mar 15, 2013 at 8:14 AM, Jiri Kosina  wrote:
> > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > reveals that "nobody cared" for irq 16 happens long after last
> > 
> > I915_WRITE(GMBUS4 + reg_offset, 0);
> > 
> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > for irq 16 is gone.
> > 
> > So it seems like something gets severely confused by the I915_WRITE to
> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > Lenovos as far as I can see (although a completely different types), so it
> > might be some platform-specific quirk?
> > 
> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > 16 at all.
> 
> that device is using
> i915 :00:02.0: irq 44 for MSI/MSI-X
> 
> so can you try to boot with pci=nomsi?

I can try disabling MSI with 3.9.0-0.rc2.git0.4.fc20. -rc3 is not yet 
available in rawhide.

thanks,
Shawn___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Daniel Vetter
On Fri, Mar 15, 2013 at 08:47:39AM -0700, Greg KH wrote:
> On Fri, Mar 15, 2013 at 04:37:56PM +0100, Jiri Kosina wrote:
> > On Fri, 15 Mar 2013, Greg KH wrote:
> > 
> > > > > I have the same problem on my Lenovo T500. I think the graphics card 
> > > > > is
> > > > > involved.
> > > > > 
> > > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 
> > > > > 16:
> > > > > nobody cared" during boot, not when I boot with the ATI card.
> > > > 
> > > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > > 
> > > > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > > > Author: Daniel Vetter 
> > > > Date:   Sat Dec 1 13:53:45 2012 +0100
> > > > 
> > > > drm/i915: use the gmbus irq for waits
> > > > 
> > > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > > what's 
> > > > happening in parallel.
> > > 
> > > Wasn't this fixed by the merge from David
> > > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> > 
> > Why do you think it should, please?
> 
> The line:
>   - Fix PCH irq handling race which resulted in missed gmbus/dp
> aux irqs and subsequent fallout (Paulo)
> 
> > (I am seeing this with a2362d247 still).
> 
> Ok, I guess it isn't still fixed properly, just was guessing :)

Yeah, the above fix is for pch split platforms, whereas these reports here
are for gm45 (which doesn't have the pch display split). Acking of gmbus
interrupts works differently on those, I'm testing right now whether I can
reproduce this fail.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Fri, 15 Mar 2013, Yinghai Lu wrote:

> > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > reveals that "nobody cared" for irq 16 happens long after last
> >
> > I915_WRITE(GMBUS4 + reg_offset, 0);
> >
> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > for irq 16 is gone.
> >
> > So it seems like something gets severely confused by the I915_WRITE to
> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > Lenovos as far as I can see (although a completely different types), so it
> > might be some platform-specific quirk?
> >
> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > 16 at all.
> 
> that device is using
> i915 :00:02.0: irq 44 for MSI/MSI-X
> 
> so can you try to boot with pci=nomsi?

Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
interrupts go away.

My understanding from the other mail is that DAniel Vetter already has an 
idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
this datapoint regarding MSI will fit into it.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Yinghai Lu
On Mon, Mar 18, 2013 at 2:12 AM, Jiri Kosina  wrote:
> On Fri, 15 Mar 2013, Yinghai Lu wrote:
>
>> > Just a datapoint -- I have put a trivial debugging patch in place, and it
>> > reveals that "nobody cared" for irq 16 happens long after last
>> >
>> > I915_WRITE(GMBUS4 + reg_offset, 0);
>> >
>> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
>> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
>> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
>> > for irq 16 is gone.
>> >
>> > So it seems like something gets severely confused by the I915_WRITE to
>> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
>> > Lenovos as far as I can see (although a completely different types), so it
>> > might be some platform-specific quirk?
>> >
>> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
>> > 16 at all.
>>
>> that device is using
>> i915 :00:02.0: irq 44 for MSI/MSI-X
>>
>> so can you try to boot with pci=nomsi?
>
> Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
> interrupts go away.
>
> My understanding from the other mail is that DAniel Vetter already has an
> idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
> this datapoint regarding MSI will fit into it.

What is /proc/interrupts difference between with and without pci=nomsi ?

drm is forced to share irq 16?

Thanks

Yinghai
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Thomas Meyer
My laptop is an Acer 1810T. I see this error message each boot.

Kind regards
Thomas

Jiri Kosina  schrieb:

>On Fri, 15 Mar 2013, Jiri Kosina wrote:
>
>> > I have the same problem on my Lenovo T500. I think the graphics card is
>> > involved.
>> > 
>> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
>> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
>> > nobody cared" during boot, not when I boot with the ATI card.
>> 
>> Confirming this. After a lot of hassle, I have bisected this reliably to
>> 
>>  commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>>  Author: Daniel Vetter 
>>  Date:   Sat Dec 1 13:53:45 2012 +0100
>> 
>>  drm/i915: use the gmbus irq for waits
>> 
>> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
>> happening in parallel.
>> 
>> Attaching dmesg.txt from the machine with 28c70f162a as head, with 
>> drm.debug=0xe.
>
>Just a datapoint -- I have put a trivial debugging patch in place, and it 
>reveals that "nobody cared" for irq 16 happens long after last
>
>   I915_WRITE(GMBUS4 + reg_offset, 0);
>
>has been performed in gmbus_wait_hw_status(). On the other hand, if I 
>comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), 
>then it of course falls back to GPIO bit-banging, but the "nobody cared" 
>for irq 16 is gone. 
>
>So it seems like something gets severely confused by the I915_WRITE to 
>GMBUS4 + reg_offset. So far this seems to have been reported solely on 
>Lenovos as far as I can see (although a completely different types), so it 
>might be some platform-specific quirk?
>
>Honestly, I still don't understand how all the GMBUS stuff relates to IRQ 
>16 at all. 
>
>-- 
>Jiri Kosina
>SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Daniel Vetter
On Mon, Mar 18, 2013 at 10:12:49AM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Yinghai Lu wrote:
> 
> > > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > > reveals that "nobody cared" for irq 16 happens long after last
> > >
> > > I915_WRITE(GMBUS4 + reg_offset, 0);
> > >
> > > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > > for irq 16 is gone.
> > >
> > > So it seems like something gets severely confused by the I915_WRITE to
> > > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > > Lenovos as far as I can see (although a completely different types), so it
> > > might be some platform-specific quirk?
> > >
> > > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > > 16 at all.
> > 
> > that device is using
> > i915 :00:02.0: irq 44 for MSI/MSI-X
> > 
> > so can you try to boot with pci=nomsi?
> 
> Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
> interrupts go away.
> 
> My understanding from the other mail is that DAniel Vetter already has an 
> idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
> this datapoint regarding MSI will fit into it.

Yep, there's a big comment in the irq handler for that chipset that we
have a gaping race with when using MSI interrupts. Although the comment
bodly claims that the race is small enough to avoid the dreaded "nobody
cared" message. Looks like gmbus is good at hitting that race - on newer
chips it already brought up a similar race in handling pch interrupts.

Can you please give the below patch a whirl? It removes the probably race
msi race avoidance code and replaces it with the same trick Paulo used to
fix pch irq handling races.

Thanks, Daniel
---
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 3c7bb04..13de12e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2684,7 +2684,7 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 {
struct drm_device *dev = (struct drm_device *) arg;
drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
-   u32 iir, new_iir;
+   u32 iir, new_iir, ier;
u32 pipe_stats[I915_MAX_PIPES];
unsigned long irqflags;
int irq_received;
@@ -2692,9 +2692,14 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 
atomic_inc(&dev_priv->irq_received);
 
+   /* irq race avoidance, copy&pasta from Paulo's PCH irq fix */
+   ier = I915_READ(IER);
+   I915_WRITE(IER, 0);
+   POSTING_READ(IER);
+
iir = I915_READ(IIR);
 
-   for (;;) {
+   do {
bool blc_event = false;
 
irq_received = iir != 0;
@@ -2792,7 +2797,10 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 * stray interrupts.
 */
iir = new_iir;
-   }
+   } while (0);
+
+   I915_WRITE(IER, ier);
+   POSTING_READ(IER);
 
i915_update_dri1_breadcrumb(dev);
 
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Chris Wilson
On Mon, Mar 18, 2013 at 08:19:03PM +0100, Daniel Vetter wrote:
> On Mon, Mar 18, 2013 at 10:12:49AM +0100, Jiri Kosina wrote:
> > On Fri, 15 Mar 2013, Yinghai Lu wrote:
> > 
> > > > Just a datapoint -- I have put a trivial debugging patch in place, and 
> > > > it
> > > > reveals that "nobody cared" for irq 16 happens long after last
> > > >
> > > > I915_WRITE(GMBUS4 + reg_offset, 0);
> > > >
> > > > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > > > comment out both GMBUS4 register offset writes in 
> > > > gmbus_wait_hw_status(),
> > > > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > > > for irq 16 is gone.
> > > >
> > > > So it seems like something gets severely confused by the I915_WRITE to
> > > > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > > > Lenovos as far as I can see (although a completely different types), so 
> > > > it
> > > > might be some platform-specific quirk?
> > > >
> > > > Honestly, I still don't understand how all the GMBUS stuff relates to 
> > > > IRQ
> > > > 16 at all.
> > > 
> > > that device is using
> > > i915 :00:02.0: irq 44 for MSI/MSI-X
> > > 
> > > so can you try to boot with pci=nomsi?
> > 
> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
> > interrupts go away.
> > 
> > My understanding from the other mail is that DAniel Vetter already has an 
> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
> > this datapoint regarding MSI will fit into it.
> 
> Yep, there's a big comment in the irq handler for that chipset that we
> have a gaping race with when using MSI interrupts. Although the comment
> bodly claims that the race is small enough to avoid the dreaded "nobody
> cared" message. Looks like gmbus is good at hitting that race - on newer
> chips it already brought up a similar race in handling pch interrupts.
> 
> Can you please give the below patch a whirl? It removes the probably race
> msi race avoidance code and replaces it with the same trick Paulo used to
> fix pch irq handling races.

Still nobody cares about irq16.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Mon, 18 Mar 2013, Daniel Vetter wrote:

> Yep, there's a big comment in the irq handler for that chipset that we
> have a gaping race with when using MSI interrupts. Although the comment
> bodly claims that the race is small enough to avoid the dreaded "nobody
> cared" message. Looks like gmbus is good at hitting that race - on newer
> chips it already brought up a similar race in handling pch interrupts.

I see ... will target my focus in that direction, thanks.

> Can you please give the below patch a whirl? It removes the probably race
> msi race avoidance code and replaces it with the same trick Paulo used to
> fix pch irq handling races.

Unfortunately it didn't change anything, the spurious interrupt report is 
still there.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Mon, 18 Mar 2013, Yinghai Lu wrote:

> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
> > interrupts go away.
> >
> > My understanding from the other mail is that DAniel Vetter already has an
> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
> > this datapoint regarding MSI will fit into it.
> 
> What is /proc/interrupts difference between with and without pci=nomsi ?
> 
> drm is forced to share irq 16?

Yup, IRQ 16 is being shared, and one of the owners is i915.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Yinghai Lu
On Mon, Mar 18, 2013 at 3:05 PM, Jiri Kosina  wrote:
> On Mon, 18 Mar 2013, Yinghai Lu wrote:
>
>> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
>> > interrupts go away.
>> >
>> > My understanding from the other mail is that DAniel Vetter already has an
>> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
>> > this datapoint regarding MSI will fit into it.
>>
>> What is /proc/interrupts difference between with and without pci=nomsi ?
>>
>> drm is forced to share irq 16?
>
> Yup, IRQ 16 is being shared, and one of the owners is i915.

the vga report strange INTx status...

00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series
Chipset Integrated Graphics Controller (rev 07) (prog-if 00 [VGA
controller])
Subsystem: Lenovo Device 20e4
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
SERR-  [disabled]
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee0100c  Data: 4142
Capabilities: [d0] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Kernel driver in use: i915
Kernel modules: i915

it should be INTx-, after we have set DisINTx+ in control.

So INTx can not be disabled after it get enabled before ?

the VGA on my T420 looks right.

00:02.0 VGA compatible controller: Intel Corporation 2nd Generation
Core Processor Family Integrated Graphics Controller (rev 09) (prog-if
00 [VGA controller])
Subsystem: Lenovo Device 21ce
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
SERR- http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Harald Arnesen wrote:

> I have the same problem on my Lenovo T500. I think the graphics card is
> involved.
> 
> This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> nobody cared" during boot, not when I boot with the ATI card.

Confirming this. After a lot of hassle, I have bisected this reliably to

commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
Author: Daniel Vetter 
Date:   Sat Dec 1 13:53:45 2012 +0100

drm/i915: use the gmbus irq for waits

Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
happening in parallel.

Attaching dmesg.txt from the machine with 28c70f162a as head, with 
drm.debug=0xe.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Jiri Kosina wrote:

> > I have the same problem on my Lenovo T500. I think the graphics card is
> > involved.
> > 
> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > nobody cared" during boot, not when I boot with the ATI card.
> 
> Confirming this. After a lot of hassle, I have bisected this reliably to
> 
>   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>   Author: Daniel Vetter 
>   Date:   Sat Dec 1 13:53:45 2012 +0100
> 
>   drm/i915: use the gmbus irq for waits
> 
> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> happening in parallel.
> 
> Attaching dmesg.txt from the machine with 28c70f162a as head, with 
> drm.debug=0xe.

Just a datapoint -- I have put a trivial debugging patch in place, and it 
reveals that "nobody cared" for irq 16 happens long after last

I915_WRITE(GMBUS4 + reg_offset, 0);

has been performed in gmbus_wait_hw_status(). On the other hand, if I 
comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), 
then it of course falls back to GPIO bit-banging, but the "nobody cared" 
for irq 16 is gone. 

So it seems like something gets severely confused by the I915_WRITE to 
GMBUS4 + reg_offset. So far this seems to have been reported solely on 
Lenovos as far as I can see (although a completely different types), so it 
might be some platform-specific quirk?

Honestly, I still don't understand how all the GMBUS stuff relates to IRQ 
16 at all. 

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Greg KH
On Fri, Mar 15, 2013 at 02:33:13PM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Harald Arnesen wrote:
> 
> > I have the same problem on my Lenovo T500. I think the graphics card is
> > involved.
> > 
> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > nobody cared" during boot, not when I boot with the ATI card.
> 
> Confirming this. After a lot of hassle, I have bisected this reliably to
> 
>   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>   Author: Daniel Vetter 
>   Date:   Sat Dec 1 13:53:45 2012 +0100
> 
>   drm/i915: use the gmbus irq for waits
> 
> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> happening in parallel.

Wasn't this fixed by the merge from David
(2cc79544bd0aabb4b3cf467ead5df526d9134c64)?  I can't figure out the
exact commit that the merge message referred to though...

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Greg KH wrote:

> > > I have the same problem on my Lenovo T500. I think the graphics card is
> > > involved.
> > > 
> > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > > nobody cared" during boot, not when I boot with the ATI card.
> > 
> > Confirming this. After a lot of hassle, I have bisected this reliably to
> > 
> > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > Author: Daniel Vetter 
> > Date:   Sat Dec 1 13:53:45 2012 +0100
> > 
> > drm/i915: use the gmbus irq for waits
> > 
> > Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> > happening in parallel.
> 
> Wasn't this fixed by the merge from David
> (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?

Why do you think it should, please?

(I am seeing this with a2362d247 still).

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Greg KH
On Fri, Mar 15, 2013 at 04:37:56PM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Greg KH wrote:
> 
> > > > I have the same problem on my Lenovo T500. I think the graphics card is
> > > > involved.
> > > > 
> > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > > > nobody cared" during boot, not when I boot with the ATI card.
> > > 
> > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > 
> > >   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > >   Author: Daniel Vetter 
> > >   Date:   Sat Dec 1 13:53:45 2012 +0100
> > > 
> > >   drm/i915: use the gmbus irq for waits
> > > 
> > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > what's 
> > > happening in parallel.
> > 
> > Wasn't this fixed by the merge from David
> > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> 
> Why do you think it should, please?

The line:
- Fix PCH irq handling race which resulted in missed gmbus/dp
  aux irqs and subsequent fallout (Paulo)

> (I am seeing this with a2362d247 still).

Ok, I guess it isn't still fixed properly, just was guessing :)

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Greg KH wrote:

> > > > > I have the same problem on my Lenovo T500. I think the graphics card 
> > > > > is
> > > > > involved.
> > > > > 
> > > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 
> > > > > 16:
> > > > > nobody cared" during boot, not when I boot with the ATI card.
> > > > 
> > > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > > 
> > > > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > > > Author: Daniel Vetter 
> > > > Date:   Sat Dec 1 13:53:45 2012 +0100
> > > > 
> > > > drm/i915: use the gmbus irq for waits
> > > > 
> > > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > > what's 
> > > > happening in parallel.
> > > 
> > > Wasn't this fixed by the merge from David
> > > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> > 
> > Why do you think it should, please?
> 
> The line:
>   - Fix PCH irq handling race which resulted in missed gmbus/dp
> aux irqs and subsequent fallout (Paulo)

Ah, that one. I believe that should be irrelevant for GM chipsets, as they 
don't have AUX line, right?

> > (I am seeing this with a2362d247 still).
> 
> Ok, I guess it isn't still fixed properly, just was guessing :)

Seems like this is a different issue.

Thanks,

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Yinghai Lu
On Fri, Mar 15, 2013 at 8:14 AM, Jiri Kosina  wrote:

> Just a datapoint -- I have put a trivial debugging patch in place, and it
> reveals that "nobody cared" for irq 16 happens long after last
>
> I915_WRITE(GMBUS4 + reg_offset, 0);
>
> has been performed in gmbus_wait_hw_status(). On the other hand, if I
> comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> then it of course falls back to GPIO bit-banging, but the "nobody cared"
> for irq 16 is gone.
>
> So it seems like something gets severely confused by the I915_WRITE to
> GMBUS4 + reg_offset. So far this seems to have been reported solely on
> Lenovos as far as I can see (although a completely different types), so it
> might be some platform-specific quirk?
>
> Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> 16 at all.

that device is using
i915 :00:02.0: irq 44 for MSI/MSI-X

so can you try to boot with pci=nomsi?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-17 Thread Shawn Starr
On Friday, March 15, 2013 12:14:28 PM Yinghai Lu wrote:
> On Fri, Mar 15, 2013 at 8:14 AM, Jiri Kosina  wrote:
> > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > reveals that "nobody cared" for irq 16 happens long after last
> > 
> > I915_WRITE(GMBUS4 + reg_offset, 0);
> > 
> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > for irq 16 is gone.
> > 
> > So it seems like something gets severely confused by the I915_WRITE to
> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > Lenovos as far as I can see (although a completely different types), so it
> > might be some platform-specific quirk?
> > 
> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > 16 at all.
> 
> that device is using
> i915 :00:02.0: irq 44 for MSI/MSI-X
> 
> so can you try to boot with pci=nomsi?

I can try disabling MSI with 3.9.0-0.rc2.git0.4.fc20. -rc3 is not yet 
available in rawhide.

thanks,
Shawn___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Daniel Vetter
On Fri, Mar 15, 2013 at 08:47:39AM -0700, Greg KH wrote:
> On Fri, Mar 15, 2013 at 04:37:56PM +0100, Jiri Kosina wrote:
> > On Fri, 15 Mar 2013, Greg KH wrote:
> > 
> > > > > I have the same problem on my Lenovo T500. I think the graphics card 
> > > > > is
> > > > > involved.
> > > > > 
> > > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 
> > > > > 16:
> > > > > nobody cared" during boot, not when I boot with the ATI card.
> > > > 
> > > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > > 
> > > > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > > > Author: Daniel Vetter 
> > > > Date:   Sat Dec 1 13:53:45 2012 +0100
> > > > 
> > > > drm/i915: use the gmbus irq for waits
> > > > 
> > > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > > what's 
> > > > happening in parallel.
> > > 
> > > Wasn't this fixed by the merge from David
> > > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> > 
> > Why do you think it should, please?
> 
> The line:
>   - Fix PCH irq handling race which resulted in missed gmbus/dp
> aux irqs and subsequent fallout (Paulo)
> 
> > (I am seeing this with a2362d247 still).
> 
> Ok, I guess it isn't still fixed properly, just was guessing :)

Yeah, the above fix is for pch split platforms, whereas these reports here
are for gm45 (which doesn't have the pch display split). Acking of gmbus
interrupts works differently on those, I'm testing right now whether I can
reproduce this fail.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Fri, 15 Mar 2013, Yinghai Lu wrote:

> > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > reveals that "nobody cared" for irq 16 happens long after last
> >
> > I915_WRITE(GMBUS4 + reg_offset, 0);
> >
> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > for irq 16 is gone.
> >
> > So it seems like something gets severely confused by the I915_WRITE to
> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > Lenovos as far as I can see (although a completely different types), so it
> > might be some platform-specific quirk?
> >
> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > 16 at all.
> 
> that device is using
> i915 :00:02.0: irq 44 for MSI/MSI-X
> 
> so can you try to boot with pci=nomsi?

Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
interrupts go away.

My understanding from the other mail is that DAniel Vetter already has an 
idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
this datapoint regarding MSI will fit into it.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Yinghai Lu
On Mon, Mar 18, 2013 at 2:12 AM, Jiri Kosina  wrote:
> On Fri, 15 Mar 2013, Yinghai Lu wrote:
>
>> > Just a datapoint -- I have put a trivial debugging patch in place, and it
>> > reveals that "nobody cared" for irq 16 happens long after last
>> >
>> > I915_WRITE(GMBUS4 + reg_offset, 0);
>> >
>> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
>> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
>> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
>> > for irq 16 is gone.
>> >
>> > So it seems like something gets severely confused by the I915_WRITE to
>> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
>> > Lenovos as far as I can see (although a completely different types), so it
>> > might be some platform-specific quirk?
>> >
>> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
>> > 16 at all.
>>
>> that device is using
>> i915 :00:02.0: irq 44 for MSI/MSI-X
>>
>> so can you try to boot with pci=nomsi?
>
> Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
> interrupts go away.
>
> My understanding from the other mail is that DAniel Vetter already has an
> idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
> this datapoint regarding MSI will fit into it.

What is /proc/interrupts difference between with and without pci=nomsi ?

drm is forced to share irq 16?

Thanks

Yinghai
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Thomas Meyer
My laptop is an Acer 1810T. I see this error message each boot.

Kind regards
Thomas

Jiri Kosina  schrieb:

>On Fri, 15 Mar 2013, Jiri Kosina wrote:
>
>> > I have the same problem on my Lenovo T500. I think the graphics card is
>> > involved.
>> > 
>> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
>> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
>> > nobody cared" during boot, not when I boot with the ATI card.
>> 
>> Confirming this. After a lot of hassle, I have bisected this reliably to
>> 
>>  commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>>  Author: Daniel Vetter 
>>  Date:   Sat Dec 1 13:53:45 2012 +0100
>> 
>>  drm/i915: use the gmbus irq for waits
>> 
>> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
>> happening in parallel.
>> 
>> Attaching dmesg.txt from the machine with 28c70f162a as head, with 
>> drm.debug=0xe.
>
>Just a datapoint -- I have put a trivial debugging patch in place, and it 
>reveals that "nobody cared" for irq 16 happens long after last
>
>   I915_WRITE(GMBUS4 + reg_offset, 0);
>
>has been performed in gmbus_wait_hw_status(). On the other hand, if I 
>comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), 
>then it of course falls back to GPIO bit-banging, but the "nobody cared" 
>for irq 16 is gone. 
>
>So it seems like something gets severely confused by the I915_WRITE to 
>GMBUS4 + reg_offset. So far this seems to have been reported solely on 
>Lenovos as far as I can see (although a completely different types), so it 
>might be some platform-specific quirk?
>
>Honestly, I still don't understand how all the GMBUS stuff relates to IRQ 
>16 at all. 
>
>-- 
>Jiri Kosina
>SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Daniel Vetter
On Mon, Mar 18, 2013 at 10:12:49AM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Yinghai Lu wrote:
> 
> > > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > > reveals that "nobody cared" for irq 16 happens long after last
> > >
> > > I915_WRITE(GMBUS4 + reg_offset, 0);
> > >
> > > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > > for irq 16 is gone.
> > >
> > > So it seems like something gets severely confused by the I915_WRITE to
> > > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > > Lenovos as far as I can see (although a completely different types), so it
> > > might be some platform-specific quirk?
> > >
> > > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > > 16 at all.
> > 
> > that device is using
> > i915 :00:02.0: irq 44 for MSI/MSI-X
> > 
> > so can you try to boot with pci=nomsi?
> 
> Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
> interrupts go away.
> 
> My understanding from the other mail is that DAniel Vetter already has an 
> idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
> this datapoint regarding MSI will fit into it.

Yep, there's a big comment in the irq handler for that chipset that we
have a gaping race with when using MSI interrupts. Although the comment
bodly claims that the race is small enough to avoid the dreaded "nobody
cared" message. Looks like gmbus is good at hitting that race - on newer
chips it already brought up a similar race in handling pch interrupts.

Can you please give the below patch a whirl? It removes the probably race
msi race avoidance code and replaces it with the same trick Paulo used to
fix pch irq handling races.

Thanks, Daniel
---
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 3c7bb04..13de12e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2684,7 +2684,7 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 {
struct drm_device *dev = (struct drm_device *) arg;
drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
-   u32 iir, new_iir;
+   u32 iir, new_iir, ier;
u32 pipe_stats[I915_MAX_PIPES];
unsigned long irqflags;
int irq_received;
@@ -2692,9 +2692,14 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 
atomic_inc(&dev_priv->irq_received);
 
+   /* irq race avoidance, copy&pasta from Paulo's PCH irq fix */
+   ier = I915_READ(IER);
+   I915_WRITE(IER, 0);
+   POSTING_READ(IER);
+
iir = I915_READ(IIR);
 
-   for (;;) {
+   do {
bool blc_event = false;
 
irq_received = iir != 0;
@@ -2792,7 +2797,10 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 * stray interrupts.
 */
iir = new_iir;
-   }
+   } while (0);
+
+   I915_WRITE(IER, ier);
+   POSTING_READ(IER);
 
i915_update_dri1_breadcrumb(dev);
 
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Chris Wilson
On Mon, Mar 18, 2013 at 08:19:03PM +0100, Daniel Vetter wrote:
> On Mon, Mar 18, 2013 at 10:12:49AM +0100, Jiri Kosina wrote:
> > On Fri, 15 Mar 2013, Yinghai Lu wrote:
> > 
> > > > Just a datapoint -- I have put a trivial debugging patch in place, and 
> > > > it
> > > > reveals that "nobody cared" for irq 16 happens long after last
> > > >
> > > > I915_WRITE(GMBUS4 + reg_offset, 0);
> > > >
> > > > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > > > comment out both GMBUS4 register offset writes in 
> > > > gmbus_wait_hw_status(),
> > > > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > > > for irq 16 is gone.
> > > >
> > > > So it seems like something gets severely confused by the I915_WRITE to
> > > > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > > > Lenovos as far as I can see (although a completely different types), so 
> > > > it
> > > > might be some platform-specific quirk?
> > > >
> > > > Honestly, I still don't understand how all the GMBUS stuff relates to 
> > > > IRQ
> > > > 16 at all.
> > > 
> > > that device is using
> > > i915 :00:02.0: irq 44 for MSI/MSI-X
> > > 
> > > so can you try to boot with pci=nomsi?
> > 
> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
> > interrupts go away.
> > 
> > My understanding from the other mail is that DAniel Vetter already has an 
> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
> > this datapoint regarding MSI will fit into it.
> 
> Yep, there's a big comment in the irq handler for that chipset that we
> have a gaping race with when using MSI interrupts. Although the comment
> bodly claims that the race is small enough to avoid the dreaded "nobody
> cared" message. Looks like gmbus is good at hitting that race - on newer
> chips it already brought up a similar race in handling pch interrupts.
> 
> Can you please give the below patch a whirl? It removes the probably race
> msi race avoidance code and replaces it with the same trick Paulo used to
> fix pch irq handling races.

Still nobody cares about irq16.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Mon, 18 Mar 2013, Daniel Vetter wrote:

> Yep, there's a big comment in the irq handler for that chipset that we
> have a gaping race with when using MSI interrupts. Although the comment
> bodly claims that the race is small enough to avoid the dreaded "nobody
> cared" message. Looks like gmbus is good at hitting that race - on newer
> chips it already brought up a similar race in handling pch interrupts.

I see ... will target my focus in that direction, thanks.

> Can you please give the below patch a whirl? It removes the probably race
> msi race avoidance code and replaces it with the same trick Paulo used to
> fix pch irq handling races.

Unfortunately it didn't change anything, the spurious interrupt report is 
still there.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Mon, 18 Mar 2013, Yinghai Lu wrote:

> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
> > interrupts go away.
> >
> > My understanding from the other mail is that DAniel Vetter already has an
> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
> > this datapoint regarding MSI will fit into it.
> 
> What is /proc/interrupts difference between with and without pci=nomsi ?
> 
> drm is forced to share irq 16?

Yup, IRQ 16 is being shared, and one of the owners is i915.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Yinghai Lu
On Mon, Mar 18, 2013 at 3:05 PM, Jiri Kosina  wrote:
> On Mon, 18 Mar 2013, Yinghai Lu wrote:
>
>> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
>> > interrupts go away.
>> >
>> > My understanding from the other mail is that DAniel Vetter already has an
>> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
>> > this datapoint regarding MSI will fit into it.
>>
>> What is /proc/interrupts difference between with and without pci=nomsi ?
>>
>> drm is forced to share irq 16?
>
> Yup, IRQ 16 is being shared, and one of the owners is i915.

the vga report strange INTx status...

00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series
Chipset Integrated Graphics Controller (rev 07) (prog-if 00 [VGA
controller])
Subsystem: Lenovo Device 20e4
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
SERR-  [disabled]
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee0100c  Data: 4142
Capabilities: [d0] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Kernel driver in use: i915
Kernel modules: i915

it should be INTx-, after we have set DisINTx+ in control.

So INTx can not be disabled after it get enabled before ?

the VGA on my T420 looks right.

00:02.0 VGA compatible controller: Intel Corporation 2nd Generation
Core Processor Family Integrated Graphics Controller (rev 09) (prog-if
00 [VGA controller])
Subsystem: Lenovo Device 21ce
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
SERR- http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Harald Arnesen wrote:

> I have the same problem on my Lenovo T500. I think the graphics card is
> involved.
> 
> This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> nobody cared" during boot, not when I boot with the ATI card.

Confirming this. After a lot of hassle, I have bisected this reliably to

commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
Author: Daniel Vetter 
Date:   Sat Dec 1 13:53:45 2012 +0100

drm/i915: use the gmbus irq for waits

Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
happening in parallel.

Attaching dmesg.txt from the machine with 28c70f162a as head, with 
drm.debug=0xe.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Jiri Kosina wrote:

> > I have the same problem on my Lenovo T500. I think the graphics card is
> > involved.
> > 
> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > nobody cared" during boot, not when I boot with the ATI card.
> 
> Confirming this. After a lot of hassle, I have bisected this reliably to
> 
>   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>   Author: Daniel Vetter 
>   Date:   Sat Dec 1 13:53:45 2012 +0100
> 
>   drm/i915: use the gmbus irq for waits
> 
> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> happening in parallel.
> 
> Attaching dmesg.txt from the machine with 28c70f162a as head, with 
> drm.debug=0xe.

Just a datapoint -- I have put a trivial debugging patch in place, and it 
reveals that "nobody cared" for irq 16 happens long after last

I915_WRITE(GMBUS4 + reg_offset, 0);

has been performed in gmbus_wait_hw_status(). On the other hand, if I 
comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), 
then it of course falls back to GPIO bit-banging, but the "nobody cared" 
for irq 16 is gone. 

So it seems like something gets severely confused by the I915_WRITE to 
GMBUS4 + reg_offset. So far this seems to have been reported solely on 
Lenovos as far as I can see (although a completely different types), so it 
might be some platform-specific quirk?

Honestly, I still don't understand how all the GMBUS stuff relates to IRQ 
16 at all. 

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Greg KH
On Fri, Mar 15, 2013 at 02:33:13PM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Harald Arnesen wrote:
> 
> > I have the same problem on my Lenovo T500. I think the graphics card is
> > involved.
> > 
> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > nobody cared" during boot, not when I boot with the ATI card.
> 
> Confirming this. After a lot of hassle, I have bisected this reliably to
> 
>   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>   Author: Daniel Vetter 
>   Date:   Sat Dec 1 13:53:45 2012 +0100
> 
>   drm/i915: use the gmbus irq for waits
> 
> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> happening in parallel.

Wasn't this fixed by the merge from David
(2cc79544bd0aabb4b3cf467ead5df526d9134c64)?  I can't figure out the
exact commit that the merge message referred to though...

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Greg KH wrote:

> > > I have the same problem on my Lenovo T500. I think the graphics card is
> > > involved.
> > > 
> > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > > nobody cared" during boot, not when I boot with the ATI card.
> > 
> > Confirming this. After a lot of hassle, I have bisected this reliably to
> > 
> > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > Author: Daniel Vetter 
> > Date:   Sat Dec 1 13:53:45 2012 +0100
> > 
> > drm/i915: use the gmbus irq for waits
> > 
> > Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> > happening in parallel.
> 
> Wasn't this fixed by the merge from David
> (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?

Why do you think it should, please?

(I am seeing this with a2362d247 still).

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Greg KH
On Fri, Mar 15, 2013 at 04:37:56PM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Greg KH wrote:
> 
> > > > I have the same problem on my Lenovo T500. I think the graphics card is
> > > > involved.
> > > > 
> > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > > > nobody cared" during boot, not when I boot with the ATI card.
> > > 
> > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > 
> > >   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > >   Author: Daniel Vetter 
> > >   Date:   Sat Dec 1 13:53:45 2012 +0100
> > > 
> > >   drm/i915: use the gmbus irq for waits
> > > 
> > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > what's 
> > > happening in parallel.
> > 
> > Wasn't this fixed by the merge from David
> > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> 
> Why do you think it should, please?

The line:
- Fix PCH irq handling race which resulted in missed gmbus/dp
  aux irqs and subsequent fallout (Paulo)

> (I am seeing this with a2362d247 still).

Ok, I guess it isn't still fixed properly, just was guessing :)

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Greg KH wrote:

> > > > > I have the same problem on my Lenovo T500. I think the graphics card 
> > > > > is
> > > > > involved.
> > > > > 
> > > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 
> > > > > 16:
> > > > > nobody cared" during boot, not when I boot with the ATI card.
> > > > 
> > > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > > 
> > > > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > > > Author: Daniel Vetter 
> > > > Date:   Sat Dec 1 13:53:45 2012 +0100
> > > > 
> > > > drm/i915: use the gmbus irq for waits
> > > > 
> > > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > > what's 
> > > > happening in parallel.
> > > 
> > > Wasn't this fixed by the merge from David
> > > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> > 
> > Why do you think it should, please?
> 
> The line:
>   - Fix PCH irq handling race which resulted in missed gmbus/dp
> aux irqs and subsequent fallout (Paulo)

Ah, that one. I believe that should be irrelevant for GM chipsets, as they 
don't have AUX line, right?

> > (I am seeing this with a2362d247 still).
> 
> Ok, I guess it isn't still fixed properly, just was guessing :)

Seems like this is a different issue.

Thanks,

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Yinghai Lu
On Fri, Mar 15, 2013 at 8:14 AM, Jiri Kosina  wrote:

> Just a datapoint -- I have put a trivial debugging patch in place, and it
> reveals that "nobody cared" for irq 16 happens long after last
>
> I915_WRITE(GMBUS4 + reg_offset, 0);
>
> has been performed in gmbus_wait_hw_status(). On the other hand, if I
> comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> then it of course falls back to GPIO bit-banging, but the "nobody cared"
> for irq 16 is gone.
>
> So it seems like something gets severely confused by the I915_WRITE to
> GMBUS4 + reg_offset. So far this seems to have been reported solely on
> Lenovos as far as I can see (although a completely different types), so it
> might be some platform-specific quirk?
>
> Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> 16 at all.

that device is using
i915 :00:02.0: irq 44 for MSI/MSI-X

so can you try to boot with pci=nomsi?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-17 Thread Shawn Starr
On Friday, March 15, 2013 12:14:28 PM Yinghai Lu wrote:
> On Fri, Mar 15, 2013 at 8:14 AM, Jiri Kosina  wrote:
> > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > reveals that "nobody cared" for irq 16 happens long after last
> > 
> > I915_WRITE(GMBUS4 + reg_offset, 0);
> > 
> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > for irq 16 is gone.
> > 
> > So it seems like something gets severely confused by the I915_WRITE to
> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > Lenovos as far as I can see (although a completely different types), so it
> > might be some platform-specific quirk?
> > 
> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > 16 at all.
> 
> that device is using
> i915 :00:02.0: irq 44 for MSI/MSI-X
> 
> so can you try to boot with pci=nomsi?

I can try disabling MSI with 3.9.0-0.rc2.git0.4.fc20. -rc3 is not yet 
available in rawhide.

thanks,
Shawn___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Daniel Vetter
On Fri, Mar 15, 2013 at 08:47:39AM -0700, Greg KH wrote:
> On Fri, Mar 15, 2013 at 04:37:56PM +0100, Jiri Kosina wrote:
> > On Fri, 15 Mar 2013, Greg KH wrote:
> > 
> > > > > I have the same problem on my Lenovo T500. I think the graphics card 
> > > > > is
> > > > > involved.
> > > > > 
> > > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 
> > > > > 16:
> > > > > nobody cared" during boot, not when I boot with the ATI card.
> > > > 
> > > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > > 
> > > > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > > > Author: Daniel Vetter 
> > > > Date:   Sat Dec 1 13:53:45 2012 +0100
> > > > 
> > > > drm/i915: use the gmbus irq for waits
> > > > 
> > > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > > what's 
> > > > happening in parallel.
> > > 
> > > Wasn't this fixed by the merge from David
> > > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> > 
> > Why do you think it should, please?
> 
> The line:
>   - Fix PCH irq handling race which resulted in missed gmbus/dp
> aux irqs and subsequent fallout (Paulo)
> 
> > (I am seeing this with a2362d247 still).
> 
> Ok, I guess it isn't still fixed properly, just was guessing :)

Yeah, the above fix is for pch split platforms, whereas these reports here
are for gm45 (which doesn't have the pch display split). Acking of gmbus
interrupts works differently on those, I'm testing right now whether I can
reproduce this fail.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Fri, 15 Mar 2013, Yinghai Lu wrote:

> > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > reveals that "nobody cared" for irq 16 happens long after last
> >
> > I915_WRITE(GMBUS4 + reg_offset, 0);
> >
> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > for irq 16 is gone.
> >
> > So it seems like something gets severely confused by the I915_WRITE to
> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > Lenovos as far as I can see (although a completely different types), so it
> > might be some platform-specific quirk?
> >
> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > 16 at all.
> 
> that device is using
> i915 :00:02.0: irq 44 for MSI/MSI-X
> 
> so can you try to boot with pci=nomsi?

Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
interrupts go away.

My understanding from the other mail is that DAniel Vetter already has an 
idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
this datapoint regarding MSI will fit into it.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Yinghai Lu
On Mon, Mar 18, 2013 at 2:12 AM, Jiri Kosina  wrote:
> On Fri, 15 Mar 2013, Yinghai Lu wrote:
>
>> > Just a datapoint -- I have put a trivial debugging patch in place, and it
>> > reveals that "nobody cared" for irq 16 happens long after last
>> >
>> > I915_WRITE(GMBUS4 + reg_offset, 0);
>> >
>> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
>> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
>> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
>> > for irq 16 is gone.
>> >
>> > So it seems like something gets severely confused by the I915_WRITE to
>> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
>> > Lenovos as far as I can see (although a completely different types), so it
>> > might be some platform-specific quirk?
>> >
>> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
>> > 16 at all.
>>
>> that device is using
>> i915 :00:02.0: irq 44 for MSI/MSI-X
>>
>> so can you try to boot with pci=nomsi?
>
> Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
> interrupts go away.
>
> My understanding from the other mail is that DAniel Vetter already has an
> idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
> this datapoint regarding MSI will fit into it.

What is /proc/interrupts difference between with and without pci=nomsi ?

drm is forced to share irq 16?

Thanks

Yinghai
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Thomas Meyer
My laptop is an Acer 1810T. I see this error message each boot.

Kind regards
Thomas

Jiri Kosina  schrieb:

>On Fri, 15 Mar 2013, Jiri Kosina wrote:
>
>> > I have the same problem on my Lenovo T500. I think the graphics card is
>> > involved.
>> > 
>> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
>> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
>> > nobody cared" during boot, not when I boot with the ATI card.
>> 
>> Confirming this. After a lot of hassle, I have bisected this reliably to
>> 
>>  commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>>  Author: Daniel Vetter 
>>  Date:   Sat Dec 1 13:53:45 2012 +0100
>> 
>>  drm/i915: use the gmbus irq for waits
>> 
>> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
>> happening in parallel.
>> 
>> Attaching dmesg.txt from the machine with 28c70f162a as head, with 
>> drm.debug=0xe.
>
>Just a datapoint -- I have put a trivial debugging patch in place, and it 
>reveals that "nobody cared" for irq 16 happens long after last
>
>   I915_WRITE(GMBUS4 + reg_offset, 0);
>
>has been performed in gmbus_wait_hw_status(). On the other hand, if I 
>comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), 
>then it of course falls back to GPIO bit-banging, but the "nobody cared" 
>for irq 16 is gone. 
>
>So it seems like something gets severely confused by the I915_WRITE to 
>GMBUS4 + reg_offset. So far this seems to have been reported solely on 
>Lenovos as far as I can see (although a completely different types), so it 
>might be some platform-specific quirk?
>
>Honestly, I still don't understand how all the GMBUS stuff relates to IRQ 
>16 at all. 
>
>-- 
>Jiri Kosina
>SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Daniel Vetter
On Mon, Mar 18, 2013 at 10:12:49AM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Yinghai Lu wrote:
> 
> > > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > > reveals that "nobody cared" for irq 16 happens long after last
> > >
> > > I915_WRITE(GMBUS4 + reg_offset, 0);
> > >
> > > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > > for irq 16 is gone.
> > >
> > > So it seems like something gets severely confused by the I915_WRITE to
> > > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > > Lenovos as far as I can see (although a completely different types), so it
> > > might be some platform-specific quirk?
> > >
> > > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > > 16 at all.
> > 
> > that device is using
> > i915 :00:02.0: irq 44 for MSI/MSI-X
> > 
> > so can you try to boot with pci=nomsi?
> 
> Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
> interrupts go away.
> 
> My understanding from the other mail is that DAniel Vetter already has an 
> idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
> this datapoint regarding MSI will fit into it.

Yep, there's a big comment in the irq handler for that chipset that we
have a gaping race with when using MSI interrupts. Although the comment
bodly claims that the race is small enough to avoid the dreaded "nobody
cared" message. Looks like gmbus is good at hitting that race - on newer
chips it already brought up a similar race in handling pch interrupts.

Can you please give the below patch a whirl? It removes the probably race
msi race avoidance code and replaces it with the same trick Paulo used to
fix pch irq handling races.

Thanks, Daniel
---
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 3c7bb04..13de12e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2684,7 +2684,7 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 {
struct drm_device *dev = (struct drm_device *) arg;
drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
-   u32 iir, new_iir;
+   u32 iir, new_iir, ier;
u32 pipe_stats[I915_MAX_PIPES];
unsigned long irqflags;
int irq_received;
@@ -2692,9 +2692,14 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 
atomic_inc(&dev_priv->irq_received);
 
+   /* irq race avoidance, copy&pasta from Paulo's PCH irq fix */
+   ier = I915_READ(IER);
+   I915_WRITE(IER, 0);
+   POSTING_READ(IER);
+
iir = I915_READ(IIR);
 
-   for (;;) {
+   do {
bool blc_event = false;
 
irq_received = iir != 0;
@@ -2792,7 +2797,10 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 * stray interrupts.
 */
iir = new_iir;
-   }
+   } while (0);
+
+   I915_WRITE(IER, ier);
+   POSTING_READ(IER);
 
i915_update_dri1_breadcrumb(dev);
 
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Chris Wilson
On Mon, Mar 18, 2013 at 08:19:03PM +0100, Daniel Vetter wrote:
> On Mon, Mar 18, 2013 at 10:12:49AM +0100, Jiri Kosina wrote:
> > On Fri, 15 Mar 2013, Yinghai Lu wrote:
> > 
> > > > Just a datapoint -- I have put a trivial debugging patch in place, and 
> > > > it
> > > > reveals that "nobody cared" for irq 16 happens long after last
> > > >
> > > > I915_WRITE(GMBUS4 + reg_offset, 0);
> > > >
> > > > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > > > comment out both GMBUS4 register offset writes in 
> > > > gmbus_wait_hw_status(),
> > > > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > > > for irq 16 is gone.
> > > >
> > > > So it seems like something gets severely confused by the I915_WRITE to
> > > > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > > > Lenovos as far as I can see (although a completely different types), so 
> > > > it
> > > > might be some platform-specific quirk?
> > > >
> > > > Honestly, I still don't understand how all the GMBUS stuff relates to 
> > > > IRQ
> > > > 16 at all.
> > > 
> > > that device is using
> > > i915 :00:02.0: irq 44 for MSI/MSI-X
> > > 
> > > so can you try to boot with pci=nomsi?
> > 
> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
> > interrupts go away.
> > 
> > My understanding from the other mail is that DAniel Vetter already has an 
> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
> > this datapoint regarding MSI will fit into it.
> 
> Yep, there's a big comment in the irq handler for that chipset that we
> have a gaping race with when using MSI interrupts. Although the comment
> bodly claims that the race is small enough to avoid the dreaded "nobody
> cared" message. Looks like gmbus is good at hitting that race - on newer
> chips it already brought up a similar race in handling pch interrupts.
> 
> Can you please give the below patch a whirl? It removes the probably race
> msi race avoidance code and replaces it with the same trick Paulo used to
> fix pch irq handling races.

Still nobody cares about irq16.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Mon, 18 Mar 2013, Daniel Vetter wrote:

> Yep, there's a big comment in the irq handler for that chipset that we
> have a gaping race with when using MSI interrupts. Although the comment
> bodly claims that the race is small enough to avoid the dreaded "nobody
> cared" message. Looks like gmbus is good at hitting that race - on newer
> chips it already brought up a similar race in handling pch interrupts.

I see ... will target my focus in that direction, thanks.

> Can you please give the below patch a whirl? It removes the probably race
> msi race avoidance code and replaces it with the same trick Paulo used to
> fix pch irq handling races.

Unfortunately it didn't change anything, the spurious interrupt report is 
still there.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Mon, 18 Mar 2013, Yinghai Lu wrote:

> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
> > interrupts go away.
> >
> > My understanding from the other mail is that DAniel Vetter already has an
> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
> > this datapoint regarding MSI will fit into it.
> 
> What is /proc/interrupts difference between with and without pci=nomsi ?
> 
> drm is forced to share irq 16?

Yup, IRQ 16 is being shared, and one of the owners is i915.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Yinghai Lu
On Mon, Mar 18, 2013 at 3:05 PM, Jiri Kosina  wrote:
> On Mon, 18 Mar 2013, Yinghai Lu wrote:
>
>> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
>> > interrupts go away.
>> >
>> > My understanding from the other mail is that DAniel Vetter already has an
>> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
>> > this datapoint regarding MSI will fit into it.
>>
>> What is /proc/interrupts difference between with and without pci=nomsi ?
>>
>> drm is forced to share irq 16?
>
> Yup, IRQ 16 is being shared, and one of the owners is i915.

the vga report strange INTx status...

00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series
Chipset Integrated Graphics Controller (rev 07) (prog-if 00 [VGA
controller])
Subsystem: Lenovo Device 20e4
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
SERR-  [disabled]
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee0100c  Data: 4142
Capabilities: [d0] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Kernel driver in use: i915
Kernel modules: i915

it should be INTx-, after we have set DisINTx+ in control.

So INTx can not be disabled after it get enabled before ?

the VGA on my T420 looks right.

00:02.0 VGA compatible controller: Intel Corporation 2nd Generation
Core Processor Family Integrated Graphics Controller (rev 09) (prog-if
00 [VGA controller])
Subsystem: Lenovo Device 21ce
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
SERR- http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Harald Arnesen wrote:

> I have the same problem on my Lenovo T500. I think the graphics card is
> involved.
> 
> This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> nobody cared" during boot, not when I boot with the ATI card.

Confirming this. After a lot of hassle, I have bisected this reliably to

commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
Author: Daniel Vetter 
Date:   Sat Dec 1 13:53:45 2012 +0100

drm/i915: use the gmbus irq for waits

Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
happening in parallel.

Attaching dmesg.txt from the machine with 28c70f162a as head, with 
drm.debug=0xe.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Jiri Kosina wrote:

> > I have the same problem on my Lenovo T500. I think the graphics card is
> > involved.
> > 
> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > nobody cared" during boot, not when I boot with the ATI card.
> 
> Confirming this. After a lot of hassle, I have bisected this reliably to
> 
>   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>   Author: Daniel Vetter 
>   Date:   Sat Dec 1 13:53:45 2012 +0100
> 
>   drm/i915: use the gmbus irq for waits
> 
> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> happening in parallel.
> 
> Attaching dmesg.txt from the machine with 28c70f162a as head, with 
> drm.debug=0xe.

Just a datapoint -- I have put a trivial debugging patch in place, and it 
reveals that "nobody cared" for irq 16 happens long after last

I915_WRITE(GMBUS4 + reg_offset, 0);

has been performed in gmbus_wait_hw_status(). On the other hand, if I 
comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), 
then it of course falls back to GPIO bit-banging, but the "nobody cared" 
for irq 16 is gone. 

So it seems like something gets severely confused by the I915_WRITE to 
GMBUS4 + reg_offset. So far this seems to have been reported solely on 
Lenovos as far as I can see (although a completely different types), so it 
might be some platform-specific quirk?

Honestly, I still don't understand how all the GMBUS stuff relates to IRQ 
16 at all. 

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Greg KH
On Fri, Mar 15, 2013 at 02:33:13PM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Harald Arnesen wrote:
> 
> > I have the same problem on my Lenovo T500. I think the graphics card is
> > involved.
> > 
> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > nobody cared" during boot, not when I boot with the ATI card.
> 
> Confirming this. After a lot of hassle, I have bisected this reliably to
> 
>   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>   Author: Daniel Vetter 
>   Date:   Sat Dec 1 13:53:45 2012 +0100
> 
>   drm/i915: use the gmbus irq for waits
> 
> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> happening in parallel.

Wasn't this fixed by the merge from David
(2cc79544bd0aabb4b3cf467ead5df526d9134c64)?  I can't figure out the
exact commit that the merge message referred to though...

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Greg KH wrote:

> > > I have the same problem on my Lenovo T500. I think the graphics card is
> > > involved.
> > > 
> > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > > nobody cared" during boot, not when I boot with the ATI card.
> > 
> > Confirming this. After a lot of hassle, I have bisected this reliably to
> > 
> > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > Author: Daniel Vetter 
> > Date:   Sat Dec 1 13:53:45 2012 +0100
> > 
> > drm/i915: use the gmbus irq for waits
> > 
> > Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
> > happening in parallel.
> 
> Wasn't this fixed by the merge from David
> (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?

Why do you think it should, please?

(I am seeing this with a2362d247 still).

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Greg KH
On Fri, Mar 15, 2013 at 04:37:56PM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Greg KH wrote:
> 
> > > > I have the same problem on my Lenovo T500. I think the graphics card is
> > > > involved.
> > > > 
> > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
> > > > nobody cared" during boot, not when I boot with the ATI card.
> > > 
> > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > 
> > >   commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > >   Author: Daniel Vetter 
> > >   Date:   Sat Dec 1 13:53:45 2012 +0100
> > > 
> > >   drm/i915: use the gmbus irq for waits
> > > 
> > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > what's 
> > > happening in parallel.
> > 
> > Wasn't this fixed by the merge from David
> > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> 
> Why do you think it should, please?

The line:
- Fix PCH irq handling race which resulted in missed gmbus/dp
  aux irqs and subsequent fallout (Paulo)

> (I am seeing this with a2362d247 still).

Ok, I guess it isn't still fixed properly, just was guessing :)

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Jiri Kosina
On Fri, 15 Mar 2013, Greg KH wrote:

> > > > > I have the same problem on my Lenovo T500. I think the graphics card 
> > > > > is
> > > > > involved.
> > > > > 
> > > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 
> > > > > 16:
> > > > > nobody cared" during boot, not when I boot with the ATI card.
> > > > 
> > > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > > 
> > > > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > > > Author: Daniel Vetter 
> > > > Date:   Sat Dec 1 13:53:45 2012 +0100
> > > > 
> > > > drm/i915: use the gmbus irq for waits
> > > > 
> > > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > > what's 
> > > > happening in parallel.
> > > 
> > > Wasn't this fixed by the merge from David
> > > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> > 
> > Why do you think it should, please?
> 
> The line:
>   - Fix PCH irq handling race which resulted in missed gmbus/dp
> aux irqs and subsequent fallout (Paulo)

Ah, that one. I believe that should be irrelevant for GM chipsets, as they 
don't have AUX line, right?

> > (I am seeing this with a2362d247 still).
> 
> Ok, I guess it isn't still fixed properly, just was guessing :)

Seems like this is a different issue.

Thanks,

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-15 Thread Yinghai Lu
On Fri, Mar 15, 2013 at 8:14 AM, Jiri Kosina  wrote:

> Just a datapoint -- I have put a trivial debugging patch in place, and it
> reveals that "nobody cared" for irq 16 happens long after last
>
> I915_WRITE(GMBUS4 + reg_offset, 0);
>
> has been performed in gmbus_wait_hw_status(). On the other hand, if I
> comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> then it of course falls back to GPIO bit-banging, but the "nobody cared"
> for irq 16 is gone.
>
> So it seems like something gets severely confused by the I915_WRITE to
> GMBUS4 + reg_offset. So far this seems to have been reported solely on
> Lenovos as far as I can see (although a completely different types), so it
> might be some platform-specific quirk?
>
> Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> 16 at all.

that device is using
i915 :00:02.0: irq 44 for MSI/MSI-X

so can you try to boot with pci=nomsi?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-17 Thread Shawn Starr
On Friday, March 15, 2013 12:14:28 PM Yinghai Lu wrote:
> On Fri, Mar 15, 2013 at 8:14 AM, Jiri Kosina  wrote:
> > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > reveals that "nobody cared" for irq 16 happens long after last
> > 
> > I915_WRITE(GMBUS4 + reg_offset, 0);
> > 
> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > for irq 16 is gone.
> > 
> > So it seems like something gets severely confused by the I915_WRITE to
> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > Lenovos as far as I can see (although a completely different types), so it
> > might be some platform-specific quirk?
> > 
> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > 16 at all.
> 
> that device is using
> i915 :00:02.0: irq 44 for MSI/MSI-X
> 
> so can you try to boot with pci=nomsi?

I can try disabling MSI with 3.9.0-0.rc2.git0.4.fc20. -rc3 is not yet 
available in rawhide.

thanks,
Shawn___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Daniel Vetter
On Fri, Mar 15, 2013 at 08:47:39AM -0700, Greg KH wrote:
> On Fri, Mar 15, 2013 at 04:37:56PM +0100, Jiri Kosina wrote:
> > On Fri, 15 Mar 2013, Greg KH wrote:
> > 
> > > > > I have the same problem on my Lenovo T500. I think the graphics card 
> > > > > is
> > > > > involved.
> > > > > 
> > > > > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
> > > > > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 
> > > > > 16:
> > > > > nobody cared" during boot, not when I boot with the ATI card.
> > > > 
> > > > Confirming this. After a lot of hassle, I have bisected this reliably to
> > > > 
> > > > commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
> > > > Author: Daniel Vetter 
> > > > Date:   Sat Dec 1 13:53:45 2012 +0100
> > > > 
> > > > drm/i915: use the gmbus irq for waits
> > > > 
> > > > Adding Daniel, Imre and Daniel to CC while I will try to figure out 
> > > > what's 
> > > > happening in parallel.
> > > 
> > > Wasn't this fixed by the merge from David
> > > (2cc79544bd0aabb4b3cf467ead5df526d9134c64)?
> > 
> > Why do you think it should, please?
> 
> The line:
>   - Fix PCH irq handling race which resulted in missed gmbus/dp
> aux irqs and subsequent fallout (Paulo)
> 
> > (I am seeing this with a2362d247 still).
> 
> Ok, I guess it isn't still fixed properly, just was guessing :)

Yeah, the above fix is for pch split platforms, whereas these reports here
are for gm45 (which doesn't have the pch display split). Acking of gmbus
interrupts works differently on those, I'm testing right now whether I can
reproduce this fail.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Fri, 15 Mar 2013, Yinghai Lu wrote:

> > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > reveals that "nobody cared" for irq 16 happens long after last
> >
> > I915_WRITE(GMBUS4 + reg_offset, 0);
> >
> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > for irq 16 is gone.
> >
> > So it seems like something gets severely confused by the I915_WRITE to
> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > Lenovos as far as I can see (although a completely different types), so it
> > might be some platform-specific quirk?
> >
> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > 16 at all.
> 
> that device is using
> i915 :00:02.0: irq 44 for MSI/MSI-X
> 
> so can you try to boot with pci=nomsi?

Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
interrupts go away.

My understanding from the other mail is that DAniel Vetter already has an 
idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
this datapoint regarding MSI will fit into it.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Yinghai Lu
On Mon, Mar 18, 2013 at 2:12 AM, Jiri Kosina  wrote:
> On Fri, 15 Mar 2013, Yinghai Lu wrote:
>
>> > Just a datapoint -- I have put a trivial debugging patch in place, and it
>> > reveals that "nobody cared" for irq 16 happens long after last
>> >
>> > I915_WRITE(GMBUS4 + reg_offset, 0);
>> >
>> > has been performed in gmbus_wait_hw_status(). On the other hand, if I
>> > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
>> > then it of course falls back to GPIO bit-banging, but the "nobody cared"
>> > for irq 16 is gone.
>> >
>> > So it seems like something gets severely confused by the I915_WRITE to
>> > GMBUS4 + reg_offset. So far this seems to have been reported solely on
>> > Lenovos as far as I can see (although a completely different types), so it
>> > might be some platform-specific quirk?
>> >
>> > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
>> > 16 at all.
>>
>> that device is using
>> i915 :00:02.0: irq 44 for MSI/MSI-X
>>
>> so can you try to boot with pci=nomsi?
>
> Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost
> interrupts go away.
>
> My understanding from the other mail is that DAniel Vetter already has an
> idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully
> this datapoint regarding MSI will fit into it.

What is /proc/interrupts difference between with and without pci=nomsi ?

drm is forced to share irq 16?

Thanks

Yinghai
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Thomas Meyer
My laptop is an Acer 1810T. I see this error message each boot.

Kind regards
Thomas

Jiri Kosina  schrieb:

>On Fri, 15 Mar 2013, Jiri Kosina wrote:
>
>> > I have the same problem on my Lenovo T500. I think the graphics card is
>> > involved.
>> > 
>> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
>> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
>> > nobody cared" during boot, not when I boot with the ATI card.
>> 
>> Confirming this. After a lot of hassle, I have bisected this reliably to
>> 
>>  commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>>  Author: Daniel Vetter 
>>  Date:   Sat Dec 1 13:53:45 2012 +0100
>> 
>>  drm/i915: use the gmbus irq for waits
>> 
>> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
>> happening in parallel.
>> 
>> Attaching dmesg.txt from the machine with 28c70f162a as head, with 
>> drm.debug=0xe.
>
>Just a datapoint -- I have put a trivial debugging patch in place, and it 
>reveals that "nobody cared" for irq 16 happens long after last
>
>   I915_WRITE(GMBUS4 + reg_offset, 0);
>
>has been performed in gmbus_wait_hw_status(). On the other hand, if I 
>comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), 
>then it of course falls back to GPIO bit-banging, but the "nobody cared" 
>for irq 16 is gone. 
>
>So it seems like something gets severely confused by the I915_WRITE to 
>GMBUS4 + reg_offset. So far this seems to have been reported solely on 
>Lenovos as far as I can see (although a completely different types), so it 
>might be some platform-specific quirk?
>
>Honestly, I still don't understand how all the GMBUS stuff relates to IRQ 
>16 at all. 
>
>-- 
>Jiri Kosina
>SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Daniel Vetter
On Mon, Mar 18, 2013 at 10:12:49AM +0100, Jiri Kosina wrote:
> On Fri, 15 Mar 2013, Yinghai Lu wrote:
> 
> > > Just a datapoint -- I have put a trivial debugging patch in place, and it
> > > reveals that "nobody cared" for irq 16 happens long after last
> > >
> > > I915_WRITE(GMBUS4 + reg_offset, 0);
> > >
> > > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > > comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(),
> > > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > > for irq 16 is gone.
> > >
> > > So it seems like something gets severely confused by the I915_WRITE to
> > > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > > Lenovos as far as I can see (although a completely different types), so it
> > > might be some platform-specific quirk?
> > >
> > > Honestly, I still don't understand how all the GMBUS stuff relates to IRQ
> > > 16 at all.
> > 
> > that device is using
> > i915 :00:02.0: irq 44 for MSI/MSI-X
> > 
> > so can you try to boot with pci=nomsi?
> 
> Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
> interrupts go away.
> 
> My understanding from the other mail is that DAniel Vetter already has an 
> idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
> this datapoint regarding MSI will fit into it.

Yep, there's a big comment in the irq handler for that chipset that we
have a gaping race with when using MSI interrupts. Although the comment
bodly claims that the race is small enough to avoid the dreaded "nobody
cared" message. Looks like gmbus is good at hitting that race - on newer
chips it already brought up a similar race in handling pch interrupts.

Can you please give the below patch a whirl? It removes the probably race
msi race avoidance code and replaces it with the same trick Paulo used to
fix pch irq handling races.

Thanks, Daniel
---
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 3c7bb04..13de12e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2684,7 +2684,7 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 {
struct drm_device *dev = (struct drm_device *) arg;
drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
-   u32 iir, new_iir;
+   u32 iir, new_iir, ier;
u32 pipe_stats[I915_MAX_PIPES];
unsigned long irqflags;
int irq_received;
@@ -2692,9 +2692,14 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 
atomic_inc(&dev_priv->irq_received);
 
+   /* irq race avoidance, copy&pasta from Paulo's PCH irq fix */
+   ier = I915_READ(IER);
+   I915_WRITE(IER, 0);
+   POSTING_READ(IER);
+
iir = I915_READ(IIR);
 
-   for (;;) {
+   do {
bool blc_event = false;
 
irq_received = iir != 0;
@@ -2792,7 +2797,10 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 * stray interrupts.
 */
iir = new_iir;
-   }
+   } while (0);
+
+   I915_WRITE(IER, ier);
+   POSTING_READ(IER);
 
i915_update_dri1_breadcrumb(dev);
 
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Chris Wilson
On Mon, Mar 18, 2013 at 08:19:03PM +0100, Daniel Vetter wrote:
> On Mon, Mar 18, 2013 at 10:12:49AM +0100, Jiri Kosina wrote:
> > On Fri, 15 Mar 2013, Yinghai Lu wrote:
> > 
> > > > Just a datapoint -- I have put a trivial debugging patch in place, and 
> > > > it
> > > > reveals that "nobody cared" for irq 16 happens long after last
> > > >
> > > > I915_WRITE(GMBUS4 + reg_offset, 0);
> > > >
> > > > has been performed in gmbus_wait_hw_status(). On the other hand, if I
> > > > comment out both GMBUS4 register offset writes in 
> > > > gmbus_wait_hw_status(),
> > > > then it of course falls back to GPIO bit-banging, but the "nobody cared"
> > > > for irq 16 is gone.
> > > >
> > > > So it seems like something gets severely confused by the I915_WRITE to
> > > > GMBUS4 + reg_offset. So far this seems to have been reported solely on
> > > > Lenovos as far as I can see (although a completely different types), so 
> > > > it
> > > > might be some platform-specific quirk?
> > > >
> > > > Honestly, I still don't understand how all the GMBUS stuff relates to 
> > > > IRQ
> > > > 16 at all.
> > > 
> > > that device is using
> > > i915 :00:02.0: irq 44 for MSI/MSI-X
> > > 
> > > so can you try to boot with pci=nomsi?
> > 
> > Yes, switching from MSI to IO-APIC-fasteoi makes the report about lost 
> > interrupts go away.
> > 
> > My understanding from the other mail is that DAniel Vetter already has an 
> > idea what might be going wrong with IRQ acking on GM45 chipsets; hopefully 
> > this datapoint regarding MSI will fit into it.
> 
> Yep, there's a big comment in the irq handler for that chipset that we
> have a gaping race with when using MSI interrupts. Although the comment
> bodly claims that the race is small enough to avoid the dreaded "nobody
> cared" message. Looks like gmbus is good at hitting that race - on newer
> chips it already brought up a similar race in handling pch interrupts.
> 
> Can you please give the below patch a whirl? It removes the probably race
> msi race avoidance code and replaces it with the same trick Paulo used to
> fix pch irq handling races.

Still nobody cares about irq16.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Jiri Kosina
On Mon, 18 Mar 2013, Daniel Vetter wrote:

> Yep, there's a big comment in the irq handler for that chipset that we
> have a gaping race with when using MSI interrupts. Although the comment
> bodly claims that the race is small enough to avoid the dreaded "nobody
> cared" message. Looks like gmbus is good at hitting that race - on newer
> chips it already brought up a similar race in handling pch interrupts.

I see ... will target my focus in that direction, thanks.

> Can you please give the below patch a whirl? It removes the probably race
> msi race avoidance code and replaces it with the same trick Paulo used to
> fix pch irq handling races.

Unfortunately it didn't change anything, the spurious interrupt report is 
still there.

-- 
Jiri Kosina
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


  1   2   >