Re: [Intel-gfx] [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts

2013-02-08 Thread Jesse Barnes
On Fri,  8 Feb 2013 17:35:18 -0200
Paulo Zanoni  wrote:

> From: Paulo Zanoni 
> 
> On ILK/SNB all we need to do is to enable the "poison" bit, but on
> IVB/HSW we need to enable the CPU error interrupt register, which is
> responsible not only for poison interrupts, but also other things.
> This includes the "unclaimed register" interrupt, so on the IVB irq
> handler we now need to: (i) check whether the interrupt was triggered by an
> unclaimed register and (ii) mask the error interrupt bit so we don't
> risk generating "unclaimed register" interrupts form inside the
> interrupt handler.
> 
> Signed-off-by: Paulo Zanoni 
> ---

OTOH there's nothing the user can do about it... so we might do a
WARN_ONCE or something here instead.  But even then, I'm not sure
there's much *we* can do about these, as they indicate a corruption in
the communication between the CPU and PCH.

-- 
Jesse Barnes, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts

2013-02-08 Thread Paulo Zanoni
Hi

2013/2/8 Jesse Barnes :
> On Fri,  8 Feb 2013 17:35:18 -0200
> Paulo Zanoni  wrote:
>
>> From: Paulo Zanoni 
>>
>> On ILK/SNB all we need to do is to enable the "poison" bit, but on
>> IVB/HSW we need to enable the CPU error interrupt register, which is
>> responsible not only for poison interrupts, but also other things.
>> This includes the "unclaimed register" interrupt, so on the IVB irq
>> handler we now need to: (i) check whether the interrupt was triggered by an
>> unclaimed register and (ii) mask the error interrupt bit so we don't
>> risk generating "unclaimed register" interrupts form inside the
>> interrupt handler.
>>
>> Signed-off-by: Paulo Zanoni 
>> ---
>
> OTOH there's nothing the user can do about it... so we might do a
> WARN_ONCE or something here instead.

Well, so far I haven't seen the message. If we conclude it happens
*too much*, then we can use WARN_ONCE.

> But even then, I'm not sure
> there's much *we* can do about these, as they indicate a corruption in
> the communication between the CPU and PCH.

At least if we get the message we may be able to understand and/or
reproduce the problems. So far we don't even know whether the problem
is happening or not... And when there's a display bug, we don't know
if it's caused by "poison".

>
> --
> Jesse Barnes, Intel Open Source Technology Center



-- 
Paulo Zanoni
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts

2013-02-08 Thread Jesse Barnes
On Fri, 8 Feb 2013 17:54:23 -0200
Paulo Zanoni  wrote:

> Hi
> 
> 2013/2/8 Jesse Barnes :
> > On Fri,  8 Feb 2013 17:35:18 -0200
> > Paulo Zanoni  wrote:
> >
> >> From: Paulo Zanoni 
> >>
> >> On ILK/SNB all we need to do is to enable the "poison" bit, but on
> >> IVB/HSW we need to enable the CPU error interrupt register, which is
> >> responsible not only for poison interrupts, but also other things.
> >> This includes the "unclaimed register" interrupt, so on the IVB irq
> >> handler we now need to: (i) check whether the interrupt was triggered by an
> >> unclaimed register and (ii) mask the error interrupt bit so we don't
> >> risk generating "unclaimed register" interrupts form inside the
> >> interrupt handler.
> >>
> >> Signed-off-by: Paulo Zanoni 
> >> ---
> >
> > OTOH there's nothing the user can do about it... so we might do a
> > WARN_ONCE or something here instead.
> 
> Well, so far I haven't seen the message. If we conclude it happens
> *too much*, then we can use WARN_ONCE.
> 
> > But even then, I'm not sure
> > there's much *we* can do about these, as they indicate a corruption in
> > the communication between the CPU and PCH.
> 
> At least if we get the message we may be able to understand and/or
> reproduce the problems. So far we don't even know whether the problem
> is happening or not... And when there's a display bug, we don't know
> if it's caused by "poison".

Ok I guess the DRM_ERROR won't hurt if/until we see reports.  Then we
can dig in and see if keeping the message makes sense or not.

-- 
Jesse Barnes, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts

2013-02-09 Thread Ben Widawsky
On Fri, 8 Feb 2013 11:42:39 -0800
Jesse Barnes  wrote:

> On Fri,  8 Feb 2013 17:35:18 -0200
> Paulo Zanoni  wrote:
> 
> > From: Paulo Zanoni 
> > 
> > On ILK/SNB all we need to do is to enable the "poison" bit, but on
> > IVB/HSW we need to enable the CPU error interrupt register, which is
> > responsible not only for poison interrupts, but also other things.
> > This includes the "unclaimed register" interrupt, so on the IVB irq
> > handler we now need to: (i) check whether the interrupt was
> > triggered by an unclaimed register and (ii) mask the error
> > interrupt bit so we don't risk generating "unclaimed register"
> > interrupts form inside the interrupt handler.
> > 
> > Signed-off-by: Paulo Zanoni 
> > ---
> 
> OTOH there's nothing the user can do about it... so we might do a
> WARN_ONCE or something here instead.  But even then, I'm not sure
> there's much *we* can do about these, as they indicate a corruption in
> the communication between the CPU and PCH.
> 

I agree with Jesse. I wouldn't bother with these. Even a WARN_ONCE
isn't helpful since the backtrace wouldn't really be meaningful.

If OTOH, you wanted to save away this information into error state; I
could get behind that.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts

2013-02-14 Thread Paulo Zanoni
Hi

2013/2/9 Ben Widawsky :
> On Fri, 8 Feb 2013 11:42:39 -0800
> Jesse Barnes  wrote:
>
>> On Fri,  8 Feb 2013 17:35:18 -0200
>> Paulo Zanoni  wrote:
>>
>> > From: Paulo Zanoni 
>> >
>> > On ILK/SNB all we need to do is to enable the "poison" bit, but on
>> > IVB/HSW we need to enable the CPU error interrupt register, which is
>> > responsible not only for poison interrupts, but also other things.
>> > This includes the "unclaimed register" interrupt, so on the IVB irq
>> > handler we now need to: (i) check whether the interrupt was
>> > triggered by an unclaimed register and (ii) mask the error
>> > interrupt bit so we don't risk generating "unclaimed register"
>> > interrupts form inside the interrupt handler.
>> >
>> > Signed-off-by: Paulo Zanoni 
>> > ---
>>
>> OTOH there's nothing the user can do about it... so we might do a
>> WARN_ONCE or something here instead.  But even then, I'm not sure
>> there's much *we* can do about these, as they indicate a corruption in
>> the communication between the CPU and PCH.
>>
>
> I agree with Jesse. I wouldn't bother with these. Even a WARN_ONCE
> isn't helpful since the backtrace wouldn't really be meaningful.

Why isn't it helpful? Right now we don't even know whether this
problem happens or not, we're completely "blind" to a possible problem
that may be affecting us in some specific cases and we don't even
know. Knowing that it happens and how often it happens is IMHO
certainly better than closing our eyes and pretending it doesn't
exist.

>
> If OTOH, you wanted to save away this information into error state; I
> could get behind that.




-- 
Paulo Zanoni
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts

2013-02-14 Thread Ben Widawsky
On Thu, 14 Feb 2013 18:35:32 -0200
Paulo Zanoni  wrote:

> Hi
> 
> 2013/2/9 Ben Widawsky :
> > On Fri, 8 Feb 2013 11:42:39 -0800
> > Jesse Barnes  wrote:
> >
> >> On Fri,  8 Feb 2013 17:35:18 -0200
> >> Paulo Zanoni  wrote:
> >>
> >> > From: Paulo Zanoni 
> >> >
> >> > On ILK/SNB all we need to do is to enable the "poison" bit, but
> >> > on IVB/HSW we need to enable the CPU error interrupt register,
> >> > which is responsible not only for poison interrupts, but also
> >> > other things. This includes the "unclaimed register" interrupt,
> >> > so on the IVB irq handler we now need to: (i) check whether the
> >> > interrupt was triggered by an unclaimed register and (ii) mask
> >> > the error interrupt bit so we don't risk generating "unclaimed
> >> > register" interrupts form inside the interrupt handler.
> >> >
> >> > Signed-off-by: Paulo Zanoni 
> >> > ---
> >>
> >> OTOH there's nothing the user can do about it... so we might do a
> >> WARN_ONCE or something here instead.  But even then, I'm not sure
> >> there's much *we* can do about these, as they indicate a
> >> corruption in the communication between the CPU and PCH.
> >>
> >
> > I agree with Jesse. I wouldn't bother with these. Even a WARN_ONCE
> > isn't helpful since the backtrace wouldn't really be meaningful.
> 
> Why isn't it helpful? Right now we don't even know whether this
> problem happens or not, we're completely "blind" to a possible problem
> that may be affecting us in some specific cases and we don't even
> know. Knowing that it happens and how often it happens is IMHO
> certainly better than closing our eyes and pretending it doesn't
> exist.
> 

I suppose you're right. I'm strongly of the opinion that we won't
ever see this error because the system will crap out before we'd be
able to get that info - of course I cannot prove that, and I don't
know enough about what exactly poison means. I just think it sucks that
we have yet another gen specific thing which has TBD value. I certainly
won't nak it, and of course if it proves useful, I'll be most
apologetic.

As for the WARN being unhelpful, it's the same problem again. You're
getting the notifications via interrupt, so a backtrace is useless on
IVB/HSW. Perhaps it makes sense on ILK/SNB. Reinventing a "do this
once" macro isn't worthwhile either, so I guess WARN_ON with the
assumption that we ignore the backtrace is fine on IVB/HSW.

> >
> > If OTOH, you wanted to save away this information into error state;
> > I could get behind that.
> 
> 
> 
> 

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx