Re: [Intel-gfx] [PATCH 1/4] drm/i915: Remove unused file arg from execbuf

2013-03-14 Thread Chris Wilson
I was hoping for something a little more magical. I couldn't spot any mistakes - the only thing missing is a hint as to why these are useful. Acked-by: Chris Wilson ch...@chris-wilson.co.uk -Chris -- Chris Wilson, Intel Open Source Technology Centre

Re: [Intel-gfx] [PATCH 4/9] drm/i915: Don't touch South display when PCH_NOP

2013-03-14 Thread Jani Nikula
On Wed, 13 Mar 2013, Ben Widawsky b...@bwidawsk.net wrote: Interrupts, clock gating, and gmbus are all within the, this will hang the CPU range when we have PCH_NOP. There is a bit of a hack in init clock gating. We want to do most of the block gating, but the part we skip will hang the

[Intel-gfx] [PATCH] drm/i915: Sanity check incoming ioctl data for a NULL pointer

2013-03-14 Thread Chris Wilson
In order to prevent a potential NULL deference with hostile userspace, we need to check whether the ioctl was passed an invalid args pointer. Reported-by: Tommi Rantala tt.rant...@gmail.com Link: http://lkml.kernel.org/r/ca+ydwtpubvbwxbt-tdgpuvj1eu7itmcho_2b3w13hkd5+jw...@mail.gmail.com

[Intel-gfx] [PATCH v2 03/16] drm/i915: reference count for i915_hw_contexts

2013-03-14 Thread Mika Kuoppala
In preparation to do analysis of which context was guilty of gpu hung, store kreffed context pointer into request struct. This allows us to inspect contexts when gpu is reset even if those contexts would already be released by userspace. v2: track i915_hw_context pointers instead of using

[Intel-gfx] [PATCH v2 01/16] drm/i915: return context from i915_switch_context()

2013-03-14 Thread Mika Kuoppala
In preparation for the next commit, return context that was switched to from i915_switch_context(). v2: context in return value instead of param. (Ben Widawsky) Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_drv.h|5 +++--

[Intel-gfx] [PATCH v2 07/16] drm/i915: introduce i915_hangcheck_ring_hung

2013-03-14 Thread Mika Kuoppala
In preparation to track per ring progress in hangcheck, add i915_hangcheck_ring_hung. v2: omit dev parameter (Ben Widawsky) Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_irq.c | 29 + 1 file changed, 17 insertions(+), 12

[Intel-gfx] [PATCH v2 09/16] drm/i915: remove i915_hangcheck_hung

2013-03-14 Thread Mika Kuoppala
Rework of per ring hangcheck made this obsolete. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_drv.h |1 - drivers/gpu/drm/i915/i915_irq.c | 21 - 2 files changed, 22 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h

[Intel-gfx] [PATCH v2 08/16] drm/i915: detect hang using per ring hangcheck_score

2013-03-14 Thread Mika Kuoppala
Add per ring score of possible culprit for gpu hang. If ring is busy and not waiting, it will get the highest score across calls to i915_hangcheck_elapsed. This way we are most likely to find the ring that caused the hang among the waiting ones. Signed-off-by: Mika Kuoppala

[Intel-gfx] [PATCH v2 10/16] drm/i915: add struct ctx_reset_state

2013-03-14 Thread Mika Kuoppala
To count context losses, add struct ctx_reset_state for both i915_hw_context and drm_i915_file_private. drm_i915_file_private is used when there is no context. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_dma.c |4 +++-

[Intel-gfx] [PATCH v2 13/16] drm/i915: mark rings which were waiting when hang happened

2013-03-14 Thread Mika Kuoppala
For guilty batchbuffer analysis later on on ring resets, mark all waiting rings so that we can skip them when trying to find a true culprit for the gpu hang. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_irq.c |3 ++-

[Intel-gfx] [PATCH v2 16/16] drm/i915: add i915_gem_context_get_reset_status_ioctl

2013-03-14 Thread Mika Kuoppala
This ioctl returns context reset status for specified context. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com CC: i...@freedesktop.org --- drivers/gpu/drm/i915/i915_dma.c |1 + drivers/gpu/drm/i915/i915_drv.c | 61 +++

[Intel-gfx] [PATCH v2 00/16] arb robustness enablers v2

2013-03-14 Thread Mika Kuoppala
Hi, Reworked patchset for guilty context detection. Main changes since the last posting: - i915_add_request cleanup - i915_switch_context returns ERR_PTR - 3 seconds to declare hang regardless if guilty was found - semaphore kicking for hung rings (from Chris) - test case for the new interface -

[Intel-gfx] [PATCH v2 14/16] drm/i915: find guilty batch buffer on ring resets

2013-03-14 Thread Mika Kuoppala
After hang check timer has declared gpu to be hang, rings are reset. In ring reset, when clearing request list, do post mortem analysis to find out the guilty batch buffer. Select requests for further analysis by inspecting the completed sequence number which has been updated into the HWS page.

[Intel-gfx] [PATCH v2 05/16] drm/i915: pass seqno to i915_hangcheck_ring_idle

2013-03-14 Thread Mika Kuoppala
In preparation for next commit, pass seqno as a parameter to i915_hangcheck_ring_idle as it will be used inside i915_hangcheck_elapsed. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_irq.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-)

[Intel-gfx] [PATCH v2 11/16] drm/i915: add reset_state for hw_contexts

2013-03-14 Thread Mika Kuoppala
For arb-robustness, every context needs to have it's own reset state tracking. Default context will be handled in a identical way as the no-context case in further down in the patch set. For no-context case, the reset state will be stored in the file_priv part. v2: handle default context inside

[Intel-gfx] [PATCH v2 02/16] drm/i915: cleanup i915_add_request

2013-03-14 Thread Mika Kuoppala
Only execbuffer needs all the parameters. Cleanup everything else behind macro. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_drv.h|8 +--- drivers/gpu/drm/i915/i915_gem.c| 10 +-

[Intel-gfx] [PATCH v2 04/16] drm/i915: Resurrect ring kicking for semaphores, selectively

2013-03-14 Thread Mika Kuoppala
From: Chris Wilson ch...@chris-wilson.co.uk Once we thought we got semaphores working, we disabled kicking the ring if hangcheck fired whilst waiting upon a ring as it was doing more harm than good: commit 4e0e90dcb8a7df1229c69e30abebb59b0b3c2a1f Author: Daniel Vetter daniel.vet...@ffwll.ch

[Intel-gfx] [PATCH v2 12/16] drm/i915: add batch object and context to i915_add_request()

2013-03-14 Thread Mika Kuoppala
In order to track down a batch buffer and context which caused the ring to hang, store reference to bo and context into the request struct. Request can also cause gpu to hang after the batch in the flush section in the ring. To detect this add start of the flush portion offset into the request.

[Intel-gfx] [PATCH v2 06/16] drm/i915: track ring progression using seqnos

2013-03-14 Thread Mika Kuoppala
Instead of relying in acthd, track ring seqno progression to detect if ring has hung. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_drv.h |2 -- drivers/gpu/drm/i915/i915_irq.c | 30 +-

[Intel-gfx] [PATCH v2 15/16] drm/i915: refuse to submit more batchbuffers from guilty context

2013-03-14 Thread Mika Kuoppala
If context has recently submitted a faulty batchbuffers guilty of gpu hang and decides to keep submitting more crap, ban it permanently. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_drv.c| 23 ++-

[Intel-gfx] [PATCH 04/10] [v2] drm/i915: Don't touch South Display when PCH_NOP

2013-03-14 Thread Ben Widawsky
Interrupts, clock gating, and GMBUS are all within the, this will hang the CPU range when we have PCH_NOP. There is a bit of a hack in init clock gating. We want to do most of the clock gating, but the part we skip will hang the system. It could probably be abstracted a bit better, but I don't

Re: [Intel-gfx] [PATCH 1/4] drm/i915: Remove unused file arg from execbuf

2013-03-14 Thread Ben Widawsky
On Thu, Mar 14, 2013 at 08:56:41AM +, Chris Wilson wrote: I was hoping for something a little more magical. I couldn't spot any mistakes - the only thing missing is a hint as to why these are useful. Acked-by: Chris Wilson ch...@chris-wilson.co.uk -Chris I'm pretty sure you asked for

Re: [Intel-gfx] [PATCH v3] drm/i915: bounds check execbuffer relocation count

2013-03-14 Thread Daniel Vetter
On Wed, Mar 13, 2013 at 9:28 PM, Daniel Vetter dan...@ffwll.ch wrote: On Tue, Mar 12, 2013 at 09:07:46AM +, Chris Wilson wrote: On Mon, Mar 11, 2013 at 05:31:45PM -0700, Kees Cook wrote: It is possible to wrap the counter used to allocate the buffer for relocation copies. This could lead

Re: [Intel-gfx] [PATCH v3] drm/i915: bounds check execbuffer relocation count

2013-03-14 Thread Kees Cook
On Thu, Mar 14, 2013 at 9:57 AM, Daniel Vetter daniel.vet...@ffwll.ch wrote: On Wed, Mar 13, 2013 at 9:28 PM, Daniel Vetter dan...@ffwll.ch wrote: On Tue, Mar 12, 2013 at 09:07:46AM +, Chris Wilson wrote: On Mon, Mar 11, 2013 at 05:31:45PM -0700, Kees Cook wrote: It is possible to wrap

Re: [Intel-gfx] i915 black screen introduced by ACPI changes

2013-03-14 Thread Chris Li
On Mon, Mar 11, 2013 at 6:16 AM, Jani Nikula jani.nik...@linux.intel.com wrote: Interesting snippets from your dmesgs: 1) good [0.00] Linux version 3.6.0-rc6+ (chr...@ideapad.lan) (gcc version 4.7.2 20121109 (Red Hat 4.7.2-8) (GCC) ) #25 SMP Wed Feb 20 12:55:06 PST 2013 ... [

[Intel-gfx] [QA 03/15] Testing report for `drm-intel-testing` (was: Updated -next)

2013-03-14 Thread Sun, Yi
Summary We finished a new round of kernel testing. Generally, in this circle, 4 new bugs are filed, 21 bugs are still opened, 3 bugs are closed. 3 bugs out of 4 new ones are related to eDP display. Test Environment Kernel: (drm-intel-testing) d08a6eb2690b1ac6f0582feb41c2ccbea945285f