Re: [Intel-gfx] [PATCH igt] gem_concurrent_blit: Don't call igt_require() outside of a subtest/fixture

2016-01-11 Thread Daniel Vetter
On Fri, Jan 08, 2016 at 09:10:38AM +, Chris Wilson wrote: > gem_concurrent_blit tries to ensure that it doesn't try and run a test > that would grind the system to a halt, i.e. unexpectedly cause swap > thrashing. It currently calls intel_require_memory(), but outside of > the subtest (as the t

Re: [Intel-gfx] [PATCH 01/13] drm/i915/bdw+: Replace list_del+list_add_tail with list_move_tail

2016-01-11 Thread Daniel Vetter
On Fri, Jan 08, 2016 at 11:29:40AM +, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > Same effect for slightly less source code and resulting binary. > > Signed-off-by: Tvrtko Ursulin Reviewed-by: Daniel Vetter > --- > drivers/gpu/drm/i915/intel_lrc.c | 15 ++- > 1 file chan

Re: [Intel-gfx] [PATCH 02/13] drm/i915: Don't need a timer to wake us up

2016-01-11 Thread Daniel Vetter
On Fri, Jan 08, 2016 at 11:29:41AM +, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > Looks like the sleeping loop in __i915_wait_request can be > simplified by using io_schedule_timeout instead of setting > up and destroying a timer. > > Signed-off-by: Tvrtko Ursulin > Cc: Chris Wilson

Re: [Intel-gfx] [PATCH 03/13] drm/i915: Avoid invariant conditionals in lrc interrupt handler

2016-01-11 Thread Daniel Vetter
On Fri, Jan 08, 2016 at 11:29:42AM +, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > There is no need to check on what Gen we are running on every > interrupt and every command submission. We can instead set up > some of that when engines are initialized, store it in the > engine structure

Re: [Intel-gfx] [PATCH 04/13] drm/i915: Fail engine initialization if LRCA is incorrectly aligned

2016-01-11 Thread Daniel Vetter
On Fri, Jan 08, 2016 at 11:29:43AM +, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > LRCA can change only when it goes from unpinned to pinned so it > makes sense to check its alignment at that point rather than at > every batch buffer submission. > > Furthermore, if we check it at pin tim

Re: [Intel-gfx] [PATCH 05/13] drm/i915: Cache LRCA in the context

2016-01-11 Thread Daniel Vetter
On Fri, Jan 08, 2016 at 11:29:44AM +, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > We are not allowed to call i915_gem_obj_ggtt_offset from irq > context without the big kernel lock held. > > LRCA lifetime is well defined so cache it so it can be looked up > cheaply from the interrupt co

Re: [Intel-gfx] [PATCH 06/13] drm/i915: Only grab timestamps when needed

2016-01-11 Thread Daniel Vetter
On Fri, Jan 08, 2016 at 11:29:45AM +, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > No need to call ktime_get_raw_ns twice per unlimited wait and can > also elimate a local variable. > > Signed-off-by: Tvrtko Ursulin > --- > drivers/gpu/drm/i915/i915_gem.c | 12 +++- > 1 file ch

Re: [Intel-gfx] [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator

2016-01-11 Thread Daniel Vetter
On Fri, Jan 08, 2016 at 01:29:14PM +, Tvrtko Ursulin wrote: > > On 08/01/16 11:29, Tvrtko Ursulin wrote: > >From: Tvrtko Ursulin > > > >Purpose is to catch places which iterate the object VMA list > >without holding the big lock. > > > >Implemented by open coding list_for_each_entry to make t

Re: [Intel-gfx] [PATCH 07/13] drm/i915: Introduce dedicated object VMA iterator

2016-01-11 Thread Daniel Vetter
On Fri, Jan 08, 2016 at 11:44:04AM +, Chris Wilson wrote: > On Fri, Jan 08, 2016 at 11:29:46AM +, Tvrtko Ursulin wrote: > > From: Tvrtko Ursulin > > > > Purpose is to catch places which iterate the object VMA list > > without holding the big lock. > > > > Implemented by open coding list_

Re: [Intel-gfx] [PATCH 11/13] drm/i915: Cache ringbuffer GTT address

2016-01-11 Thread Daniel Vetter
On Fri, Jan 08, 2016 at 11:29:50AM +, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > Purpose is to avoid calling i915_gem_obj_ggtt_offset from the > interrupt context without the big lock held. > > Signed-off-by: Tvrtko Ursulin > --- > drivers/gpu/drm/i915/intel_lrc.c| 3 +-- > d

Re: [Intel-gfx] [PATCH 09/13] drm/i915: Remove two impossible asserts

2016-01-11 Thread Daniel Vetter
On Fri, Jan 08, 2016 at 11:29:48AM +, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > Engine initialization would have failed if those two weren't > pinned and calling i915_gem_obj_is_pinned is illegal from irq > context without the big lock held. > > Signed-off-by: Tvrtko Ursulin Reviewe

Re: [Intel-gfx] [PATCH] drm/i915: Support to enable TRTT on GEN9

2016-01-11 Thread Chris Wilson
On Mon, Jan 11, 2016 at 01:09:50PM +0530, Goel, Akash wrote: > > > On 1/10/2016 11:09 PM, Chris Wilson wrote: > >On Sat, Jan 09, 2016 at 05:00:21PM +0530, akash.g...@intel.com wrote: > >>From: Akash Goel > >> > >>Gen9 has an additional address translation hardware support in form of > >>Tiled Re

Re: [Intel-gfx] [PATCH igt] gem_concurrent_blit: Don't call igt_require() outside of a subtest/fixture

2016-01-11 Thread Chris Wilson
On Mon, Jan 11, 2016 at 09:00:13AM +0100, Daniel Vetter wrote: > On Fri, Jan 08, 2016 at 09:10:38AM +, Chris Wilson wrote: > > gem_concurrent_blit tries to ensure that it doesn't try and run a test > > that would grind the system to a halt, i.e. unexpectedly cause swap > > thrashing. It current

[Intel-gfx] ✗ warning: Fi.CI.BAT

2016-01-11 Thread Patchwork
== Summary == Built on ff88655b3a5467bbc3be8c67d3e05ebf182557d3 drm-intel-nightly: 2016y-01m-11d-07h-30m-16s UTC integration manifest Test gem_storedw_loop: Subgroup basic-render: dmesg-warn -> PASS (bdw-ultra) dmesg-warn -> PASS (skl-i7k-2) UN

[Intel-gfx] ✗ failure: Fi.CI.BAT

2016-01-11 Thread Patchwork
== Summary == HEAD is now at ff88655 drm-intel-nightly: 2016y-01m-11d-07h-30m-16s UTC integration manifest Applying: drm/i915: Use passed plane state for sprite planes, v4. Using index info to reconstruct a base tree... M drivers/gpu/drm/i915/intel_drv.h M drivers/gpu/drm/i915/intel_s

Re: [Intel-gfx] [PATCH igt] core/sighelper: Interrupt everyone in the process group

2016-01-11 Thread Chris Wilson
On Mon, Jan 11, 2016 at 08:57:33AM +0100, Daniel Vetter wrote: > On Fri, Jan 08, 2016 at 08:44:29AM +, Chris Wilson wrote: > > Some stress tests create both the signal helper and a lot of competing > > processes. In these tests, the parent is just waiting upon the children, > > and the intentio

Re: [Intel-gfx] [PATCH 2/2] drm/i915/gen9: Calculate edram size

2016-01-11 Thread Chris Wilson
On Mon, Jan 11, 2016 at 08:50:43AM +0100, Daniel Vetter wrote: > On Fri, Jan 08, 2016 at 06:58:45PM +0200, Mika Kuoppala wrote: > > With gen9+ the edram capabilities are defined so > > that we can calculate the edram (ellc) size accordingly. > > > > Note that there are undefined combinations for s

Re: [Intel-gfx] [PATCH igt] core/sighelper: Interrupt everyone in the process group

2016-01-11 Thread Daniel Vetter
On Mon, Jan 11, 2016 at 08:54:59AM +, Chris Wilson wrote: > On Mon, Jan 11, 2016 at 08:57:33AM +0100, Daniel Vetter wrote: > > On Fri, Jan 08, 2016 at 08:44:29AM +, Chris Wilson wrote: > > > Some stress tests create both the signal helper and a lot of competing > > > processes. In these tes

Re: [Intel-gfx] [PATCH igt] gem_concurrent_blit: Don't call igt_require() outside of a subtest/fixture

2016-01-11 Thread Daniel Vetter
On Mon, Jan 11, 2016 at 08:52:24AM +, Chris Wilson wrote: > On Mon, Jan 11, 2016 at 09:00:13AM +0100, Daniel Vetter wrote: > > On Fri, Jan 08, 2016 at 09:10:38AM +, Chris Wilson wrote: > > > gem_concurrent_blit tries to ensure that it doesn't try and run a test > > > that would grind the sy

[Intel-gfx] ✗ failure: Fi.CI.BAT

2016-01-11 Thread Patchwork
== Summary == Built on ff88655b3a5467bbc3be8c67d3e05ebf182557d3 drm-intel-nightly: 2016y-01m-11d-07h-30m-16s UTC integration manifest Test gem_storedw_loop: Subgroup basic-render: pass -> DMESG-WARN (skl-i5k-2) UNSTABLE dmesg-warn -> PASS (bdw-

[Intel-gfx] [PATCH 002/190] drm/i915: Move the mb() following release-mmap into release-mmap

2016-01-11 Thread Chris Wilson
As paranoia, we want to ensure that the CPU's PTEs have been revoked for the object before we return from i915_gem_release_mmap(). This allows us to rely on there being no outstanding memory accesses and guarantees serialisation of the code against concurrent access just by calling i915_gem_release

[Intel-gfx] [PATCH 004/190] drm/i915: Fix some invalid requests cancellations

2016-01-11 Thread Chris Wilson
As we add the VMA to the request early, it may be cancelled during execbuf reservation. This will leave the context object pointing to a dangling request; i915_wait_request() simply skips the wait and so we may unbind the object whilst it is still active. However, if at any point we make a change

[Intel-gfx] [PATCH 001/190] drm: Release driver references to handle before making it available again

2016-01-11 Thread Chris Wilson
When userspace closes a handle, we remove it from the file->object_idr and then tell the driver to drop its references to that file/handle. However, as the file/handle is already available again for reuse, it may be reallocated back to userspace and active on a new object before the driver has had

[Intel-gfx] [PATCH 006/190] drm/i915: Add GEM debugging Kconfig option

2016-01-11 Thread Chris Wilson
Currently there is a #define to enable extra BUG_ON for debugging requests and associated activities. I want to expand its use to cover all of GEM internals (so that we can saturate the code with asserts). We can add a Kconfig option to make it easier to enable - with the usual caveats of not enabl

[Intel-gfx] [PATCH 007/190] drm/i915: Hide the atomic_read(reset_counter) behind a helper

2016-01-11 Thread Chris Wilson
This is principally a little bit of syntatic sugar to hide the atomic_read()s throughout the code to retrieve the current reset_counter. It also provides the other utility functions to check the reset state on the already read reset_counter, so that (in later patches) we can read it once and do mul

[Intel-gfx] [PATCH 011/190] drm/i915: Simplify reset_counter handling during atomic modesetting

2016-01-11 Thread Chris Wilson
Now that the reset_counter is stored on the request, we can rearrange the code to handle reading the counter versus waiting during the atomic modesetting for readibility (by deleting the hairiest of codes). Signed-off-by: Chris Wilson Cc: Daniel Vetter Reviewed-by: Daniel Vetter --- drivers/gp

[Intel-gfx] [PATCH 005/190] drm/i915: Force clean compilation with -Werror

2016-01-11 Thread Chris Wilson
Our driver compiles clean (nowadays thanks to 0day) but for me, at least, it would be beneficial if the compiler threw an error rather than a warning when it found a piece of suspect code. (I use this to compile-check patch series and want to break on the first compiler error in order to fix the pa

[Intel-gfx] [PATCH 017/190] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+

2016-01-11 Thread Chris Wilson
In order to ensure seqno/irq coherency, we current read a ring register. We are not sure quite how it works, only that is does. Experiments show that e.g. doing a clflush(seqno) instead is not sufficient, but we can remove the forcewake dance from the mmio access. v2: Baytrail wants a clflush too.

[Intel-gfx] [PATCH 010/190] drm/i915: Store the reset counter when constructing a request

2016-01-11 Thread Chris Wilson
As the request is only valid during the same global reset epoch, we can record the current reset_counter when constructing the request and reuse it when waiting upon that request in future. This removes a very hairy atomic check serialised by the struct_mutex at the time of waiting and allows us to

[Intel-gfx] [PATCH 020/190] drm/i915: Remove the lazy_coherency parameter from request-completed?

2016-01-11 Thread Chris Wilson
Now that we have split out the seqno-barrier from the engine->get_seqno() callback itself, we can move the users of the seqno-barrier to the required callsites simplifying the common code and making the required workaround handling much more explicit. Signed-off-by: Chris Wilson --- drivers/gpu/

[Intel-gfx] [PATCH 026/190] drm/i915: Stop setting wraparound seqno on initialisation

2016-01-11 Thread Chris Wilson
We have testcases to ensure that seqno wraparound works fine, so we can forgo forcing everyone to encounter seqno wraparound during early uptime. seqno wraparound incurs a full GPU stall so not forcing it will eliminate one jitter from the early system. Using the testcases, we have very determinist

[Intel-gfx] [PATCH 031/190] drm/i915: Harden detection of missed interrupts

2016-01-11 Thread Chris Wilson
Only declare a missed interrupt if we find that the GPU is idle with waiters and a hangcheck interval has passed in which no new user interrupts have been raised. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 6 ++ drivers/gpu/drm/i915/i915_irq.c | 10 +++

[Intel-gfx] [PATCH 013/190] drm/i915: Suppress error message when GPU resets are disabled

2016-01-11 Thread Chris Wilson
If we do not have lowlevel support for reseting the GPU, or if the user has explicitly disabled reseting the device, the failure is expected. Since it is an expected failure, we should be using a lower priority message than *ERROR*, perhaps NOTICE. In the absence of DRM_NOTICE, just emit the expect

[Intel-gfx] [PATCH 041/190] drm/i915: Allow userspace to request no-error-capture upon GPU hangs

2016-01-11 Thread Chris Wilson
igt likes to inject GPU hangs into its command streams. However, as we expect these hangs, we don't actually want them recorded in the dmesg output or stored in the i915_error_state (usually). To accomodate this allow userspace to set a flag on the context that any hang emanating from that context

[Intel-gfx] [PATCH 032/190] drm/i915: Remove debug noise on detecting fault-injection of missed interrupts

2016-01-11 Thread Chris Wilson
Since the tests can and do explicitly check debugfs/i915_ring_missed_irqs for the handling of a "missed interrupt", adding it to the dmesg at INFO is just noise. When it happens for real, we still class it as an ERROR. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_irq.c | 3 --- 1 fi

[Intel-gfx] [PATCH 030/190] drm/i915: Move the get/put irq locking into the caller

2016-01-11 Thread Chris Wilson
With only a single callsite for intel_engine_cs->irq_get and ->irq_put, we can reduce the code size by moving the common preamble into the caller, and we can also eliminate the reference counting. For completeness, as we are no longer doing reference counting on irq, rename the get/put vfunctions

[Intel-gfx] [PATCH 023/190] drm/i915: Only apply one barrier after a breadcrumb interrupt is posted

2016-01-11 Thread Chris Wilson
If we flag the seqno as potentially stale upon receiving an interrupt, we can use that information to reduce the frequency that we apply the heavyweight coherent seqno read (i.e. if we wake up a chain of waiters). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 15

[Intel-gfx] [PATCH 012/190] drm/i915: Prevent leaking of -EIO from i915_wait_request()

2016-01-11 Thread Chris Wilson
Reporting -EIO from i915_wait_request() has proven very troublematic over the years, with numerous hard-to-reproduce bugs cropping up in the corner case of where a reset occurs and the code wasn't expecting such an error. If the we reset the GPU or have detected a hang and wish to reset the GPU, t

[Intel-gfx] [PATCH 019/190] drm/i915: Separate out the seqno-barrier from engine->get_seqno

2016-01-11 Thread Chris Wilson
In order to simplify the next couple of patches, extract the lazy_coherency optimisation our of the engine->get_seqno() vfunc into its own callback. v2: Rename the barrier to engine->irq_seqno_barrier to try and better reflect that the barrier is only required after the user interrupt before readi

[Intel-gfx] [PATCH 036/190] drm/i915: Restore waitboost credit to the synchronous waiter

2016-01-11 Thread Chris Wilson
Ideally, we want to automagically have the GPU respond to the instantaneous load by reclocking itself. However, reclocking occurs relatively slowly, and to the client waiting for a result from the GPU, too late. To compensate and reduce the client latency, we allow the first wait from a client to b

[Intel-gfx] [PATCH 050/190] drm/i915: Refactor duplicate object vmap functions

2016-01-11 Thread Chris Wilson
We now have two implementations for vmapping a whole object, one for dma-buf and one for the ringbuffer. If we couple the vmapping into the obj->pages lifetime, then we can reuse an obj->vmapping for both and at the same time couple it into the shrinker. v2: Mark the failable kmalloc() as __GFP_NO

[Intel-gfx] [PATCH 048/190] drm/i915: Disable waitboosting for fence_wait()

2016-01-11 Thread Chris Wilson
We want to restrict waitboosting to known process contexts, where we can track which clients are receiving waitboosts and prevent excessive power wasting. For fence_wait() we do not have any client tracking and so that leaves it open to abuse. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915

[Intel-gfx] [PATCH 015/190] drm/i915: Remove the dedicated hangcheck workqueue

2016-01-11 Thread Chris Wilson
The queue only ever contains at most one item and has no special flags. It is just a very simple wrapper around the system-wq - a complication with no benefits. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_dma.c | 11 --- drivers/gpu/drm/i915/i915_drv.h | 1 - drivers/gpu/d

[Intel-gfx] [PATCH 045/190] drm/i915: Move releasing of the GEM request from free to retire/cancel

2016-01-11 Thread Chris Wilson
If we move the release of the GEM request (i.e. decoupling it from the various lists used for client and context tracking) after it is complete (either by the GPU retiring the request, or by the caller cancelling the request), we can remove the requirement that the final unreference of the GEM requ

[Intel-gfx] [PATCH 059/190] drm/i915: Rename request->ringbuf to request->ring

2016-01-11 Thread Chris Wilson
Now that we have disambuigated ring and engine, we can use the clearer and more consistent name for the intel_ringbuffer pointer in the request. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c| 8 +- drivers/gpu/drm/i915/i915_gem_context.c| 2 +- drivers/gpu/d

[Intel-gfx] [PATCH 040/190] drm/i915: Record the ringbuffer associated with the request

2016-01-11 Thread Chris Wilson
The request tells us where to read the ringbuf from, so use that information to simplify the error capture. If no request was active at the time of the hang, the ring is idle and there is no information inside the ring pertaining to the hang. Note carefully that this will reduce the amount of info

[Intel-gfx] [PATCH 043/190] drm/i915: Skip capturing an error state if we already have one

2016-01-11 Thread Chris Wilson
As we only ever keep the first error state around, we can avoid some work that can be quite intrusive if we don't record the error the second time around. This does move the race whereby the user could discard one error state as the second is being captured, but that race exists in the current code

[Intel-gfx] [PATCH 058/190] drm/i915: Rename request->ring to request->engine

2016-01-11 Thread Chris Wilson
In order to disambiguate between the pointer to the intel_engine_cs (called ring) and the intel_ringbuffer (called ringbuf), rename s/ring/engine/. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 11 +-- drivers/gpu/drm/i915/i915_drv.h | 2 +- drive

[Intel-gfx] [PATCH 055/190] drm/i915: Unify intel_logical_ring_emit and intel_ring_emit

2016-01-11 Thread Chris Wilson
Both perform the same actions with more or less indirection, so just unify the code. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c| 2 +- drivers/gpu/drm/i915/i915_gem_context.c| 8 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 34 - drivers/gpu/d

[Intel-gfx] [PATCH 037/190] drm/i915: Add background commentary to "waitboosting"

2016-01-11 Thread Chris Wilson
Describe the intent of boosting the GPU frequency to maximum before waiting on the GPU. RPS waitboosting was introduced with commit b29c19b645287f7062e17d70fa4e9781a01a5d88 Author: Chris Wilson Date: Wed Sep 25 17:34:56 2013 +0100 drm/i915: Boost RPS frequency for CPU stalls but lacked a

[Intel-gfx] [PATCH 047/190] drm/i915: Rename request reference/unreference to get/put

2016-01-11 Thread Chris Wilson
Now that we derive requests from struct fence, swap over to its nomenclature for references. It's shorter and more idiomatic across the kernel. s/i915_gem_request_reference/i915_gem_request_get/ s/i915_gem_request_unreference/i915_gem_request_put/ Signed-off-by: Chris Wilson --- drivers/gpu/drm

[Intel-gfx] [PATCH 008/190] drm/i915: Simplify checking of GPU reset_counter in display pageflips

2016-01-11 Thread Chris Wilson
If we, when we store the reset_counter for the operation, we ensure that it is not in a wedged or in the middle of a reset, we can then assert that if any reset occurs the reset_counter must change. Later we can just compare the operation's reset epoch against the current counter to see if we need

[Intel-gfx] [PATCH 021/190] drm/i915: Use HWS for seqno tracking everywhere

2016-01-11 Thread Chris Wilson
By using the same address for storing the HWS on every platform, we can remove the platform specific vfuncs and reduce the get-seqno routine to a single read of a cached memory location. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 10 ++-- drivers/gpu/drm/i915/i915

[Intel-gfx] [PATCH 051/190] drm,i915: Introduce drm_malloc_gfp()

2016-01-11 Thread Chris Wilson
I have instances where I want to use drm_malloc_ab() but with a custom gfp mask. And with those, where I want a temporary allocation, I want to try a high-order kmalloc() before using a vmalloc(). So refactor my usage into drm_malloc_gfp(). Signed-off-by: Chris Wilson Cc: dri-de...@lists.freedes

[Intel-gfx] [PATCH 054/190] drm/i915: Use the new rq->i915 field where appropriate

2016-01-11 Thread Chris Wilson
In a few frequent cases, having a direct pointer to the drm_i915_private from the request is very useful. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c| 7 +++--- drivers/gpu/drm/i915/i915_gem_context.c| 21 +- drivers/gpu/drm/i915/i915_gem_exec

[Intel-gfx] [PATCH 027/190] drm/i915: Only query timestamp when measuring elapsed time

2016-01-11 Thread Chris Wilson
Avoid the two calls to ktime_get_raw_ns() (at best it reads the TSC) as we only need to compute the elapsed time for a timed wait. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i91

[Intel-gfx] [PATCH 073/190] drm/i915: Introduce i915_gem_active for request tracking

2016-01-11 Thread Chris Wilson
In the next patch, request tracking is made more generic and for that we need a new expanded struct and to separate out the logic changes from the mechanical churn, we split out the structure renaming into this patch. v2: Writer's block. Add some spiel about why we track requests. v3: Now i915_gem

[Intel-gfx] [PATCH 056/190] drm/i915: Unify intel_ring_begin()

2016-01-11 Thread Chris Wilson
Combine the near identical implementations of intel_logical_ring_begin() and intel_ring_begin() - the only difference is that the logical wait has to check for a matching ring (which is assumed by legacy). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_lrc.c| 141 ++--

[Intel-gfx] [PATCH 067/190] drm/i915: Unify legacy/execlists emission of MI_BATCHBUFFER_START

2016-01-11 Thread Chris Wilson
Both the ->dispatch_execbuffer and ->emit_bb_start callbacks do exactly the same thing, add MI_BATCHBUFFER_START to the request's ringbuffer - we need only one vfunc. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 6 +-- drivers/gpu/drm/i915/i915_gem_render_state

[Intel-gfx] [PATCH 038/190] drm/i915: Flush the RPS bottom-half when the GPU idles

2016-01-11 Thread Chris Wilson
Make sure that the RPS bottom-half is flushed before we set the idle frequency when we decide the GPU is idle. This should prevent any races with the bottom-half and setting the idle frequency, and ensures that the bottom-half is bounded by the GPU's rpm reference taken for when it is active (i.e.

[Intel-gfx] [PATCH 029/190] drm/i915: Convert trace-irq to the breadcrumb waiter

2016-01-11 Thread Chris Wilson
If we convert the tracing over from direct use of ring->irq_get() and over to the breadcrumb infrastructure, we only have a single user of the ring->irq_get and so we will be able to simplify the driver routines (eliminating the redundant validation and irq refcounting). v2: Move to a signaling fr

[Intel-gfx] [PATCH 068/190] drm/i915: Unify adding requests between ringbuffer and execlists

2016-01-11 Thread Chris Wilson
Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_request.c | 8 +- drivers/gpu/drm/i915/intel_lrc.c| 14 ++-- drivers/gpu/drm/i915/intel_ringbuffer.c | 129 +--- drivers/gpu/drm/i915/intel_ringbuffer.h | 21 +++--- 4 files changed, 87 insertion

[Intel-gfx] [PATCH 022/190] drm/i915: Check the CPU cached value of seqno after waking the waiter

2016-01-11 Thread Chris Wilson
If we have multiple waiters, we may find that many complete on the same wake up. If we first inspect the seqno from the CPU cache, we may reduce the number of heavyweight coherent seqno reads we require. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 14 ++ 1 file

[Intel-gfx] [PATCH 065/190] drm/i915: Remove obsolete engine->gpu_caches_dirty

2016-01-11 Thread Chris Wilson
Space for flushing the GPU cache prior to completing the request is preallocated and so cannot fail. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_context.c| 2 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 9 +--- drivers/gpu/drm/i915/i915_gem_gtt.c| 18

[Intel-gfx] [PATCH 060/190] drm/i915: Rename backpointer from intel_ringbuffer to intel_engine_cs

2016-01-11 Thread Chris Wilson
Having ringbuf->ring point to an engine is confusing, so rename it once again to ring->engine. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_guc_submission.c | 10 +++--- drivers/gpu/drm/i915/intel_lrc.c | 35 +-- drivers/gpu/drm/i915/intel_ringbuffer.c|

[Intel-gfx] [PATCH 046/190] drm/i915: Derive GEM requests from dma-fence

2016-01-11 Thread Chris Wilson
dma-buf provides a generic fence class for interoperation between drivers. Internally we use the request structure as a fence, and so with only a little bit of interfacing we can rebase those requests on top of dma-buf fences. This will allow us, in the future, to pass those fences back to userspac

[Intel-gfx] [PATCH 016/190] drm/i915: Make queueing the hangcheck work inline

2016-01-11 Thread Chris Wilson
Since the function is a small wrapper around schedule_delayed_work(), move it inline to remove the function call overhead for the principle caller. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 17 - drivers/gpu/drm/i915/i915_irq.c | 16 2 fil

[Intel-gfx] [PATCH 009/190] drm/i915: Tighten reset_counter for reset status

2016-01-11 Thread Chris Wilson
In the reset_counter, we use two bits to track a GPU hang and reset. The low bit is a "reset-in-progress" flag that we set to signal when we need to break waiters in order for the recovery task to grab the mutex. As soon as the recovery task has the mutex, we can clear that flag (which we do by inc

[Intel-gfx] [PATCH 039/190] drm/i915: Remove stop-rings debugfs interface

2016-01-11 Thread Chris Wilson
Now that we have (near) universal GPU recovery code, we can inject a real hang from userspace and not need any fakery. Not only does this mean that the testing is far more realistic, but we can simplify the kernel in the process. v2: Replace the i915_stop_rings with a dummy implementation as igt e

[Intel-gfx] [PATCH 024/190] drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor

2016-01-11 Thread Chris Wilson
When reading from the HWS page, we use barrier() to prevent the compiler optimising away the read from the volatile (may be updated by the GPU) memory address. This is more suited to READ_ONCE(); make it so. Signed-off-by: Chris Wilson Cc: Daniel Vetter --- drivers/gpu/drm/i915/intel_ringbuffer

[Intel-gfx] [PATCH 075/190] drm/i915: Refactor activity tracking for requests

2016-01-11 Thread Chris Wilson
With the introduction of requests, we amplified the number of atomic refcounted objects we use and update every execbuffer; from none to several references, and a set of references that need to be changed. We also introduced interesting side-effects in the order of retiring requests and objects. I

[Intel-gfx] [PATCH 078/190] drm/i915: Split early global GTT initialisation

2016-01-11 Thread Chris Wilson
Initialising the global GTT is tricky as we wish to use the drm_mm range manager during the modesetting initialisation (to capture stolen allocations from the BIOS) before we actually enable GEM. To overcome this, we currently setup the drm_mm first and then carefully rebind them. Signed-off-by: C

[Intel-gfx] [PATCH 033/190] drm/i915: Only start retire worker when idle

2016-01-11 Thread Chris Wilson
The retire worker is a low frequency task that makes sure we retire outstanding requests if userspace is being lax. We only need to start it once as it remains active until the GPU is idle, so do a cheap test before the more expensive queue_work(). A consequence of this is that we need correct lock

[Intel-gfx] [PATCH 070/190] drm/i915: Unify legacy/execlists submit_execbuf callbacks

2016-01-11 Thread Chris Wilson
Now that emitting requests is identical between legacy and execlists, we can use the same function to build up the ring for submitting to either engine. (With the exception of i915_switch_contexts(), but in time that will also be handled gracefully.) Signed-off-by: Chris Wilson --- drivers/gpu/d

[Intel-gfx] [PATCH 014/190] drm/i915: Delay queuing hangcheck to wait-request

2016-01-11 Thread Chris Wilson
We can forgo queuing the hangcheck from the start of every request to until we wait upon a request. This reduces the overhead of every request, but may increase the latency of detecting a hang. Howeever, if nothing every waits upon a hang, did it ever hang? It also improves the robustness of the wa

[Intel-gfx] [PATCH 061/190] drm/i915: Rename intel_context[engine].ringbuf

2016-01-11 Thread Chris Wilson
Perform s/ringbuf/ring/ on the context struct for consistency with the ring/engine split. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c| 2 +- drivers/gpu/drm/i915/i915_drv.h| 2 +- drivers/gpu/drm/i915/i915_guc_submission.c | 6 +-- drivers/gpu/drm/i

[Intel-gfx] [PATCH 034/190] drm/i915: Do not keep postponing the idle-work

2016-01-11 Thread Chris Wilson
Rather than persistently postponing the idle-work everytime somebody calls i915_gem_retire_requests() (potentially ensuring that we never reach the idle state), queue the work the first time we detect all requests are complete. Then if in 100ms, more requests have been queued, we will abort the idl

[Intel-gfx] [PATCH 018/190] drm/i915: Slaughter the thundering i915_wait_request herd

2016-01-11 Thread Chris Wilson
One particularly stressful scenario consists of many independent tasks all competing for GPU time and waiting upon the results (e.g. realtime transcoding of many, many streams). One bottleneck in particular is that each client waits on its own results, but every client is woken up after every batch

[Intel-gfx] [PATCH 083/190] drm/i915: Be more careful when unbinding vma

2016-01-11 Thread Chris Wilson
When we call i915_vma_unbind(), we will wait upon outstanding rendering. This will also trigger a retirement phase, which may update the object lists. If, we extend request tracking to the VMA itself (rather than keep it at the encompassing object), then there is a potential that the obj->vma_list

[Intel-gfx] [PATCH 035/190] drm/i915: Remove redundant queue_delayed_work() from throttle ioctl

2016-01-11 Thread Chris Wilson
We know, by design, that whilst the GPU is active (and thus we are throttling) the retire_worker is queued. Therefore attempting to requeue it with queue_delayed_work() is a no-op and we can safely remove it. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 3 --- 1 file changed

[Intel-gfx] [PATCH 044/190] drm/i915: Move GEM request routines to i915_gem_request.c

2016-01-11 Thread Chris Wilson
Migrate the request operations out of the main body of i915_gem.c and into their own C file for easier expansion. v2: Move __i915_add_request() across as well Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/i915_drv.h | 205 +

[Intel-gfx] [PATCH 080/190] drm/i915: Store owning file on the i915_address_space

2016-01-11 Thread Chris Wilson
For the global GTT (and aliasing GTT), the address space is owned by the device (it is a global resource) and so the per-file owner field is NULL. For per-process GTT (where we create an address space per context), each is owned by the opening file. We can use this ownership information to both dis

[Intel-gfx] [PATCH 077/190] drm/i915: Amalgamate GGTT/ppGTT vma debug list walkers

2016-01-11 Thread Chris Wilson
As we can now have multiple VMA inside the global GTT (with partial mappings, rotations, etc), it is no longer true that there may just be a single GGTT entry and so we should walk the full vma_list to count up the actual usage. In addition to unifying the two walkers, switch from multiplying the o

[Intel-gfx] [PATCH 053/190] drm/i915: Convert i915_semaphores_is_enabled over to early sanitize

2016-01-11 Thread Chris Wilson
Rather than recomputing whether semaphores are enabled, we can do that computation once during early initialisation as the i915.semaphores module parameter is now read-only. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 2 +- drivers/gpu/drm/i915/i915_dma.c |

[Intel-gfx] [PATCH 066/190] drm/i915: Simplify request_alloc by returning the allocated request

2016-01-11 Thread Chris Wilson
If is simpler and leads to more readable code through the callstack if the allocation returns the allocated struct through the return value. The importance of this is that it no longer looks like we accidentally allocate requests as side-effect of calling certain functions. Signed-off-by: Chris W

[Intel-gfx] [PATCH 057/190] drm/i915: Remove the identical implementations of request space reservation

2016-01-11 Thread Chris Wilson
Now that we share intel_ring_begin(), reserving space for the tail of the request is identical between legacy/execlists and so the tautology can be removed. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_request.c | 7 +++ drivers/gpu/drm/i915/intel_lrc.c| 15

[Intel-gfx] [PATCH 071/190] drm/i915: Simplify calling engine->sync_to

2016-01-11 Thread Chris Wilson
Since requests can no longer be generated as a side-effect of intel_ring_begin(), we know that the seqno will be unchanged during ring-emission. This predicatablity then means we do not have to check for the seqno wrapping around whilst emitting the semaphore for engine->sync_to(). Signed-off-by:

[Intel-gfx] [PATCH 074/190] drm/i915: Rename request->list to link for consistency

2016-01-11 Thread Chris Wilson
We use "list" to denote the list and "link" to denote an element on that list. Rename request->list to match this idiom. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 4 ++-- drivers/gpu/drm/i915/i915_gem.c | 12 ++-- drivers/gpu/drm/i915/i915_gem_req

[Intel-gfx] [PATCH 062/190] drm/i915: Rename extern functions operating on intel_engine_cs

2016-01-11 Thread Chris Wilson
Using intel_ring_* to refer to the intel_engine_cs functions is most confusing! Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c| 10 +++ drivers/gpu/drm/i915/i915_dma.c| 8 +++--- drivers/gpu/drm/i915/i915_drv.h| 4 +-- drivers/gpu/drm/i9

[Intel-gfx] [PATCH 025/190] drm/i915: Broadwell execlists needs exactly the same seqno w/a as legacy

2016-01-11 Thread Chris Wilson
In legacy mode, we use the gen6 seqno barrier to insert a delay after the interrupt before reading the seqno (as the seqno write is not flushed before the interrupt is sent, the interrupt arrives before the seqno is visible). Execlists ignored the evidence of igt. Note that is harder, but not impo

[Intel-gfx] [PATCH 042/190] drm/i915: Clean up GPU hang message

2016-01-11 Thread Chris Wilson
Remove some redundant kernel messages as we deduce a hung GPU and capture the error state. v2: Fix "hang" vs "no progress" message whilst I was there Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_irq.c | 21 +++-- 1 file changed, 7 insertions(+), 14 deletions(-) dif

[Intel-gfx] [PATCH 081/190] drm/i915: i915_vma_move_to_active prep patch

2016-01-11 Thread Chris Wilson
This patch is broken out of the next just to remove the code motion from that patch and make it more readable. What we do here is move the i915_vma_move_to_active() to i915_gem_execbuffer.c and put the three stages (read, write, fenced) together so that future modifications to active handling are a

[Intel-gfx] [PATCH 079/190] drm/i915: Reduce the pointer dance of i915_is_ggtt()

2016-01-11 Thread Chris Wilson
The multiple levels of indirect do nothing but hinder the compiler and the pointer chasing turns to be quite painful but painless to fix. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c| 13 ++--- drivers/gpu/drm/i915/i915_drv.h| 7 --- driver

[Intel-gfx] [PATCH 069/190] drm/i915: Remove duplicate golden render state init from execlists

2016-01-11 Thread Chris Wilson
Now that we use the same vfuncs for emitting the batch buffer in both execlists and legacy, the golden render state initialisation is identical between both. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_render_state.c | 22 -- drivers/gpu/drm/i915/i915_gem_render

[Intel-gfx] [PATCH 072/190] drm/i915: Execlists cannot pin a context without the object

2016-01-11 Thread Chris Wilson
Given that the intel_lr_context_pin cannot succeed without the object, we cannot reach intel_lr_context_unpin() without first allocating that object - so we can remove the redundant test. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_lrc.c | 19 --- 1 file changed, 8

[Intel-gfx] [PATCH 049/190] drm/i915: Disable waitboosting for mmioflips/semaphores

2016-01-11 Thread Chris Wilson
Since commit a6f766f3975185af66a31a2cea2cd38721645999 Author: Chris Wilson Date: Mon Apr 27 13:41:20 2015 +0100 drm/i915: Limit ring synchronisation (sw sempahores) RPS boosts and commit bcafc4e38b6ad03f48989b7ecaff03845b5b7acf Author: Chris Wilson Date: Mon Apr 27 13:41:21 2015 +0100

[Intel-gfx] [PATCH 003/190] drm/i915: Add an optional selection from i915 of CONFIG_MMU_NOTIFIER

2016-01-11 Thread Chris Wilson
userptr requires mmu-notifier for full unprivileged support. Most systems have mmu-notifier support already enabled as a requirement for virtualisation support, but we should make the option for i915 to take advantage of mmu-notifiers explicit (and enable by default so that regular userspace can ta

[Intel-gfx] [PATCH 086/190] drm/i915: Mark the context and address space as closed

2016-01-11 Thread Chris Wilson
When the user closes the context mark it and the dependent address space as closed. As we use an asynchronous destruct method, this has two purposes. First it allows us to flag the closed context and detect internal errors if we to create any new objects for it (as it is removed from the user's nam

[Intel-gfx] [PATCH 052/190] drm/i915: Treat ringbuffer writes as write to normal memory

2016-01-11 Thread Chris Wilson
Ringbuffers are now being written to either through LLC or WC paths, so treating them as simply iomem is no longer adequate. However, for the older !llc hardware, the hardware is documentated as treating the TAIL register update as serialising, so we can relax the barriers when filling the rings (b

[Intel-gfx] [PATCH 085/190] drm/i915: Release vma when the handle is closed

2016-01-11 Thread Chris Wilson
In order to prevent a leak of the vma on shared objects, we need to hook into the object_close callback to destroy the vma on the object for this file. However, if we destroyed that vma immediately we may cause unexpected application stalls as we try to unbind a busy vma - hence we defer the unbind

  1   2   3   4   5   >