On Fri, Jan 08, 2016 at 09:10:38AM +, Chris Wilson wrote:
> gem_concurrent_blit tries to ensure that it doesn't try and run a test
> that would grind the system to a halt, i.e. unexpectedly cause swap
> thrashing. It currently calls intel_require_memory(), but outside of
> the subtest (as the t
On Fri, Jan 08, 2016 at 11:29:40AM +, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin
>
> Same effect for slightly less source code and resulting binary.
>
> Signed-off-by: Tvrtko Ursulin
Reviewed-by: Daniel Vetter
> ---
> drivers/gpu/drm/i915/intel_lrc.c | 15 ++-
> 1 file chan
On Fri, Jan 08, 2016 at 11:29:41AM +, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin
>
> Looks like the sleeping loop in __i915_wait_request can be
> simplified by using io_schedule_timeout instead of setting
> up and destroying a timer.
>
> Signed-off-by: Tvrtko Ursulin
> Cc: Chris Wilson
On Fri, Jan 08, 2016 at 11:29:42AM +, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin
>
> There is no need to check on what Gen we are running on every
> interrupt and every command submission. We can instead set up
> some of that when engines are initialized, store it in the
> engine structure
On Fri, Jan 08, 2016 at 11:29:43AM +, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin
>
> LRCA can change only when it goes from unpinned to pinned so it
> makes sense to check its alignment at that point rather than at
> every batch buffer submission.
>
> Furthermore, if we check it at pin tim
On Fri, Jan 08, 2016 at 11:29:44AM +, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin
>
> We are not allowed to call i915_gem_obj_ggtt_offset from irq
> context without the big kernel lock held.
>
> LRCA lifetime is well defined so cache it so it can be looked up
> cheaply from the interrupt co
On Fri, Jan 08, 2016 at 11:29:45AM +, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin
>
> No need to call ktime_get_raw_ns twice per unlimited wait and can
> also elimate a local variable.
>
> Signed-off-by: Tvrtko Ursulin
> ---
> drivers/gpu/drm/i915/i915_gem.c | 12 +++-
> 1 file ch
On Fri, Jan 08, 2016 at 01:29:14PM +, Tvrtko Ursulin wrote:
>
> On 08/01/16 11:29, Tvrtko Ursulin wrote:
> >From: Tvrtko Ursulin
> >
> >Purpose is to catch places which iterate the object VMA list
> >without holding the big lock.
> >
> >Implemented by open coding list_for_each_entry to make t
On Fri, Jan 08, 2016 at 11:44:04AM +, Chris Wilson wrote:
> On Fri, Jan 08, 2016 at 11:29:46AM +, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin
> >
> > Purpose is to catch places which iterate the object VMA list
> > without holding the big lock.
> >
> > Implemented by open coding list_
On Fri, Jan 08, 2016 at 11:29:50AM +, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin
>
> Purpose is to avoid calling i915_gem_obj_ggtt_offset from the
> interrupt context without the big lock held.
>
> Signed-off-by: Tvrtko Ursulin
> ---
> drivers/gpu/drm/i915/intel_lrc.c| 3 +--
> d
On Fri, Jan 08, 2016 at 11:29:48AM +, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin
>
> Engine initialization would have failed if those two weren't
> pinned and calling i915_gem_obj_is_pinned is illegal from irq
> context without the big lock held.
>
> Signed-off-by: Tvrtko Ursulin
Reviewe
On Mon, Jan 11, 2016 at 01:09:50PM +0530, Goel, Akash wrote:
>
>
> On 1/10/2016 11:09 PM, Chris Wilson wrote:
> >On Sat, Jan 09, 2016 at 05:00:21PM +0530, akash.g...@intel.com wrote:
> >>From: Akash Goel
> >>
> >>Gen9 has an additional address translation hardware support in form of
> >>Tiled Re
On Mon, Jan 11, 2016 at 09:00:13AM +0100, Daniel Vetter wrote:
> On Fri, Jan 08, 2016 at 09:10:38AM +, Chris Wilson wrote:
> > gem_concurrent_blit tries to ensure that it doesn't try and run a test
> > that would grind the system to a halt, i.e. unexpectedly cause swap
> > thrashing. It current
== Summary ==
Built on ff88655b3a5467bbc3be8c67d3e05ebf182557d3 drm-intel-nightly:
2016y-01m-11d-07h-30m-16s UTC integration manifest
Test gem_storedw_loop:
Subgroup basic-render:
dmesg-warn -> PASS (bdw-ultra)
dmesg-warn -> PASS (skl-i7k-2) UN
== Summary ==
HEAD is now at ff88655 drm-intel-nightly: 2016y-01m-11d-07h-30m-16s UTC
integration manifest
Applying: drm/i915: Use passed plane state for sprite planes, v4.
Using index info to reconstruct a base tree...
M drivers/gpu/drm/i915/intel_drv.h
M drivers/gpu/drm/i915/intel_s
On Mon, Jan 11, 2016 at 08:57:33AM +0100, Daniel Vetter wrote:
> On Fri, Jan 08, 2016 at 08:44:29AM +, Chris Wilson wrote:
> > Some stress tests create both the signal helper and a lot of competing
> > processes. In these tests, the parent is just waiting upon the children,
> > and the intentio
On Mon, Jan 11, 2016 at 08:50:43AM +0100, Daniel Vetter wrote:
> On Fri, Jan 08, 2016 at 06:58:45PM +0200, Mika Kuoppala wrote:
> > With gen9+ the edram capabilities are defined so
> > that we can calculate the edram (ellc) size accordingly.
> >
> > Note that there are undefined combinations for s
On Mon, Jan 11, 2016 at 08:54:59AM +, Chris Wilson wrote:
> On Mon, Jan 11, 2016 at 08:57:33AM +0100, Daniel Vetter wrote:
> > On Fri, Jan 08, 2016 at 08:44:29AM +, Chris Wilson wrote:
> > > Some stress tests create both the signal helper and a lot of competing
> > > processes. In these tes
On Mon, Jan 11, 2016 at 08:52:24AM +, Chris Wilson wrote:
> On Mon, Jan 11, 2016 at 09:00:13AM +0100, Daniel Vetter wrote:
> > On Fri, Jan 08, 2016 at 09:10:38AM +, Chris Wilson wrote:
> > > gem_concurrent_blit tries to ensure that it doesn't try and run a test
> > > that would grind the sy
== Summary ==
Built on ff88655b3a5467bbc3be8c67d3e05ebf182557d3 drm-intel-nightly:
2016y-01m-11d-07h-30m-16s UTC integration manifest
Test gem_storedw_loop:
Subgroup basic-render:
pass -> DMESG-WARN (skl-i5k-2) UNSTABLE
dmesg-warn -> PASS (bdw-
As paranoia, we want to ensure that the CPU's PTEs have been revoked for
the object before we return from i915_gem_release_mmap(). This allows us
to rely on there being no outstanding memory accesses and guarantees
serialisation of the code against concurrent access just by calling
i915_gem_release
As we add the VMA to the request early, it may be cancelled during
execbuf reservation. This will leave the context object pointing to a
dangling request; i915_wait_request() simply skips the wait and so we
may unbind the object whilst it is still active.
However, if at any point we make a change
When userspace closes a handle, we remove it from the file->object_idr
and then tell the driver to drop its references to that file/handle.
However, as the file/handle is already available again for reuse, it may
be reallocated back to userspace and active on a new object before the
driver has had
Currently there is a #define to enable extra BUG_ON for debugging
requests and associated activities. I want to expand its use to cover
all of GEM internals (so that we can saturate the code with asserts).
We can add a Kconfig option to make it easier to enable - with the usual
caveats of not enabl
This is principally a little bit of syntatic sugar to hide the
atomic_read()s throughout the code to retrieve the current reset_counter.
It also provides the other utility functions to check the reset state on the
already read reset_counter, so that (in later patches) we can read it once
and do mul
Now that the reset_counter is stored on the request, we can rearrange
the code to handle reading the counter versus waiting during the atomic
modesetting for readibility (by deleting the hairiest of codes).
Signed-off-by: Chris Wilson
Cc: Daniel Vetter
Reviewed-by: Daniel Vetter
---
drivers/gp
Our driver compiles clean (nowadays thanks to 0day) but for me, at least,
it would be beneficial if the compiler threw an error rather than a
warning when it found a piece of suspect code. (I use this to
compile-check patch series and want to break on the first compiler error
in order to fix the pa
In order to ensure seqno/irq coherency, we current read a ring register.
We are not sure quite how it works, only that is does. Experiments show
that e.g. doing a clflush(seqno) instead is not sufficient, but we can
remove the forcewake dance from the mmio access.
v2: Baytrail wants a clflush too.
As the request is only valid during the same global reset epoch, we can
record the current reset_counter when constructing the request and reuse
it when waiting upon that request in future. This removes a very hairy
atomic check serialised by the struct_mutex at the time of waiting and
allows us to
Now that we have split out the seqno-barrier from the
engine->get_seqno() callback itself, we can move the users of the
seqno-barrier to the required callsites simplifying the common code and
making the required workaround handling much more explicit.
Signed-off-by: Chris Wilson
---
drivers/gpu/
We have testcases to ensure that seqno wraparound works fine, so we can
forgo forcing everyone to encounter seqno wraparound during early
uptime. seqno wraparound incurs a full GPU stall so not forcing it
will eliminate one jitter from the early system. Using the testcases, we
have very determinist
Only declare a missed interrupt if we find that the GPU is idle with
waiters and a hangcheck interval has passed in which no new user
interrupts have been raised.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 6 ++
drivers/gpu/drm/i915/i915_irq.c | 10 +++
If we do not have lowlevel support for reseting the GPU, or if the user
has explicitly disabled reseting the device, the failure is expected.
Since it is an expected failure, we should be using a lower priority
message than *ERROR*, perhaps NOTICE. In the absence of DRM_NOTICE, just
emit the expect
igt likes to inject GPU hangs into its command streams. However, as we
expect these hangs, we don't actually want them recorded in the dmesg
output or stored in the i915_error_state (usually). To accomodate this
allow userspace to set a flag on the context that any hang emanating
from that context
Since the tests can and do explicitly check debugfs/i915_ring_missed_irqs
for the handling of a "missed interrupt", adding it to the dmesg at INFO
is just noise. When it happens for real, we still class it as an ERROR.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_irq.c | 3 ---
1 fi
With only a single callsite for intel_engine_cs->irq_get and ->irq_put,
we can reduce the code size by moving the common preamble into the
caller, and we can also eliminate the reference counting.
For completeness, as we are no longer doing reference counting on irq,
rename the get/put vfunctions
If we flag the seqno as potentially stale upon receiving an interrupt,
we can use that information to reduce the frequency that we apply the
heavyweight coherent seqno read (i.e. if we wake up a chain of waiters).
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_drv.h | 15
Reporting -EIO from i915_wait_request() has proven very troublematic
over the years, with numerous hard-to-reproduce bugs cropping up in the
corner case of where a reset occurs and the code wasn't expecting such
an error.
If the we reset the GPU or have detected a hang and wish to reset the
GPU, t
In order to simplify the next couple of patches, extract the
lazy_coherency optimisation our of the engine->get_seqno() vfunc into
its own callback.
v2: Rename the barrier to engine->irq_seqno_barrier to try and better
reflect that the barrier is only required after the user interrupt before
readi
Ideally, we want to automagically have the GPU respond to the
instantaneous load by reclocking itself. However, reclocking occurs
relatively slowly, and to the client waiting for a result from the GPU,
too late. To compensate and reduce the client latency, we allow the
first wait from a client to b
We now have two implementations for vmapping a whole object, one for
dma-buf and one for the ringbuffer. If we couple the vmapping into the
obj->pages lifetime, then we can reuse an obj->vmapping for both and at
the same time couple it into the shrinker.
v2: Mark the failable kmalloc() as __GFP_NO
We want to restrict waitboosting to known process contexts, where we can
track which clients are receiving waitboosts and prevent excessive power
wasting. For fence_wait() we do not have any client tracking and so that
leaves it open to abuse.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915
The queue only ever contains at most one item and has no special flags.
It is just a very simple wrapper around the system-wq - a complication
with no benefits.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_dma.c | 11 ---
drivers/gpu/drm/i915/i915_drv.h | 1 -
drivers/gpu/d
If we move the release of the GEM request (i.e. decoupling it from the
various lists used for client and context tracking) after it is complete
(either by the GPU retiring the request, or by the caller cancelling the
request), we can remove the requirement that the final unreference of
the GEM requ
Now that we have disambuigated ring and engine, we can use the clearer
and more consistent name for the intel_ringbuffer pointer in the
request.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem.c| 8 +-
drivers/gpu/drm/i915/i915_gem_context.c| 2 +-
drivers/gpu/d
The request tells us where to read the ringbuf from, so use that
information to simplify the error capture. If no request was active at
the time of the hang, the ring is idle and there is no information
inside the ring pertaining to the hang.
Note carefully that this will reduce the amount of info
As we only ever keep the first error state around, we can avoid some
work that can be quite intrusive if we don't record the error the second
time around. This does move the race whereby the user could discard one
error state as the second is being captured, but that race exists in the
current code
In order to disambiguate between the pointer to the intel_engine_cs
(called ring) and the intel_ringbuffer (called ringbuf), rename
s/ring/engine/.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 11 +--
drivers/gpu/drm/i915/i915_drv.h | 2 +-
drive
Both perform the same actions with more or less indirection, so just
unify the code.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem.c| 2 +-
drivers/gpu/drm/i915/i915_gem_context.c| 8 +-
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 34 -
drivers/gpu/d
Describe the intent of boosting the GPU frequency to maximum before
waiting on the GPU.
RPS waitboosting was introduced with
commit b29c19b645287f7062e17d70fa4e9781a01a5d88
Author: Chris Wilson
Date: Wed Sep 25 17:34:56 2013 +0100
drm/i915: Boost RPS frequency for CPU stalls
but lacked a
Now that we derive requests from struct fence, swap over to its
nomenclature for references. It's shorter and more idiomatic across the
kernel.
s/i915_gem_request_reference/i915_gem_request_get/
s/i915_gem_request_unreference/i915_gem_request_put/
Signed-off-by: Chris Wilson
---
drivers/gpu/drm
If we, when we store the reset_counter for the operation, we ensure that
it is not in a wedged or in the middle of a reset, we can then assert that
if any reset occurs the reset_counter must change. Later we can just
compare the operation's reset epoch against the current counter to see
if we need
By using the same address for storing the HWS on every platform, we can
remove the platform specific vfuncs and reduce the get-seqno routine to
a single read of a cached memory location.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 10 ++--
drivers/gpu/drm/i915/i915
I have instances where I want to use drm_malloc_ab() but with a custom
gfp mask. And with those, where I want a temporary allocation, I want to
try a high-order kmalloc() before using a vmalloc().
So refactor my usage into drm_malloc_gfp().
Signed-off-by: Chris Wilson
Cc: dri-de...@lists.freedes
In a few frequent cases, having a direct pointer to the drm_i915_private
from the request is very useful.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem.c| 7 +++---
drivers/gpu/drm/i915/i915_gem_context.c| 21 +-
drivers/gpu/drm/i915/i915_gem_exec
Avoid the two calls to ktime_get_raw_ns() (at best it reads the TSC) as
we only need to compute the elapsed time for a timed wait.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem.c | 13 +
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i91
In the next patch, request tracking is made more generic and for that we
need a new expanded struct and to separate out the logic changes from
the mechanical churn, we split out the structure renaming into this
patch.
v2: Writer's block. Add some spiel about why we track requests.
v3: Now i915_gem
Combine the near identical implementations of intel_logical_ring_begin()
and intel_ring_begin() - the only difference is that the logical wait
has to check for a matching ring (which is assumed by legacy).
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/intel_lrc.c| 141 ++--
Both the ->dispatch_execbuffer and ->emit_bb_start callbacks do exactly
the same thing, add MI_BATCHBUFFER_START to the request's ringbuffer -
we need only one vfunc.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 6 +--
drivers/gpu/drm/i915/i915_gem_render_state
Make sure that the RPS bottom-half is flushed before we set the idle
frequency when we decide the GPU is idle. This should prevent any races
with the bottom-half and setting the idle frequency, and ensures that
the bottom-half is bounded by the GPU's rpm reference taken for when it
is active (i.e.
If we convert the tracing over from direct use of ring->irq_get() and
over to the breadcrumb infrastructure, we only have a single user of the
ring->irq_get and so we will be able to simplify the driver routines
(eliminating the redundant validation and irq refcounting).
v2: Move to a signaling fr
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_request.c | 8 +-
drivers/gpu/drm/i915/intel_lrc.c| 14 ++--
drivers/gpu/drm/i915/intel_ringbuffer.c | 129 +---
drivers/gpu/drm/i915/intel_ringbuffer.h | 21 +++---
4 files changed, 87 insertion
If we have multiple waiters, we may find that many complete on the same
wake up. If we first inspect the seqno from the CPU cache, we may reduce
the number of heavyweight coherent seqno reads we require.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_drv.h | 14 ++
1 file
Space for flushing the GPU cache prior to completing the request is
preallocated and so cannot fail.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_context.c| 2 +-
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 9 +---
drivers/gpu/drm/i915/i915_gem_gtt.c| 18
Having ringbuf->ring point to an engine is confusing, so rename it once
again to ring->engine.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_guc_submission.c | 10 +++---
drivers/gpu/drm/i915/intel_lrc.c | 35 +--
drivers/gpu/drm/i915/intel_ringbuffer.c|
dma-buf provides a generic fence class for interoperation between
drivers. Internally we use the request structure as a fence, and so with
only a little bit of interfacing we can rebase those requests on top of
dma-buf fences. This will allow us, in the future, to pass those fences
back to userspac
Since the function is a small wrapper around schedule_delayed_work(),
move it inline to remove the function call overhead for the principle
caller.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_drv.h | 17 -
drivers/gpu/drm/i915/i915_irq.c | 16
2 fil
In the reset_counter, we use two bits to track a GPU hang and reset. The
low bit is a "reset-in-progress" flag that we set to signal when we need
to break waiters in order for the recovery task to grab the mutex. As
soon as the recovery task has the mutex, we can clear that flag (which
we do by inc
Now that we have (near) universal GPU recovery code, we can inject a
real hang from userspace and not need any fakery. Not only does this
mean that the testing is far more realistic, but we can simplify the
kernel in the process.
v2: Replace the i915_stop_rings with a dummy implementation as igt
e
When reading from the HWS page, we use barrier() to prevent the compiler
optimising away the read from the volatile (may be updated by the GPU)
memory address. This is more suited to READ_ONCE(); make it so.
Signed-off-by: Chris Wilson
Cc: Daniel Vetter
---
drivers/gpu/drm/i915/intel_ringbuffer
With the introduction of requests, we amplified the number of atomic
refcounted objects we use and update every execbuffer; from none to
several references, and a set of references that need to be changed. We
also introduced interesting side-effects in the order of retiring
requests and objects.
I
Initialising the global GTT is tricky as we wish to use the drm_mm range
manager during the modesetting initialisation (to capture stolen
allocations from the BIOS) before we actually enable GEM. To overcome
this, we currently setup the drm_mm first and then carefully rebind
them.
Signed-off-by: C
The retire worker is a low frequency task that makes sure we retire
outstanding requests if userspace is being lax. We only need to start it
once as it remains active until the GPU is idle, so do a cheap test
before the more expensive queue_work(). A consequence of this is that we
need correct lock
Now that emitting requests is identical between legacy and execlists, we
can use the same function to build up the ring for submitting to either
engine. (With the exception of i915_switch_contexts(), but in time that
will also be handled gracefully.)
Signed-off-by: Chris Wilson
---
drivers/gpu/d
We can forgo queuing the hangcheck from the start of every request to
until we wait upon a request. This reduces the overhead of every
request, but may increase the latency of detecting a hang. Howeever, if
nothing every waits upon a hang, did it ever hang? It also improves the
robustness of the wa
Perform s/ringbuf/ring/ on the context struct for consistency with the
ring/engine split.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c| 2 +-
drivers/gpu/drm/i915/i915_drv.h| 2 +-
drivers/gpu/drm/i915/i915_guc_submission.c | 6 +--
drivers/gpu/drm/i
Rather than persistently postponing the idle-work everytime somebody
calls i915_gem_retire_requests() (potentially ensuring that we never
reach the idle state), queue the work the first time we detect all
requests are complete. Then if in 100ms, more requests have been queued,
we will abort the idl
One particularly stressful scenario consists of many independent tasks
all competing for GPU time and waiting upon the results (e.g. realtime
transcoding of many, many streams). One bottleneck in particular is that
each client waits on its own results, but every client is woken up after
every batch
When we call i915_vma_unbind(), we will wait upon outstanding rendering.
This will also trigger a retirement phase, which may update the object
lists. If, we extend request tracking to the VMA itself (rather than
keep it at the encompassing object), then there is a potential that the
obj->vma_list
We know, by design, that whilst the GPU is active (and thus we are
throttling) the retire_worker is queued. Therefore attempting to requeue
it with queue_delayed_work() is a no-op and we can safely remove it.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem.c | 3 ---
1 file changed
Migrate the request operations out of the main body of i915_gem.c and
into their own C file for easier expansion.
v2: Move __i915_add_request() across as well
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/Makefile | 1 +
drivers/gpu/drm/i915/i915_drv.h | 205 +
For the global GTT (and aliasing GTT), the address space is owned by the
device (it is a global resource) and so the per-file owner field is
NULL. For per-process GTT (where we create an address space per
context), each is owned by the opening file. We can use this ownership
information to both dis
As we can now have multiple VMA inside the global GTT (with partial
mappings, rotations, etc), it is no longer true that there may just be a
single GGTT entry and so we should walk the full vma_list to count up
the actual usage. In addition to unifying the two walkers, switch from
multiplying the o
Rather than recomputing whether semaphores are enabled, we can do that
computation once during early initialisation as the i915.semaphores
module parameter is now read-only.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 2 +-
drivers/gpu/drm/i915/i915_dma.c |
If is simpler and leads to more readable code through the callstack if
the allocation returns the allocated struct through the return value.
The importance of this is that it no longer looks like we accidentally
allocate requests as side-effect of calling certain functions.
Signed-off-by: Chris W
Now that we share intel_ring_begin(), reserving space for the tail of
the request is identical between legacy/execlists and so the tautology
can be removed.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_request.c | 7 +++
drivers/gpu/drm/i915/intel_lrc.c| 15
Since requests can no longer be generated as a side-effect of
intel_ring_begin(), we know that the seqno will be unchanged during
ring-emission. This predicatablity then means we do not have to check
for the seqno wrapping around whilst emitting the semaphore for
engine->sync_to().
Signed-off-by:
We use "list" to denote the list and "link" to denote an element on that
list. Rename request->list to match this idiom.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 4 ++--
drivers/gpu/drm/i915/i915_gem.c | 12 ++--
drivers/gpu/drm/i915/i915_gem_req
Using intel_ring_* to refer to the intel_engine_cs functions is most
confusing!
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c| 10 +++
drivers/gpu/drm/i915/i915_dma.c| 8 +++---
drivers/gpu/drm/i915/i915_drv.h| 4 +--
drivers/gpu/drm/i9
In legacy mode, we use the gen6 seqno barrier to insert a delay after
the interrupt before reading the seqno (as the seqno write is not
flushed before the interrupt is sent, the interrupt arrives before the
seqno is visible). Execlists ignored the evidence of igt.
Note that is harder, but not impo
Remove some redundant kernel messages as we deduce a hung GPU and
capture the error state.
v2: Fix "hang" vs "no progress" message whilst I was there
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_irq.c | 21 +++--
1 file changed, 7 insertions(+), 14 deletions(-)
dif
This patch is broken out of the next just to remove the code motion from
that patch and make it more readable. What we do here is move the
i915_vma_move_to_active() to i915_gem_execbuffer.c and put the three
stages (read, write, fenced) together so that future modifications to
active handling are a
The multiple levels of indirect do nothing but hinder the compiler and
the pointer chasing turns to be quite painful but painless to fix.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c| 13 ++---
drivers/gpu/drm/i915/i915_drv.h| 7 ---
driver
Now that we use the same vfuncs for emitting the batch buffer in both
execlists and legacy, the golden render state initialisation is
identical between both.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_render_state.c | 22 --
drivers/gpu/drm/i915/i915_gem_render
Given that the intel_lr_context_pin cannot succeed without the object,
we cannot reach intel_lr_context_unpin() without first allocating that
object - so we can remove the redundant test.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/intel_lrc.c | 19 ---
1 file changed, 8
Since
commit a6f766f3975185af66a31a2cea2cd38721645999
Author: Chris Wilson
Date: Mon Apr 27 13:41:20 2015 +0100
drm/i915: Limit ring synchronisation (sw sempahores) RPS boosts
and
commit bcafc4e38b6ad03f48989b7ecaff03845b5b7acf
Author: Chris Wilson
Date: Mon Apr 27 13:41:21 2015 +0100
userptr requires mmu-notifier for full unprivileged support. Most
systems have mmu-notifier support already enabled as a requirement for
virtualisation support, but we should make the option for i915 to take
advantage of mmu-notifiers explicit (and enable by default so that
regular userspace can ta
When the user closes the context mark it and the dependent address space
as closed. As we use an asynchronous destruct method, this has two purposes.
First it allows us to flag the closed context and detect internal errors if
we to create any new objects for it (as it is removed from the user's
nam
Ringbuffers are now being written to either through LLC or WC paths, so
treating them as simply iomem is no longer adequate. However, for the
older !llc hardware, the hardware is documentated as treating the TAIL
register update as serialising, so we can relax the barriers when filling
the rings (b
In order to prevent a leak of the vma on shared objects, we need to
hook into the object_close callback to destroy the vma on the object for
this file. However, if we destroyed that vma immediately we may cause
unexpected application stalls as we try to unbind a busy vma - hence we
defer the unbind
1 - 100 of 450 matches
Mail list logo