These tests exercise the userptr ioctl to create shared buffers
between CPU and GPU. They contain error and normal usage scenarios.
They also contain a couple of stress tests which copy buffers between
CPU and GPU. These tests rely on the softpin patch in order to pin buffers
to a certain VA.
On 29 November 2015 at 12:47, Daniel Vetter wrote:
> On Fri, Nov 27, 2015 at 07:37:53PM -0500, Ilia Mirkin wrote:
>> On Fri, Nov 27, 2015 at 10:40 AM, Emil Velikov
>> wrote:
>> > On 27 November 2015 at 15:10, Daniel Vetter
We have relied upon the sole caller (wait_ioctl) validating the timeout
argument. However, when waiting for multiple requests I forgot to ensure
that the timeout was still positive on the later requests. This is more
simply done inside __i915_wait_request.
Fixes a minor regression introduced in
We have testcases to ensure that seqno wraparound works fine, so we can
forgo forcing everyone to encounter seqno wraparound during early
uptime. seqno wraparound incurs a full GPU stall so not forcing it
will eliminate one jitter from the early system.
Advancing the global next_seqno after a GPU
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/intel_ringbuffer.c | 6 +-
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3d59dd555e64..ccceb43f14ac 100644
Instead of querying the reset counter before every access to the ring,
query it the first time we touch the ring, and do a final compare when
submitting the request. For correctness, we need to then sanitize how
the reset_counter is incremented to prevent broken submission and
waiting across
Avoid the two calls to ktime_get_raw_ns() (at best it reads the TSC) as
we only need to compute the elapsed time for a timed wait.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem.c | 13 +
1 file changed, 5 insertions(+), 8 deletions(-)
diff
One particularly stressful scenario consists of many independent tasks
all competing for GPU time and waiting upon the results (e.g. realtime
transcoding of many, many streams). One bottleneck in particular is that
each client waits on its own results, but every client is woken up after
every
When waiting for high frequency requests, the finite amount of time
required to set up the irq and wait upon it limits the response rate. By
busywaiting on the request completion for a short while we can service
the high frequency waits as quick as possible. However, if it is a slow
request, we
If we do not have lowlevel support for reseting the GPU, or if the user
has explicitly disabled reseting the device, the failure is expected.
Since it is an expected failure, we should be using a lower priority
message than *ERROR*, perhaps NOTICE. In the absence of DRM_NOTICE, just
emit the
We can forgo queuing the hangcheck from the start of every request to
until we wait upon a request. This reduces the overhead of every
request, but may increase the latency of detecting a hang. Howeever, if
nothing every waits upon a hang, did it ever hang? It also improves the
robustness of the
Now that we have split out the seqno-barrier from the
engine->get_seqno() callback itself, we can move the users of the
seqno-barrier to the required callsites simplifying the common code and
making the required workaround handling much more explicit.
Signed-off-by: Chris Wilson
Limit busywaiting only to the request currently being processed by the
GPU. If the request is not currently being processed by the GPU, there
is a very low likelihood of it being completed within the 2 microsecond
spin timeout and so we will just be wasting CPU cycles.
v2: Check for logical
The first 3 patches are cc:stable for reducing the negative
side-effects of busywaiting. Following on from them, we have a set of
patches to prevent the thundering herd issue with multiple concurrent
waiters and to optimise the waiters. Since this has been flagged as
severely impacting some client
The busywait in __i915_spin_request() does not respect pending signals
and so may consume the entire timeslice for the task instead of
returning to userspace to handle the signal.
In the worst case this could cause a delay in signal processing of 20ms,
which would be a noticeable jitter in cursor
In order to simplify the next couple of patches, extract the
lazy_coherency optimisation our of the engine->get_seqno() vfunc into
its own callback.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 10 -
drivers/gpu/drm/i915/i915_drv.h
By using the same address for storing the HWS on every platform, we can
remove the platform specific vfuncs and reduce the get-seqno routine to
a single read of a cached memory location.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 6 +--
After the GPU reset and we discard all of the incomplete requests, mark
the GPU as having advanced to the last_submitted_seqno (as having
completed the requests and ready for fresh work). The impact of this is
negligble, as all the requests will be considered completed by this
point, it just
On Fri, Nov 27, 2015 at 07:37:53PM -0500, Ilia Mirkin wrote:
> On Fri, Nov 27, 2015 at 10:40 AM, Emil Velikov
> wrote:
> > On 27 November 2015 at 15:10, Daniel Vetter wrote:
> >> It only leads to bloodshed and tears - we don't bother to restore
On Fri, Nov 27, 2015 at 03:29:52PM +, Emil Velikov wrote:
> On 27 November 2015 at 15:10, Daniel Vetter wrote:
> > This allows us to ditch a ton of ugly #ifdefs from a bunch of drm modeset
> > drivers.
> >
> > v2: Make the dummy function actually return a sane value,
Hello,
We launched Intel GPU Tools on 7 platforms: Skylake-Y, Braswell-M,
Broadwell-U, Baytrail-M, Haswell-U, Ivy Bridge and Sandy Bridge to
validate tag drm-intel-testing-2015-11-20 (kernel 4.4-rc1).
Results:
``
| Avail | Black | Skip | Not
On Saturday, November 28, 2015 10:34:24 AM Imre Deak wrote:
> The runtime PM core doesn't treat EBUSY and EAGAIN retvals from the driver
> suspend hooks as errors, but they still show up as errors in dmesg. Tune
> them down.
>
> One problem caused by this was noticed by Daniel: the i915 driver
>
On Sat, Nov 28, 2015 at 05:36:54AM +1000, Dave Airlie wrote:
> On 28 November 2015 at 05:05, Linus Torvalds
> wrote:
> > On Thu, Nov 19, 2015 at 8:07 PM, Dave Airlie wrote:
> >>
> >> core: Atomic fixes and Atomic helper fixes
> >> i915: Revert for
From: Akash Goel
When the object is moved out of CPU read domain, the cachelines
are not invalidated immediately. The invalidation is deferred till
next time the object is brought back into CPU read domain.
But the invalidation is done unconditionally, i.e. even for the
(Resent cause of moderation)
This implements a highly needed feature in a minimal non instructive way.
Consider a Limited Range display (like most TVs) where you want to watch a
decoded video. The TV is set to limited range and the intel driver also
uses full scaling Limited 16:235 mode, e.g. if
On 11/25/2015 3:30 PM, Daniel Vetter wrote:
On Wed, Nov 25, 2015 at 02:57:47PM +0530, Goel, Akash wrote:
On 11/25/2015 2:51 PM, Daniel Vetter wrote:
On Tue, Nov 24, 2015 at 10:39:38PM +, Chris Wilson wrote:
On Tue, Nov 24, 2015 at 07:14:31PM +0100, Daniel Vetter wrote:
On Tue, Nov
26 matches
Mail list logo