From: John Harrison
Various projects desire a mechanism for managing dependencies between
work items asynchronously. This can also include work items across
complete different and independent systems. For example, an
application wants to retreive a frame from a video in device,
using it for rende
From: John Harrison
The change to the implementation of i915_gem_request_completed() means
that the lazy coherency flag is no longer used. This can now be
removed to simplify the interface.
For: VIZ-5190
Signed-off-by: John Harrison
---
drivers/gpu/drm/i915/i915_debugfs.c | 2 +-
drivers/gpu
From: John Harrison
The request structure is reference counted. When the count reached
zero, the request was immediately freed and all associated objects
were unrefereced/unallocated. This meant that the driver mutex lock
must be held at the point where the count reaches zero. This was fine
while
From: Peter Lawthers
In the 3.14 kernel, a signaled fence was indicated by the status field
== 1. In 4.x, a status == 0 indicates signaled, status < 0 indicates error,
and status > 0 indicates active.
This patch wraps the check for a signaled fence in a function so that
callers no longer needs t
From: John Harrison
The intended usage model for struct fence is that the signalled status
should be set on demand rather than polled. That is, there should not
be a need for a 'signaled' function to be called everytime the status
is queried. Instead, 'something' should be done to enable a signal
From: John Harrison
Added the '_complete' trace event which occurs when a fence/request is
signaled as complete. Also moved the notify event from the IRQ handler
code to inside the notify function itself.
v3: Added the current ring seqno to the notify trace point.
For: VIZ-5190
Signed-off-by: J
From: John Harrison
There is a construct in the linux kernel called 'struct fence' that is
intended to keep track of work that is executed on hardware. I.e. it
solves the basic problem that the drivers 'struct
drm_i915_gem_request' is trying to address. The request structure does
quite a lot more
From: Maarten Lankhorst
Debug output assumes all sync points are built on top of Android sync points
and when we start creating them from dma-fences will NULL ptr deref unless
taught about this.
v4: Corrected patch ownership.
Signed-off-by: Maarten Lankhorst
Signed-off-by: Tvrtko Ursulin
Cc:
From: Maarten Lankhorst
This allows users of dma fences to create a android fence.
v2: Added kerneldoc. (Tvrtko Ursulin).
v4: Updated comments from review feedback my Maarten.
Signed-off-by: Maarten Lankhorst
Signed-off-by: Tvrtko Ursulin
Cc: Maarten Lankhorst
Cc: Daniel Vetter
Cc: Jesse B
From: John Harrison
The sync code has a facility for dumping current state information via
debugfs. It also has a way to re-use the same code for dumping to the
kernel log on an internal error. However, the redirection was rather
clunky and split the output across multiple prints at arbitrary
bou
From: John Harrison
The fence object used inside the request structure requires a sequence
number. Although this is not used by the i915 driver itself, it could
potentially be used by non-i915 code if the fence is passed outside of
the driver. This is the intention as it allows external kernel dr
From: John Harrison
There is a construct in the linux kernel called 'struct fence' that is
intended to keep track of work that is executed on hardware. I.e. it
solves the basic problem that the drivers 'struct
drm_i915_gem_request' is trying to address. The request structure does
quite a lot more
On Fri, Dec 11, 2015 at 05:05:11AM +, Jindal, Sonika wrote:
> How about following instead of two levels of check in the while loop:
>
> unsigned int retry = 3;
>
> do {
> live_status = intel_digital_port_connected(dev_priv,
> hdmi_to_dig_port(intel_hdmi));
>
On to, 2015-12-10 at 23:14 +0100, Rafael J. Wysocki wrote:
> On Thursday, December 10, 2015 11:20:40 PM Imre Deak wrote:
> > On Thu, 2015-12-10 at 22:42 +0100, Rafael J. Wysocki wrote:
> > > On Thursday, December 10, 2015 10:36:37 PM Rafael J. Wysocki
> > > wrote:
> > > > On Thursday, December 10,
On 11/12/15 12:19, Tvrtko Ursulin wrote:
On 11/12/15 11:22, Ankitprasad Sharma wrote:
On Wed, 2015-12-09 at 14:06 +, Tvrtko Ursulin wrote:
Hi,
On 09/12/15 12:46, ankitprasad.r.sha...@intel.com wrote:
From: Ankitprasad Sharma
[snip!]
+/**
+ * Requested flags (currently used fo
On 11/12/15 05:16, Ankitprasad Sharma wrote:
On Thu, 2015-12-10 at 14:15 +, Tvrtko Ursulin wrote:
On 10/12/15 13:17, Ankitprasad Sharma wrote:
On Thu, 2015-12-10 at 09:43 +, Tvrtko Ursulin wrote:
Hi,
Two more comments below:
On 09/12/15 12:46, ankitprasad.r.sha...@intel.com wrote:
On Fri, Dec 11, 2015 at 12:19:09PM +, Dave Gordon wrote:
> On 10/12/15 08:58, Daniel Vetter wrote:
> >On Mon, Dec 07, 2015 at 12:51:49PM +, Dave Gordon wrote:
> >>I think I missed i915_gem_phys_pwrite().
> >>
> >>i915_gem_gtt_pwrite_fast() marks the object dirty for most cases (vit
> >>set_
On 11/12/15 11:22, Ankitprasad Sharma wrote:
On Wed, 2015-12-09 at 14:06 +, Tvrtko Ursulin wrote:
Hi,
On 09/12/15 12:46, ankitprasad.r.sha...@intel.com wrote:
From: Ankitprasad Sharma
Extend the drm_i915_gem_create structure to add support for
creating Stolen memory backed objects. Adde
On 10/12/15 08:58, Daniel Vetter wrote:
On Mon, Dec 07, 2015 at 12:51:49PM +, Dave Gordon wrote:
I think I missed i915_gem_phys_pwrite().
i915_gem_gtt_pwrite_fast() marks the object dirty for most cases (vit
set_to_gtt_domain(), but isn't called for all cases (or can return before
the set_d
Hi,
Some random comments, mostly from the point of view of solving the
thundering herd problem.
On 23/11/15 11:34, john.c.harri...@intel.com wrote:
From: John Harrison
The intended usage model for struct fence is that the signalled status
should be set on demand rather than polled. That is
On Wed, 2015-12-09 at 14:06 +, Tvrtko Ursulin wrote:
> Hi,
>
> On 09/12/15 12:46, ankitprasad.r.sha...@intel.com wrote:
> > From: Ankitprasad Sharma
> >
> > Extend the drm_i915_gem_create structure to add support for
> > creating Stolen memory backed objects. Added a new flag through
> > whic
On 12/11/2015 04:55 PM, Thulasimani, Sivakumar wrote:
On 12/10/2015 8:32 PM, Ville Syrjälä wrote:
On Thu, Dec 10, 2015 at 08:09:01PM +0530, Thulasimani, Sivakumar wrote:
On 12/10/2015 7:08 PM, Ville Syrjälä wrote:
On Thu, Dec 10, 2015 at 03:15:37PM +0200, Ville Syrjälä wrote:
On Thu, Dec 1
After the GPU reset and we discard all of the incomplete requests, mark
the GPU as having advanced to the last_submitted_seqno (as having
completed the requests and ready for fresh work). The impact of this is
negligble, as all the requests will be considered completed by this
point, it just brings
Since the tests can and do explicitly check debugfs/i915_ring_missed_irqs
for the handling of a "missed interrupt", adding it to the dmesg at INFO
is just noise. When it happens for real, we still class it as an ERROR.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_irq.c | 3 ---
1 fi
Avoid the two calls to ktime_get_raw_ns() (at best it reads the TSC) as
we only need to compute the elapsed time for a timed wait.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem.c | 13 +
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i91
Describe the intent of boosting the GPU frequency to maximum before
waiting on the GPU.
RPS waitboosting was introduced with
commit b29c19b645287f7062e17d70fa4e9781a01a5d88
Author: Chris Wilson
Date: Wed Sep 25 17:34:56 2013 +0100
drm/i915: Boost RPS frequency for CPU stalls
but lacked a
Only declare a missed interrupt if we find that the GPU is idle with
waiters and a hangcheck interval has passed in which no new user
interrupts have been raised.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 6 ++
drivers/gpu/drm/i915/i915_irq.c | 7 +
Make sure that the RPS bottom-half is flushed before we set the idle
frequency when we decide the GPU is idle. This should prevent any races
with the bottom-half and setting the idle frequency, and ensures that
the bottom-half is bounded by the GPU's rpm reference taken for when it
is active (i.e.
If we convert the tracing over from direct use of ring->irq_get() and
over to the breadcrumb infrastructure, we only have a single user of the
ring->irq_get and so we will be able to simplify the driver routines
(eliminating the redundant validation and irq refcounting).
Signed-off-by: Chris Wilso
When waiting for high frequency requests, the finite amount of time
required to set up the irq and wait upon it limits the response rate. By
busywaiting on the request completion for a short while we can service
the high frequency waits as quick as possible. However, if it is a slow
request, we wan
Ideally, we want to automagically have the GPU respond to the
instantaneous load by reclocking itself. However, reclocking occurs
relatively slowly, and to the client waiting for a result from the GPU,
too late. To compensate and reduce the client latency, we allow the
first wait from a client to b
With only a single callsite for intel_engine_cs->irq_get and ->irq_put,
we can reduce the code size by moving the common preamble into the
caller, and we can also eliminate the reference counting.
For completeness, as we are no longer doing reference counting on irq,
rename the get/put vfunctions
When reading from the HWS page, we use barrier() to prevent the compiler
optimising away the read from the volatile (may be updated by the GPU)
memory address. This is more suited to READ_ONCE(); make it so.
Signed-off-by: Chris Wilson
Cc: Daniel Vetter
---
drivers/gpu/drm/i915/intel_ringbuffer
If we do not have lowlevel support for reseting the GPU, or if the user
has explicitly disabled reseting the device, the failure is expected.
Since it is an expected failure, we should be using a lower priority
message than *ERROR*, perhaps NOTICE. In the absence of DRM_NOTICE, just
emit the expect
Reporting -EIO from i915_wait_request() has proven very troublematic
over the years, with numerous hard-to-reproduce bugs cropping up in the
corner case of where a reset occurs and the code wasn't expecting such
an error.
If the we reset the GPU or have detected a hang and wish to reset the
GPU, t
In legacy mode, we use the gen6 seqno barrier to insert a delay after
the interrupt before reading the seqno (as the seqno write is not
flushed before the interrupt is sent, the interrupt arrives before the
seqno is visible). Execlists ignored the evidence of igt.
Signed-off-by: Chris Wilson
---
The retire worker is a low frequency task that makes sure we retire
outstanding requests if userspace is being lax. We only need to start it
once as it remains active until the GPU is idle, so do a cheap test
before the more expensive queue_work(). A consequence of this is that we
need correct lock
In the reset_counter, we use two bits to track a GPU hang and reset. The
low bit is a "reset-in-progress" flag that we set to signal when we need
to break waiters in order for the recovery task to grab the mutex. As
soon as the recovery task has the mutex, we can clear that flag (which
we do by inc
We have testcases to ensure that seqno wraparound works fine, so we can
forgo forcing everyone to encounter seqno wraparound during early
uptime. seqno wraparound incurs a full GPU stall so not forcing it
will eliminate one jitter from the early system. Using the testcases, we
have very determinist
One particularly stressful scenario consists of many independent tasks
all competing for GPU time and waiting upon the results (e.g. realtime
transcoding of many, many streams). One bottleneck in particular is that
each client waits on its own results, but every client is woken up after
every batch
This is principally a little bit of syntatic sugar to hide the
atomic_read()s throughout the code to retrieve the current reset_counter.
It also provides the other utility functions to check the reset state on the
already read reset_counter, so that (in later patches) we can read it once
and do mul
Limit busywaiting only to the request currently being processed by the
GPU. If the request is not currently being processed by the GPU, there
is a very low likelihood of it being completed within the 2 microsecond
spin timeout and so we will just be wasting CPU cycles.
v2: Check for logical invers
Now that the reset_counter is stored on the request, we can rearrange
the code to handle reading the counter versus waiting during the atomic
modesetting for readibility (by deleting the hairiest of codes).
Signed-off-by: Chris Wilson
Cc: Daniel Vetter
---
drivers/gpu/drm/i915/intel_display.c |
We can forgo queuing the hangcheck from the start of every request to
until we wait upon a request. This reduces the overhead of every
request, but may increase the latency of detecting a hang. Howeever, if
nothing every waits upon a hang, did it ever hang? It also improves the
robustness of the wa
If we have multiple waiters, we may find that many complete on the same
wake up. If we first inspect the seqno from the CPU cache, we may reduce
the number of heavyweight coherent seqno reads we require.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem.c | 14 ++
1 file
Now that we have split out the seqno-barrier from the
engine->get_seqno() callback itself, we can move the users of the
seqno-barrier to the required callsites simplifying the common code and
making the required workaround handling much more explicit.
Signed-off-by: Chris Wilson
---
drivers/gpu/
By using the same address for storing the HWS on every platform, we can
remove the platform specific vfuncs and reduce the get-seqno routine to
a single read of a cached memory location.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 10 ++--
drivers/gpu/drm/i915/i915
The biggest change is the revised bottom-half for handling user
interupts (now we use the waiter on the oldest request as the
bottom-half). That and the review feedback on Daniel on handling resets
(and hangcheck) during the wait. Oh, and some interrupt/seqno timing review.
Available from
http://c
The busywait in __i915_spin_request() does not respect pending signals
and so may consume the entire timeslice for the task instead of
returning to userspace to handle the signal.
In the worst case this could cause a delay in signal processing of 20ms,
which would be a noticeable jitter in cursor
The queue only ever contains at most one item and has no special flags.
It is just a very simple wrapper around the system-wq - a complication
with no benefits.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_dma.c | 11 ---
drivers/gpu/drm/i915/i915_drv.h | 1 -
drivers/gpu/d
In order to simplify the next couple of patches, extract the
lazy_coherency optimisation our of the engine->get_seqno() vfunc into
its own callback.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 6 ++---
drivers/gpu/drm/i915/i915_drv.h | 12 ++
driv
Since the function is a small wrapper around schedule_delayed_work(),
move it inline to remove the function call overhead for the principle
caller.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_drv.h | 17 -
drivers/gpu/drm/i915/i915_irq.c | 16
2 fil
As the request is only valid during the same global reset epoch, we can
record the current reset_counter when constructing the request and reuse
it when waiting upon that request in future. This removes a very hairy
atomic check serialised by the struct_mutex at the time of waiting and
allows us to
If we, when we store the reset_counter for the operation, we ensure that
it is not in a wedged or in the middle of a reset, we can then assert that
if any reset occurs the reset_counter must change. Later we can just
compare the operation's reset epoch against the current counter to see
if we need
In order to ensure seqno/irq coherency, we current read a ring register.
We are not sure quite how it works, only that is does. Experiments show
that e.g. doing a clflush(seqno) instead is not sufficient, but we can
remove the forcewake dance from the mmio access.
v2: Baytrail wants a clflush too.
On 12/10/2015 8:32 PM, Ville Syrjälä wrote:
On Thu, Dec 10, 2015 at 08:09:01PM +0530, Thulasimani, Sivakumar wrote:
On 12/10/2015 7:08 PM, Ville Syrjälä wrote:
On Thu, Dec 10, 2015 at 03:15:37PM +0200, Ville Syrjälä wrote:
On Thu, Dec 10, 2015 at 03:01:02PM +0530, Kumar, Shobhit wrote:
On
On 10/12/15 22:14, Rafael J. Wysocki wrote:
On Thursday, December 10, 2015 11:20:40 PM Imre Deak wrote:
On Thu, 2015-12-10 at 22:42 +0100, Rafael J. Wysocki wrote:
On Thursday, December 10, 2015 10:36:37 PM Rafael J. Wysocki wrote:
On Thursday, December 10, 2015 11:43:50 AM Imre Deak wrote:
O
On Fri, 11 Dec 2015 07:07:53 +0100,
Libin Yang wrote:
>
> >>> diff --git a/drivers/gpu/drm/i915/intel_audio.c
> >>> b/drivers/gpu/drm/i915/intel_audio.c
> >>> index 9aa83e7..5ad2e66 100644
> >>> --- a/drivers/gpu/drm/i915/intel_audio.c
> >>> +++ b/drivers/gpu/drm/i915/intel_audio.c
> >>> @@ -262,
>
>
>-Original Message-
>From: Daniel Vetter [mailto:daniel.vet...@ffwll.ch] On Behalf Of Daniel Vetter
>Sent: Thursday, December 10, 2015 12:53 PM
>To: Morton, Derek J
>Cc: Daniel Vetter; intel-gfx@lists.freedesktop.org; Wood, Thomas
>Subject: Re: [Intel-gfx] [PATCH i-g-t] gem_flink_race/p
From: Dhanya
This patch will verify color correction capability of a display driver.
Gamma/CSC/De-gamma for SKL/BXT supported.
Signed-off-by: Dhanya
---
tests/.gitignore | 1 +
tests/Makefile.sources | 1 +
tests/kms_color.c | 684
Hi Dave -
Here are some i915 fixes for v4.4, sorry for being late this week.
BR,
Jani.
The following changes since commit 527e9316f8ec44bd53d90fb9f611fa752bb9:
Linux 4.4-rc4 (2015-12-06 15:43:12 -0800)
are available in the git repository at:
git://anongit.freedesktop.org/drm-intel ta
This patch wraps the get_ddi_pll() methods for
SKL/BXT/HSW+ with a common intel_get_ddi_pll()
method, and exports it, so that it can be shared
by other users also.
Signed-off-by: Durgadoss R
---
drivers/gpu/drm/i915/intel_display.c | 18 --
drivers/gpu/drm/i915/intel_drv.h |
To support USB type C alternate DP mode, the display driver needs to
know the number of lanes required by the DP panel as well as number
of lanes that can be supported by the type-C cable. Sometimes, the
type-C cable may limit the bandwidth even if Panel can support
more lanes. To address these sce
This patch exports the intel_{enable/disable}_shared_dpll
methods so that they can be called from other files also.
Subsequent patches need to call this from intel_ddi.c
Signed-off-by: Durgadoss R
---
drivers/gpu/drm/i915/intel_display.c | 4 ++--
drivers/gpu/drm/i915/intel_drv.h | 2 ++
2 f
Looping over the crtc list and finding an unused crtc
has users other than load_detect(). Hence move it to
a common function so that we can re-use the logic.
Signed-off-by: Durgadoss R
---
drivers/gpu/drm/i915/intel_display.c | 37 ++--
drivers/gpu/drm/i915/intel_
Do not call intel_get_shared_dpll() if there exists a
valid shared DPLL already.
Signed-off-by: Durgadoss R
---
drivers/gpu/drm/i915/intel_ddi.c | 70
drivers/gpu/drm/i915/intel_display.c | 2 +-
drivers/gpu/drm/i915/intel_drv.h | 2 +-
3 files chan
We do not need to loop through crtc_state to get the
encoder if we already have a valid one available.
Signed-off-by: Durgadoss R
---
drivers/gpu/drm/i915/intel_ddi.c | 11 ---
drivers/gpu/drm/i915/intel_display.c | 2 +-
drivers/gpu/drm/i915/intel_drv.h | 3 ++-
3 files change
Retrying with reduced lanes/bw and updating the final
available lanes/bw to DPCD is needed for upfront link
train logic. Hence, this patch adds these methods
and exports them so that these can be called from
other files like ddi.c/display.c.
Signed-off-by: Durgadoss R
---
drivers/gpu/drm/i915/in
This patch series adds upfront link training support to enable
USB type C based DP on BXT platform.
To support USB type C alternate DP mode, the display driver needs to
know the number of lanes required by the DP panel as well as number
of lanes that can be supported by the type-C cable. Sometimes
On Thu, Dec 03, 2015 at 10:14:54AM +0100, Daniel Vetter wrote:
> On Tue, Dec 01, 2015 at 11:05:35AM +, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/intel_display.c
> > b/drivers/gpu/drm/i915/intel_display.c
> > index 4447e73b54db..73c61b94f7fd 100644
> > --- a/drivers/gpu/drm/i915
When testing this patch on my BXT-M I received this error message
Hardware name: Intel Corp. Broxton M/RVP, BIOS
BXTM_IFWI_X64_R_2015_49_2_03 11/25/2015
[0.00] [ cut here ]
[0.00] WARNING: CPU: 0 PID: 0 at drivers/iommu/dmar.c:829
warn_invalid_dmar+0x81/0xa
On 12/4/2015 8:52 PM, Imre Deak wrote:
On to, 2015-12-03 at 16:43 -0800, Bob Paauwe wrote:
On Tue, 1 Dec 2015 19:43:05 +0200
Imre Deak wrote:
On ti, 2015-12-01 at 09:22 -0800, Bob Paauwe wrote:
On Tue, 1 Dec 2015 15:56:55 +0200
Imre Deak wrote:
On ma, 2015-11-30 at 16:23 -0800, Bob Paau
On Tue, 2015-12-01 at 04:17 +0530, Deepak M wrote:
> Currently there is a entry to get the complete opregion
> dump, this patch adds entry to get the VBT alone from
> the opregion.
>
> Adding this entry helps developer to get the VBT easily,
> instead of following the old way where we get the comp
On Tue, 2015-12-01 at 04:17 +0530, Deepak M wrote:
> Calling the validate_vbt before assiging the opregion vbt blob.
> Size of the VBT blob cant be more than 6KB when VBT is present
> in mailbox 4.
>
> Cc: Jani Nikula
Tested-by: Mika Kahola
> Signed-off-by: Deepak M
> ---
> drivers/gpu/drm/i91
On Tue, 2015-12-01 at 04:17 +0530, Deepak M wrote:
> Mailbox 5 is BIOS to Driver Notification mailbox is intended
> to support BIOS to Driver event notification or data storage
> for BIOS to Driver data synchronization purpose. Mailbox 5 is
> the extension of mailbox 3.
>
> Cc: Jani Nikula
Tested
RC6 setup is shared between BIOS and Driver. BIOS sets up subset of RC6
setup registers. If those are not setup Driver should not enable RC6.
For implementing this, driver can check RC_CTRL0 and RC_CTRL1 values
to know if BIOS has enabled HW/SW RC6.
This will also enable user to control RC6 using B
On Tue, 2015-12-01 at 04:17 +0530, Deepak M wrote:
> v3: rebase
>
> Cc: Jani Nikula
Tested-by: Mika Kahola
> Signed-off-by: Deepak M
> ---
> drivers/gpu/drm/i915/intel_opregion.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_opregion.c
On Tue, 2015-12-01 at 04:17 +0530, Deepak M wrote:
> From: vkorjani
>
> New sequence element for i2c is been added in the
> mipi sequence block of the VBT. This patch parses
> and executes the i2c sequence.
>
> v2: Add i2c_put_adapter call(Jani), rebase
> v3: corrected the retry loop(Jani), reba
On Tue, Dec 08, 2015 at 06:41:54PM +0200, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä
>
> Show a sensible name for the plane in debug mesages. The driver
> may supply its own name, otherwise the core genrates the name
> ("plane-0", "plane-1" etc.).
>
> v2: kstrdup() the name passe
101 - 179 of 179 matches
Mail list logo