[Intel-gfx] Canceled event: XDC 2023 A Corunha Spain @ Tue Oct 17 - Thu Oct 19, 2023 (intel-gfx@lists.freedesktop.org)

2023-04-17 Thread mario . kleiner . de
BEGIN:VCALENDAR
PRODID:-//Google Inc//Google Calendar 70.9054//EN
VERSION:2.0
CALSCALE:GREGORIAN
METHOD:CANCEL
BEGIN:VEVENT
DTSTART;VALUE=DATE:20231017
DTEND;VALUE=DATE:20231020
DTSTAMP:20230417T170848Z
ORGANIZER;CN=mario.kleiner...@gmail.com:mailto:mario.kleiner...@gmail.com
UID:65qeuuc9e0gll25tq5r7e61...@google.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=et
 na...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:etnaviv@lists.freedesktop
 .org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=xo
 rg-de...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:xorg-devel@lists.freed
 esktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=am
 d-gfx list;X-NUM-GUESTS=0:mailto:amd-...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=in
 tel-gfx;X-NUM-GUESTS=0:mailto:intel-gfx@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=No
 uveau Dev;X-NUM-GUESTS=0:mailto:nouv...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=ACCEPTED;CN=mario.
 kleiner...@gmail.com;X-NUM-GUESTS=0:mailto:mario.kleiner...@gmail.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=bo
 a...@foundation.x.org;X-NUM-GUESTS=0:mailto:bo...@foundation.x.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=li
 bre-soc-...@lists.libre-soc.org;X-NUM-GUESTS=0:mailto:libre-soc-dev@lists.l
 ibre-soc.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=ML
  mesa-dev;X-NUM-GUESTS=0:mailto:mesa-...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=me
 mb...@x.org;X-NUM-GUESTS=0:mailto:memb...@x.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=fr
 eedr...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:freedreno@lists.freedes
 ktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=dr
 oidbit...@gmail.com;X-NUM-GUESTS=0:mailto:droidbit...@gmail.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=wa
 yland-de...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:wayland-devel@lists
 .freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=dr
 i-devel;X-NUM-GUESTS=0:mailto:dri-de...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=si
 gles...@igalia.com;X-NUM-GUESTS=0:mailto:sigles...@igalia.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=ev
 e...@lists.x.org;X-NUM-GUESTS=0:mailto:eve...@lists.x.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;X-NUM
 -GUESTS=0:mailto:bibby.hs...@mediatek.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN="G
 arg, Rohan";X-NUM-GUESTS=0:mailto:rohan.g...@intel.com
X-GOOGLE-CONFERENCE:https://meet.google.com/azn-uwfp-pgw
CREATED:20230417T170310Z
DESCRIPTION:Hello!\n \nRegistration & Call for Proposals are now open for X
 DC 2023\, which will\ntake place on October 17-19\, 2023.\n\nhttps://xdc202
 3.x.org\n \nAs usual\, the conference is free of charge and open to the gen
 eral\npublic. If you plan on attending\, please make sure to register as ea
 rly\nas possible!\n \nIn order to register as attendee\, you will therefore
  need to register\nvia the XDC website.\n \nhttps://indico.freedesktop.org/
 event/4/registrations/\n \nIn addition to registration\, the CfP is now ope
 n for talks\, workshops\nand demos at XDC 2023. While ...\n\n-::~:~::~:~:~:
 ~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-\n
 Join with Google Meet: https://meet.google.com/azn-uwfp-pgw\n\nLearn more a
 bout Meet at: https://support.google.com/a/users/answer/9282720\n\nPlease d
 o not edit this section.\n-::~:~::~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~
 :~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-
LAST-MODIFIED:20230417T170847Z
LOCATION:
SEQUENCE:1
STATUS:CANCELLED
SUMMARY:XDC 2023 A Corunha Spain
TRANSP:TRANSPARENT
END:VEVENT
END:VCALENDAR


invite.ics
Description: application/ics


[Intel-gfx] Invitation: XDC 2023 A Corunha Spain @ Tue Oct 17 - Thu Oct 19, 2023 (intel-gfx@lists.freedesktop.org)

2023-04-17 Thread mario . kleiner . de
BEGIN:VCALENDAR
PRODID:-//Google Inc//Google Calendar 70.9054//EN
VERSION:2.0
CALSCALE:GREGORIAN
METHOD:REQUEST
BEGIN:VEVENT
DTSTART;VALUE=DATE:20231017
DTEND;VALUE=DATE:20231020
DTSTAMP:20230417T170311Z
ORGANIZER;CN=mario.kleiner...@gmail.com:mailto:mario.kleiner...@gmail.com
UID:65qeuuc9e0gll25tq5r7e61...@google.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=etna...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:etnaviv@lists.f
 reedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=xorg-de...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:xorg-devel@l
 ists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=amd-gfx list;X-NUM-GUESTS=0:mailto:amd-...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=intel-gfx;X-NUM-GUESTS=0:mailto:intel-gfx@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=Nouveau Dev;X-NUM-GUESTS=0:mailto:nouv...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=ACCEPTED;RSVP=TRUE
 ;CN=mario.kleiner...@gmail.com;X-NUM-GUESTS=0:mailto:mario.kleiner.de@gmail
 .com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=bo...@foundation.x.org;X-NUM-GUESTS=0:mailto:bo...@foundation.x.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=libre-soc-...@lists.libre-soc.org;X-NUM-GUESTS=0:mailto:libre-soc-d
 e...@lists.libre-soc.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=ML mesa-dev;X-NUM-GUESTS=0:mailto:mesa-...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=memb...@x.org;X-NUM-GUESTS=0:mailto:memb...@x.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=freedr...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:freedreno@lis
 ts.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=droidbit...@gmail.com;X-NUM-GUESTS=0:mailto:droidbit...@gmail.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=wayland-de...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:wayland-d
 e...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=dri-devel;X-NUM-GUESTS=0:mailto:dri-de...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=sigles...@igalia.com;X-NUM-GUESTS=0:mailto:sigles...@igalia.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=eve...@lists.x.org;X-NUM-GUESTS=0:mailto:eve...@lists.x.org
X-GOOGLE-CONFERENCE:https://meet.google.com/azn-uwfp-pgw
X-MICROSOFT-CDO-OWNERAPPTID:148915568
CREATED:20230417T170310Z
DESCRIPTION:Hello!\n \nRegistration & Call for Proposals are now open for X
 DC 2023\, which will\ntake place on October 17-19\, 2023.\n\nhttps://xdc202
 3.x.org\n \nAs usual\, the conference is free of charge and open to the gen
 eral\npublic. If you plan on attending\, please make sure to register as ea
 rly\nas possible!\n \nIn order to register as attendee\, you will therefore
  need to register\nvia the XDC website.\n \nhttps://indico.freedesktop.org/
 event/4/registrations/\n \nIn addition to registration\, the CfP is now ope
 n for talks\, workshops\nand demos at XDC 2023. While ...\n\n-::~:~::~:~:~:
 ~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-\n
 Join with Google Meet: https://meet.google.com/azn-uwfp-pgw\n\nLearn more a
 bout Meet at: https://support.google.com/a/users/answer/9282720\n\nPlease d
 o not edit this section.\n-::~:~::~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~
 :~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-
LAST-MODIFIED:20230417T170310Z
LOCATION:
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:XDC 2023 A Corunha Spain
TRANSP:TRANSPARENT
BEGIN:VALARM
ACTION:EMAIL
DESCRIPTION:This is an event reminder
SUMMARY:Alarm notification
ATTENDEE:mailto:intel-gfx@lists.freedesktop.org
TRIGGER:-P0DT0H30M0S
END:VALARM
BEGIN:VALARM
ACTION:DISPLAY
DESCRIPTION:This is an event reminder
TRIGGER:-P0DT0H30M0S
END:VALARM
BEGIN:VALARM
ACTION:EMAIL
DESCRIPTION:This is an event reminder
SUMMARY:Alarm notification
ATTENDEE:mailto:intel-gfx@lists.freedesktop.org
TRIGGER:-P0DT7H30M0S
END:VALARM
END:VEVENT
END:VCALENDAR


invite.ics
Description: application/ics


Re: [Intel-gfx] [PATCH 4/8] drm/i915: Use preempt_disable/enable_rt() where recommended

2022-02-14 Thread Mario Kleiner
On Fri, Feb 11, 2022 at 9:44 AM Sebastian Andrzej Siewior
 wrote:
>
> On 2022-01-27 00:29:37 [+0100], Mario Kleiner wrote:
> > Hi, first thank you for implementing these preempt disables according to
> Hi Mario,
>
> > the markers i left long ago. And sorry for the rather late reply.
> >
> > I had a look at the code, as of Linux 5.16, and did also a little test run
> > (of a standard kernel, not with PREEMPT_RT, only
> > CONFIG_PREEMPT_VOLUNTARY=y) on my Intel Kabylake GT2, so some thoughts:
> >
> > The area covers only register reads and writes. The part that worries me
> > > is:
> > > - __intel_get_crtc_scanline() the worst case is 100us if no match is
> > >   found.
> > >
> >
> > This one can be a problem indeed on (maybe all?) modern Intel gpu's since
> > Haswell, ie. the last ~10 years. I was able to reproduce it on my Kabylake
> > Intel gpu.
> >
> > Most of the time that for-loop with up to 100 repetitions (~ 100
> > udelay(1) + one mmio register read) (cfe.
> > https://elixir.bootlin.com/linux/v5.17-rc1/source/drivers/gpu/drm/i915/i915_irq.c#L856)
> > will not execute, because most of the time that function gets called from
> > the vblank irq handler and then that trigger condition (if
> > (HAS_DDI(dev_priv) && !position)) is not true. However, it also gets called
> > as part of power-saving on behalf of userspace context, whenever the
> > desktop graphics goes idle for two video refresh cycles. If the desktop
> > shows graphics activity again, and vblank interrupts need to get reenabled,
> > the probability of hitting that case is then ~1-4% depending on video mode.
> > How many loops it runs also varies.
> >
> > On my little Intel(R) Core(TM) i5-8250U CPU machine with a mostly idle
> > desktop, I observed about one hit every couple of seconds of regular use,
> > and each hit took between 125 usecs and almost 250 usecs. I guess udelay(1)
> > can take a bit longer than 1 usec?
>
> it should get very close to this. Maybe something else extended the time
> depending on what you observe.
>

Probably all the other stuff in that for-loop adds a microsecond. I
don't have a good feeling how long a typical mmio register read is
expected to take, except for quite a bit less than 1 usec from my
experience.

> > So that's too much for preempt-rt. What one could do is the following:
> >
> > 1. In the for-loop in __intel_get_crtc_scanline(), add a preempt_enable()
> > before the udelay(1); and a preempt_disable() again after it. Or
> > potentially around the whole for-loop if the overhead of
> > preempt_en/disable() is significant?
>
> It is very optimized on x86 ;)

Good! So adding a disable/enable pair into each of those loop
iterations won't hurt.

>
> > 2. In intel_get_crtc_scanline() also wrap the call to
> > __intel_get_crtc_scanline() into a preempt_disable() and preempt_enable(),
> > so we can be sure that __intel_get_crtc_scanline() always gets called with
> > preemption disabled.
> >
> > Why should this work ok'ish? The point of the original preempt disable
> > inside i915_get_crtc_scanoutpos
> > <https://elixir.bootlin.com/linux/v5.17-rc1/C/ident/i915_get_crtc_scanoutpos>
> > is that those two *stime = ktime_get() and *etime = ktime_get() clock
> > queries happen as close to the scanout position query as possible to get a
> > small confidence interval for when exactly the scanoutpos was
> > read/determined from the display hardware. error = (etime - stime) is the
> > error margin. If that margin becomes greater than 20 usecs, then the
> > higher-level code will consider the measurement invalid and repeat the
> > whole procedure up to 3 times before giving up.
>
> The preempt-disable is needed then? The task is preemptible here on
> PREEMPT_RT but it _may_ not come to this. The difference vs !RT is that
> an interrupt will preempt this code without it.
>

Yes, it is needed, as that chunk of code between the two ktime_get()
requires should ideally not get interrupted by anything.
The "try up to three times" higher level logic in calling code is just
to cover the hopefully rare cases where something still preempts,
e.g., a NMI or such.

I have not ever tested this on a PREEMPT_RT kernel in at least a
decade, but on regular kernels, e.g., Ubuntu generic or Ubuntu
low-latency kernels I haven't observed more than one retry when it
mattered, and usually the code executes in 0-2 usecs on my test
machines, way below the limit of 20 usecs at which a measurement is
considered failed and then retried. So the retries are sufficient as
long as all preventable preemption is prevented. Hence the
preempt_disable() ann

Re: [Intel-gfx] [PATCH 4/8] drm/i915: Use preempt_disable/enable_rt() where recommended

2022-01-26 Thread Mario Kleiner
On Tue, Dec 14, 2021 at 3:03 PM Sebastian Andrzej Siewior <
bige...@linutronix.de> wrote:

> From: Mike Galbraith 
>
> Mario Kleiner suggest in commit
>   ad3543ede630f ("drm/intel: Push get_scanout_position() timestamping into
> kms driver.")
>
> a spots where preemption should be disabled on PREEMPT_RT. The
> difference is that on PREEMPT_RT the intel_uncore::lock disables neither
> preemption nor interrupts and so region remains preemptible.
>
>
Hi, first thank you for implementing these preempt disables according to
the markers i left long ago. And sorry for the rather late reply.

I had a look at the code, as of Linux 5.16, and did also a little test run
(of a standard kernel, not with PREEMPT_RT, only
CONFIG_PREEMPT_VOLUNTARY=y) on my Intel Kabylake GT2, so some thoughts:

The area covers only register reads and writes. The part that worries me
> is:
> - __intel_get_crtc_scanline() the worst case is 100us if no match is
>   found.
>

This one can be a problem indeed on (maybe all?) modern Intel gpu's since
Haswell, ie. the last ~10 years. I was able to reproduce it on my Kabylake
Intel gpu.

Most of the time that for-loop with up to 100 repetitions (~ 100
udelay(1) + one mmio register read) (cfe.
https://elixir.bootlin.com/linux/v5.17-rc1/source/drivers/gpu/drm/i915/i915_irq.c#L856)
will not execute, because most of the time that function gets called from
the vblank irq handler and then that trigger condition (if
(HAS_DDI(dev_priv) && !position)) is not true. However, it also gets called
as part of power-saving on behalf of userspace context, whenever the
desktop graphics goes idle for two video refresh cycles. If the desktop
shows graphics activity again, and vblank interrupts need to get reenabled,
the probability of hitting that case is then ~1-4% depending on video mode.
How many loops it runs also varies.

On my little Intel(R) Core(TM) i5-8250U CPU machine with a mostly idle
desktop, I observed about one hit every couple of seconds of regular use,
and each hit took between 125 usecs and almost 250 usecs. I guess udelay(1)
can take a bit longer than 1 usec?

So that's too much for preempt-rt. What one could do is the following:

1. In the for-loop in __intel_get_crtc_scanline(), add a preempt_enable()
before the udelay(1); and a preempt_disable() again after it. Or
potentially around the whole for-loop if the overhead of
preempt_en/disable() is significant?

2. In intel_get_crtc_scanline() also wrap the call to
__intel_get_crtc_scanline() into a preempt_disable() and preempt_enable(),
so we can be sure that __intel_get_crtc_scanline() always gets called with
preemption disabled.

Why should this work ok'ish? The point of the original preempt disable
inside i915_get_crtc_scanoutpos
<https://elixir.bootlin.com/linux/v5.17-rc1/C/ident/i915_get_crtc_scanoutpos>
is that those two *stime = ktime_get() and *etime = ktime_get() clock
queries happen as close to the scanout position query as possible to get a
small confidence interval for when exactly the scanoutpos was
read/determined from the display hardware. error = (etime - stime) is the
error margin. If that margin becomes greater than 20 usecs, then the
higher-level code will consider the measurement invalid and repeat the
whole procedure up to 3 times before giving up.

Normally, in my experience with different graphics chips, one would observe
error < 3 usecs, so the measurement almost always succeeds at first try,
only very rarely takes two attempts. The preempt disable is meant to make
sure that this stays the case on a PREEMPT_RT kernel.

The problem here are the relatively rare cases where we hit that up to 100
iterations for-loop. Here even on a regular kernel, due to hardware quirks,
we already exceed the 20 usecs tolerance by a huge amount of more than 100
usecs, leading to a retry of the measurement. And my tests showed that
often the two succeeding retries also fail, because of hardware quirks can
apparently create a blackout situation approaching 1 msec, so we lose
anyway, regardless if we get preempted on a RT kernel or not. That's why
enabling preemption on RT again during that for-loop should not make the
situation worse and at least keep RT as real-time as intended.

In practice I would also expect that this failure case is the one least
likely to impair userspace applications greatly in practice. The cases that
mostly matter are the ones executed during vblank hardware irq, where the
for-loop never executes and error margin and preempt off time is only about
1 usec. My own software which depends on very precise timestamps from the
mechanism never reported >> 20 usecs errors during startup tests or runtime
tests.


> - intel_crtc_scanlines_since_frame_timestamp() not sure how long this
>   may take in the worst case.
>
>
intel_crtc_scanlines_since_frame_timestamp() should be harmless. That
do-while loop just tries to make sure that two

Re: [Intel-gfx] [PATCH v2 2/7] drm/uAPI: Add "active bpc" as feedback channel for "max bpc" drm property

2021-06-14 Thread Mario Kleiner
On Thu, Jun 10, 2021 at 9:55 AM Pekka Paalanen  wrote:
>
> On Tue,  8 Jun 2021 19:43:15 +0200
> Werner Sembach  wrote:
>
> > Add a new general drm property "active bpc" which can be used by graphic 
> > drivers
> > to report the applied bit depth per pixel back to userspace.
> >

Maybe "bit depth per pixel" -> "bit depth per pixel color component"
for slightly more clarity?

> > While "max bpc" can be used to change the color depth, there was no way to 
> > check
> > which one actually got used. While in theory the driver chooses the 
> > best/highest
> > color depth within the max bpc setting a user might not be fully aware what 
> > his
> > hardware is or isn't capable off. This is meant as a quick way to double 
> > check
> > the setup.
> >
> > In the future, automatic color calibration for screens might also depend on 
> > this
> > information being available.
> >
> > Signed-off-by: Werner Sembach 
> > ---
> >  drivers/gpu/drm/drm_atomic_uapi.c |  2 ++
> >  drivers/gpu/drm/drm_connector.c   | 41 +++
> >  include/drm/drm_connector.h   | 15 +++
> >  3 files changed, 58 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
> > b/drivers/gpu/drm/drm_atomic_uapi.c
> > index 268bb69c2e2f..7ae4e40936b5 100644
> > --- a/drivers/gpu/drm/drm_atomic_uapi.c
> > +++ b/drivers/gpu/drm/drm_atomic_uapi.c
> > @@ -873,6 +873,8 @@ drm_atomic_connector_get_property(struct drm_connector 
> > *connector,
> >   *val = 0;
> >   } else if (property == connector->max_bpc_property) {
> >   *val = state->max_requested_bpc;
> > + } else if (property == connector->active_bpc_property) {
> > + *val = state->active_bpc;
> >   } else if (connector->funcs->atomic_get_property) {
> >   return connector->funcs->atomic_get_property(connector,
> >   state, property, val);
> > diff --git a/drivers/gpu/drm/drm_connector.c 
> > b/drivers/gpu/drm/drm_connector.c
> > index 7631f76e7f34..c0c3c09bfed0 100644
> > --- a/drivers/gpu/drm/drm_connector.c
> > +++ b/drivers/gpu/drm/drm_connector.c
> > @@ -1195,6 +1195,14 @@ static const struct drm_prop_enum_list 
> > dp_colorspaces[] = {
> >   *   drm_connector_attach_max_bpc_property() to create and attach the
> >   *   property to the connector during initialization.
> >   *
> > + * active bpc:
> > + *   This read-only range property tells userspace the pixel color bit 
> > depth
> > + *   actually used by the hardware display engine on "the cable" on a
> > + *   connector. The chosen value depends on hardware capabilities, both
> > + *   display engine and connected monitor, and the "max bpc" property.
> > + *   Drivers shall use drm_connector_attach_active_bpc_property() to 
> > install
> > + *   this property.
> > + *
>
> This description is now clear to me, but I wonder, is it also how
> others understand it wrt. dithering?
>
> Dithering done on monitor is irrelevant, because we are talking about
> "on the cable" pixels. But since we are talking about "on the cable"
> pixels, also dithering done by the display engine must not factor in.
> Should the dithering done by display engine result in higher "active
> bpc" number than what is actually transmitted on the cable?
>
> I cannot guess what userspace would want exactly. I think the
> strict "on the cable" interpretation is a safe bet, because it then
> gives a lower limit on observed bpc. Dithering settings should be
> exposed with other KMS properties, so userspace can factor those in.
> But to be absolutely sure, we'd have to ask some color management
> experts.
>
> Cc'ing Mario in case he has an opinion.
>

Thanks. I like this a lot, in fact such a connector property was on my
todo list / wish list for something like that!

I agree with the "active bpc" definition here in this patch and
Pekka's comments. I want what goes out over the cable, not including
any effects of dithering. At least AMD's amdpu-kms driver exposes
"active bpc" already as a per-connector property in debugfs, and i use
reported output from there a lot to debug problems with respect to HDR
display or high color precision output, and to verify i'm not fooling
myself wrt. what goes out, compared to what dithering may "fake" on
top of it.

Software like mine would greatly benefit from getting this directly
off the connector, ie. as a RandR output property, just like with "max
bpc", as mapping X11 output names to driver output names is a guessing
game, directing regular users to those debugfs files is tedious and
error prone, and many regular users don't have root permissions
anyway.

Sometimes one wants to prioritize "active bpc" over resolution or
refresh rate, and especially on now more common HDR displays, and
actual bit depth also changes depending on bandwidth requirements vs.
availability, and how well DP link training went with a flaky or loose
cable, like only getting 10 bpc for HDR-10 when running on less than
maximum 

Re: [Intel-gfx] [PATCH 18/18] drm/i915/display13: Enabling dithering after the CC1 pipe

2021-02-18 Thread Mario Kleiner
On Fri, Feb 19, 2021 at 4:22 AM Mario Kleiner 
wrote:

>
>
> On Thu, Feb 11, 2021 at 1:29 PM Ville Syrjälä <
> ville.syrj...@linux.intel.com> wrote:
>
>> On Thu, Jan 28, 2021 at 11:24:13AM -0800, Matt Roper wrote:
>> > From: Nischal Varide 
>> >
>> > If the panel is 12bpc then Dithering is not enabled in the Legacy
>> > dithering block , instead its Enabled after the C1 CC1 pipe post
>> > color space conversion.For a 6bpc pannel Dithering is enabled in
>> > Legacy block.
>>
>> Dithering is probably going to require a whole uapi bikeshed.
>> Not sure we can just enable it unilaterally.
>>
>> Ccing dri-devel, and Mario who had issues with dithering in the
>> past...
>>
>> Thanks for the cc Ville!
>
> The problem with dithering on Intel is that various tested Intel gpu's
> (Ironlake, IvyBridge, Haswell, Skylake iirc.) are dithering when they
> shouldn't. If one has a standard 8 bpc framebuffer feeding into a standard
> (legacy) 256 slots, 8 bit wide lut which was loaded with an identity
> mapping, feeding into a standard 8 bpc video output (DVI/HDMI/DP), the
> expected result is that pixels rendered into the framebuffer show up
> unmodified at the video output. What happens instead is that some dithering
> is needlessly applied. This is bad for various neuroscience/medical
> research equipment that requires pixels to pass unmodified in a pure 8 bpc
> configuration, e.g., because some digital info is color-encoded in-band in
> the rendered image to control research hardware, a la "if rgb pixel (123,
> 12, 23) is detected in the digital video stream, emit some trigger signal,
> or timestamp that moment with a hw clock, or start or stop some scientific
> recording equipment". Also there exist specialized visual stimulators to
> drive special displays with more than 12 bpc, e.g., 16 bpc, and so they
> encode the 8MSB of 16 bpc color values in pixels in even columns, and the
> 8LSB in the odd columns of the framebuffer. Unexpected dithering makes such
> equipment completely unusable. By now I must have spent months of my life,
> just trying to deal with dithering induced problems on different gpu's due
> to hw quirks or bugs somewhere in the graphics stack.
>
> Atm. the intel kms driver disables dithering for anything with >= 8 bpc as
> a fix for this harmful hardware quirk.
>
> Ideally we'd have uapi that makes dithering controllable per connector
> (on/off/auto, selectable depth), also in a way that those controls are
> exposed as RandR output properties, easily controllable by X clients. And
> some safe default in case the client can't access the properties (like I'd
> expect to happen with the dozens of Wayland compositors under the sun).
> Various drivers had this over time, e.g., AMD classic kms path (if i don't
> misremember) and nouveau, but some of it also got lost in the new atomic
> kms variants, and Intel never exposed this.
>
> Or maybe some method that checks the values actually stored in the hw
> lut's, CTM etc. and if the values suggest no dithering should be needed,
> disable the dithering. E.g., if output depth is 8 bpc, one only needs
> dithering if the slots in the final active hw lut do have any meaningful
> values in the lower bits below the top 8 MSB, ie. if the content is
> actually > 8 bpc net bit depth.
>
> -mario
>
>
One cup of coffee later... I think this specific patch should be ok wrt. my
use cases. The majority of the above mentioned research devices are
single/dual-link DVI receivers, ie. 8 bpc video sinks. I'm only aware of
one recent device that has a DisplayPort receiver who could act as a > 8
bpc video sink. See the following link for advanced examples of such
devices: https://vpixx.com/our-products/video-i-o-hub/

I cannot think of a use case that would require more than 8 bits for inband
signalling given that that was good enough for the last 20 years, or for
encoding very high color precision content -- the 16 bpc precision that one
can get out of the current even/odd pixel = 8 MSB + 8 LSB encoding scheme
should be enough for the foreseeable future. Therefore dithering shouldn't
pose a problem if it leaves the 8 MSB of each pixel color component intact,
and spatial dithering as employed here usually only touches the least
significant bit (or maybe the 2 LSB's?).

As this patch only enables dithering on 12 bpc video sinks, if i understand
pipe_bpp correctly, it could only "corrupt" one bit and leave at least the
10-11 MSB's intact, right?

pipe_bpp == 24 is the case that would really hurt a lot of researchers if
dithering would be enabled without providing good uapi or other mechanisms
to prevent it.

So:

Acked-by: Mario Kleiner 

One suggestion: It would be good to also add a bit of

Re: [Intel-gfx] [PATCH 18/18] drm/i915/display13: Enabling dithering after the CC1 pipe

2021-02-18 Thread Mario Kleiner
On Thu, Feb 11, 2021 at 1:29 PM Ville Syrjälä 
wrote:

> On Thu, Jan 28, 2021 at 11:24:13AM -0800, Matt Roper wrote:
> > From: Nischal Varide 
> >
> > If the panel is 12bpc then Dithering is not enabled in the Legacy
> > dithering block , instead its Enabled after the C1 CC1 pipe post
> > color space conversion.For a 6bpc pannel Dithering is enabled in
> > Legacy block.
>
> Dithering is probably going to require a whole uapi bikeshed.
> Not sure we can just enable it unilaterally.
>
> Ccing dri-devel, and Mario who had issues with dithering in the
> past...
>
> Thanks for the cc Ville!

The problem with dithering on Intel is that various tested Intel gpu's
(Ironlake, IvyBridge, Haswell, Skylake iirc.) are dithering when they
shouldn't. If one has a standard 8 bpc framebuffer feeding into a standard
(legacy) 256 slots, 8 bit wide lut which was loaded with an identity
mapping, feeding into a standard 8 bpc video output (DVI/HDMI/DP), the
expected result is that pixels rendered into the framebuffer show up
unmodified at the video output. What happens instead is that some dithering
is needlessly applied. This is bad for various neuroscience/medical
research equipment that requires pixels to pass unmodified in a pure 8 bpc
configuration, e.g., because some digital info is color-encoded in-band in
the rendered image to control research hardware, a la "if rgb pixel (123,
12, 23) is detected in the digital video stream, emit some trigger signal,
or timestamp that moment with a hw clock, or start or stop some scientific
recording equipment". Also there exist specialized visual stimulators to
drive special displays with more than 12 bpc, e.g., 16 bpc, and so they
encode the 8MSB of 16 bpc color values in pixels in even columns, and the
8LSB in the odd columns of the framebuffer. Unexpected dithering makes such
equipment completely unusable. By now I must have spent months of my life,
just trying to deal with dithering induced problems on different gpu's due
to hw quirks or bugs somewhere in the graphics stack.

Atm. the intel kms driver disables dithering for anything with >= 8 bpc as
a fix for this harmful hardware quirk.

Ideally we'd have uapi that makes dithering controllable per connector
(on/off/auto, selectable depth), also in a way that those controls are
exposed as RandR output properties, easily controllable by X clients. And
some safe default in case the client can't access the properties (like I'd
expect to happen with the dozens of Wayland compositors under the sun).
Various drivers had this over time, e.g., AMD classic kms path (if i don't
misremember) and nouveau, but some of it also got lost in the new atomic
kms variants, and Intel never exposed this.

Or maybe some method that checks the values actually stored in the hw
lut's, CTM etc. and if the values suggest no dithering should be needed,
disable the dithering. E.g., if output depth is 8 bpc, one only needs
dithering if the slots in the final active hw lut do have any meaningful
values in the lower bits below the top 8 MSB, ie. if the content is
actually > 8 bpc net bit depth.

-mario

>
> > Cc: Uma Shankar 
> > Signed-off-by: Nischal Varide 
> > Signed-off-by: Bhanuprakash Modem 
> > Signed-off-by: Matt Roper 
> > ---
> >  drivers/gpu/drm/i915/display/intel_color.c   | 16 
> >  drivers/gpu/drm/i915/display/intel_display.c |  9 -
> >  drivers/gpu/drm/i915/i915_reg.h  |  3 ++-
> >  3 files changed, 26 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_color.c
> b/drivers/gpu/drm/i915/display/intel_color.c
> > index ff7dcb7088bf..9a0572bbc5db 100644
> > --- a/drivers/gpu/drm/i915/display/intel_color.c
> > +++ b/drivers/gpu/drm/i915/display/intel_color.c
> > @@ -1604,6 +1604,20 @@ static u32 icl_csc_mode(const struct
> intel_crtc_state *crtc_state)
> >   return csc_mode;
> >  }
> >
> > +static u32 dither_after_cc1_12bpc(const struct intel_crtc_state
> *crtc_state)
> > +{
> > + u32 gamma_mode = crtc_state->gamma_mode;
> > + struct drm_i915_private *i915 =
> to_i915(crtc_state->uapi.crtc->dev);
> > +
> > + if (HAS_DISPLAY13(i915)) {
> > + if (!crtc_state->dither_force_disable &&
> > + (crtc_state->pipe_bpp == 36))
> > + gamma_mode |= GAMMA_MODE_DITHER_AFTER_CC1;
> > + }
> > +
> > + return gamma_mode;
> > +}
> > +
> >  static int icl_color_check(struct intel_crtc_state *crtc_state)
> >  {
> >   int ret;
> > @@ -1614,6 +1628,8 @@ static int icl_color_check(struct intel_crtc_state
> *crtc_state)
> >
> >   crtc_state->gamma_mode = icl_gamma_mode(crtc_state);
> >
> > + crtc_state->gamma_mode = dither_after_cc1_12bpc(crtc_state);
> > +
> >   crtc_state->csc_mode = icl_csc_mode(crtc_state);
> >
> >   crtc_state->preload_luts = intel_can_preload_luts(crtc_state);
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> b/drivers/gpu/drm/i915/display/intel_display.c
> > index 

[Intel-gfx] [PATCH] drm/i915/dp: Add dpcd link_rate quirk for Apple 15" MBP 2017 (v3)

2020-03-15 Thread Mario Kleiner
This fixes a problem found on the MacBookPro 2017 Retina panel.

The panel reports 10 bpc color depth in its EDID, and the
firmware chooses link settings at boot which support enough
bandwidth for 10 bpc (324000 kbit/sec = multiplier 0xc),
but the DP_MAX_LINK_RATE dpcd register only reports
2.7 Gbps (multiplier value 0xa) as possible, in direct
contradiction of what the firmware successfully set up.

This restricts the panel to 8 bpc, not providing the full
color depth of the panel.

This patch adds a quirk specific to the MBP 2017 15" Retina
panel to add the additiional 324000 kbps link rate during
edp setup.

Link to previous discussion of a different attempted fix
with Ville and Jani:

https://patchwork.kernel.org/patch/11325935/

v2: Follow Jani's proposal of defining quirk_rates[] instead
of just appending 324000. This for better clarity.

v3: Rebased onto current drm-tip, as of 16-March-2020. Adapt
to new edid_quirks parameter of drm_dp_has_quirk().

Signed-off-by: Mario Kleiner 
Tested-by: Mario Kleiner 
Cc: Jani Nikula 
---
 drivers/gpu/drm/drm_dp_helper.c |  2 ++
 drivers/gpu/drm/i915/display/intel_dp.c | 11 +++
 include/drm/drm_dp_helper.h |  7 +++
 3 files changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
index c6fbe6e6bc9d..8ba4531e808d 100644
--- a/drivers/gpu/drm/drm_dp_helper.c
+++ b/drivers/gpu/drm/drm_dp_helper.c
@@ -1238,6 +1238,8 @@ static const struct dpcd_quirk dpcd_quirk_list[] = {
{ OUI(0x00, 0x00, 0x00), DEVICE_ID('C', 'H', '7', '5', '1', '1'), 
false, BIT(DP_DPCD_QUIRK_NO_SINK_COUNT) },
/* Synaptics DP1.4 MST hubs can support DSC without virtual DPCD */
{ OUI(0x90, 0xCC, 0x24), DEVICE_ID_ANY, true, 
BIT(DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD) },
+   /* Apple MacBookPro 2017 15 inch eDP Retina panel reports too low 
DP_MAX_LINK_RATE */
+   { OUI(0x00, 0x10, 0xfa), DEVICE_ID(101, 68, 21, 101, 98, 97), false, 
BIT(DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS) },
 };
 
 #undef OUI
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
b/drivers/gpu/drm/i915/display/intel_dp.c
index 0a417cd2af2b..ef2e06e292d5 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -164,6 +164,17 @@ static void intel_dp_set_sink_rates(struct intel_dp 
*intel_dp)
};
int i, max_rate;
 
+   if (drm_dp_has_quirk(_dp->desc, 0,
+DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS)) {
+   /* Needed, e.g., for Apple MBP 2017, 15 inch eDP Retina panel */
+   static const int quirk_rates[] = { 162000, 27, 324000 };
+
+   memcpy(intel_dp->sink_rates, quirk_rates, sizeof(quirk_rates));
+   intel_dp->num_sink_rates = ARRAY_SIZE(quirk_rates);
+
+   return;
+   }
+
max_rate = 
drm_dp_bw_code_to_link_rate(intel_dp->dpcd[DP_MAX_LINK_RATE]);
 
for (i = 0; i < ARRAY_SIZE(dp_rates); i++) {
diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
index c6119e4c169a..9d87cdf2740a 100644
--- a/include/drm/drm_dp_helper.h
+++ b/include/drm/drm_dp_helper.h
@@ -1548,6 +1548,13 @@ enum drm_dp_quirk {
 * capabilities advertised.
 */
DP_QUIRK_FORCE_DPCD_BACKLIGHT,
+   /**
+* @DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS:
+*
+* The device supports a link rate of 3.24 Gbps (multiplier 0xc) despite
+* the DP_MAX_LINK_RATE register reporting a lower max multiplier.
+*/
+   DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS,
 };
 
 /**
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 2/2] drm/dp: Add function to parse EDID descriptors for adaptive sync limits

2020-03-06 Thread Mario Kleiner
Just as a comment, u8 for max_vfreq in struct drm_adaptive_sync_info
might be not very future proof?

I just read that ASUS announced a "TUF Gaming VG259QM" monitor which
seems to have an adaptive sync range of 48 Hz to 280 Hz, exceeding the
max 255 Hz of u8?

-mario



On Fri, Mar 6, 2020 at 4:02 PM Kazlauskas, Nicholas
 wrote:
>
> On 2020-03-05 8:42 p.m., Manasi Navare wrote:
> > Adaptive Sync is a VESA feature so add a DRM core helper to parse
> > the EDID's detailed descritors to obtain the adaptive sync monitor range.
> > Store this info as part fo drm_display_info so it can be used
> > across all drivers.
> > This part of the code is stripped out of amdgpu's function
> > amdgpu_dm_update_freesync_caps() to make it generic and be used
> > across all DRM drivers
> >
> > v4:
> > * Use is_display_descriptor() (Ville)
> > * Name the monitor range flags (Ville)
> > v3:
> > * Remove the edid parsing restriction for just DP (Nicholas)
> > * Use drm_for_each_detailed_block (Ville)
> > * Make the drm_get_adaptive_sync_range function static (Harry, Jani)
> > v2:
> > * Change vmin and vmax to use u8 (Ville)
> > * Dont store pixel clock since that is just a max dotclock
> > and not related to VRR mode (Manasi)
> >
> > Cc: Ville Syrjälä 
> > Cc: Harry Wentland 
> > Cc: Clinton A Taylor 
> > Cc: Kazlauskas Nicholas 
> > Signed-off-by: Manasi Navare 
>
> Looks good to me now. I'm fine with whether we want to rename the flags
> or not, I don't have much of a preference either way.
>
> Series is:
>
> Reviewed-by: Nicholas Kazlauskas 
>
> Regards,
> Nicholas Kazlauskas
>
> > ---
> >   drivers/gpu/drm/drm_edid.c  | 44 +
> >   include/drm/drm_connector.h | 22 +++
> >   2 files changed, 66 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> > index ad41764a4ebe..61ed544d9535 100644
> > --- a/drivers/gpu/drm/drm_edid.c
> > +++ b/drivers/gpu/drm/drm_edid.c
> > @@ -4938,6 +4938,47 @@ static void drm_parse_cea_ext(struct drm_connector 
> > *connector,
> >   }
> >   }
> >
> > +static
> > +void get_adaptive_sync_range(struct detailed_timing *timing,
> > +  void *info_adaptive_sync)
> > +{
> > + struct drm_adaptive_sync_info *adaptive_sync = info_adaptive_sync;
> > + const struct detailed_non_pixel *data = >data.other_data;
> > + const struct detailed_data_monitor_range *range = >data.range;
> > +
> > + if (!is_display_descriptor((const u8 *)timing, 
> > EDID_DETAIL_MONITOR_RANGE))
> > + return;
> > +
> > + /*
> > +  * Check for flag range limits only. If flag == 1 then
> > +  * no additional timing information provided.
> > +  * Default GTF, GTF Secondary curve and CVT are not
> > +  * supported
> > +  */
> > + if (range->flags != EDID_RANGE_LIMITS_ONLY_FLAG)
> > + return;
> > +
> > + adaptive_sync->min_vfreq = range->min_vfreq;
> > + adaptive_sync->max_vfreq = range->max_vfreq;
> > +}
> > +
> > +static
> > +void drm_get_adaptive_sync_range(struct drm_connector *connector,
> > +  const struct edid *edid)
> > +{
> > + struct drm_display_info *info = >display_info;
> > +
> > + if (!version_greater(edid, 1, 1))
> > + return;
> > +
> > + drm_for_each_detailed_block((u8 *)edid, get_adaptive_sync_range,
> > + >adaptive_sync);
> > +
> > + DRM_DEBUG_KMS("Adaptive Sync refresh rate range is %d Hz - %d Hz\n",
> > +   info->adaptive_sync.min_vfreq,
> > +   info->adaptive_sync.max_vfreq);
> > +}
> > +
> >   /* A connector has no EDID information, so we've got no EDID to compute 
> > quirks from. Reset
> >* all of the values which would have been set from EDID
> >*/
> > @@ -4960,6 +5001,7 @@ drm_reset_display_info(struct drm_connector 
> > *connector)
> >   memset(>hdmi, 0, sizeof(info->hdmi));
> >
> >   info->non_desktop = 0;
> > + memset(>adaptive_sync, 0, sizeof(info->adaptive_sync));
> >   }
> >
> >   u32 drm_add_display_info(struct drm_connector *connector, const struct 
> > edid *edid)
> > @@ -4975,6 +5017,8 @@ u32 drm_add_display_info(struct drm_connector 
> > *connector, const struct edid *edi
> >
> >   info->non_desktop = !!(quirks & EDID_QUIRK_NON_DESKTOP);
> >
> > + drm_get_adaptive_sync_range(connector, edid);
> > +
> >   DRM_DEBUG_KMS("non_desktop set to %d\n", info->non_desktop);
> >
> >   if (edid->revision < 3)
> > diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
> > index 0df7a95ca5d9..2b22c0fa42c4 100644
> > --- a/include/drm/drm_connector.h
> > +++ b/include/drm/drm_connector.h
> > @@ -254,6 +254,23 @@ enum drm_panel_orientation {
> >   DRM_MODE_PANEL_ORIENTATION_RIGHT_UP,
> >   };
> >
> > +/**
> > + * struct drm_adaptive_sync_info - Panel's Adaptive Sync capabilities for
> > + * _display_info
> > + *
> > + * This struct is used to store a Panel's 

Re: [Intel-gfx] [PATCH] drm/i915/dp: Add dpcd link_rate quirk for Apple 15" MBP 2017

2020-03-05 Thread Mario Kleiner
On Wed, Mar 4, 2020 at 4:32 PM Jani Nikula  wrote:
>
> On Sat, 29 Feb 2020, Mario Kleiner  wrote:
> > This fixes a problem found on the MacBookPro 2017 Retina panel.
> >
> > The panel reports 10 bpc color depth in its EDID, and the
> > firmware chooses link settings at boot which support enough
> > bandwidth for 10 bpc (324000 kbit/sec = multiplier 0xc),
> > but the DP_MAX_LINK_RATE dpcd register only reports
> > 2.7 Gbps (multiplier value 0xa) as possible, in direct
> > contradiction of what the firmware successfully set up.
> >
> > This restricts the panel to 8 bpc, not providing the full
> > color depth of the panel.
> >
> > This patch adds a quirk specific to the MBP 2017 15" Retina
> > panel to add the additiional 324000 kbps link rate during
> > edp setup.
> >
> > Link to previous discussion of a different attempted fix
> > with Ville and Jani:
> >
> > https://patchwork.kernel.org/patch/11325935/
> >
> > Signed-off-by: Mario Kleiner 
> > Cc: Ville Syrjälä 
> > Cc: Jani Nikula 
> > ---
> >  drivers/gpu/drm/drm_dp_helper.c | 2 ++
> >  drivers/gpu/drm/i915/display/intel_dp.c | 7 +++
> >  include/drm/drm_dp_helper.h | 7 +++
> >  3 files changed, 16 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_dp_helper.c 
> > b/drivers/gpu/drm/drm_dp_helper.c
> > index 5a103e9b3c86..36a371c016cb 100644
> > --- a/drivers/gpu/drm/drm_dp_helper.c
> > +++ b/drivers/gpu/drm/drm_dp_helper.c
> > @@ -1179,6 +1179,8 @@ static const struct dpcd_quirk dpcd_quirk_list[] = {
> >   { OUI(0x00, 0x00, 0x00), DEVICE_ID('C', 'H', '7', '5', '1', '1'), 
> > false, BIT(DP_DPCD_QUIRK_NO_SINK_COUNT) },
> >   /* Synaptics DP1.4 MST hubs can support DSC without virtual DPCD */
> >   { OUI(0x90, 0xCC, 0x24), DEVICE_ID_ANY, true, 
> > BIT(DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD) },
> > + /* Apple MacBookPro 2017 15 inch eDP Retina panel reports too low 
> > DP_MAX_LINK_RATE */
> > + { OUI(0x00, 0x10, 0xfa), DEVICE_ID(101, 68, 21, 101, 98, 97), false, 
> > BIT(DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS) },
> >  };
> >
> >  #undef OUI
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
> > b/drivers/gpu/drm/i915/display/intel_dp.c
> > index 4074d83b1a5f..1f6bd659ad41 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > @@ -178,6 +178,13 @@ static void intel_dp_set_sink_rates(struct intel_dp 
> > *intel_dp)
> >   }
> >
> >   intel_dp->num_sink_rates = i;
> > +
> > + if (drm_dp_has_quirk(_dp->desc,
> > + DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS)) {
> > + /* Needed for Apple MBP 2017, 15 inch eDP Retina panel */
> > + intel_dp->sink_rates[i] = 324000;
> > + intel_dp->num_sink_rates++;
> > + }
>
> If we can isolate the quirk to this one function, I'll be happy. \o/
>

Me too \o/ - Patch v2 is out, following your proposal, retested on the
machine, works. cat ... i915_display_info reports a pipe depth of 30
bpp, instead of 24 bpp.

I didn't add a stable tag, but wonder if a cc stable tag could be
added by you, if you think it is minimal enough, to get it also into
the kernels for the spring distro updates. In any case, case closed.

Thanks for the review,
-mario

> However, even if this might work on said machine, I'd prefer it if we
> didn't give the idea that you could just append a value in sink_rates
> (it must be sorted). How about putting something like this in the
> beginning of the function, to be a bit more explicit:
>
> if (quirk) {
> static const int quirk_rates[] = { 162000, 27, 324000 };
>
> memcpy(intel_dp->sink_rates, quirk_rates, 
> sizeof(quirk_rates));
> intel_dp->num_sink_rates = ARRAY_SIZE(quirk_rates);
>
> return;
> }
>
> BR,
> Jani.
>
> >  }
> >
> >  /* Get length of rates array potentially limited by max_rate. */
> > diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
> > index 262faf9e5e94..4b86a1f2a559 100644
> > --- a/include/drm/drm_dp_helper.h
> > +++ b/include/drm/drm_dp_helper.h
> > @@ -1532,6 +1532,13 @@ enum drm_dp_quirk {
> >* The DSC caps can be read from the physical aux instead.
> >*/
> >   DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD,
> > + /**
> > +  * @DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS:
> > +  *
> > +  * The device supports a link rate of 3.24 Gbps (multiplier 0xc) 
> > despite
> > +  * the DP_MAX_LINK_RATE register reporting a lower max multiplier.
> > +  */
> > + DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS,
> >  };
> >
> >  /**
>
> --
> Jani Nikula, Intel Open Source Graphics Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/dp: Add dpcd link_rate quirk for Apple 15" MBP 2017 (v2)

2020-03-05 Thread Mario Kleiner
This fixes a problem found on the MacBookPro 2017 Retina panel.

The panel reports 10 bpc color depth in its EDID, and the
firmware chooses link settings at boot which support enough
bandwidth for 10 bpc (324000 kbit/sec = multiplier 0xc),
but the DP_MAX_LINK_RATE dpcd register only reports
2.7 Gbps (multiplier value 0xa) as possible, in direct
contradiction of what the firmware successfully set up.

This restricts the panel to 8 bpc, not providing the full
color depth of the panel.

This patch adds a quirk specific to the MBP 2017 15" Retina
panel to add the additiional 324000 kbps link rate during
edp setup.

Link to previous discussion of a different attempted fix
with Ville and Jani:

https://patchwork.kernel.org/patch/11325935/

v2: Follow Jani's proposal of defining quirk_rates[] instead
of just appending 324000. This for better clarity.

Signed-off-by: Mario Kleiner 
Tested-by: Mario Kleiner 
Cc: Ville Syrjälä 
Cc: Jani Nikula 
---
 drivers/gpu/drm/drm_dp_helper.c |  2 ++
 drivers/gpu/drm/i915/display/intel_dp.c | 11 +++
 include/drm/drm_dp_helper.h |  7 +++
 3 files changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
index 5a103e9b3c86..36a371c016cb 100644
--- a/drivers/gpu/drm/drm_dp_helper.c
+++ b/drivers/gpu/drm/drm_dp_helper.c
@@ -1179,6 +1179,8 @@ static const struct dpcd_quirk dpcd_quirk_list[] = {
{ OUI(0x00, 0x00, 0x00), DEVICE_ID('C', 'H', '7', '5', '1', '1'), 
false, BIT(DP_DPCD_QUIRK_NO_SINK_COUNT) },
/* Synaptics DP1.4 MST hubs can support DSC without virtual DPCD */
{ OUI(0x90, 0xCC, 0x24), DEVICE_ID_ANY, true, 
BIT(DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD) },
+   /* Apple MacBookPro 2017 15 inch eDP Retina panel reports too low 
DP_MAX_LINK_RATE */
+   { OUI(0x00, 0x10, 0xfa), DEVICE_ID(101, 68, 21, 101, 98, 97), false, 
BIT(DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS) },
 };
 
 #undef OUI
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
b/drivers/gpu/drm/i915/display/intel_dp.c
index 4074d83b1a5f..c0d2c70b04fb 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -169,6 +169,17 @@ static void intel_dp_set_sink_rates(struct intel_dp 
*intel_dp)
};
int i, max_rate;
 
+   if (drm_dp_has_quirk(_dp->desc,
+DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS)) {
+   /* Needed, e.g., for Apple MBP 2017, 15 inch eDP Retina panel */
+   static const int quirk_rates[] = { 162000, 27, 324000 };
+
+   memcpy(intel_dp->sink_rates, quirk_rates, sizeof(quirk_rates));
+   intel_dp->num_sink_rates = ARRAY_SIZE(quirk_rates);
+
+   return;
+   }
+
max_rate = 
drm_dp_bw_code_to_link_rate(intel_dp->dpcd[DP_MAX_LINK_RATE]);
 
for (i = 0; i < ARRAY_SIZE(dp_rates); i++) {
diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
index 262faf9e5e94..4b86a1f2a559 100644
--- a/include/drm/drm_dp_helper.h
+++ b/include/drm/drm_dp_helper.h
@@ -1532,6 +1532,13 @@ enum drm_dp_quirk {
 * The DSC caps can be read from the physical aux instead.
 */
DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD,
+   /**
+* @DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS:
+*
+* The device supports a link rate of 3.24 Gbps (multiplier 0xc) despite
+* the DP_MAX_LINK_RATE register reporting a lower max multiplier.
+*/
+   DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS,
 };
 
 /**
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/dp: Add dpcd link_rate quirk for Apple 15" MBP 2017

2020-02-28 Thread Mario Kleiner
This fixes a problem found on the MacBookPro 2017 Retina panel.

The panel reports 10 bpc color depth in its EDID, and the
firmware chooses link settings at boot which support enough
bandwidth for 10 bpc (324000 kbit/sec = multiplier 0xc),
but the DP_MAX_LINK_RATE dpcd register only reports
2.7 Gbps (multiplier value 0xa) as possible, in direct
contradiction of what the firmware successfully set up.

This restricts the panel to 8 bpc, not providing the full
color depth of the panel.

This patch adds a quirk specific to the MBP 2017 15" Retina
panel to add the additiional 324000 kbps link rate during
edp setup.

Link to previous discussion of a different attempted fix
with Ville and Jani:

https://patchwork.kernel.org/patch/11325935/

Signed-off-by: Mario Kleiner 
Cc: Ville Syrjälä 
Cc: Jani Nikula 
---
 drivers/gpu/drm/drm_dp_helper.c | 2 ++
 drivers/gpu/drm/i915/display/intel_dp.c | 7 +++
 include/drm/drm_dp_helper.h | 7 +++
 3 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
index 5a103e9b3c86..36a371c016cb 100644
--- a/drivers/gpu/drm/drm_dp_helper.c
+++ b/drivers/gpu/drm/drm_dp_helper.c
@@ -1179,6 +1179,8 @@ static const struct dpcd_quirk dpcd_quirk_list[] = {
{ OUI(0x00, 0x00, 0x00), DEVICE_ID('C', 'H', '7', '5', '1', '1'), 
false, BIT(DP_DPCD_QUIRK_NO_SINK_COUNT) },
/* Synaptics DP1.4 MST hubs can support DSC without virtual DPCD */
{ OUI(0x90, 0xCC, 0x24), DEVICE_ID_ANY, true, 
BIT(DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD) },
+   /* Apple MacBookPro 2017 15 inch eDP Retina panel reports too low 
DP_MAX_LINK_RATE */
+   { OUI(0x00, 0x10, 0xfa), DEVICE_ID(101, 68, 21, 101, 98, 97), false, 
BIT(DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS) },
 };
 
 #undef OUI
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
b/drivers/gpu/drm/i915/display/intel_dp.c
index 4074d83b1a5f..1f6bd659ad41 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -178,6 +178,13 @@ static void intel_dp_set_sink_rates(struct intel_dp 
*intel_dp)
}
 
intel_dp->num_sink_rates = i;
+
+   if (drm_dp_has_quirk(_dp->desc,
+   DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS)) {
+   /* Needed for Apple MBP 2017, 15 inch eDP Retina panel */
+   intel_dp->sink_rates[i] = 324000;
+   intel_dp->num_sink_rates++;
+   }
 }
 
 /* Get length of rates array potentially limited by max_rate. */
diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
index 262faf9e5e94..4b86a1f2a559 100644
--- a/include/drm/drm_dp_helper.h
+++ b/include/drm/drm_dp_helper.h
@@ -1532,6 +1532,13 @@ enum drm_dp_quirk {
 * The DSC caps can be read from the physical aux instead.
 */
DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD,
+   /**
+* @DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS:
+*
+* The device supports a link rate of 3.24 Gbps (multiplier 0xc) despite
+* the DP_MAX_LINK_RATE register reporting a lower max multiplier.
+*/
+   DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS,
 };
 
 /**
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.

2020-01-10 Thread Mario Kleiner
On Thu, Jan 9, 2020 at 10:26 PM Harry Wentland  wrote:

>
>
> On 2020-01-09 4:04 p.m., Mario Kleiner wrote:
>
> On Thu, Jan 9, 2020 at 8:49 PM Alex Deucher  wrote:
>
>> On Thu, Jan 9, 2020 at 11:47 AM Mario Kleiner
>>  wrote:
>> >
>> > On Thu, Jan 9, 2020 at 4:40 PM Alex Deucher 
>> wrote:
>> >>
>> >> On Thu, Jan 9, 2020 at 10:08 AM Mario Kleiner
>> >>  wrote:
>> >> >
>> As Harry mentioned in the other thread, won't this only work if the
>> display was brought up by the vbios?  In the suspend/resume case,
>> won't we just fall back to 2.7Gbps?
>>
>> Alex
>>
>>
> Adding Harry to cc...
>
> The code is only executed for eDP. On the Intel side, it seems that
> intel_edp_init_dpcd() gets only called during driver load / modesetting
> init, so not on resume.
>
> On the AMD DC side, dc_link_detect_helper() has this early no-op return at
> the beginning:
>
> if ((link->connector_signal == SIGNAL_TYPE_LVDS ||
>   link->connector_signal == SIGNAL_TYPE_EDP) &&
>   link->local_sink)
>   return true;
>
>
> So i guess if link->local_sink doesn't get NULL'ed during a suspend/resume
> cycle, then we never reach the setup code that would overwrite with non
> vbios settings?
>
> Sounds reasonable to me, given that eDP panels are usually fixed internal
> panels, nothing that gets hot(un-)plugged?
>
> I can't test, because suspend/resume with the Polaris gpu on the MBP 2017
> is totally broken atm., just as vgaswitcheroo can't do its job. Looks like
> powering down the gpu works, but powering up doesn't. And also modesetting
> at vgaswitcheroo switch time is no-go, because the DDC/AUX lines apparently
> can't be switched on that Apple gmux, and handover of that data seems to be
> not implemented in current vgaswitcheroo. At the moment switching between
> AMD only or Intel+AMD Prime setup is quite a pita...
>
>
> I haven't followed the entire discussion on the i915 thread but for the
> amdgpu dc patch I would prefer a DPCD quirk to override the reported link
> settings with the correct link rate.
>
> Harry
>
>
Ok, as you wish. How do i do that? Is there already some DP related
official mechanism, or do i just add some if-statement to

detect_edp_sink_caps
<https://elixir.bootlin.com/linux/v5.5-rc5/ident/detect_edp_sink_caps>()
that matches on a new EDID quirk to be defined for that panel in
drm_edid etc., and then

if (edit quirk for that panel)
dpcd[DP_MAX_LINK_RATE
<https://elixir.bootlin.com/linux/v5.5-rc5/ident/DP_MAX_LINK_RATE>] =
0xc;

The other question would be if we should do it for this panel on AMD DC at
all? I see my original patch more as something to fix other odd (Apple?)
panels, than for this specific one. As mentioned above, photometer testing
on AMD DC with a Polaris on the MBP 2017 suggests that the deault 2.7 Gbps
8 bit mode + AMD's spatial dithering provides higher quality results for >=
10 bpc framebuffers than actually running the panel at 10 bit without
dithering.

As a little side-note, for squeezing out more precision than the 10 bpc
framebuffers we officially have in Mesa/OpenGL, my software Psychtoolbox
has some special hacks, playing funny tricks with resizing X-Screens,
applying bit-twiddling shaders to images and MMIO programming the gpu
"behind the back" of the driver, to get the gpu into RGBA16161616 linear
scanout mode. That gives up to 12 bpc precision on that panel according to
photometer measurements. While AMD's dithering with the panel in 8 bit + 4
bit spatial dithering gives pretty good results, panel at 10 bit + 2 bit
spatial dithering has some artifacts. And even at a normal 10 bit
framebuffer, the 8 bit panel + 2 bit dithering seems to give better results
than 10 bit panel mode.

-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.

2020-01-10 Thread Mario Kleiner
On Fri, Jan 10, 2020 at 2:32 PM Ville Syrjälä 
wrote:

> On Thu, Jan 09, 2020 at 09:19:07PM +0100, Mario Kleiner wrote:
> > On Thu, Jan 9, 2020 at 7:24 PM Ville Syrjälä <
> ville.syrj...@linux.intel.com>
> > wrote:
> >
> > > On Thu, Jan 09, 2020 at 06:57:14PM +0100, Mario Kleiner wrote:
> > > > On Thu, Jan 9, 2020 at 5:47 PM Ville Syrjälä <
> > > ville.syrj...@linux.intel.com>
> > > > wrote:
> > > >
> > > > > On Thu, Jan 09, 2020 at 05:30:05PM +0100, Mario Kleiner wrote:
> > > > > > On Thu, Jan 9, 2020 at 4:38 PM Ville Syrjälä <
> > > > > ville.syrj...@linux.intel.com>
> > > > > > wrote:
> > > > > >
> > >
> >
> > > wouldn't work if dpcd[0x1] == 0xa, which it likely is [*]. AMD DC
> > > > identified it as DP 1.1, eDP 1.3, and these extended caps seem to be
> only
> > > > part of DP 1.3+ if i understand the comments in
> > > > intel_dp_extended_receiver_capabilities() correctly.
> > >
> > >
> > Ok, looking at previous debug output logs shows that those extended caps
> > are not present on the systems, ie. that extended caps bit is not set. So
> > dpcd[0x1] == 0xa.
> >
> >
> > > Yeah, but you never know how creative they've been with the DPCD in
> > > such a propritary machine. A full DPCD dump from /dev/drm_dp_aux* would
> > > be nice. Can you file a bug an attach the DPCD dump there so we have a
> > > good reference on what we're talking about (also for future if/when
> > > someone eventually starts to wonder why we have such hacks in the
> > > code)?
> > >
> > >
> > True, it's Apple which likes to "Think different..." :/
> >
> > Will do. But is there a proper/better way to do the /dev/drm_dp_aux0
> dump?
> > I used cat /dev/drm_dp_aux0 > dump, and that hangs, but if i interrupt it
> > after a few seconds, i get a dump file of 512k size, which seems
> excessive?
> > On AMD DC atm., in case that matters.
>
> It can take a while to dump the whole thing. If there are errors in some
> parts (against the spec but some devices simply don't care about the
> spec) you may need to use ddrescue/etc. to dump everything that can be
> dumped.
>
> Ok, it is Mozilla bug 206157:

https://bugzilla.kernel.org/show_bug.cgi?id=206157

I attached the first ~ 5000 Bytes of DPCD dump, as there is a 5k file size
limit. The total dump is 512 kB, mostly zeros.

-mario

-- 
> Ville Syrjälä
> Intel
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.

2020-01-09 Thread Mario Kleiner
On Thu, Jan 9, 2020 at 8:49 PM Alex Deucher  wrote:

> On Thu, Jan 9, 2020 at 11:47 AM Mario Kleiner
>  wrote:
> >
> > On Thu, Jan 9, 2020 at 4:40 PM Alex Deucher 
> wrote:
> >>
> >> On Thu, Jan 9, 2020 at 10:08 AM Mario Kleiner
> >>  wrote:
> >> >
> As Harry mentioned in the other thread, won't this only work if the
> display was brought up by the vbios?  In the suspend/resume case,
> won't we just fall back to 2.7Gbps?
>
> Alex
>
>
Adding Harry to cc...

The code is only executed for eDP. On the Intel side, it seems that
intel_edp_init_dpcd() gets only called during driver load / modesetting
init, so not on resume.

On the AMD DC side, dc_link_detect_helper() has this early no-op return at
the beginning:

if ((link 
<https://elixir.bootlin.com/linux/v5.5-rc5/ident/link>->connector_signal
== SIGNAL_TYPE_LVDS
<https://elixir.bootlin.com/linux/v5.5-rc5/ident/SIGNAL_TYPE_LVDS> ||
link 
<https://elixir.bootlin.com/linux/v5.5-rc5/ident/link>->connector_signal
== SIGNAL_TYPE_EDP
<https://elixir.bootlin.com/linux/v5.5-rc5/ident/SIGNAL_TYPE_EDP>) &&
link 
<https://elixir.bootlin.com/linux/v5.5-rc5/ident/link>->local_sink)
return <https://elixir.bootlin.com/linux/v5.5-rc5/ident/return> 
true
<https://elixir.bootlin.com/linux/v5.5-rc5/ident/true>;


So i guess if link->local_sink doesn't get NULL'ed during a suspend/resume
cycle, then we never reach the setup code that would overwrite with non
vbios settings?

Sounds reasonable to me, given that eDP panels are usually fixed internal
panels, nothing that gets hot(un-)plugged?

I can't test, because suspend/resume with the Polaris gpu on the MBP 2017
is totally broken atm., just as vgaswitcheroo can't do its job. Looks like
powering down the gpu works, but powering up doesn't. And also modesetting
at vgaswitcheroo switch time is no-go, because the DDC/AUX lines apparently
can't be switched on that Apple gmux, and handover of that data seems to be
not implemented in current vgaswitcheroo. At the moment switching between
AMD only or Intel+AMD Prime setup is quite a pita...

-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.

2020-01-09 Thread Mario Kleiner
On Thu, Jan 9, 2020 at 7:24 PM Ville Syrjälä 
wrote:

> On Thu, Jan 09, 2020 at 06:57:14PM +0100, Mario Kleiner wrote:
> > On Thu, Jan 9, 2020 at 5:47 PM Ville Syrjälä <
> ville.syrj...@linux.intel.com>
> > wrote:
> >
> > > On Thu, Jan 09, 2020 at 05:30:05PM +0100, Mario Kleiner wrote:
> > > > On Thu, Jan 9, 2020 at 4:38 PM Ville Syrjälä <
> > > ville.syrj...@linux.intel.com>
> > > > wrote:
> > > >
>

> wouldn't work if dpcd[0x1] == 0xa, which it likely is [*]. AMD DC
> > identified it as DP 1.1, eDP 1.3, and these extended caps seem to be only
> > part of DP 1.3+ if i understand the comments in
> > intel_dp_extended_receiver_capabilities() correctly.
>
>
Ok, looking at previous debug output logs shows that those extended caps
are not present on the systems, ie. that extended caps bit is not set. So
dpcd[0x1] == 0xa.


> Yeah, but you never know how creative they've been with the DPCD in
> such a propritary machine. A full DPCD dump from /dev/drm_dp_aux* would
> be nice. Can you file a bug an attach the DPCD dump there so we have a
> good reference on what we're talking about (also for future if/when
> someone eventually starts to wonder why we have such hacks in the
> code)?
>
>
True, it's Apple which likes to "Think different..." :/

Will do. But is there a proper/better way to do the /dev/drm_dp_aux0 dump?
I used cat /dev/drm_dp_aux0 > dump, and that hangs, but if i interrupt it
after a few seconds, i get a dump file of 512k size, which seems excessive?
On AMD DC atm., in case that matters.

However, the file shows DPCD_REV 1.1, maximum 0xa and no extended caps (
DP_TRAINING_AUX_RD_INTERVAL
<https://elixir.bootlin.com/linux/v5.5-rc5/ident/DP_TRAINING_AUX_RD_INTERVAL>
aka
[0xe] == 0x00).
 -mario


-- 
> Ville Syrjälä
> Intel
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.

2020-01-09 Thread Mario Kleiner
On Thu, Jan 9, 2020 at 5:47 PM Ville Syrjälä 
wrote:

> On Thu, Jan 09, 2020 at 05:30:05PM +0100, Mario Kleiner wrote:
> > On Thu, Jan 9, 2020 at 4:38 PM Ville Syrjälä <
> ville.syrj...@linux.intel.com>
> > wrote:
> >
> > > On Thu, Jan 09, 2020 at 05:26:57PM +0200, Ville Syrjälä wrote:
> > > > On Thu, Jan 09, 2020 at 04:07:52PM +0100, Mario Kleiner wrote:
> > > > > The panel reports 10 bpc color depth in its EDID, and the UEFI
> > > > > firmware chooses link settings at boot which support enough
> > > > > bandwidth for 10 bpc (324000 kbit/sec to be precise), but the
> > > > > DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps as possible,
> > >
> > > Does it actually or do we just ignore the fact that it reports
> 3.24Gbps?
> > >
> > > If it really reports 3.24 then we should be able to just add that to
> > > dp_rates[] in intel_dp_set_sink_rates() and be done with it.
> > >
> > > Although we'd likely want to skip 3.24 unless it really is reported
> > > as the max so as to not use that non-standard rate on other displays.
> > > So would require a bit fancier logic for that.
> > >
> > >
> > Was also my initial thought, but the DP_MAX_LINK_RATE reg reports 2.7
> Gbps
> > as maximum.
>
> So dpcd[0x1] == 0xa ?
>
>
Yes. [*]


> What about the magic second version of DP_MAX_LINK_RATE at 0x2201 ?
> Hmm. I guess we should already be reading that via
> intel_dp_extended_receiver_capabilities().
>

Yes, you do.

[*] Well, i have to recheck on the machine. I started this work on the AMD
side and checked what AMD DC gave me, haven't rechecked stuff under i915
that i already knew from AMD. Comparing the implementations, there's some
peculiar differences that may matter:

intel_dp_extended_receiver_capabilities() is more "paranoid" than AMD DC's
retrieve_link_cap() function in deciding if the extended receiver caps are
valid. Intels implementation copies only the first 6 Bytes of extended
receiver caps into the dpcd[] arrays, whereas AMD copies 16 Bytes. Not sure
about the differences, but one of you may wanna check why this is, and if
it matters somehow.

Btw. your proposed

/* blah */
if (max_rate > ...)

wouldn't work if dpcd[0x1] == 0xa, which it likely is [*]. AMD DC
identified it as DP 1.1, eDP 1.3, and these extended caps seem to be only
part of DP 1.3+ if i understand the comments in
intel_dp_extended_receiver_capabilities() correctly.

-mario



>
> --
> Ville Syrjälä
> Intel
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.

2020-01-09 Thread Mario Kleiner
On Thu, Jan 9, 2020 at 4:40 PM Alex Deucher  wrote:

> On Thu, Jan 9, 2020 at 10:08 AM Mario Kleiner
>  wrote:
> >
> > If the current eDP link rate, as read from hw, provides a
> > higher bandwidth than the standard link rates, then add the
> > current link rate to the link_rates array for consideration
> > in future mode-sets.
> >
> > These initial current eDP link settings have been set up by
> > firmware during boot, so they should work on the eDP panel.
> > Therefore use them if the firmware thinks they are good and
> > they provide higher link bandwidth, e.g., to enable higher
> > resolutions / color depths.
> >
> > This fixes a problem found on the MacBookPro 2017 Retina panel:
> >
> > The panel reports 10 bpc color depth in its EDID, and the UEFI
> > firmware chooses link settings at boot which support enough
> > bandwidth for 10 bpc (324000 kbit/sec to be precise), but the
> > DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps as possible,
> > so intel_dp_set_sink_rates() would cap at that. This restricts
> > achievable color depth to 8 bpc, not providing the full color
> > depth of the panel. With this commit, we can use firmware setting
> > and get the full 10 bpc advertised by the Retina panel.
>
> Would it make more sense to just add a quirk for this particular
> panel?  Would there be cases where the link was programmed wrong and
> then we end up using that additional link speed as supported?
>
> Alex
>
>
Not sure. This MBP 2017 is the only non-ancient laptop i now have. I'd
assume many other Apple Retina panels would behave similar. The panels dpcd
regs report DP 1.1 and eDP 1.3, so the flexible table with additional modes
from eDP1.4+ does not exist. According to Wikipedia, eDP 1.4 was introduced
in february 2013 and this is a mid 2017 machine, so Apple seems to be quite
behind. Therefore i assume  we'd need a lot of quirks over time.

That said:

1. The logic in amdgpu's DC for the same purpose is a bit different than on
the intel side.

2. DC allows overriding DP link settings, that's how i initially tested
this, so one could do the "quirk" via something like that in a bootup
script. So on AMD one could work around the lack of the patch and of quirks.

3. I spent a lot of time with a photo-meter, testing the quality of the 10
bit: It turns out that running the panel at 8 bit + AMD's spatial dithering
that kicks in gives better results than running the panel in native 10 bit.
Maybe the panel is not really a 10 bit one, but just pretends to be and
then uses its own dithering to achieve 10 bit. So at least on AMD one is
better off precision-wise with the 8 bit panel default with this specific
panel.

On Intel however, we don't do dithering for > 6 bpc panels atm., so using
the panel at 10 bpc is the only way to get 10 bit display atm. Adn we don't
use dithering on Intel at > 6 bpc panels atm., because there are some
oddities in the way Intel hw dithers at higher bit depths - it also dithers
pixel values where it shouldn't. That makes it impossible to get an
identity passthrough of a 8 bpc framebuffer to the outputs, which kills all
kind of special display equipment that needs that identity passthrough to
work.

-mario

>
> > Signed-off-by: Mario Kleiner 
> > Cc: Daniel Vetter 
> > ---
> >  drivers/gpu/drm/i915/display/intel_dp.c | 23 +++
> >  1 file changed, 23 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c
> b/drivers/gpu/drm/i915/display/intel_dp.c
> > index 2f31d226c6eb..aa3e0b5108c6 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > @@ -4368,6 +4368,8 @@ intel_edp_init_dpcd(struct intel_dp *intel_dp)
> >  {
> > struct drm_i915_private *dev_priv =
> > to_i915(dp_to_dig_port(intel_dp)->base.base.dev);
> > +   int max_rate;
> > +   u8 link_bw;
> >
> > /* this function is meant to be called only once */
> > WARN_ON(intel_dp->dpcd[DP_DPCD_REV] != 0);
> > @@ -4433,6 +4435,27 @@ intel_edp_init_dpcd(struct intel_dp *intel_dp)
> > else
> > intel_dp_set_sink_rates(intel_dp);
> >
> > +   /*
> > +* If the firmware programmed a rate higher than the standard
> sink rates
> > +* during boot, then add that rate as a valid sink rate, as fw
> knows
> > +* this is a good rate and we get extra bandwidth.
> > +*
> > +* Helps, e.g., on the Apple MacBookPro 2017 Retina panel, which
> is only
> > +* eDP 1.1, but supports the unusual rate of 324000 kHz at
> bootup, for
> > +* 10 bpc 

Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.

2020-01-09 Thread Mario Kleiner
On Thu, Jan 9, 2020 at 4:38 PM Ville Syrjälä 
wrote:

> On Thu, Jan 09, 2020 at 05:26:57PM +0200, Ville Syrjälä wrote:
> > On Thu, Jan 09, 2020 at 04:07:52PM +0100, Mario Kleiner wrote:
> > > The panel reports 10 bpc color depth in its EDID, and the UEFI
> > > firmware chooses link settings at boot which support enough
> > > bandwidth for 10 bpc (324000 kbit/sec to be precise), but the
> > > DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps as possible,
>
> Does it actually or do we just ignore the fact that it reports 3.24Gbps?
>
> If it really reports 3.24 then we should be able to just add that to
> dp_rates[] in intel_dp_set_sink_rates() and be done with it.
>
> Although we'd likely want to skip 3.24 unless it really is reported
> as the max so as to not use that non-standard rate on other displays.
> So would require a bit fancier logic for that.
>
>
Was also my initial thought, but the DP_MAX_LINK_RATE reg reports 2.7 Gbps
as maximum.
-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.

2020-01-09 Thread Mario Kleiner
On Thu, Jan 9, 2020 at 4:27 PM Ville Syrjälä 
wrote:

> On Thu, Jan 09, 2020 at 04:07:52PM +0100, Mario Kleiner wrote:
> > If the current eDP link rate, as read from hw, provides a
> > higher bandwidth than the standard link rates, then add the
> > current link rate to the link_rates array for consideration
> > in future mode-sets.
> >
> > These initial current eDP link settings have been set up by
> > firmware during boot, so they should work on the eDP panel.
> > Therefore use them if the firmware thinks they are good and
> > they provide higher link bandwidth, e.g., to enable higher
> > resolutions / color depths.
> >
> > This fixes a problem found on the MacBookPro 2017 Retina panel:
> >
> > The panel reports 10 bpc color depth in its EDID, and the UEFI
> > firmware chooses link settings at boot which support enough
> > bandwidth for 10 bpc (324000 kbit/sec to be precise), but the
> > DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps as possible,
> > so intel_dp_set_sink_rates() would cap at that. This restricts
> > achievable color depth to 8 bpc, not providing the full color
> > depth of the panel. With this commit, we can use firmware setting
> > and get the full 10 bpc advertised by the Retina panel.
> >
> > Signed-off-by: Mario Kleiner 
> > Cc: Daniel Vetter 
> > ---
> >  drivers/gpu/drm/i915/display/intel_dp.c | 23 +++
> >  1 file changed, 23 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c
> b/drivers/gpu/drm/i915/display/intel_dp.c
> > index 2f31d226c6eb..aa3e0b5108c6 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > @@ -4368,6 +4368,8 @@ intel_edp_init_dpcd(struct intel_dp *intel_dp)
> >  {
> >   struct drm_i915_private *dev_priv =
> >   to_i915(dp_to_dig_port(intel_dp)->base.base.dev);
> > + int max_rate;
> > + u8 link_bw;
> >
> >   /* this function is meant to be called only once */
> >   WARN_ON(intel_dp->dpcd[DP_DPCD_REV] != 0);
> > @@ -4433,6 +4435,27 @@ intel_edp_init_dpcd(struct intel_dp *intel_dp)
> >   else
> >   intel_dp_set_sink_rates(intel_dp);
> >
> > + /*
> > +  * If the firmware programmed a rate higher than the standard sink
> rates
> > +  * during boot, then add that rate as a valid sink rate, as fw
> knows
> > +  * this is a good rate and we get extra bandwidth.
> > +  *
> > +  * Helps, e.g., on the Apple MacBookPro 2017 Retina panel, which
> is only
> > +  * eDP 1.1, but supports the unusual rate of 324000 kHz at bootup,
> for
> > +  * 10 bpc / 30 bit color depth.
> > +  */
> > + if (!intel_dp->use_rate_select &&
> > + (drm_dp_dpcd_read(_dp->aux, DP_LINK_BW_SET, _bw, 1)
> == 1) &&
> > + (link_bw > 0) && (intel_dp->num_sink_rates <
> DP_MAX_SUPPORTED_RATES)) {
> > + max_rate = drm_dp_bw_code_to_link_rate(link_bw);
> > + if (max_rate >
> intel_dp->sink_rates[intel_dp->num_sink_rates - 1]) {
> > + intel_dp->sink_rates[intel_dp->num_sink_rates] =
> max_rate;
> > + intel_dp->num_sink_rates++;
> > + DRM_DEBUG_KMS("Adding max bandwidth eDP rate %d
> kHz.\n",
> > +   max_rate);
> > + }
>
> Hmm. I guess we could do this. But plese put it into a separate
> function so we don't end up with that super ugly if condition.
>
>
Ok. Does static void intel_edp_add_bootup_rate() good to you? Or
intel_edp_add_fw_rate()?

The debug message should probably be a bit more explicit. Eg.
> something like:
> "Firmware using non-standard link rate %d kHz. Including it in sink
> rates.\n"
>

Ok.


> I'm also wondering if we shouldn't just add the link rate to the sink
> rates regradless of whether it's the highest rate or not...
>
>
I tried to be conservative, and simple, but yes, one could add it anyway.
Would need to preserve the order in the sink_rates[] array.
Your choice, your're the expert :)


> > + }
> > +
> >   intel_dp_set_common_rates(intel_dp);
> >
> >   /* Read the eDP DSC DPCD registers */
> > --
> > 2.24.0
> >
> > ___
> > dri-devel mailing list
> > dri-de...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Ville Syrjälä
> Intel
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.

2020-01-09 Thread Mario Kleiner
If the current eDP link rate, as read from hw, provides a
higher bandwidth than the standard link rates, then add the
current link rate to the link_rates array for consideration
in future mode-sets.

These initial current eDP link settings have been set up by
firmware during boot, so they should work on the eDP panel.
Therefore use them if the firmware thinks they are good and
they provide higher link bandwidth, e.g., to enable higher
resolutions / color depths.

This fixes a problem found on the MacBookPro 2017 Retina panel:

The panel reports 10 bpc color depth in its EDID, and the UEFI
firmware chooses link settings at boot which support enough
bandwidth for 10 bpc (324000 kbit/sec to be precise), but the
DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps as possible,
so intel_dp_set_sink_rates() would cap at that. This restricts
achievable color depth to 8 bpc, not providing the full color
depth of the panel. With this commit, we can use firmware setting
and get the full 10 bpc advertised by the Retina panel.

Signed-off-by: Mario Kleiner 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/display/intel_dp.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
b/drivers/gpu/drm/i915/display/intel_dp.c
index 2f31d226c6eb..aa3e0b5108c6 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -4368,6 +4368,8 @@ intel_edp_init_dpcd(struct intel_dp *intel_dp)
 {
struct drm_i915_private *dev_priv =
to_i915(dp_to_dig_port(intel_dp)->base.base.dev);
+   int max_rate;
+   u8 link_bw;
 
/* this function is meant to be called only once */
WARN_ON(intel_dp->dpcd[DP_DPCD_REV] != 0);
@@ -4433,6 +4435,27 @@ intel_edp_init_dpcd(struct intel_dp *intel_dp)
else
intel_dp_set_sink_rates(intel_dp);
 
+   /*
+* If the firmware programmed a rate higher than the standard sink rates
+* during boot, then add that rate as a valid sink rate, as fw knows
+* this is a good rate and we get extra bandwidth.
+*
+* Helps, e.g., on the Apple MacBookPro 2017 Retina panel, which is only
+* eDP 1.1, but supports the unusual rate of 324000 kHz at bootup, for
+* 10 bpc / 30 bit color depth.
+*/
+   if (!intel_dp->use_rate_select &&
+   (drm_dp_dpcd_read(_dp->aux, DP_LINK_BW_SET, _bw, 1) == 
1) &&
+   (link_bw > 0) && (intel_dp->num_sink_rates < 
DP_MAX_SUPPORTED_RATES)) {
+   max_rate = drm_dp_bw_code_to_link_rate(link_bw);
+   if (max_rate > intel_dp->sink_rates[intel_dp->num_sink_rates - 
1]) {
+   intel_dp->sink_rates[intel_dp->num_sink_rates] = 
max_rate;
+   intel_dp->num_sink_rates++;
+   DRM_DEBUG_KMS("Adding max bandwidth eDP rate %d kHz.\n",
+ max_rate);
+   }
+   }
+
intel_dp_set_common_rates(intel_dp);
 
/* Read the eDP DSC DPCD registers */
-- 
2.24.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/3] drm/atomic: Rename crtc_state->pageflip_flags to async_flip

2019-09-05 Thread Mario Kleiner
On Wed, Sep 4, 2019 at 2:57 PM Kazlauskas, Nicholas
 wrote:
>
> On 2019-09-03 3:06 p.m., Daniel Vetter wrote:
> > It's the only flag anyone actually cares about. Plus if we're unlucky,
> > the atomic ioctl might need a different flag for async flips. So
> > better to abstract this away from the uapi a bit.
> >
> > Cc: Maarten Lankhorst 
> > Cc: Michel Dänzer 
> > Cc: Alex Deucher 
> > Cc: Adam Jackson 
> > Cc: Sean Paul 
> > Cc: David Airlie 
> > Signed-off-by: Daniel Vetter 
> > Cc: Maxime Ripard 
> > Cc: Daniel Vetter 
> > Cc: Nicholas Kazlauskas 
> > Cc: Leo Li 
> > Cc: Harry Wentland 
> > Cc: David Francis 
> > Cc: Mario Kleiner 
> > Cc: Bhawanpreet Lakha 
> > Cc: Ben Skeggs 
> > Cc: "Christian König" 
> > Cc: Ilia Mirkin 
> > Cc: Sam Ravnborg 
> > Cc: Chris Wilson 
> > ---
>
> Series is:
>
> Reviewed-by: Nicholas Kazlauskas 
>
> I would like to see a new flag eventually show up for atomic as well,
> but the existing one is effectively broken at this point and I would
> hope that no userspace is setting it expecting that it actually does
> something.

You mean it is generally broken? My software uses non-vsync'ed flips
for diagnostic purpose and iirc some gpu + driver combo didn't work as
expected anymore. But i thought that was one specific driver bug
(maybe on AMD + DC)?

>
> At this point we don't really gain anything from enabling atomic in DDX
> I think, most drivers already make use of DRM helpers to map these
> legacy IOCTLs to atomic anyway.
>

One thing i wanted to try, once i hopefully find some time in late
2019 / early 2020 (if nobody else starts working on such a thing
earlier), would be to add the ability to pass in a target flip time to
the pageflip ioctl for use with VRR. For that i thought adding a new
pageflip flag a la DRM_MODE_PAGE_FLIP_TARGETTIME) would be a good way
to reuse the existing page_flip_target ioctl and redefine the "uint32
sequence" field of struct drm_mode_crtc_page_flip_target to pass in
the target time - or at least the lower 32 bits of a target time.

So that would be one more page flip flag for the future. I'd like this
to be workable from X11, and the current DDX don't use the atomic
interface, apart from the modesetting DDX where it just got disabled
by default in xserver master due to various unresolved bugs afaik?

thanks,
-mario

> Nicholas Kazlauskas
>
> >   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 ++---
> >   drivers/gpu/drm/drm_atomic_helper.c   | 2 +-
> >   drivers/gpu/drm/drm_atomic_state_helper.c | 2 +-
> >   drivers/gpu/drm/nouveau/dispnv50/wndw.c   | 4 ++--
> >   include/drm/drm_crtc.h| 8 
> >   5 files changed, 10 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 0a71ed1e7762..2f0ef0820f00 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -5756,8 +5756,7 @@ static void amdgpu_dm_commit_planes(struct 
> > drm_atomic_state *state,
> >* change FB pitch, DCC state, rotation or mirroing.
> >*/
> >   bundle->flip_addrs[planes_count].flip_immediate =
> > - (crtc->state->pageflip_flags &
> > -  DRM_MODE_PAGE_FLIP_ASYNC) != 0 &&
> > + crtc->state->async_flip &&
> >   acrtc_state->update_type == UPDATE_TYPE_FAST;
> >
> >   timestamp_ns = ktime_get_ns();
> > @@ -6334,7 +6333,7 @@ static void amdgpu_dm_atomic_commit_tail(struct 
> > drm_atomic_state *state)
> >   amdgpu_dm_enable_crtc_interrupts(dev, state, true);
> >
> >   for_each_new_crtc_in_state(state, crtc, new_crtc_state, j)
> > - if (new_crtc_state->pageflip_flags & DRM_MODE_PAGE_FLIP_ASYNC)
> > + if (new_crtc_state->async_flip)
> >   wait_for_vblank = false;
> >
> >   /* update planes when needed per crtc*/
> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
> > b/drivers/gpu/drm/drm_atomic_helper.c
> > index e9c6112e7f73..1e5293eb66e3 100644
> > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > @@ -3263,7 +3263,7 @@ static int page_flip_common(struct drm_atomic_state 
> > *state,
> >   return PTR_ERR(crtc_state);
> >
> >   c

Re: [Intel-gfx] [PATCH xf86-video-intel v3 2/2] sna: Support 10bpc gamma via the GAMMA_LUT crtc property

2019-07-09 Thread Mario Kleiner
Hi Ville,

now somebody just needs to merge these two 10 bit gamma lut patches
into intel-ddx?

thanks,
-mario

On Fri, May 17, 2019 at 3:51 PM Ville Syrjala
 wrote:
>
> From: Ville Syrjälä 
>
> Probe the GAMMA_LUT/GAMMA_LUT_SIZE props and utilize them when
> the running with > 8bpc.
>
> v2: s/sna_crtc_id/__sna_crtc_id/ in DBG since we have a sna_crtc
> v3: Fix the vg "bluered" typo (Mario)
> This time I even build tested with vg support
>
> Cc: Mario Kleiner 
> Signed-off-by: Ville Syrjälä 
> Reviewed-and-tested-by: Mario Kleiner 
> ---
>  src/sna/sna_display.c | 247 +++---
>  1 file changed, 208 insertions(+), 39 deletions(-)
>
> diff --git a/src/sna/sna_display.c b/src/sna/sna_display.c
> index 41edfec12839..d6210cc7bbc8 100644
> --- a/src/sna/sna_display.c
> +++ b/src/sna/sna_display.c
> @@ -127,6 +127,7 @@ struct local_mode_obj_get_properties {
> uint32_t obj_type;
> uint32_t pad;
>  };
> +#define LOCAL_MODE_OBJECT_CRTC 0x
>  #define LOCAL_MODE_OBJECT_PLANE 0x
>
>  struct local_mode_set_plane {
> @@ -229,6 +230,11 @@ struct sna_crtc {
> } primary;
> struct list sprites;
>
> +   struct drm_color_lut *gamma_lut;
> +   uint64_t gamma_lut_prop;
> +   uint64_t gamma_lut_blob;
> +   uint32_t gamma_lut_size;
> +
> uint32_t mode_serial, flip_serial;
>
> uint32_t last_seq, wrap_seq;
> @@ -317,6 +323,9 @@ static void __sna_output_dpms(xf86OutputPtr output, int 
> dpms, int fixup);
>  static void sna_crtc_disable_cursor(struct sna *sna, struct sna_crtc *crtc);
>  static bool sna_crtc_flip(struct sna *sna, struct sna_crtc *crtc,
>   struct kgem_bo *bo, int x, int y);
> +static void sna_crtc_gamma_set(xf86CrtcPtr crtc,
> +  CARD16 *red, CARD16 *green,
> +  CARD16 *blue, int size);
>
>  static bool is_zaphod(ScrnInfoPtr scrn)
>  {
> @@ -3150,11 +3159,9 @@ sna_crtc_set_mode_major(xf86CrtcPtr crtc, 
> DisplayModePtr mode,
>mode->VDisplay <= sna->mode.max_crtc_height);
>
>  #if HAS_GAMMA
> -   drmModeCrtcSetGamma(sna->kgem.fd, __sna_crtc_id(sna_crtc),
> -   crtc->gamma_size,
> -   crtc->gamma_red,
> -   crtc->gamma_green,
> -   crtc->gamma_blue);
> +   sna_crtc_gamma_set(crtc,
> +  crtc->gamma_red, crtc->gamma_green,
> +  crtc->gamma_blue, crtc->gamma_size);
>  #endif
>
> saved_kmode = sna_crtc->kmode;
> @@ -3212,12 +3219,44 @@ void sna_mode_adjust_frame(struct sna *sna, int x, 
> int y)
>
>  static void
>  sna_crtc_gamma_set(xf86CrtcPtr crtc,
> -  CARD16 *red, CARD16 *green, CARD16 *blue, int size)
> +  CARD16 *red, CARD16 *green, CARD16 *blue, int size)
>  {
> -   assert(to_sna_crtc(crtc));
> -   drmModeCrtcSetGamma(to_sna(crtc->scrn)->kgem.fd,
> -   sna_crtc_id(crtc),
> -   size, red, green, blue);
> +   struct sna *sna = to_sna(crtc->scrn);
> +   struct sna_crtc *sna_crtc = to_sna_crtc(crtc);
> +   struct drm_color_lut *lut = sna_crtc->gamma_lut;
> +   uint32_t blob_size = size * sizeof(lut[0]);
> +   uint32_t blob_id;
> +   int ret, i;
> +
> +   DBG(("%s: gamma_size %d\n", __FUNCTION__, size));
> +
> +   if (!lut) {
> +   assert(size == 256);
> +
> +   drmModeCrtcSetGamma(to_sna(crtc->scrn)->kgem.fd,
> +   sna_crtc_id(crtc),
> +   size, red, green, blue);
> +   return;
> +   }
> +
> +   assert(size == sna_crtc->gamma_lut_size);
> +
> +   for (i = 0; i < size; i++) {
> +   lut[i].red = red[i];
> +   lut[i].green = green[i];
> +   lut[i].blue = blue[i];
> +   }
> +
> +   ret = drmModeCreatePropertyBlob(sna->kgem.fd, lut, blob_size, 
> _id);
> +   if (ret)
> +   return;
> +
> +   ret = drmModeObjectSetProperty(sna->kgem.fd,
> +  sna_crtc->id, DRM_MODE_OBJECT_CRTC,
> +  sna_crtc->gamma_lut_prop,
> +  blob_id);
> +
> +   drmModeDestroyPropertyBlob(sna->kgem.fd, blob_id);
>  }
>
>  static void
> @@ -3229,6 +3268,8 @@ sna_crtc_destroy(xf86CrtcPtr crtc)
> if (sna_crtc == NULL)
>

Re: [Intel-gfx] [PATCH xf86-video-intel v2 2/2] sna: Support 10bpc gamma via the GAMMA_LUT crtc property

2019-05-16 Thread Mario Kleiner
On Fri, Apr 26, 2019 at 6:32 PM Ville Syrjala
 wrote:
>
> From: Ville Syrjälä 
>
> Probe the GAMMA_LUT/GAMMA_LUT_SIZE props and utilize them when
> the running with > 8bpc.
>
> v2: s/sna_crtc_id/__sna_crtc_id/ in DBG since we have a sna_crtc
>
> Cc: Mario Kleiner 
> Signed-off-by: Ville Syrjälä 
> ---
>  src/sna/sna_display.c | 245 +++---
>  1 file changed, 207 insertions(+), 38 deletions(-)
>
> diff --git a/src/sna/sna_display.c b/src/sna/sna_display.c
> index 41edfec12839..6d671dce8c14 100644
> --- a/src/sna/sna_display.c
> +++ b/src/sna/sna_display.c
> @@ -127,6 +127,7 @@ struct local_mode_obj_get_properties {
> uint32_t obj_type;
> uint32_t pad;
>  };
> +#define LOCAL_MODE_OBJECT_CRTC 0x
>  #define LOCAL_MODE_OBJECT_PLANE 0x
>
>  struct local_mode_set_plane {
> @@ -229,6 +230,11 @@ struct sna_crtc {
> } primary;
> struct list sprites;
>
> +   struct drm_color_lut *gamma_lut;
> +   uint64_t gamma_lut_prop;
> +   uint64_t gamma_lut_blob;
> +   uint32_t gamma_lut_size;
> +
> uint32_t mode_serial, flip_serial;
>
> uint32_t last_seq, wrap_seq;
> @@ -317,6 +323,9 @@ static void __sna_output_dpms(xf86OutputPtr output, int 
> dpms, int fixup);
>  static void sna_crtc_disable_cursor(struct sna *sna, struct sna_crtc *crtc);
>  static bool sna_crtc_flip(struct sna *sna, struct sna_crtc *crtc,
>   struct kgem_bo *bo, int x, int y);
> +static void sna_crtc_gamma_set(xf86CrtcPtr crtc,
> +  CARD16 *red, CARD16 *green,
> +  CARD16 *blue, int size);
>
>  static bool is_zaphod(ScrnInfoPtr scrn)
>  {
> @@ -3150,11 +3159,9 @@ sna_crtc_set_mode_major(xf86CrtcPtr crtc, 
> DisplayModePtr mode,
>mode->VDisplay <= sna->mode.max_crtc_height);
>
>  #if HAS_GAMMA
> -   drmModeCrtcSetGamma(sna->kgem.fd, __sna_crtc_id(sna_crtc),
> -   crtc->gamma_size,
> -   crtc->gamma_red,
> -   crtc->gamma_green,
> -   crtc->gamma_blue);
> +   sna_crtc_gamma_set(crtc,
> +  crtc->gamma_red, crtc->gamma_green,
> +  crtc->gamma_blue, crtc->gamma_size);
>  #endif
>
> saved_kmode = sna_crtc->kmode;
> @@ -3212,12 +3219,44 @@ void sna_mode_adjust_frame(struct sna *sna, int x, 
> int y)
>
>  static void
>  sna_crtc_gamma_set(xf86CrtcPtr crtc,
> -  CARD16 *red, CARD16 *green, CARD16 *blue, int size)
> +  CARD16 *red, CARD16 *green, CARD16 *blue, int size)
>  {
> -   assert(to_sna_crtc(crtc));
> -   drmModeCrtcSetGamma(to_sna(crtc->scrn)->kgem.fd,
> -   sna_crtc_id(crtc),
> -   size, red, green, blue);
> +   struct sna *sna = to_sna(crtc->scrn);
> +   struct sna_crtc *sna_crtc = to_sna_crtc(crtc);
> +   struct drm_color_lut *lut = sna_crtc->gamma_lut;
> +   uint32_t blob_size = size * sizeof(lut[0]);
> +   uint32_t blob_id;
> +   int ret, i;
> +
> +   DBG(("%s: gamma_size %d\n", __FUNCTION__, size));
> +
> +   if (!lut) {
> +   assert(size == 256);
> +
> +   drmModeCrtcSetGamma(to_sna(crtc->scrn)->kgem.fd,
> +   sna_crtc_id(crtc),
> +   size, red, green, blue);
> +   return;
> +   }
> +
> +   assert(size == sna_crtc->gamma_lut_size);
> +
> +   for (i = 0; i < size; i++) {
> +   lut[i].red = red[i];
> +   lut[i].green = green[i];
> +   lut[i].blue = blue[i];
> +   }
> +
> +   ret = drmModeCreatePropertyBlob(sna->kgem.fd, lut, blob_size, 
> _id);
> +   if (ret)
> +   return;
> +
> +   ret = drmModeObjectSetProperty(sna->kgem.fd,
> +  sna_crtc->id, DRM_MODE_OBJECT_CRTC,
> +  sna_crtc->gamma_lut_prop,
> +  blob_id);
> +
> +   drmModeDestroyPropertyBlob(sna->kgem.fd, blob_id);
>  }
>
>  static void
> @@ -3229,6 +3268,8 @@ sna_crtc_destroy(xf86CrtcPtr crtc)
> if (sna_crtc == NULL)
> return;
>
> +   free(sna_crtc->gamma_lut);
> +
> list_for_each_entry_safe(sprite, sn, _crtc->sprites, link)
> free(sprite);
>
> @@ -3663,6 +3704,55 @@ b

Re: [Intel-gfx] [PATCH xf86-video-intel v2 1/2] sna: Refactor property parsing

2019-05-16 Thread Mario Kleiner
On Fri, Apr 26, 2019 at 6:32 PM Ville Syrjala
 wrote:
>
> From: Ville Syrjälä 
>
> Generalize the code that parses the plane properties to be useable
> for crtc (or any kms object) properties as well.
>
> v2: plane 'type' prop is enum not range!
>
> Cc: Mario Kleiner 
> Signed-off-by: Ville Syrjälä 
> ---

This patch is

Reviewed-and-tested-by: Mario Kleiner 

-mario

>  src/sna/sna_display.c | 69 ++-
>  1 file changed, 49 insertions(+), 20 deletions(-)
>
> diff --git a/src/sna/sna_display.c b/src/sna/sna_display.c
> index 119ea981d243..41edfec12839 100644
> --- a/src/sna/sna_display.c
> +++ b/src/sna/sna_display.c
> @@ -215,6 +215,7 @@ struct sna_crtc {
> uint32_t rotation;
> struct plane {
> uint32_t id;
> +   uint32_t type;
> struct {
> uint32_t prop;
> uint32_t supported;
> @@ -3391,33 +3392,40 @@ void sna_crtc_set_sprite_colorspace(xf86CrtcPtr crtc,
>  p->color_encoding.values[colorspace]);
>  }
>
> -static int plane_details(struct sna *sna, struct plane *p)
> +typedef void (*parse_prop_func)(struct sna *sna,
> +   struct drm_mode_get_property *prop,
> +   uint64_t value,
> +   void *data);
> +static void parse_props(struct sna *sna,
> +  uint32_t obj_type, uint32_t obj_id,
> +  parse_prop_func parse_prop,
> +  void *data)
>  {
>  #define N_STACK_PROPS 32 /* must be a multiple of 2 */
> struct local_mode_obj_get_properties arg;
> uint64_t stack[N_STACK_PROPS + N_STACK_PROPS/2];
> uint64_t *values = stack;
> uint32_t *props = (uint32_t *)(values + N_STACK_PROPS);
> -   int i, type = DRM_PLANE_TYPE_OVERLAY;
> +   int i;
>
> memset(, 0, sizeof(struct local_mode_obj_get_properties));
> -   arg.obj_id = p->id;
> -   arg.obj_type = LOCAL_MODE_OBJECT_PLANE;
> +   arg.obj_id = obj_id;
> +   arg.obj_type = obj_type;
>
> arg.props_ptr = (uintptr_t)props;
> arg.prop_values_ptr = (uintptr_t)values;
> arg.count_props = N_STACK_PROPS;
>
> if (drmIoctl(sna->kgem.fd, LOCAL_IOCTL_MODE_OBJ_GETPROPERTIES, ))
> -   return -1;
> +   return;
>
> DBG(("%s: object %d (type %x) has %d props\n", __FUNCTION__,
> -p->id, LOCAL_MODE_OBJECT_PLANE, arg.count_props));
> +obj_id, obj_type, arg.count_props));
>
> if (arg.count_props > N_STACK_PROPS) {
> values = malloc(2*sizeof(uint64_t)*arg.count_props);
> if (values == NULL)
> -   return -1;
> +   return;
>
> props = (uint32_t *)(values + arg.count_props);
>
> @@ -3444,27 +3452,48 @@ static int plane_details(struct sna *sna, struct 
> plane *p)
> DBG(("%s: prop[%d] .id=%ld, .name=%s, .flags=%x, 
> .value=%ld\n", __FUNCTION__, i,
>  (long)props[i], prop.name, (unsigned)prop.flags, 
> (long)values[i]));
>
> -   if (strcmp(prop.name, "type") == 0) {
> -   type = values[i];
> -   } else if (prop_is_rotation()) {
> -   parse_rotation_prop(sna, p, , values[i]);
> -   } else if (prop_is_color_encoding()) {
> -   parse_color_encoding_prop(sna, p, , values[i]);
> -   }
> +   parse_prop(sna, , values[i], data);
> }
>
> -   p->rotation.supported &= DBG_NATIVE_ROTATION;
> -   if (!xf86ReturnOptValBool(sna->Options, OPTION_ROTATION, TRUE))
> -   p->rotation.supported = RR_Rotate_0;
> -
> if (values != stack)
> free(values);
>
> -   DBG(("%s: plane=%d type=%d\n", __FUNCTION__, p->id, type));
> -   return type;
>  #undef N_STACK_PROPS
>  }
>
> +static bool prop_is_type(const struct drm_mode_get_property *prop)
> +{
> +   return prop_has_type_and_name(prop, 3, "type");
> +}
> +
> +static void plane_parse_prop(struct sna *sna,
> +struct drm_mode_get_property *prop,
> +uint64_t value, void *data)
> +{
> +   struct plane *p = data;
> +
> +   if (prop_is_type(prop))
> +   p->type = value;
> +   else if (prop_is_rotation(prop))
> +   parse_rotation_prop(sna, p, prop, value);
> +   else if (prop_is

Re: [Intel-gfx] [PATCH xf86-video-intel] sna/uxa: Fix colormap handling at screen depth 30. (v2)

2019-01-20 Thread Mario Kleiner
On Mon, Oct 15, 2018 at 6:21 PM Ville Syrjälä 
wrote:

> On Tue, Jun 12, 2018 at 06:20:35PM +0200, Mario Kleiner wrote:
> > The various clut handling functions like a setup
> > consistent with the x-screen color depth. Otherwise
> > we observe improper sampling in the gamma tables
> > at depth 30.
> >
> > Therefore replace hard-coded bitsPerRGB = 8 by actual
> > bits per channel scrn->rgbBits. Also use this for call
> > to xf86HandleColormaps().
> >
> > Tested for uxa and sna at depths 8, 16, 24 and 30 on
> > IvyBridge, and tested at depth 24 and 30 that xgamma
> > and gamma table animations work, and with measurement
> > equipment to make sure identity gamma ramps actually
> > are identity mappings at the output.
> >
> > v2: Also deal with X-Server 1.19 and earlier, which as of
> > v1.19.6 lack a fix to color palette handling and can
> > not deal with depths/bpc > 24/8 bpc. On < 1.20 we skip
> > xf86HandleColormaps() setup at > 8 bpc. This disables
> > color palette handling on such servers at > 8 bpc, but
> > still keeps RandR gamma table handling intact.
> >
> > Tested on 1.19.6 and 1.20.0 to do the right thing.
> >
> > Signed-off-by: Mario Kleiner 
>
> Forgot this didn't get applied. It did make sense to me at the
> time when I was looking at the explosions with depth 30.
> Still seems to do the trick on 1.19, and redshit still works
> so
>
> Reviewed-by: Ville Syrjälä 
>
>
Thanks Ville!

Now it just needs to get merged, please. Chris?

One last missing piece is support for 1024 slot gamma tables in i965-kms,
or gamma table bypass for such high bit depth framebuffers to make them
actually useful. Ville, i think you mentioned working on that around spring
last year?

Thanks,
-mario

> ---
> >  src/sna/sna_driver.c   | 9 ++---
> >  src/uxa/intel_driver.c | 6 +-
> >  2 files changed, 11 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/sna/sna_driver.c b/src/sna/sna_driver.c
> > index 2007e354..8c79d43b 100644
> > --- a/src/sna/sna_driver.c
> > +++ b/src/sna/sna_driver.c
> > @@ -1152,7 +1152,7 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL)
> >   if (!miInitVisuals(, , , ,
> ,
> >  ,
> >  ((unsigned long)1 << (scrn->bitsPerPixel - 1)),
> > -8, -1))
> > +scrn->rgbBits, -1))
> >   return FALSE;
> >
> >   if (!miScreenInit(screen, NULL,
> > @@ -1223,8 +1223,11 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL)
> >   if (!miCreateDefColormap(screen))
> >   return FALSE;
> >
> > - if (sna->mode.num_real_crtc &&
> > - !xf86HandleColormaps(screen, 256, 8, sna_load_palette, NULL,
> > + /* X-Server < 1.20 mishandles > 256 slots / > 8 bpc color maps. */
> > + if (sna->mode.num_real_crtc && (scrn->rgbBits <= 8 ||
> > + XORG_VERSION_CURRENT >= XORG_VERSION_NUMERIC(1,20,0,0,0)) &&
> > + !xf86HandleColormaps(screen, 1 << scrn->rgbBits, scrn->rgbBits,
> > +  sna_load_palette, NULL,
> >CMAP_RELOAD_ON_MODE_SWITCH |
> >CMAP_PALETTED_TRUECOLOR))
> >   return FALSE;
> > diff --git a/src/uxa/intel_driver.c b/src/uxa/intel_driver.c
> > index 3703c412..77c0dc00 100644
> > --- a/src/uxa/intel_driver.c
> > +++ b/src/uxa/intel_driver.c
> > @@ -991,7 +991,11 @@ I830ScreenInit(SCREEN_INIT_ARGS_DECL)
> >   if (!miCreateDefColormap(screen))
> >   return FALSE;
> >
> > - if (!xf86HandleColormaps(screen, 256, 8, I830LoadPalette, NULL,
> > + /* X-Server < 1.20 mishandles > 256 slots / > 8 bpc color maps. */
> > + if ((scrn->rgbBits <= 8 ||
> > + XORG_VERSION_CURRENT >= XORG_VERSION_NUMERIC(1,20,0,0,0)) &&
> > + !xf86HandleColormaps(screen, 1 << scrn->rgbBits, scrn->rgbBits,
> > +  I830LoadPalette, NULL,
> >CMAP_RELOAD_ON_MODE_SWITCH |
> >CMAP_PALETTED_TRUECOLOR)) {
> >   return FALSE;
> > --
> > 2.17.1
>
> --
> Ville Syrjälä
> Intel
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH xf86-video-intel] sna/uxa: Fix colormap handling at screen depth 30. (v2)

2018-06-12 Thread Mario Kleiner
The various clut handling functions like a setup
consistent with the x-screen color depth. Otherwise
we observe improper sampling in the gamma tables
at depth 30.

Therefore replace hard-coded bitsPerRGB = 8 by actual
bits per channel scrn->rgbBits. Also use this for call
to xf86HandleColormaps().

Tested for uxa and sna at depths 8, 16, 24 and 30 on
IvyBridge, and tested at depth 24 and 30 that xgamma
and gamma table animations work, and with measurement
equipment to make sure identity gamma ramps actually
are identity mappings at the output.

v2: Also deal with X-Server 1.19 and earlier, which as of
v1.19.6 lack a fix to color palette handling and can
not deal with depths/bpc > 24/8 bpc. On < 1.20 we skip
xf86HandleColormaps() setup at > 8 bpc. This disables
color palette handling on such servers at > 8 bpc, but
still keeps RandR gamma table handling intact.

Tested on 1.19.6 and 1.20.0 to do the right thing.

Signed-off-by: Mario Kleiner 
---
 src/sna/sna_driver.c   | 9 ++---
 src/uxa/intel_driver.c | 6 +-
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/src/sna/sna_driver.c b/src/sna/sna_driver.c
index 2007e354..8c79d43b 100644
--- a/src/sna/sna_driver.c
+++ b/src/sna/sna_driver.c
@@ -1152,7 +1152,7 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL)
if (!miInitVisuals(, , , , ,
   ,
   ((unsigned long)1 << (scrn->bitsPerPixel - 1)),
-  8, -1))
+  scrn->rgbBits, -1))
return FALSE;
 
if (!miScreenInit(screen, NULL,
@@ -1223,8 +1223,11 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL)
if (!miCreateDefColormap(screen))
return FALSE;
 
-   if (sna->mode.num_real_crtc &&
-   !xf86HandleColormaps(screen, 256, 8, sna_load_palette, NULL,
+   /* X-Server < 1.20 mishandles > 256 slots / > 8 bpc color maps. */
+   if (sna->mode.num_real_crtc && (scrn->rgbBits <= 8 ||
+   XORG_VERSION_CURRENT >= XORG_VERSION_NUMERIC(1,20,0,0,0)) &&
+   !xf86HandleColormaps(screen, 1 << scrn->rgbBits, scrn->rgbBits,
+sna_load_palette, NULL,
 CMAP_RELOAD_ON_MODE_SWITCH |
 CMAP_PALETTED_TRUECOLOR))
return FALSE;
diff --git a/src/uxa/intel_driver.c b/src/uxa/intel_driver.c
index 3703c412..77c0dc00 100644
--- a/src/uxa/intel_driver.c
+++ b/src/uxa/intel_driver.c
@@ -991,7 +991,11 @@ I830ScreenInit(SCREEN_INIT_ARGS_DECL)
if (!miCreateDefColormap(screen))
return FALSE;
 
-   if (!xf86HandleColormaps(screen, 256, 8, I830LoadPalette, NULL,
+   /* X-Server < 1.20 mishandles > 256 slots / > 8 bpc color maps. */
+   if ((scrn->rgbBits <= 8 ||
+   XORG_VERSION_CURRENT >= XORG_VERSION_NUMERIC(1,20,0,0,0)) &&
+   !xf86HandleColormaps(screen, 1 << scrn->rgbBits, scrn->rgbBits,
+I830LoadPalette, NULL,
 CMAP_RELOAD_ON_MODE_SWITCH |
 CMAP_PALETTED_TRUECOLOR)) {
return FALSE;
-- 
2.17.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Depth 30 colormap handling fixes for servers 1.20+ and < 1.20

2018-06-12 Thread Mario Kleiner
Hi,

finally here's an updated patch that for depth 30 now works on both
Server 1.20 with the full colormap + gamma table handling, and for
servers < 1.20 with the RandR gamma tables working fine and the colormap
processing skipped.

This one successfully tested on sna and uxa with both server 1.20.0 and
server 1.19.6.

I assume this one will be replaced by Ville's ddx+kmswork anyway soonish,
but until that is done, this one keeps things at least testable without
crashes and other problems. I use it with my own intel-kms 10 bit lut poc
hacks for measurements and to test that Mesa's depth 30 stuff doesn't
break.

Thanks,
-mario

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] sna/uxa: Fix colormap handling at screen depth 30.

2018-03-15 Thread Mario Kleiner
Oops, didn't reply yet, sorry!

On Thu, Mar 15, 2018 at 5:14 PM, Chris Wilson <ch...@chris-wilson.co.uk> wrote:
> Quoting Ville Syrjälä (2018-03-15 16:02:42)
>> On Thu, Mar 15, 2018 at 03:28:18PM +, Chris Wilson wrote:
>> > Quoting Ville Syrjälä (2018-03-01 11:12:53)
>> > > On Thu, Mar 01, 2018 at 02:20:48AM +0100, Mario Kleiner wrote:
>> > > > The various clut handling functions like a setup
>> > > > consistent with the x-screen color depth. Otherwise
>> > > > we observe improper sampling in the gamma tables
>> > > > at depth 30.
>> > > >
>> > > > Therefore replace hard-coded bitsPerRGB = 8 by actual
>> > > > bits per channel scrn->rgbBits. Also use this for call
>> > > > to xf86HandleColormaps().
>> > > >
>> > > > Tested for uxa and sna at depths 8, 16, 24 and 30 on
>> > > > IvyBridge, and tested at depth 24 and 30 that xgamma
>> > > > and gamma table animations work, and with measurement
>> > > > equipment to make sure identity gamma ramps actually
>> > > > are identity mappings at the output.
>> > >
>> > > You mean identity mapping at 8bpc? We don't support higher precision
>> > > gamma on pre-bdw atm, and the ddx doesn't use the higher precision
>> > > stuff even on bdw+. I'm working on fixing both, but it turned out to
>> > > be a bit more work than I anticipated so will take a while.
>> > >
>> > > >
>> > > > Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com>
>> > > > ---
>> > > >  src/sna/sna_driver.c   | 5 +++--
>> > > >  src/uxa/intel_driver.c | 3 ++-
>> > > >  2 files changed, 5 insertions(+), 3 deletions(-)
>> > > >
>> > > > diff --git a/src/sna/sna_driver.c b/src/sna/sna_driver.c
>> > > > index 2643e6c..9c4bcd4 100644
>> > > > --- a/src/sna/sna_driver.c
>> > > > +++ b/src/sna/sna_driver.c
>> > > > @@ -1145,7 +1145,7 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL)
>> > > >   if (!miInitVisuals(, , , , 
>> > > > ,
>> > > >  ,
>> > > >  ((unsigned long)1 << (scrn->bitsPerPixel - 
>> > > > 1)),
>> > > > -8, -1))
>> > > > +scrn->rgbBits, -1))
>> > > >   return FALSE;
>> > > >
>> > > >   if (!miScreenInit(screen, NULL,
>> > > > @@ -1217,7 +1217,8 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL)
>> > > >   return FALSE;
>> > > >
>> > > >   if (sna->mode.num_real_crtc &&
>> > > > - !xf86HandleColormaps(screen, 256, 8, sna_load_palette, NULL,
>> > > > + !xf86HandleColormaps(screen, 1 << scrn->rgbBits, 
>> > > > scrn->rgbBits,
>> > > > +  sna_load_palette, NULL,
>> > > >CMAP_RELOAD_ON_MODE_SWITCH |
>> > > >CMAP_PALETTED_TRUECOLOR))
>> > >
>> > > I already forgot what this does prior to your randr fix. IIRC bumping
>> > > the 8 alone would cause the thing to segfault, but I guess bumping both
>> > > was fine?
>> > >

We always need this fix for X-Screen depth 30, even on older servers.
With the current maxColors=256, bitsPerPixel=8 setting and color depth
30, the server does some out of bounds reads in its gamma handling
code for maxColors=256, and we get crash at server startup. It's a bit
a matter of luck to reproduce. I had it running for months without
problems, then after some Ubuntu system upgrade i got crashes at
server startup under sna and at server shutdown under uxa.

Without raising the bitsPerPixel to 10, we get some bottleneck in the
way the server mushes together the old XF86VidMode gamma ramps
(per-x-screen) and the new RandR per-crtc gamma ramps, so there are
artifacts in the gamma table finally uploaded to the hw.

For DefaultDeph=24 this patch doesn't change anything.

On X-Server < 1.20 however, without my fix, at color depth 30 this
will get us stuck on a identity gamma ramp, as the update code in the
server effectively no-ops. Or so i think, because i tested so many
permutations of so many things on intel,amd,nouveau with different
mesa,server,ddx branches lately that i may misremember something. Ilia
reported some odd behavior on 1.19 with the correspond

[Intel-gfx] [PATCH] sna/uxa: Fix colormap handling at screen depth 30.

2018-02-28 Thread Mario Kleiner
The various clut handling functions like a setup
consistent with the x-screen color depth. Otherwise
we observe improper sampling in the gamma tables
at depth 30.

Therefore replace hard-coded bitsPerRGB = 8 by actual
bits per channel scrn->rgbBits. Also use this for call
to xf86HandleColormaps().

Tested for uxa and sna at depths 8, 16, 24 and 30 on
IvyBridge, and tested at depth 24 and 30 that xgamma
and gamma table animations work, and with measurement
equipment to make sure identity gamma ramps actually
are identity mappings at the output.

Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com>
---
 src/sna/sna_driver.c   | 5 +++--
 src/uxa/intel_driver.c | 3 ++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/sna/sna_driver.c b/src/sna/sna_driver.c
index 2643e6c..9c4bcd4 100644
--- a/src/sna/sna_driver.c
+++ b/src/sna/sna_driver.c
@@ -1145,7 +1145,7 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL)
if (!miInitVisuals(, , , , ,
   ,
   ((unsigned long)1 << (scrn->bitsPerPixel - 1)),
-  8, -1))
+  scrn->rgbBits, -1))
return FALSE;
 
if (!miScreenInit(screen, NULL,
@@ -1217,7 +1217,8 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL)
return FALSE;
 
if (sna->mode.num_real_crtc &&
-   !xf86HandleColormaps(screen, 256, 8, sna_load_palette, NULL,
+   !xf86HandleColormaps(screen, 1 << scrn->rgbBits, scrn->rgbBits,
+sna_load_palette, NULL,
 CMAP_RELOAD_ON_MODE_SWITCH |
 CMAP_PALETTED_TRUECOLOR))
return FALSE;
diff --git a/src/uxa/intel_driver.c b/src/uxa/intel_driver.c
index 3703c41..88c749e 100644
--- a/src/uxa/intel_driver.c
+++ b/src/uxa/intel_driver.c
@@ -991,7 +991,8 @@ I830ScreenInit(SCREEN_INIT_ARGS_DECL)
if (!miCreateDefColormap(screen))
return FALSE;
 
-   if (!xf86HandleColormaps(screen, 256, 8, I830LoadPalette, NULL,
+   if (!xf86HandleColormaps(screen, 1 << scrn->rgbBits, scrn->rgbBits,
+I830LoadPalette, NULL,
 CMAP_RELOAD_ON_MODE_SWITCH |
 CMAP_PALETTED_TRUECOLOR)) {
return FALSE;
-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Add module parameter to en-/disable hw color correction.

2017-09-29 Thread Mario Kleiner

On 09/26/2017 07:05 AM, Daniel Vetter wrote:

On Fri, Sep 15, 2017 at 05:48:25PM +0200, Mario Kleiner wrote:

The new module parameter enable_hw_color_correction defaults to
true, to retain the current behaviour. If set to false, it will
disable all hardware color correction, like gamma/degamma and
csc.

This is useful for debugging gamma table / csc precision problems,
and to ensure unmodified pixel passthrough from framebuffer to
outputs, e.g., for scientific applications which critically depend
on perfect pixel passthrough. While i hope this switch generally
won't be needed, it provides extra peace-of-mind - an "airbag" for
color correction trouble.

Tested on Ironlake, IvyBridge, Haswell, Skylake.

One unexpected result during testing was that while this works on
all tested gpu's with a 8 bpc XR24 framebuffer as primary plane,
if a 10 bpc XR30 fb is active, then hw gamma tables seem to get
automatically bypassed on at least the tested IvyBridge and later
(but not on the tested Ironlake), regardless of hw programming,
at least for the legacy 256->8 bit luts and the 1024->10 bit
precision luts. However, the type of selected - but bypassed -
hw gamma table still determines output precision, ie. even an
auto-bypassed legacy 256 slot 8 bit lut in XR30 fb mode still
restricts the effective output precision to 8 bit, while an
auto-bypassed precision lut doesn't restrict precision.


Instead of a modparam I think the right thing to fix here is the driver
setup. Enabling the legacy gamma table is indeed documented to restrict
the pipe to 8bpc (the 2 additional bits for 10bpc are just padded).

Having driver options for "pls give me non-broken behaviour" doesn't make
any sense to me.
-Daniel



Hi Daniel,

this isn't meant as a permanent solution, but as a debugging aid, and as 
the equivalent of an air-bag in a car. You hope you won't need it, but 
it is good to have. In the past it would have been very handy for me to 
have a master-switch for this, debugging problems on users machines 
related to "pixels don't appear on the outputs as specified in the 
OpenGL rendering code". When looking over the docs for color correction 
i just realized the hardware has an easy way to disable this part of the 
pipeline, so i thought this could make debugging so much easier - at 
least for me. I had the impression that many current i915 module 
parameters are of this nature.


The debug switch also provides a temporary workaround on production 
systems if a problem is related to color correction, not meant as a 
permanent solution. Many of my users are challenged already by the fact 
that Linux is not macOS, and editing a config file or installing a 
prebuilt kernel from a .dpkg is already borderline rocket science for 
them, that's why those module parameters would be nice to have.


My actual plan is to implement true 10 bit -> 12 bit gamma table 
support, hopefully still for the 4.15 kernel.


I have experimental patches for using the precision luts with 1024 slots 
and 10 bit output width, ie. 10 bit in -> 10 bit out on Ironlake and 
later. I'll send those out in their hacky state just for reference.


However the better plan i have in mind is to extend the code so that if 
(we are in single-lut mode (DEGAMMA_LUT == 0)) AND (the userspace 
provided input lut is monotonically increasing) we switch from the dual 
512 slot 10 bit luts to the 512 slot 12 bit lut. This would also be 
applicable to the 256 slot legacy gamma tables, which are always 
single-lut and can be upsampled from 256 slots to 512 slots.


The reason is that the dual-512 slot luts are not good enough to handle 
a 10 bit framebuffer. As far as i read the PRMs, a 10 bit fb value would 
simply get truncated to 9 bits to select one of the 512 slots, so we 
would lose 1 bit of precision, which makes 10 bit framebuffers mostly 
pointless, at least for scientific/medical/HDR applications.


The 512 slot 12 bit lut is perfect for such applications, as the PRMs 
say the hw will linearly interpolate between the nearest neighbor slots 
of the 512 slot lut for the given fb input value -> works with 10 bit 
fb's. Would also work with those 16 bit half-float fb format that is 
supported by current hw but currently unused - but could be handy for 
future HDR applications. Also 12 bit output precision is nice for better 
gamma correction on true 10-12 bit displays over DP/HDMI deep color.


I will try to work on this within the next 1-2 weeks.

Now here's a catch i found while testing with the 1024 slot 10 bit luts, 
which i found very surprising:


- If i have a standard XR24 framebuffer on the primary plane, the 1024 
slot/10 bit lut's work exactly as expected, as verified on a XR24 fb via 
photometer measurements and tweaking the values in the gamma tables -- 
and the "force dithering on" module parameter patch, as i don't have a 
true 10 bit panel around atm.


- As soon as a XR30 fb is active (X11 Def

[Intel-gfx] [PATCH 1/2] drm/i915: Add module parameter to force en-/disable dithering.

2017-09-15 Thread Mario Kleiner
i915.enable_dithering allows to force dithering on all outputs
on (=1) or off (=0). The default is -1 for current automatic
per-pipe selection.

This is useful for debugging and for special case scenarios,
e.g., providing simulated 10 bpc output on 8 bpc digital sinks
if a 10 bpc framebuffer + rendering is in use.

A more flexible solution would be connector properties, like
other drivers (radeon, amdgpu, nouveau) already provide. A
global override via module parameter is useful even with such
connector properties, e.g., for scientific applications which
require strict control over dithering, to have an override
for DE's which may not expose such properties via some standard
protocol in a user-controllable way, e.g., afaik all currently
existing Wayland compositors.

Tested on Ironlake, IvyBridge, Haswell, Skylake.

Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com>
---
 drivers/gpu/drm/i915/i915_params.c   | 5 +
 drivers/gpu/drm/i915/i915_params.h   | 1 +
 drivers/gpu/drm/i915/intel_display.c | 5 +
 3 files changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_params.c 
b/drivers/gpu/drm/i915/i915_params.c
index 8ab003dca113..07ec3a96457c 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -65,6 +65,7 @@ struct i915_params i915 __read_mostly = {
.inject_load_failure = 0,
.enable_dpcd_backlight = false,
.enable_gvt = false,
+   .enable_dithering = -1,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -257,3 +258,7 @@ MODULE_PARM_DESC(enable_dpcd_backlight,
 module_param_named(enable_gvt, i915.enable_gvt, bool, 0400);
 MODULE_PARM_DESC(enable_gvt,
"Enable support for Intel GVT-g graphics virtualization host 
support(default:false)");
+
+module_param_named(enable_dithering, i915.enable_dithering, int, 0644);
+MODULE_PARM_DESC(enable_dithering,
+   "Enable dithering (-1=auto [default], 0=force off on all outputs, 
1=force on on all outputs)");
diff --git a/drivers/gpu/drm/i915/i915_params.h 
b/drivers/gpu/drm/i915/i915_params.h
index ac844709c97e..7e365cd4fc91 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -54,6 +54,7 @@
func(int, edp_vswing); \
func(int, reset); \
func(unsigned int, inject_load_failure); \
+   func(int, enable_dithering); \
/* leave bools at the end to not create holes */ \
func(bool, alpha_support); \
func(bool, enable_cmd_parser); \
diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 0e93ec201fe3..bea471a96820 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10978,6 +10978,11 @@ intel_modeset_pipe_config(struct drm_crtc *crtc,
 */
pipe_config->dither = (pipe_config->pipe_bpp == 6*3) &&
!pipe_config->dither_force_disable;
+
+   /* Override of auto-selected dither mode via module parameter? */
+   if (i915.enable_dithering != -1)
+   pipe_config->dither = i915.enable_dithering > 0 ? true : false;
+
DRM_DEBUG_KMS("hw max bpp: %i, pipe bpp: %i, dithering: %i\n",
  base_bpp, pipe_config->pipe_bpp, pipe_config->dither);
 
-- 
2.13.0.rc1.294.g07d810a77f

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/2] drm/i915: Add module parameter to en-/disable hw color correction.

2017-09-15 Thread Mario Kleiner
The new module parameter enable_hw_color_correction defaults to
true, to retain the current behaviour. If set to false, it will
disable all hardware color correction, like gamma/degamma and
csc.

This is useful for debugging gamma table / csc precision problems,
and to ensure unmodified pixel passthrough from framebuffer to
outputs, e.g., for scientific applications which critically depend
on perfect pixel passthrough. While i hope this switch generally
won't be needed, it provides extra peace-of-mind - an "airbag" for
color correction trouble.

Tested on Ironlake, IvyBridge, Haswell, Skylake.

One unexpected result during testing was that while this works on
all tested gpu's with a 8 bpc XR24 framebuffer as primary plane,
if a 10 bpc XR30 fb is active, then hw gamma tables seem to get
automatically bypassed on at least the tested IvyBridge and later
(but not on the tested Ironlake), regardless of hw programming,
at least for the legacy 256->8 bit luts and the 1024->10 bit
precision luts. However, the type of selected - but bypassed -
hw gamma table still determines output precision, ie. even an
auto-bypassed legacy 256 slot 8 bit lut in XR30 fb mode still
restricts the effective output precision to 8 bit, while an
auto-bypassed precision lut doesn't restrict precision.

Iow. this patch is needed even with XR30 fb's for actual 10
bit precision output, even though the hw seems to sort of ignore
the tested gamma tables for XR30 fb's.

Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com>
---
 drivers/gpu/drm/i915/i915_params.c   |  5 +
 drivers/gpu/drm/i915/i915_params.h   |  3 ++-
 drivers/gpu/drm/i915/intel_display.c | 26 +-
 drivers/gpu/drm/i915/intel_sprite.c  | 21 -
 4 files changed, 40 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.c 
b/drivers/gpu/drm/i915/i915_params.c
index 07ec3a96457c..8f6a176a97e1 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -66,6 +66,7 @@ struct i915_params i915 __read_mostly = {
.enable_dpcd_backlight = false,
.enable_gvt = false,
.enable_dithering = -1,
+   .enable_hw_color_correction = true,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -262,3 +263,7 @@ MODULE_PARM_DESC(enable_gvt,
 module_param_named(enable_dithering, i915.enable_dithering, int, 0644);
 MODULE_PARM_DESC(enable_dithering,
"Enable dithering (-1=auto [default], 0=force off on all outputs, 
1=force on on all outputs)");
+
+module_param_named(enable_hw_color_correction, 
i915.enable_hw_color_correction, bool, 0644);
+MODULE_PARM_DESC(enable_hw_color_correction,
+   "Enable hardware color correction like gamma luts and csc (default: 
true)");
diff --git a/drivers/gpu/drm/i915/i915_params.h 
b/drivers/gpu/drm/i915/i915_params.h
index 7e365cd4fc91..f5c9163d2675 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -69,7 +69,8 @@
func(bool, nuclear_pageflip); \
func(bool, enable_dp_mst); \
func(bool, enable_dpcd_backlight); \
-   func(bool, enable_gvt)
+   func(bool, enable_gvt); \
+   func(bool, enable_hw_color_correction)
 
 #define MEMBER(T, member) T member
 struct i915_params {
diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index bea471a96820..1e1b157353a9 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3184,13 +3184,17 @@ static u32 i9xx_plane_ctl(const struct intel_crtc_state 
*crtc_state,
unsigned int rotation = plane_state->base.rotation;
u32 dspcntr;
 
-   dspcntr = DISPLAY_PLANE_ENABLE | DISPPLANE_GAMMA_ENABLE;
+   dspcntr = DISPLAY_PLANE_ENABLE;
+
+   if (i915.enable_hw_color_correction)
+   dspcntr |= DISPPLANE_GAMMA_ENABLE;
 
if (IS_G4X(dev_priv) || IS_GEN5(dev_priv) ||
IS_GEN6(dev_priv) || IS_IVYBRIDGE(dev_priv))
dspcntr |= DISPPLANE_TRICKLE_FEED_DISABLE;
 
-   if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
+   if ((IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv)) &&
+   i915.enable_hw_color_correction)
dspcntr |= DISPPLANE_PIPE_CSC_ENABLE;
 
if (INTEL_GEN(dev_priv) < 4)
@@ -3514,7 +3518,8 @@ u32 skl_plane_ctl(const struct intel_crtc_state 
*crtc_state,
 
plane_ctl = PLANE_CTL_ENABLE;
 
-   if (!IS_GEMINILAKE(dev_priv) && !IS_CANNONLAKE(dev_priv)) {
+   if (!IS_GEMINILAKE(dev_priv) && !IS_CANNONLAKE(dev_priv) &&
+   i915.enable_hw_color_correction) {
plane_ctl |=
PLANE_CTL_PIPE_GAMMA_ENABLE |
PLANE_CTL_PIPE_CSC_ENABLE |
@@ -3571,7 +3576,8 @@ static void skylake_update_primary_plane(struct 
intel_plane *plane,
 
spin_lock_irqsave(_priv->unco

[Intel-gfx] Module parameters to override color management/dithering.

2017-09-15 Thread Mario Kleiner
Hi,

so these two patches add i915 module parameters to globally override
how the driver handles dithering and gamma/csc conversion.

They serve two purposes: First as debug aid and "airbag" for working
around potential precision problems in getting pixels from rendering
to the display outputs. This mostly for applications that critically
depend on getting pixels untampered from the fb to the outputs, e.g.,
scientific neuro-science/vision research/medical applications. Having
the ability to bypass parts of the pipeline can help a lot in debugging
such problems on remote user machines, and to allow such users to
work around the problems until proper fixes are made. I expect this
to become especially useful when dealing with all the Wayland compositor
implementations, which so far don't have a standardized application/user
controllable equivalent to RandR protocol / xrandr tools.

The second, short-term purpose is to enable true 10 bit output from
rendering, so people with urgent 10 bit precision needs can benefit
from the Mesa patches i started working on for i965 (rev 1 on the
mailing-list, rev 2 to come soon).

I realize the merge window for Linux 4.14 is almost over, but wanted
to ask if it would be possible to slip these patches into 4.14 if
they aren't considered too intrusive?

These are tested on Ironlake, Ivybridge, Haswell and Skylake, also
with a photometer to see what actually comes out of the display for
different settings.

The bigger plan is to enhance the gamma table support, so we could
also use > 8 bit precision gamma tables on Ironlake and later, both
for the legacy gamma ioctl and the new color mgmt. method. I do have
proof of concept patches for using the 1024->10 precision luts on
Ironlake and later. Tweaking the gamma tables i upload via RandR and
measuring with photometer showed my poc patches work. However, as
described in the 2nd patch, at least the tested legacy luts and
1024->10 precision luts seem to get mostly ignored/bypassed in the hw
when a XR30 fb is attached to the primary plane. Not sure if some setup
is missing, or if this is some hardware quirk? Couldn't find anything
in the PRM's so far.

What i'd actually like to implement for Ironlake+ instead of the
1024->10 bit luts is this:

If in dual-gamma lut mode, or if the input gamma table is not
monotonically increasing, do what is done now (legacy luts for legacy
gamma ioctl, split 512->10 big luts for new path).

If only a single gamma lut is requested (DEGAMMA_LUT == 0), and the
provided input lut is monotonically increasing, switch to the linearly
interpolated 512->12 bit lut instead, which exists on Ironlake+. Also
for the legacy gamma ioctl, so existing apps can benefit from the
higher precision. This would enable > 8 bit framebuffers to be output
properly and with high quality gamma correction.

Thanks,
-mario

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm: Make the decision to keep vblank irq enabled earlier

2017-03-23 Thread Mario Kleiner

On 03/23/2017 02:26 PM, Ville Syrjälä wrote:

On Thu, Mar 23, 2017 at 07:51:06AM +, Chris Wilson wrote:

We want to provide the vblank irq shadow for pageflip events as well as
vblank queries. Such events are completed within the vblank interrupt
handler, and so the current check for disabling the irq will disable it
from with the same interrupt as the last pageflip event. If we move the
decision on whether to disable the irq (based on there no being no
remaining vblank events, i.e. vblank->refcount == 0) to before we signal
the events, we will only disable the irq on the interrupt after the last
event was signaled. In the normal course of events, this will keep the
vblank irq enabled for the entire flip sequence whereas before it would
flip-flop around every interrupt.

Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrj...@linux.intel.com>
Cc: Daniel Vetter <dan...@ffwll.ch>
Cc: Michel Dänzer <mic...@daenzer.net>
Cc: Laurent Pinchart <laurent.pinch...@ideasonboard.com>
Cc: Dave Airlie <airl...@redhat.com>,
Cc: Mario Kleiner <mario.kleiner...@gmail.com>
---
 drivers/gpu/drm/drm_irq.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 5b77057e91ca..1d6bcee3708f 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -1741,6 +1741,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned 
int pipe)
 {
struct drm_vblank_crtc *vblank = >vblank[pipe];
unsigned long irqflags;
+   bool disable_irq;

if (WARN_ON_ONCE(!dev->num_crtcs))
return false;
@@ -1768,16 +1769,19 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned 
int pipe)
spin_unlock(>vblank_time_lock);

wake_up(>queue);
-   drm_handle_vblank_events(dev, pipe);

/* With instant-off, we defer disabling the interrupt until after
-* we finish processing the following vblank. The disable has to
-* be last (after drm_handle_vblank_events) so that the timestamp
-* is always accurate.
+* we finish processing the following vblank after all events have
+* been signaled. The disable has to be last (after
+* drm_handle_vblank_events) so that the timestamp is always accurate.


We wouldn't actually do the disable as long there's a reference still
held, so the timestamp should be fine in that case. And if there aren't
any references the timestamp shouldn't matter... I think. But it's
probably more clear to keep to the order you propose here anyway.

Reviewed-by: Ville Syrjälä <ville.syrj...@linux.intel.com>



Looks good to me. As a further optimization, i think we could move the 
vblank_disable_fn() call outside/below the spin_unlock_irqrestore for 
event_lock, as vblank_disable_fn() doesn't need any locks held at call 
time, so slightly reduce event_lock hold time. Don't know if it is worth it.


In any case

Reviewed-by: Mario Kleiner <mario.kleiner...@gmail.com>

thanks,
-mario


Oh, and now that I think about this stuff again, I start to wonder why
I made the disable actually update the seq/ts. If the interrupt is
currently enabled the seq/ts should be reasonably uptodate already
when we do disable the interrupt. Perhaps I was only thinking about
drm_vblank_off() when I made that change, or I decided that I didn't
want two different disable codepaths. Anyways, just an idea that
we might be able to make the vblank irq disable a little cheaper.


 */
-   if (dev->vblank_disable_immediate &&
-   drm_vblank_offdelay > 0 &&
-   !atomic_read(>refcount))
+   disable_irq = (dev->vblank_disable_immediate &&
+  drm_vblank_offdelay > 0 &&
+  !atomic_read(>refcount));
+
+   drm_handle_vblank_events(dev, pipe);
+
+   if (disable_irq)
vblank_disable_fn((unsigned long)vblank);

spin_unlock_irqrestore(>event_lock, irqflags);
--
2.11.0



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/3] drm: Defer disabling the vblank IRQ until the next interrupt (for instant-off)

2017-03-22 Thread Mario Kleiner

On 03/15/2017 10:00 PM, Ville Syrjälä wrote:

On Wed, Mar 15, 2017 at 08:40:25PM +, Chris Wilson wrote:

On vblank instant-off systems, we can get into a situation where the cost
of enabling and disabling the vblank IRQ around a drmWaitVblank query
dominates. And with the advent of even deeper hardware sleep state,
touching registers becomes ever more expensive.  However, we know that if
the user wants the current vblank counter, they are also very likely to
immediately queue a vblank wait and so we can keep the interrupt around
and only turn it off if we have no further vblank requests queued within
the interrupt interval.

After vblank event delivery, this patch adds a shadow of one vblank where
the interrupt is kept alive for the user to query and queue another vblank
event. Similarly, if the user is using blocking drmWaitVblanks, the
interrupt will be disabled on the IRQ following the wait completion.
However, if the user is simply querying the current vblank counter and
timestamp, the interrupt will be disabled after every IRQ and the user
will enabled it again on the first query following the IRQ.

v2: Mario Kleiner -
After testing this, one more thing that would make sense is to move
the disable block at the end of drm_handle_vblank() instead of at the
top.

Turns out that if high precision timestaming is disabled or doesn't
work for some reason (as can be simulated by echo 0 >
/sys/module/drm/parameters/timestamp_precision_usec), then with your
delayed disable code at its current place, the vblank counter won't
increment anymore at all for instant queries, ie. with your other
"instant query" patches. Clients which repeatedly query the counter
and wait for it to progress will simply hang, spinning in an endless
query loop. There's that comment in vblank_disable_and_save:

"* Skip this step if there isn't any high precision timestamp
 * available. In that case we can't account for this and just
 * hope for the best.
 */

With the disable happening after leading edge of vblank (== hw counter
increment already happened) but before the vblank counter/timestamp
handling in drm_handle_vblank, that step is needed to keep the counter
progressing, so skipping it is bad.

Now without high precision timestamping support, a kms driver must not
set dev->vblank_disable_immediate = true, as this would cause problems
for clients, so this shouldn't matter, but it would be good to still
make this robust against a future kms driver which might have
unreliable high precision timestamping, e.g., high precision
timestamping that intermittently doesn't work.

v3: Patch before coffee needs extra coffee.

Testcase: igt/kms_vblank
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrj...@linux.intel.com>
Cc: Daniel Vetter <dan...@ffwll.ch>
Cc: Michel Dänzer <mic...@daenzer.net>
Cc: Laurent Pinchart <laurent.pinch...@ideasonboard.com>
Cc: Dave Airlie <airl...@redhat.com>,
Cc: Mario Kleiner <mario.kleiner...@gmail.com>


Yep. This seems like a good idea to me. I just neglected to review it
last time around (and maybe even before that?) for some reason. Locks
seem to be taken in the right order, so it at least looks safe to me.

Reviewed-by: Ville Syrjälä <ville.syrj...@linux.intel.com>



Hi,

as a followup to this one, maybe we should move the 
drm_handle_vblank_events(dev, pipe); down, immediately after Chris new 
delayed disable code?


The idea was to avoid lots of redundant enable->disable->enable... calls 
by having some 1 frame delay before disable. This works for pure vblank 
count/ts queries.


But both DRI2 and DRI3/Present use vblank events to trigger a 
pageflip-ioctl at the right target vblank. With the current ordering we 
may dispatch the vblank swap trigger event to the X-Server and drop the 
vblank refcount to zero due to the vblank_put inside 
drm_handle_vblank_events for the dispatched event, then detect in this 
patch that refcount == 0 and disable vblanks, but a few microseconds 
later the server will queue a pageflip ioctl which bumps the refcount 
and reenables vblank irqs, so we have a redundant disable->enable.


Also many kms drivers now use drm_crtc_arm_vblank_event() for pageflip 
completion handling at vblank, the pageflip completion events are also 
dispatched via drm_handle_vblank_events(). After a pageflip completes, 
it makes sense to have this "swap shadow" of 1 full frame, as animations 
would likely queue a new vblank query/event immediately for the next 
animation frame.


-mario


---
 drivers/gpu/drm/drm_irq.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 9bdca69f754c..e64b05ea95ea 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -1198,9 +1198,9 @@ static void drm_vblank_put(struct drm_device *dev, 
unsigned int pipe)
  

Re: [Intel-gfx] [PATCH] drm/i915: Fix legacy gamma lut updates in Linux 4.7-rc6

2016-07-14 Thread Mario Kleiner
Ok, so legacy gamma table updates are completely broken for Intel on 
Linux-4.7-rc7, the final release candidate.


The good news is that applying Lionel's patch

"drm/i915: add missing condition for committing planes on crtc"

from

https://patchwork.freedesktop.org/patch/89111/

fixes it nicely. The patch currently applies cleanly to drm-fixes and 
drm-next and is


Reviewed-and-tested-by: Mario Kleiner <mario.kleiner...@gmail.com>


When we are at it, could somebody please look at that updated series of 
my Displayport color depth fixes ("EDID/DP fixes for proper bpc 
detection of displays.") i sent out a week ago?


Especially pulling patch 2/5 "[PATCH 2/5] drm/i915/dp: Revert 
"drm/i915/dp: fall back to 18 bpp when sink capability is unknown" would 
be important, as that bug introduced a regression for Intel + DP + 
legacy DP converters into stable kernels which is very serious for users 
of scientific/medical display equipment, especially as the failures can 
easily go unnoticed during normal equipment tests, but would introduce 
the equivalent of "silent data corruption" into their measured 
scientific data, which is not a great experience given that collecting 
such data can easily take half a year of work time and ten-thousands of 
euros of wasted research funding.


Patches 3 and 4 contain changes Daniel asked me to do, patch 5 would be 
good to safe-guard against similar issues in the future.


thanks,
-mario

On 07/12/2016 12:50 PM, Lionel Landwerlin wrote:

Hi Mario,

There was a couple of patch to fix this issue :

https://patchwork.freedesktop.org/series/5467/
https://patchwork.freedesktop.org/series/5466/

I tested this late last week on drm-intel-nightly, it seems a series of
revert fixed most of the issues.

Cheers,

-
Lionel

On 12/07/16 11:33, Mario Kleiner wrote:

Updating legacy gamma tables, e.g., via RandR doesn't work at all
as of Linux 4.7-rc6.

Reason seems to be that the required call to
drm_atomic_helper_commit_planes_on_crtc is skipped in
intel_atomic_commit after userspace set new gamma tables,
because neither crtc->state->planes_changed nor
update_pipe (= pipe_config->update_pipe) are true.

Removing the check for planes_changed || update_pipe fixes
gamma table updates.

The code for Linux 4.8 drm-next has changed a lot in that area
wrt. 4.7, but the new code for 4.8 also removed those checks
and calls drm_atomic_helper_commit_planes_on_crtc unconditionally,
and legacy gamma lut updates work on drm-next, so this seems to be
the right solution.

Tested also shutdown/reboot, suspend/resume, (un-)plugging displays,
mode switches for resolution/refresh rate, display rotation, and
page-flipping/pageflip timing on Intel HD Ironlake to confirm the
fix apparently doesn't break anything under X11.

Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com>
Cc: Maarten Lankhorst <maarten.lankho...@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwer...@intel.com>
Cc: Daniel Vetter <daniel.vet...@ffwll.ch>
---
  drivers/gpu/drm/i915/intel_display.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c
b/drivers/gpu/drm/i915/intel_display.c
index 04452cf..eb8fb36 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -13685,7 +13685,6 @@ static int intel_atomic_commit(struct
drm_device *dev,
  bool modeset = needs_modeset(crtc->state);
  struct intel_crtc_state *pipe_config =
  to_intel_crtc_state(crtc->state);
-bool update_pipe = !modeset && pipe_config->update_pipe;
  if (modeset && crtc->state->active) {
  update_scanline_offset(to_intel_crtc(crtc));
@@ -13699,8 +13698,7 @@ static int intel_atomic_commit(struct
drm_device *dev,
  drm_atomic_get_existing_plane_state(state, crtc->primary))
  intel_fbc_enable(intel_crtc);
-if (crtc->state->active &&
-(crtc->state->planes_changed || update_pipe))
+if (crtc->state->active)
  drm_atomic_helper_commit_planes_on_crtc(old_crtc_state);
  if (pipe_config->base.active && needs_vblank_wait(pipe_config))




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Fix legacy gamma lut updates in Linux 4.7-rc6

2016-07-12 Thread Mario Kleiner

On 07/12/2016 05:02 PM, Lionel Landwerlin wrote:

On 12/07/16 13:11, Mario Kleiner wrote:

On 07/12/2016 12:50 PM, Lionel Landwerlin wrote:

Hi Mario,



Hi Lionel,


There was a couple of patch to fix this issue :

https://patchwork.freedesktop.org/series/5467/
https://patchwork.freedesktop.org/series/5466/



Looking at them they should fix the issue, but they seem to be stuck
in review?


I tested this late last week on drm-intel-nightly, it seems a series of
revert fixed most of the issues.



You mean something else has fixed legacy gamma updates, as i can't
find above patches applied on drm-intel-nightly?


This revert on drm-intel-nightly seems to have fixed the problem :

https://cgit.freedesktop.org/drm-intel/commit/drivers/gpu/drm/i915?id=e42aeef1237b7c969a77b7f726c50f6cb832185f



Ok, with that intel-nightly looks like drm-next for 4.8 and that indeed 
has working lut updates in my testing. My own patch was motivated by the 
way the implementation is done in intel_atomic_commit_tail() from drm-next.






Are those fixes supposed to be already part of 4.7-rc7, the final rc
afaik?


I haven't seen it on 4.7-rc7.



I just checked Linus tree for 4.7-rc7 and there the code in 
intel_display.c didn't receive any updates since 13 days and looks like 
the broken code from rc6 which according to my testing doesn't work.


So i'd assume legacy gamma table updates are broken in Linux 4.7 final 
rc atm. Couldn't test, because for some weird reason 4.7-rc7 doesn't 
even boot on my laptop :( - However i got that via a quick install from 
Ubuntu's mainline ppa so it could be some unrelated problem with their 
ppa builds.


I think either my patch would fix it, but is untested wrt. nuclear 
pageflip, or those two patches you referenced, which apparently didn't 
move forward.


What now?
-mario



thanks,
-mario



Cheers,

-
Lionel

On 12/07/16 11:33, Mario Kleiner wrote:

Updating legacy gamma tables, e.g., via RandR doesn't work at all
as of Linux 4.7-rc6.

Reason seems to be that the required call to
drm_atomic_helper_commit_planes_on_crtc is skipped in
intel_atomic_commit after userspace set new gamma tables,
because neither crtc->state->planes_changed nor
update_pipe (= pipe_config->update_pipe) are true.

Removing the check for planes_changed || update_pipe fixes
gamma table updates.

The code for Linux 4.8 drm-next has changed a lot in that area
wrt. 4.7, but the new code for 4.8 also removed those checks
and calls drm_atomic_helper_commit_planes_on_crtc unconditionally,
and legacy gamma lut updates work on drm-next, so this seems to be
the right solution.

Tested also shutdown/reboot, suspend/resume, (un-)plugging displays,
mode switches for resolution/refresh rate, display rotation, and
page-flipping/pageflip timing on Intel HD Ironlake to confirm the
fix apparently doesn't break anything under X11.

Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com>
Cc: Maarten Lankhorst <maarten.lankho...@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwer...@intel.com>
Cc: Daniel Vetter <daniel.vet...@ffwll.ch>
---
  drivers/gpu/drm/i915/intel_display.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c
b/drivers/gpu/drm/i915/intel_display.c
index 04452cf..eb8fb36 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -13685,7 +13685,6 @@ static int intel_atomic_commit(struct
drm_device *dev,
  bool modeset = needs_modeset(crtc->state);
  struct intel_crtc_state *pipe_config =
  to_intel_crtc_state(crtc->state);
-bool update_pipe = !modeset && pipe_config->update_pipe;
  if (modeset && crtc->state->active) {
  update_scanline_offset(to_intel_crtc(crtc));
@@ -13699,8 +13698,7 @@ static int intel_atomic_commit(struct
drm_device *dev,
  drm_atomic_get_existing_plane_state(state,
crtc->primary))
  intel_fbc_enable(intel_crtc);
-if (crtc->state->active &&
-(crtc->state->planes_changed || update_pipe))
+if (crtc->state->active)
drm_atomic_helper_commit_planes_on_crtc(old_crtc_state);
  if (pipe_config->base.active &&
needs_vblank_wait(pipe_config))








___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Fix legacy gamma lut updates in Linux 4.7-rc6

2016-07-12 Thread Mario Kleiner

On 07/12/2016 12:50 PM, Lionel Landwerlin wrote:

Hi Mario,



Hi Lionel,


There was a couple of patch to fix this issue :

https://patchwork.freedesktop.org/series/5467/
https://patchwork.freedesktop.org/series/5466/



Looking at them they should fix the issue, but they seem to be stuck in 
review?



I tested this late last week on drm-intel-nightly, it seems a series of
revert fixed most of the issues.



You mean something else has fixed legacy gamma updates, as i can't find 
above patches applied on drm-intel-nightly?


Are those fixes supposed to be already part of 4.7-rc7, the final rc afaik?

thanks,
-mario



Cheers,

-
Lionel

On 12/07/16 11:33, Mario Kleiner wrote:

Updating legacy gamma tables, e.g., via RandR doesn't work at all
as of Linux 4.7-rc6.

Reason seems to be that the required call to
drm_atomic_helper_commit_planes_on_crtc is skipped in
intel_atomic_commit after userspace set new gamma tables,
because neither crtc->state->planes_changed nor
update_pipe (= pipe_config->update_pipe) are true.

Removing the check for planes_changed || update_pipe fixes
gamma table updates.

The code for Linux 4.8 drm-next has changed a lot in that area
wrt. 4.7, but the new code for 4.8 also removed those checks
and calls drm_atomic_helper_commit_planes_on_crtc unconditionally,
and legacy gamma lut updates work on drm-next, so this seems to be
the right solution.

Tested also shutdown/reboot, suspend/resume, (un-)plugging displays,
mode switches for resolution/refresh rate, display rotation, and
page-flipping/pageflip timing on Intel HD Ironlake to confirm the
fix apparently doesn't break anything under X11.

Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com>
Cc: Maarten Lankhorst <maarten.lankho...@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwer...@intel.com>
Cc: Daniel Vetter <daniel.vet...@ffwll.ch>
---
  drivers/gpu/drm/i915/intel_display.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c
b/drivers/gpu/drm/i915/intel_display.c
index 04452cf..eb8fb36 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -13685,7 +13685,6 @@ static int intel_atomic_commit(struct
drm_device *dev,
  bool modeset = needs_modeset(crtc->state);
  struct intel_crtc_state *pipe_config =
  to_intel_crtc_state(crtc->state);
-bool update_pipe = !modeset && pipe_config->update_pipe;
  if (modeset && crtc->state->active) {
  update_scanline_offset(to_intel_crtc(crtc));
@@ -13699,8 +13698,7 @@ static int intel_atomic_commit(struct
drm_device *dev,
  drm_atomic_get_existing_plane_state(state, crtc->primary))
  intel_fbc_enable(intel_crtc);
-if (crtc->state->active &&
-(crtc->state->planes_changed || update_pipe))
+if (crtc->state->active)
  drm_atomic_helper_commit_planes_on_crtc(old_crtc_state);
  if (pipe_config->base.active && needs_vblank_wait(pipe_config))




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Fix legacy gamma lut updates in Linux 4.7-rc6

2016-07-12 Thread Mario Kleiner
Updating legacy gamma tables, e.g., via RandR doesn't work at all
as of Linux 4.7-rc6.

Reason seems to be that the required call to
drm_atomic_helper_commit_planes_on_crtc is skipped in
intel_atomic_commit after userspace set new gamma tables,
because neither crtc->state->planes_changed nor
update_pipe (= pipe_config->update_pipe) are true.

Removing the check for planes_changed || update_pipe fixes
gamma table updates.

The code for Linux 4.8 drm-next has changed a lot in that area
wrt. 4.7, but the new code for 4.8 also removed those checks
and calls drm_atomic_helper_commit_planes_on_crtc unconditionally,
and legacy gamma lut updates work on drm-next, so this seems to be
the right solution.

Tested also shutdown/reboot, suspend/resume, (un-)plugging displays,
mode switches for resolution/refresh rate, display rotation, and
page-flipping/pageflip timing on Intel HD Ironlake to confirm the
fix apparently doesn't break anything under X11.

Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com>
Cc: Maarten Lankhorst <maarten.lankho...@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwer...@intel.com>
Cc: Daniel Vetter <daniel.vet...@ffwll.ch>
---
 drivers/gpu/drm/i915/intel_display.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 04452cf..eb8fb36 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -13685,7 +13685,6 @@ static int intel_atomic_commit(struct drm_device *dev,
bool modeset = needs_modeset(crtc->state);
struct intel_crtc_state *pipe_config =
to_intel_crtc_state(crtc->state);
-   bool update_pipe = !modeset && pipe_config->update_pipe;
 
if (modeset && crtc->state->active) {
update_scanline_offset(to_intel_crtc(crtc));
@@ -13699,8 +13698,7 @@ static int intel_atomic_commit(struct drm_device *dev,
drm_atomic_get_existing_plane_state(state, crtc->primary))
intel_fbc_enable(intel_crtc);
 
-   if (crtc->state->active &&
-   (crtc->state->planes_changed || update_pipe))
+   if (crtc->state->active)
drm_atomic_helper_commit_planes_on_crtc(old_crtc_state);
 
if (pipe_config->base.active && needs_vblank_wait(pipe_config))
-- 
2.7.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Pageflipping bugs in drm-next on at least Ironlake and Ivybridge.

2016-07-06 Thread Mario Kleiner

On 07/06/2016 03:05 PM, Chris Wilson wrote:

On Wed, Jul 06, 2016 at 12:17:55PM +0200, Mario Kleiner wrote:

Since i pulled the current drm-next tree i see strong flicker and
visual corruption during pageflipping, both in my own app, but also
in KDE4 and KDE5 Plasma with desktop composition enabled. This
happens on both Intel HD Ironake mobile (Apple MBP 2010) and HD-4000
Ivybridge mobile (Apple macMini 2012).

It looks like page flips are not waiting properly for rendering to
complete, showing partially rendered frames at flip time.

If i revert Daniel's commit that switches legacy pageflips from the
old code path to the atomic code, all problems disappear, so
apparently the atomic code for Intel is not quite ready at least on
those parts?


Exactly right, we've reverted the enabling patch for the time being.
Daniel Stone has spotted the likely problem, but we also want to review
the handling of state/old_state to see if the same problem has cropped
up elsewhere.
-Chris



Ah ok, now i see it in drm-intel-next-queued. I'm probably not adding 
anything new here, but wrt. your crc based tests not catching it, while 
it happens all the time for me under KDE, in my own fullscreen app, it 
only obviously happens for some tests, the more graphics heavy ones not 
others, so probably (gfx-)load dependent? Maybe the tests don't put 
enough work onto the gpu to still keep it rendering at flip time.


-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Legacy gamma table updates broken in 4.7-rc4

2016-07-06 Thread Mario Kleiner
A strange one. In Linux 4.7-rc4, at least as build by the Ubuntu 
mainline ppa, gamma table updates via RandR don't work. No errors are 
reported and the X-Server thinks everything went well, but on Intel 
Ironlake and Ivybridge the updates don't have any visual effect.


The same problem doesn't happen with current drm-next, so something was 
fixed. Looking at the new code in intel_color.c i can't see anything 
obvious that would break it on 4.7-rc but make it work on drm-next?


Are there some gamma fixes in drm-next that didn't make it into 4.7-rc yet?

Thanks,
-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Pageflipping bugs in drm-next on at least Ironlake and Ivybridge.

2016-07-06 Thread Mario Kleiner
Since i pulled the current drm-next tree i see strong flicker and visual 
corruption during pageflipping, both in my own app, but also in KDE4 and 
KDE5 Plasma with desktop composition enabled. This happens on both Intel 
HD Ironake mobile (Apple MBP 2010) and HD-4000 Ivybridge mobile (Apple 
macMini 2012).


It looks like page flips are not waiting properly for rendering to 
complete, showing partially rendered frames at flip time.


If i revert Daniel's commit that switches legacy pageflips from the old 
code path to the atomic code, all problems disappear, so apparently the 
atomic code for Intel is not quite ready at least on those parts?


In case this helps: As i was also testing DRI3/Present + PRIME on the 
hybrid graphics MBP, if i use the Intel HD as display gpu and the 
NVidia/nouveau as render offload gpu i don't get any corruption/flicker 
even with the atomic pageflip code for legacy pageflips. Iow. the path 
using dmabuf fence wait in intel_prepare_plane_fb works fine.


thanks,
-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm: use seqlocks for vblank time/count

2016-05-24 Thread Mario Kleiner

On 05/18/2016 05:10 PM, Matthew Auld wrote:

There's an updated version of this patch already on the ml [1], which
I Cc'd you in on. I take it that your @tuebingen.mpg.de is in fact an
old email address?

[1] https://patchwork.freedesktop.org/patch/86354/



Your patch looks good to me. I'd only keep that one dropped comment line 
in drmP.h about the vblank counter and ts also needing to be protected 
by the vblank_timelock in addition to the seqlock, as this is still 
needed, especially to get _irqsave part of spin_lock_irqsave, as the 
write seqlocks in don't do the local irq disable. I'll give it a test 
later this week.


Reviewed-by: Mario Kleiner <mario.kleiner...@gmail.com>

Indeed the old inactive @tuebingen.mpg.de is only a forward to the gmail 
address, probably with some botched mail filter rules, so they can go 
unnoticed quite a while.


thanks,
-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm: use seqlocks for vblank time/count

2016-05-18 Thread Mario Kleiner

On 05/09/2016 08:11 PM, Daniel Vetter wrote:

On Mon, May 09, 2016 at 08:16:07PM +0300, Ville Syrjälä wrote:

On Mon, May 09, 2016 at 05:08:43PM +0100, Matthew Auld wrote:

This patch aims to replace the roll-your-own seqlock implementation with
full-blown seqlock'. We also remove the timestamp ring-buffer in favour
of single timestamp/count pair protected by a seqlock. In turn this
means we can now increment the vblank freely without the need for
clamping.


This will also change the behaviour to block new readers while the
writer has the lock, whereas the old code would allow readers to
proceed in parallel. We do the whole hw counter + scanout position
query while holding the lock so it's not exactly zero amount of work,
but I'm not sure that's a real problem.

I guess we could reduce the scope of the seqlock, but then maybe we'd
need to keep the vblank_time_lock spinlock as well. The details escape
me now, so I'd have re-read the code again.

Ccing Mario too.


Yeah, my idea was to keep the spinlock, and only replace the stuff in
store_vblank and the few do {} while (cur_vblank != get_vblank_counter)
loops. Extending the seqlock stuff to everything seems indeed counter to
Mario's locking scheme.

So goal would be to really just replace the half-baked seqlock that we
have already, and leave all other locking unchanged.
-Daniel


+1 to that, for simplicity. I thought Ville already had a patch laying 
around somewhere which essentially does this?


-mario







Cc: Daniel Vetter 
Cc: Ville Syrjälä 
Signed-off-by: Matthew Auld 
---
  drivers/gpu/drm/drm_irq.c | 111 +-
  include/drm/drmP.h|  14 ++
  2 files changed, 25 insertions(+), 100 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 3c1a6f1..bfc6a8d 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -42,10 +42,6 @@
  #include 
  #include 

-/* Access macro for slots in vblank timestamp ringbuffer. */
-#define vblanktimestamp(dev, pipe, count) \
-   ((dev)->vblank[pipe].time[(count) % DRM_VBLANKTIME_RBSIZE])
-
  /* Retry timestamp calculation up to 3 times to satisfy
   * drm_timestamp_precision before giving up.
   */
@@ -82,29 +78,13 @@ static void store_vblank(struct drm_device *dev, unsigned 
int pipe,
 struct timeval *t_vblank, u32 last)
  {
struct drm_vblank_crtc *vblank = >vblank[pipe];
-   u32 tslot;

-   assert_spin_locked(>vblank_time_lock);
+   assert_spin_locked(>vblank_seqlock.lock);

vblank->last = last;

-   /* All writers hold the spinlock, but readers are serialized by
-* the latching of vblank->count below.
-*/
-   tslot = vblank->count + vblank_count_inc;
-   vblanktimestamp(dev, pipe, tslot) = *t_vblank;
-
-   /*
-* vblank timestamp updates are protected on the write side with
-* vblank_time_lock, but on the read side done locklessly using a
-* sequence-lock on the vblank counter. Ensure correct ordering using
-* memory barrriers. We need the barrier both before and also after the
-* counter update to synchronize with the next timestamp write.
-* The read-side barriers for this are in drm_vblank_count_and_time.
-*/
-   smp_wmb();
+   vblank->time = *t_vblank;
vblank->count += vblank_count_inc;
-   smp_wmb();
  }

  /**
@@ -127,7 +107,7 @@ static void drm_reset_vblank_timestamp(struct drm_device 
*dev, unsigned int pipe
struct timeval t_vblank;
int count = DRM_TIMESTAMP_MAXRETRIES;

-   spin_lock(>vblank_time_lock);
+   write_seqlock(>vblank_seqlock);

/*
 * sample the current counter to avoid random jumps
@@ -152,7 +132,7 @@ static void drm_reset_vblank_timestamp(struct drm_device 
*dev, unsigned int pipe
 */
store_vblank(dev, pipe, 1, _vblank, cur_vblank);

-   spin_unlock(>vblank_time_lock);
+   write_sequnlock(>vblank_seqlock);
  }

  /**
@@ -205,7 +185,7 @@ static void drm_update_vblank_count(struct drm_device *dev, 
unsigned int pipe,
const struct timeval *t_old;
u64 diff_ns;

-   t_old = (dev, pipe, vblank->count);
+   t_old = >time;
diff_ns = timeval_to_ns(_vblank) - timeval_to_ns(t_old);

/*
@@ -239,49 +219,6 @@ static void drm_update_vblank_count(struct drm_device 
*dev, unsigned int pipe,
diff = 1;
}

-   /*
-* FIMXE: Need to replace this hack with proper seqlocks.
-*
-* Restrict the bump of the software vblank counter to a safe maximum
-* value of +1 whenever there is the possibility that concurrent readers
-* of vblank timestamps could be active at the moment, as the current
-* implementation of the timestamp caching and updating is not safe
-   

Re: [Intel-gfx] [PATCH v2 01/21] drm/core: Add drm_accurate_vblank_count, v5.

2016-05-18 Thread Mario Kleiner
I'm fine with it. I assume the function will only be used by kms 
drivers, whose writers probably know when it is safe to call the 
function, ie. what kind of potential quirks the kms drivers timestamping 
implementation has.


Reviewed-by: Mario Kleiner <mario.kleiner...@gmail.com>

On 05/17/2016 03:07 PM, Maarten Lankhorst wrote:

This function is useful for gen2 intel devices which have no frame
counter, but need a way to determine the current vblank count without
racing with the vblank interrupt handler.

intel_pipe_update_start checks if no vblank interrupt will occur
during vblank evasion, but cannot check whether the vblank handler has
run to completion. This function uses the timestamps to determine
when the last vblank has happened, and interpolates from there.

Changes since v1:
- Take vblank_time_lock and don't use drm_vblank_count_and_time.
Changes since v2:
- Don't return time of last vblank.
Changes since v3:
- Change pipe to unsigned int. (Ville)
- Remove unused documentation for tv_ret. (kbuild)
Changes since v4:
- Add warning to docs when the function is useful.
- Add a WARN_ON when get_vblank_timestamp is unavailable.
- Use drm_vblank_count.

Cc: Mario Kleiner <mario.kleiner...@gmail.com>
Cc: Ville Syrjälä <ville.syrj...@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankho...@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrj...@linux.intel.com> #v4
Acked-by: David Airlie <airl...@linux.ie> #irc, v4
---
  drivers/gpu/drm/drm_irq.c | 31 +++
  include/drm/drmP.h|  1 +
  2 files changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 3c1a6f18e71c..d3124b67f4a5 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -303,6 +303,37 @@ static void drm_update_vblank_count(struct drm_device 
*dev, unsigned int pipe,
store_vblank(dev, pipe, diff, _vblank, cur_vblank);
  }

+/**
+ * drm_accurate_vblank_count - retrieve the master vblank counter
+ * @crtc: which counter to retrieve
+ *
+ * This function is similar to @drm_crtc_vblank_count but this
+ * function interpolates to handle a race with vblank irq's.
+ *
+ * This is mostly useful for hardware that can obtain the scanout
+ * position, but doesn't have a frame counter.
+ */
+u32 drm_accurate_vblank_count(struct drm_crtc *crtc)
+{
+   struct drm_device *dev = crtc->dev;
+   unsigned int pipe = drm_crtc_index(crtc);
+   u32 vblank;
+   unsigned long flags;
+
+   WARN(!dev->driver->get_vblank_timestamp,
+"This function requires support for accurate vblank timestamps.");
+
+   spin_lock_irqsave(>vblank_time_lock, flags);
+
+   drm_update_vblank_count(dev, pipe, 0);
+   vblank = drm_vblank_count(dev, pipe);
+
+   spin_unlock_irqrestore(>vblank_time_lock, flags);
+
+   return vblank;
+}
+EXPORT_SYMBOL(drm_accurate_vblank_count);
+
  /*
   * Disable vblank irq's on crtc, make sure that last vblank count
   * of hardware and corresponding consistent software vblank counter
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 360b2a74e1ef..ed890384b938 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -1002,6 +1002,7 @@ extern void drm_crtc_vblank_off(struct drm_crtc *crtc);
  extern void drm_crtc_vblank_reset(struct drm_crtc *crtc);
  extern void drm_crtc_vblank_on(struct drm_crtc *crtc);
  extern void drm_vblank_cleanup(struct drm_device *dev);
+extern u32 drm_accurate_vblank_count(struct drm_crtc *crtc);
  extern u32 drm_vblank_no_hw_counter(struct drm_device *dev, unsigned int 
pipe);

  extern int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev,


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/19] drm/core: Add drm_accurate_vblank_count, v5.

2016-04-27 Thread Mario Kleiner
Anyway, although i would have liked the stricter check and warning docs, 
the v4 patch is ok with me:


Reviewed-by: Mario Kleiner <mario.kleiner...@gmail.com>

-mario

On 04/25/2016 08:32 AM, Maarten Lankhorst wrote:

This function is useful for gen2 intel devices which have no frame
counter, but need a way to determine the current vblank count without
racing with the vblank interrupt handler.

intel_pipe_update_start checks if no vblank interrupt will occur
during vblank evasion, but cannot check whether the vblank handler has
run to completion. This function uses the timestamps to determine
when the last vblank has happened, and interpolates from there.

Changes since v1:
- Take vblank_time_lock and don't use drm_vblank_count_and_time.
Changes since v2:
- Don't return time of last vblank.
Changes since v3:
- Change pipe to unsigned int. (Ville)
- Remove unused documentation for tv_ret. (kbuild)
Changes since v4:
- Add warning to docs when the function is useful.
- Add a WARN_ON when get_vblank_timestamp is unavailable.
- Use drm_vblank_count.

Cc: Mario Kleiner <mario.kleiner...@gmail.com>
Cc: Ville Syrjälä <ville.syrj...@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankho...@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrj...@linux.intel.com> #v4
Acked-by: David Airlie <airl...@linux.ie> #irc, v4
---

Unfortunately WARN_ON(!dev->disable_vblank_immediate) doesn't work on gen2,
which is the reason this function is created. So I used
WARN_ON(!get_vblank_timestamp) instead.

  drivers/gpu/drm/drm_irq.c | 31 +++
  include/drm/drmP.h|  1 +
  2 files changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 3c1a6f18e71c..d3124b67f4a5 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -303,6 +303,37 @@ static void drm_update_vblank_count(struct drm_device 
*dev, unsigned int pipe,
store_vblank(dev, pipe, diff, _vblank, cur_vblank);
  }

+/**
+ * drm_accurate_vblank_count - retrieve the master vblank counter
+ * @crtc: which counter to retrieve
+ *
+ * This function is similar to @drm_crtc_vblank_count but this
+ * function interpolates to handle a race with vblank irq's.
+ *
+ * This is mostly useful for hardware that can obtain the scanout
+ * position, but doesn't have a frame counter.
+ */
+u32 drm_accurate_vblank_count(struct drm_crtc *crtc)
+{
+   struct drm_device *dev = crtc->dev;
+   unsigned int pipe = drm_crtc_index(crtc);
+   u32 vblank;
+   unsigned long flags;
+
+   WARN(!dev->driver->get_vblank_timestamp,
+"This function requires support for accurate vblank timestamps.");
+
+   spin_lock_irqsave(>vblank_time_lock, flags);
+
+   drm_update_vblank_count(dev, pipe, 0);
+   vblank = drm_vblank_count(dev, pipe);
+
+   spin_unlock_irqrestore(>vblank_time_lock, flags);
+
+   return vblank;
+}
+EXPORT_SYMBOL(drm_accurate_vblank_count);
+
  /*
   * Disable vblank irq's on crtc, make sure that last vblank count
   * of hardware and corresponding consistent software vblank counter
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 005202ea5900..90527c41cd5a 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -995,6 +995,7 @@ extern void drm_crtc_vblank_off(struct drm_crtc *crtc);
  extern void drm_crtc_vblank_reset(struct drm_crtc *crtc);
  extern void drm_crtc_vblank_on(struct drm_crtc *crtc);
  extern void drm_vblank_cleanup(struct drm_device *dev);
+extern u32 drm_accurate_vblank_count(struct drm_crtc *crtc);
  extern u32 drm_vblank_no_hw_counter(struct drm_device *dev, unsigned int 
pipe);

  extern int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev,


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/19] drm/core: Add drm_accurate_vblank_count, v5.

2016-04-25 Thread Mario Kleiner

On 04/25/2016 08:32 AM, Maarten Lankhorst wrote:

This function is useful for gen2 intel devices which have no frame
counter, but need a way to determine the current vblank count without
racing with the vblank interrupt handler.

intel_pipe_update_start checks if no vblank interrupt will occur
during vblank evasion, but cannot check whether the vblank handler has
run to completion. This function uses the timestamps to determine
when the last vblank has happened, and interpolates from there.

Changes since v1:
- Take vblank_time_lock and don't use drm_vblank_count_and_time.
Changes since v2:
- Don't return time of last vblank.
Changes since v3:
- Change pipe to unsigned int. (Ville)
- Remove unused documentation for tv_ret. (kbuild)
Changes since v4:
- Add warning to docs when the function is useful.
- Add a WARN_ON when get_vblank_timestamp is unavailable.
- Use drm_vblank_count.

Cc: Mario Kleiner <mario.kleiner...@gmail.com>
Cc: Ville Syrjälä <ville.syrj...@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankho...@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrj...@linux.intel.com> #v4
Acked-by: David Airlie <airl...@linux.ie> #irc, v4
---

Unfortunately WARN_ON(!dev->disable_vblank_immediate) doesn't work on gen2,
which is the reason this function is created. So I used
WARN_ON(!get_vblank_timestamp) instead.


That's a weaker warning. I'd like to have the WARN_ON and the doc text 
to be more frightening/restrictive to discourage abuse.


But can't you simply remove that !IS_GEN2 check now and always set 
dev->disable_vblank_immediate = true? The reason for that exception was 
that GEN2 doesn't have a hw vblank counter. But it has scanout pos based 
vblank timestamping, which i'd assume is well behaved. With the new 
scanout based vblank counter emulation in drm_update_vblank_count() 
since around Linux 4.4 you therefore essentially have a proper emulated 
vblank counter, so this should be safe. Ville will probably know. 
Otherwise you couldn't trust drm_accurate_vblank_count() here, because 
it depends on the same logic, no?


-mario




  drivers/gpu/drm/drm_irq.c | 31 +++
  include/drm/drmP.h|  1 +
  2 files changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 3c1a6f18e71c..d3124b67f4a5 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -303,6 +303,37 @@ static void drm_update_vblank_count(struct drm_device 
*dev, unsigned int pipe,
store_vblank(dev, pipe, diff, _vblank, cur_vblank);
  }

+/**
+ * drm_accurate_vblank_count - retrieve the master vblank counter
+ * @crtc: which counter to retrieve
+ *
+ * This function is similar to @drm_crtc_vblank_count but this
+ * function interpolates to handle a race with vblank irq's.
+ *
+ * This is mostly useful for hardware that can obtain the scanout
+ * position, but doesn't have a frame counter.
+ */
+u32 drm_accurate_vblank_count(struct drm_crtc *crtc)
+{
+   struct drm_device *dev = crtc->dev;
+   unsigned int pipe = drm_crtc_index(crtc);
+   u32 vblank;
+   unsigned long flags;
+
+   WARN(!dev->driver->get_vblank_timestamp,
+"This function requires support for accurate vblank timestamps.");
+
+   spin_lock_irqsave(>vblank_time_lock, flags);
+
+   drm_update_vblank_count(dev, pipe, 0);
+   vblank = drm_vblank_count(dev, pipe);
+
+   spin_unlock_irqrestore(>vblank_time_lock, flags);
+
+   return vblank;
+}
+EXPORT_SYMBOL(drm_accurate_vblank_count);
+
  /*
   * Disable vblank irq's on crtc, make sure that last vblank count
   * of hardware and corresponding consistent software vblank counter
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 005202ea5900..90527c41cd5a 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -995,6 +995,7 @@ extern void drm_crtc_vblank_off(struct drm_crtc *crtc);
  extern void drm_crtc_vblank_reset(struct drm_crtc *crtc);
  extern void drm_crtc_vblank_on(struct drm_crtc *crtc);
  extern void drm_vblank_cleanup(struct drm_device *dev);
+extern u32 drm_accurate_vblank_count(struct drm_crtc *crtc);
  extern u32 drm_vblank_no_hw_counter(struct drm_device *dev, unsigned int 
pipe);

  extern int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev,


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/19] drm/core: Add drm_accurate_vblank_count, v4.

2016-04-24 Thread Mario Kleiner

Sorry for the late review, but see below...

On 04/19/2016 09:52 AM, Maarten Lankhorst wrote:

This function is useful for gen2 intel devices which have no frame
counter, but need a way to determine the current vblank count without
racing with the vblank interrupt handler.

intel_pipe_update_start checks if no vblank interrupt will occur
during vblank evasion, but cannot check whether the vblank handler has
run to completion. This function uses the timestamps to determine
when the last vblank has happened, and interpolates from there.

Changes since v1:
- Take vblank_time_lock and don't use drm_vblank_count_and_time.
Changes since v2:
- Don't return time of last vblank.
Changes since v3:
- Change pipe to unsigned int. (Ville)
- Remove unused documentation for tv_ret. (kbuild)

Cc: Mario Kleiner <mario.kleiner...@gmail.com>
Cc: Ville Syrjälä <ville.syrj...@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankho...@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
Acked-by: David Airlie <airl...@linux.ie> #irc
---
  drivers/gpu/drm/drm_irq.c | 26 ++
  include/drm/drmP.h|  1 +
  2 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 3c1a6f18e71c..f1bda13562da 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -303,6 +303,32 @@ static void drm_update_vblank_count(struct drm_device 
*dev, unsigned int pipe,
store_vblank(dev, pipe, diff, _vblank, cur_vblank);
  }

+/**
+ * drm_accurate_vblank_count - retrieve the master vblank counter
+ * @crtc: which counter to retrieve
+ *
+ * This function is similar to @drm_crtc_vblank_count but this
+ * function interpolates to handle a race with vblank irq's.
+ */
+
+u32 drm_accurate_vblank_count(struct drm_crtc *crtc)
+{
+   struct drm_device *dev = crtc->dev;
+   unsigned int pipe = drm_crtc_index(crtc);
+   u32 vblank;
+   unsigned long flags;
+


This function is rather dangerous to use on any driver that doesn't have 
precise vblank timestamping, or doesn't have the guarantee that hw 
vblank counters (if there are any) and timestamps update exactly at 
leading edge of vblank, so i think we need some WARN() here and maybe 
much less encouraging docs to avoid this being called from incapable kms 
drivers or in general code.


- If the driver doesn't have precise scanoutpos based timestamping each 
call into drm_update_vblank_count from non-irq context will reset the 
vblank timestamps to zero, so clients will only receive invalid 
timestamps if this is frequently used. Also bogus vblank counts


Atm. only i915 Intel, AMD, NVidia desktop for >= NV-50, maybe nouveau 
driven Tegra parts, and some modern Adrenos (msm/mdp-5 - i assume from 
the code?) support this reliably.


- If the drivers scanoutpos timestamps and/or vblank counter don't 
increment at leading edge we will get funny off-by-one problems with 
vblank counters. That's why we normally only call 
drm_update_vblank_count() from vblank irq on such parts - the only safe 
place to avoid off-by-one problems, and limit vblank disable/enable to 
only at most once every 5 seconds to reduce the problems caused by 
off-by-one errors.


Which restricts the list to only the above parts, maybe minus Adreno 
where i don't know if it obeys the "leading edge" rule or not.


So on most SoC's one must not use this function.

WARN_ON(!dev->vblank_disable_immediate, "This function is unsafe on this 
driver.");


would probably prevent the worst abuse, unless drivers lie about 
vblank_disable_immediate. Not sure how much this was checked for msm / 
Adreno? At least drm_vblank_init() only allows vblank_disable_immediate 
if the driver at least implements proper timestamping.


Not sure how much general use this function will have outside Intel 
gen-2 with the restrictions on safe use?



+   spin_lock_irqsave(>vblank_time_lock, flags);
+
+   drm_update_vblank_count(dev, pipe, 0);
+   vblank = dev->vblank[pipe].count;


Could do vblank = drm_vblank_count(dev, pipe); instead, given that we 
avoid open coding this in most places.


-mario


+
+   spin_unlock_irqrestore(>vblank_time_lock, flags);
+
+   return vblank;
+}
+EXPORT_SYMBOL(drm_accurate_vblank_count);
+
  /*
   * Disable vblank irq's on crtc, make sure that last vblank count
   * of hardware and corresponding consistent software vblank counter
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 005202ea5900..90527c41cd5a 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -995,6 +995,7 @@ extern void drm_crtc_vblank_off(struct drm_crtc *crtc);
  extern void drm_crtc_vblank_reset(struct drm_crtc *crtc);
  extern void drm_crtc_vblank_on(struct drm_crtc *crtc);
  extern void drm_vblank_cleanup(struct drm_device *dev);
+extern u32 drm_accurate_vblank_count(struct drm_crtc *crtc);
  extern u32 drm_vblank_no_h

Re: [Intel-gfx] [PATCH] drm/i915: Only dither on 6bpc panels

2015-08-12 Thread Mario Kleiner

Thanks for the quick fix! Comments below...

On 08/12/2015 11:43 AM, Daniel Vetter wrote:

In

commit d328c9d78d64ca11e744fe227096990430a88477
Author: Daniel Vetter daniel.vet...@ffwll.ch
Date:   Fri Apr 10 16:22:37 2015 +0200

 drm/i915: Select starting pipe bpp irrespective or the primary plane

we started to select the pipe bpp from sink capabilities and not from
the primary framebuffer - that one might change (and we don't want to
incur a modeset) and sprites might contain higher bpp content too.

Problem is that now if you have a 10bpc screen and display 24bpp rgb
primary then we select dithering, and apparently that mangles the high
8 bits even (even thought you'd expect dithering only to affect how
12bpc gets mapped into 10bpc). And that mangling upsets certain users.



Probably doesn't matter, but your explanation of the former problem here 
is slightly off. We also selected dithering on a 8 bpc screen displaying 
a 24bpp rgb primary, because pipe_bpp is 24 for such a typical 8 bpc 
sink, but since the commit mentioned above, base_bpp is always the 
absolute maximum supported by the hardware, e.g., 36 bpp on my Ironlake 
chip. Iow. the only way to not get dithering would have been to connect 
a deep color 12 bpc display, so pipe_bpp == 36 == base_bpp.



Hence only enable dithering on 6bpc screens where we difinitely and
always want it.



Other than that, i tested the patch on both 8 bpc output with my 
measurement equipment and on the internal laptop 6 bpc panel, and 
everything is fine now - No banding on the 6 bpc panel, no banding or 
equipment failure on the external 8 bpc output. Life is good again :)


Reviewed-and-tested-by: Mario Kleiner mario.kleiner...@gmail.com

thanks,
-mario


Cc: Mario Kleiner mario.kleiner...@gmail.com
Reported-by: Mario Kleiner mario.kleiner...@gmail.com
Signed-off-by: Daniel Vetter daniel.vet...@intel.com
---
  drivers/gpu/drm/i915/intel_display.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 9a2f229a1c3a..128462e0a0b5 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -12186,7 +12186,9 @@ encoder_retry:
goto encoder_retry;
}

-   pipe_config-dither = pipe_config-pipe_bpp != base_bpp;
+   /* Dithering seems to not pass-through bits correctly when it should, so
+* only enable it on 6bpc panels. */
+   pipe_config-dither = pipe_config-pipe_bpp == 6*3;
DRM_DEBUG_KMS(plane bpp: %i, pipe bpp: %i, dithering: %i\n,
  base_bpp, pipe_config-pipe_bpp, pipe_config-dither);



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Intel-kms in Linux-4.2rc causes regression due to dithering always on.

2015-08-11 Thread Mario Kleiner

On 08/07/2015 09:14 AM, Daniel Vetter wrote:

On Fri, Aug 07, 2015 at 12:45:52AM +0200, Mario Kleiner wrote:

On 08/07/2015 12:12 AM, Daniel Vetter wrote:

On Thu, Aug 6, 2015 at 11:56 PM, Mario Kleiner
mario.kleiner...@gmail.com wrote:

Hi Daniel and all,

since Linux 4.2 (tested with rc4), i think this commit
d328c9d78d64ca11e744fe227096990430a88477
drm/i915: Select starting pipe bpp irrespective or the primary plane

causes trouble for me and my users, as tested on Intel HD Ironlake and Ivy
Bridge with MiniDP-Singlelink-DVI adapter - Measurement device.

Afaics it causes dithering to always be enabled on a regular 8bpc
framebuffer, even when outputting to a 8 bpc DVI-D output, and that
dithering causes my display measurement equipment and other special display
devices used for neuro-science and medical applications to fail. This
equipment requires an identity passthrough of 8 bpc framebuffer pixels to
the digital outputs, iow. dithering off.

Log output on Linux 4.1 (good):

Aug  1 06:39:26 twisty kernel: [  154.175394]
[drm:connected_sink_compute_bpp] [CONNECTOR:35:HDMI-A-1] checking for sink
bpp constrains
Aug  1 06:39:26 twisty kernel: [  154.175396]
[drm:intel_hdmi_compute_config] picking bpc to 8 for HDMI output
Aug  1 06:39:26 twisty kernel: [  154.175397]
[drm:intel_hdmi_compute_config] forcing pipe bpc to 24 for HDMI
Aug  1 06:39:26 twisty kernel: [  154.175400] [drm:ironlake_check_fdi_lanes]
checking fdi config on pipe A, lanes 1
Aug  1 06:39:26 twisty kernel: [  154.175402]
[drm:intel_modeset_pipe_config] plane bpp: 24, pipe bpp: 24, dithering: 0
Aug  1 06:39:26 twisty kernel: [  154.175403] [drm:intel_dump_pipe_config]
[CRTC:20][modeset] config for pipe A
Aug  1 06:39:26 twisty kernel: [  154.175404] [drm:intel_dump_pipe_config]
cpu_transcoder: A
Aug  1 06:39:26 twisty kernel: [  154.175405] [drm:intel_dump_pipe_config]
pipe bpp: 24, dithering: 0

Log output on Linux 4.2-rc4 (bad):

Aug  1 06:21:31 twisty kernel: [  200.924831]
[drm:connected_sink_compute_bpp] [CONNECTOR:36:HDMI-A-1] checking for sink
bpp constrains
Aug  1 06:21:31 twisty kernel: [  200.924832]
[drm:connected_sink_compute_bpp] clamping display bpp (was 36) to default
limit of 24
Aug  1 06:21:31 twisty kernel: [  200.924834]
[drm:intel_hdmi_compute_config] picking bpc to 8 for HDMI output
Aug  1 06:21:31 twisty kernel: [  200.924835]
[drm:intel_hdmi_compute_config] forcing pipe bpc to 24 for HDMI
Aug  1 06:21:31 twisty kernel: [  200.924838] [drm:ironlake_check_fdi_lanes]
checking fdi config on pipe A, lanes 1
Aug  1 06:21:31 twisty kernel: [  200.924840]
[drm:intel_modeset_pipe_config] plane bpp: 36, pipe bpp: 24, dithering: 1
Aug  1 06:21:31 twisty kernel: [  200.924841] [drm:intel_dump_pipe_config]
[CRTC:21][modeset] config 880131a5c800 for pipe A
Aug  1 06:21:31 twisty kernel: [  200.924842] [drm:intel_dump_pipe_config]
cpu_transcoder: A
Aug  1 06:21:31 twisty kernel: [  200.924843] [drm:intel_dump_pipe_config]
pipe bpp: 24, dithering: 1

Ideas what to do about this?


Well I somehow assumed the dither bit would be sane and not wreak
havoc with the lower bits when they would fit into the final bpc pipe
mode ... Can you confirm with your equipment that we seem to be doing
8bpc-6bpc dithering on the 8bpc sink?



It will need a bit of work to find this out when i'm back in the lab. So far
i just know something bad is happening to the signal and i assume it's the
dithering, because the visual error pattern of messiness looks like that
caused by dithering. E.g., on a static framebuffer i see some repeating
pattern over the screen, but the pattern changes with every OpenGL
bufferswap, even if i swap to the same fb content, as if the swap triggers
some change of the spatial dither pattern (assuming PIPECONF_DITHER_TYPE_SP
= spatial dithering?)


If that's the case we simply limit to only ever dither when the sink
is 6bpc, and not in any other case.
-Daniel



That would be an improvement for my immediate problem if that works. But
assuming we have 10 bpc framebuffers at some point, dithering 10 bpc - 8
bpc would also have some practical use.

Probably some dynamic check would be good, a la if there is a mismatch
between the max(bpc) over all active planes and the supported depth of the
sink then dither?

It's not clear to me where the dithering happens on intel hw. I'd expected
that with a 24 bpp framebuffer feeding into a 24 bpp pipe, dithering simply
wouldn't do anything even if enabled.


Yeah my assumption was that if you run the pipe at a given bpc it will
just pass through anything that fits and only dither the additional bits.
But obviously that's not how the hardware works ...

The problem with adaptive schemes is that we have multiple planes nowadays
and they might all run at different depths. And dither seems to be
happening at the pipe/overall level (at least there's only one bit). Of
course this wouldn't be a problem if the thing wouldn't mangle bits which
should pass!

Anyway if we can confirm this I think

[Intel-gfx] Intel-kms in Linux-4.2rc causes regression due to dithering always on.

2015-08-06 Thread Mario Kleiner

Hi Daniel and all,

since Linux 4.2 (tested with rc4), i think this commit 
d328c9d78d64ca11e744fe227096990430a88477

drm/i915: Select starting pipe bpp irrespective or the primary plane

causes trouble for me and my users, as tested on Intel HD Ironlake and 
Ivy Bridge with MiniDP-Singlelink-DVI adapter - Measurement device.


Afaics it causes dithering to always be enabled on a regular 8bpc 
framebuffer, even when outputting to a 8 bpc DVI-D output, and that 
dithering causes my display measurement equipment and other special 
display devices used for neuro-science and medical applications to fail. 
This equipment requires an identity passthrough of 8 bpc framebuffer 
pixels to the digital outputs, iow. dithering off.


Log output on Linux 4.1 (good):

Aug  1 06:39:26 twisty kernel: [  154.175394] 
[drm:connected_sink_compute_bpp] [CONNECTOR:35:HDMI-A-1] checking for 
sink bpp constrains
Aug  1 06:39:26 twisty kernel: [  154.175396] 
[drm:intel_hdmi_compute_config] picking bpc to 8 for HDMI output
Aug  1 06:39:26 twisty kernel: [  154.175397] 
[drm:intel_hdmi_compute_config] forcing pipe bpc to 24 for HDMI
Aug  1 06:39:26 twisty kernel: [  154.175400] 
[drm:ironlake_check_fdi_lanes] checking fdi config on pipe A, lanes 1
Aug  1 06:39:26 twisty kernel: [  154.175402] 
[drm:intel_modeset_pipe_config] plane bpp: 24, pipe bpp: 24, dithering: 0
Aug  1 06:39:26 twisty kernel: [  154.175403] 
[drm:intel_dump_pipe_config] [CRTC:20][modeset] config for pipe A
Aug  1 06:39:26 twisty kernel: [  154.175404] 
[drm:intel_dump_pipe_config] cpu_transcoder: A
Aug  1 06:39:26 twisty kernel: [  154.175405] 
[drm:intel_dump_pipe_config] pipe bpp: 24, dithering: 0


Log output on Linux 4.2-rc4 (bad):

Aug  1 06:21:31 twisty kernel: [  200.924831] 
[drm:connected_sink_compute_bpp] [CONNECTOR:36:HDMI-A-1] checking for 
sink bpp constrains
Aug  1 06:21:31 twisty kernel: [  200.924832] 
[drm:connected_sink_compute_bpp] clamping display bpp (was 36) to 
default limit of 24
Aug  1 06:21:31 twisty kernel: [  200.924834] 
[drm:intel_hdmi_compute_config] picking bpc to 8 for HDMI output
Aug  1 06:21:31 twisty kernel: [  200.924835] 
[drm:intel_hdmi_compute_config] forcing pipe bpc to 24 for HDMI
Aug  1 06:21:31 twisty kernel: [  200.924838] 
[drm:ironlake_check_fdi_lanes] checking fdi config on pipe A, lanes 1
Aug  1 06:21:31 twisty kernel: [  200.924840] 
[drm:intel_modeset_pipe_config] plane bpp: 36, pipe bpp: 24, dithering: 1
Aug  1 06:21:31 twisty kernel: [  200.924841] 
[drm:intel_dump_pipe_config] [CRTC:21][modeset] config 880131a5c800 
for pipe A
Aug  1 06:21:31 twisty kernel: [  200.924842] 
[drm:intel_dump_pipe_config] cpu_transcoder: A
Aug  1 06:21:31 twisty kernel: [  200.924843] 
[drm:intel_dump_pipe_config] pipe bpp: 24, dithering: 1


Ideas what to do about this?

thanks,
-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Intel-kms in Linux-4.2rc causes regression due to dithering always on.

2015-08-06 Thread Mario Kleiner

On 08/07/2015 12:12 AM, Daniel Vetter wrote:

On Thu, Aug 6, 2015 at 11:56 PM, Mario Kleiner
mario.kleiner...@gmail.com wrote:

Hi Daniel and all,

since Linux 4.2 (tested with rc4), i think this commit
d328c9d78d64ca11e744fe227096990430a88477
drm/i915: Select starting pipe bpp irrespective or the primary plane

causes trouble for me and my users, as tested on Intel HD Ironlake and Ivy
Bridge with MiniDP-Singlelink-DVI adapter - Measurement device.

Afaics it causes dithering to always be enabled on a regular 8bpc
framebuffer, even when outputting to a 8 bpc DVI-D output, and that
dithering causes my display measurement equipment and other special display
devices used for neuro-science and medical applications to fail. This
equipment requires an identity passthrough of 8 bpc framebuffer pixels to
the digital outputs, iow. dithering off.

Log output on Linux 4.1 (good):

Aug  1 06:39:26 twisty kernel: [  154.175394]
[drm:connected_sink_compute_bpp] [CONNECTOR:35:HDMI-A-1] checking for sink
bpp constrains
Aug  1 06:39:26 twisty kernel: [  154.175396]
[drm:intel_hdmi_compute_config] picking bpc to 8 for HDMI output
Aug  1 06:39:26 twisty kernel: [  154.175397]
[drm:intel_hdmi_compute_config] forcing pipe bpc to 24 for HDMI
Aug  1 06:39:26 twisty kernel: [  154.175400] [drm:ironlake_check_fdi_lanes]
checking fdi config on pipe A, lanes 1
Aug  1 06:39:26 twisty kernel: [  154.175402]
[drm:intel_modeset_pipe_config] plane bpp: 24, pipe bpp: 24, dithering: 0
Aug  1 06:39:26 twisty kernel: [  154.175403] [drm:intel_dump_pipe_config]
[CRTC:20][modeset] config for pipe A
Aug  1 06:39:26 twisty kernel: [  154.175404] [drm:intel_dump_pipe_config]
cpu_transcoder: A
Aug  1 06:39:26 twisty kernel: [  154.175405] [drm:intel_dump_pipe_config]
pipe bpp: 24, dithering: 0

Log output on Linux 4.2-rc4 (bad):

Aug  1 06:21:31 twisty kernel: [  200.924831]
[drm:connected_sink_compute_bpp] [CONNECTOR:36:HDMI-A-1] checking for sink
bpp constrains
Aug  1 06:21:31 twisty kernel: [  200.924832]
[drm:connected_sink_compute_bpp] clamping display bpp (was 36) to default
limit of 24
Aug  1 06:21:31 twisty kernel: [  200.924834]
[drm:intel_hdmi_compute_config] picking bpc to 8 for HDMI output
Aug  1 06:21:31 twisty kernel: [  200.924835]
[drm:intel_hdmi_compute_config] forcing pipe bpc to 24 for HDMI
Aug  1 06:21:31 twisty kernel: [  200.924838] [drm:ironlake_check_fdi_lanes]
checking fdi config on pipe A, lanes 1
Aug  1 06:21:31 twisty kernel: [  200.924840]
[drm:intel_modeset_pipe_config] plane bpp: 36, pipe bpp: 24, dithering: 1
Aug  1 06:21:31 twisty kernel: [  200.924841] [drm:intel_dump_pipe_config]
[CRTC:21][modeset] config 880131a5c800 for pipe A
Aug  1 06:21:31 twisty kernel: [  200.924842] [drm:intel_dump_pipe_config]
cpu_transcoder: A
Aug  1 06:21:31 twisty kernel: [  200.924843] [drm:intel_dump_pipe_config]
pipe bpp: 24, dithering: 1

Ideas what to do about this?


Well I somehow assumed the dither bit would be sane and not wreak
havoc with the lower bits when they would fit into the final bpc pipe
mode ... Can you confirm with your equipment that we seem to be doing
8bpc-6bpc dithering on the 8bpc sink?



It will need a bit of work to find this out when i'm back in the lab. So 
far i just know something bad is happening to the signal and i assume 
it's the dithering, because the visual error pattern of messiness looks 
like that caused by dithering. E.g., on a static framebuffer i see some 
repeating pattern over the screen, but the pattern changes with every 
OpenGL bufferswap, even if i swap to the same fb content, as if the swap 
triggers some change of the spatial dither pattern (assuming 
PIPECONF_DITHER_TYPE_SP = spatial dithering?)



If that's the case we simply limit to only ever dither when the sink
is 6bpc, and not in any other case.
-Daniel



That would be an improvement for my immediate problem if that works. But 
assuming we have 10 bpc framebuffers at some point, dithering 10 bpc - 
8 bpc would also have some practical use.


Probably some dynamic check would be good, a la if there is a mismatch 
between the max(bpc) over all active planes and the supported depth of 
the sink then dither?


It's not clear to me where the dithering happens on intel hw. I'd 
expected that with a 24 bpp framebuffer feeding into a 24 bpp pipe, 
dithering simply wouldn't do anything even if enabled.


-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/3] drm/nouveau: Use drm_vblank_on/off consistently

2015-06-05 Thread Mario Kleiner

On 05/29/2015 07:35 PM, Daniel Vetter wrote:

On Fri, May 29, 2015 at 07:23:35PM +0200, Mario Kleiner wrote:



On 05/29/2015 07:19 PM, Daniel Vetter wrote:

On Fri, May 29, 2015 at 06:50:06PM +0200, Mario Kleiner wrote:

On 05/27/2015 11:04 AM, Daniel Vetter wrote:

In

commit 9cba5efab5a8145ae6c52ea273553f069c294482
Author: Mario Kleiner mario.kleiner...@gmail.com
Date:   Tue Jul 29 02:36:44 2014 +0200

 drm/nouveau: Dis/Enable vblank irqs during suspend/resume

drm_vblank_on/off calls where added around suspend/resume to make sure
vblank stay doesn't go boom over that transition. But nouveau already
used drm_vblank_pre/post_modeset over modesets. Instead use
drm_vblank_on/off everyhwere. The slight change here is that after
_off drm_vblank_get will refuse to work right away, but nouveau
doesn't seem to depend upon that anywhere outside of the pageflip
paths.

The longer-term plan here is to switch all kms drivers to
drm_vblank_on/off so that common code like pending event cleanup can
be done there, while drm_vblank_pre/post_modeset will be purely
drm internal for the old UMS ioctl.

Note that the drm_vblank_off still seems required in the suspend path
since nouveau doesn't explicitly disable crtcs. But on the resume side
drm_helper_resume_force_mode should end up calling drm_vblank_on
through the nouveau crtc hooks already. Hence remove the call in the
resume code.

Cc: Mario Kleiner mario.kleiner...@gmail.com
Cc: Ben Skeggs bske...@redhat.com
Signed-off-by: Daniel Vetter daniel.vet...@intel.com
---
  drivers/gpu/drm/nouveau/dispnv04/crtc.c   | 4 ++--
  drivers/gpu/drm/nouveau/nouveau_display.c | 4 
  2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c 
b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
index 3d96b49fe662..dab24066fa21 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
@@ -708,7 +708,7 @@ static void nv_crtc_prepare(struct drm_crtc *crtc)
if (nv_two_heads(dev))
NVSetOwner(dev, nv_crtc-index);

-   drm_vblank_pre_modeset(dev, nv_crtc-index);
+   drm_vblank_off(dev, nv_crtc-index);
funcs-dpms(crtc, DRM_MODE_DPMS_OFF);

NVBlankScreen(dev, nv_crtc-index, true);
@@ -740,7 +740,7 @@ static void nv_crtc_commit(struct drm_crtc *crtc)
  #endif

funcs-dpms(crtc, DRM_MODE_DPMS_ON);
-   drm_vblank_post_modeset(dev, nv_crtc-index);
+   drm_vblank_on(dev, nv_crtc-index);
  }


The above hunk is probably correct, but i couldn't test it without
sufficiently old pre-nv 50 hardware.



  static void nv_crtc_destroy(struct drm_crtc *crtc)
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c 
b/drivers/gpu/drm/nouveau/nouveau_display.c
index 8670d90cdc11..d824023f9fc6 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -620,10 +620,6 @@ nouveau_display_resume(struct drm_device *dev, bool 
runtime)
nv_crtc-lut.depth = 0;
}

-   /* Make sure that drm and hw vblank irqs get resumed if needed. */
-   for (head = 0; head  dev-mode_config.num_crtc; head++)
-   drm_vblank_on(dev, head);
-
/* This should ensure we don't hit a locking problem when someone
 * wakes us up via a connector.  We should never go into suspend
 * while the display is on anyways.



Tested this one and this hunk breaks suspend/resume. After a suspend/resume
cycle, all OpenGL apps and composited desktop are dead, as the core can't
get any vblank irq's enabled anymore.

So the drm_vblank_on() is still needed here.


Hm that's very surprising. As mentioned above the force_mode_restore
should be calling nv_crtc_prepare already and fix this all up for us. I
guess I need to dig out my nv card and trace what's really going on here.

Enabling interrupts when the crtc is off isn't a good idea.
-Daniel



I think the nv_crtc_prepare() path modified in your first hunk is only for
the original nv04 display engine for very old cards. nv50+ (GeForce-8 and
later) take different paths.


Oh right totally missed the nv50+ code. I only grepped for
pre/post_modeset ...

Below untested diff should help. I also realized that the pre-nv50 code
lacks drm_vblank_on/off in the dpms callback, so there's more work to do
anyway for this one here.

Thanks, Daniel



The diff on top of your patch is now tested and helps. suspend-resume 
is now fine on nv50. In your patch, nouveau_display_resume() would also 
need to get a now unused int head removed to make the compiler happy.


-mario


diff --git a/drivers/gpu/drm/nouveau/nv50_display.c 
b/drivers/gpu/drm/nouveau/nv50_display.c
index 7da7958556a3..a16c37d8f7e1 100644
--- a/drivers/gpu/drm/nouveau/nv50_display.c
+++ b/drivers/gpu/drm/nouveau/nv50_display.c
@@ -997,6 +997,10 @@ nv50_crtc_cursor_show_hide(struct nouveau_crtc *nv_crtc, 
bool show, bool update)
  static void
  nv50_crtc_dpms(struct drm_crtc *crtc, int mode

Re: [Intel-gfx] [PATCH 1/3] drm/nouveau: Use drm_vblank_on/off consistently

2015-05-29 Thread Mario Kleiner

On 05/27/2015 11:04 AM, Daniel Vetter wrote:

In

commit 9cba5efab5a8145ae6c52ea273553f069c294482
Author: Mario Kleiner mario.kleiner...@gmail.com
Date:   Tue Jul 29 02:36:44 2014 +0200

 drm/nouveau: Dis/Enable vblank irqs during suspend/resume

drm_vblank_on/off calls where added around suspend/resume to make sure
vblank stay doesn't go boom over that transition. But nouveau already
used drm_vblank_pre/post_modeset over modesets. Instead use
drm_vblank_on/off everyhwere. The slight change here is that after
_off drm_vblank_get will refuse to work right away, but nouveau
doesn't seem to depend upon that anywhere outside of the pageflip
paths.

The longer-term plan here is to switch all kms drivers to
drm_vblank_on/off so that common code like pending event cleanup can
be done there, while drm_vblank_pre/post_modeset will be purely
drm internal for the old UMS ioctl.

Note that the drm_vblank_off still seems required in the suspend path
since nouveau doesn't explicitly disable crtcs. But on the resume side
drm_helper_resume_force_mode should end up calling drm_vblank_on
through the nouveau crtc hooks already. Hence remove the call in the
resume code.

Cc: Mario Kleiner mario.kleiner...@gmail.com
Cc: Ben Skeggs bske...@redhat.com
Signed-off-by: Daniel Vetter daniel.vet...@intel.com
---
  drivers/gpu/drm/nouveau/dispnv04/crtc.c   | 4 ++--
  drivers/gpu/drm/nouveau/nouveau_display.c | 4 
  2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c 
b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
index 3d96b49fe662..dab24066fa21 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
@@ -708,7 +708,7 @@ static void nv_crtc_prepare(struct drm_crtc *crtc)
if (nv_two_heads(dev))
NVSetOwner(dev, nv_crtc-index);

-   drm_vblank_pre_modeset(dev, nv_crtc-index);
+   drm_vblank_off(dev, nv_crtc-index);
funcs-dpms(crtc, DRM_MODE_DPMS_OFF);

NVBlankScreen(dev, nv_crtc-index, true);
@@ -740,7 +740,7 @@ static void nv_crtc_commit(struct drm_crtc *crtc)
  #endif

funcs-dpms(crtc, DRM_MODE_DPMS_ON);
-   drm_vblank_post_modeset(dev, nv_crtc-index);
+   drm_vblank_on(dev, nv_crtc-index);
  }


The above hunk is probably correct, but i couldn't test it without 
sufficiently old pre-nv 50 hardware.




  static void nv_crtc_destroy(struct drm_crtc *crtc)
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c 
b/drivers/gpu/drm/nouveau/nouveau_display.c
index 8670d90cdc11..d824023f9fc6 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -620,10 +620,6 @@ nouveau_display_resume(struct drm_device *dev, bool 
runtime)
nv_crtc-lut.depth = 0;
}

-   /* Make sure that drm and hw vblank irqs get resumed if needed. */
-   for (head = 0; head  dev-mode_config.num_crtc; head++)
-   drm_vblank_on(dev, head);
-
/* This should ensure we don't hit a locking problem when someone
 * wakes us up via a connector.  We should never go into suspend
 * while the display is on anyways.



Tested this one and this hunk breaks suspend/resume. After a 
suspend/resume cycle, all OpenGL apps and composited desktop are dead, 
as the core can't get any vblank irq's enabled anymore.


So the drm_vblank_on() is still needed here.

thanks,
-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/3] drm/nouveau: Use drm_vblank_on/off consistently

2015-05-29 Thread Mario Kleiner



On 05/29/2015 07:19 PM, Daniel Vetter wrote:

On Fri, May 29, 2015 at 06:50:06PM +0200, Mario Kleiner wrote:

On 05/27/2015 11:04 AM, Daniel Vetter wrote:

In

commit 9cba5efab5a8145ae6c52ea273553f069c294482
Author: Mario Kleiner mario.kleiner...@gmail.com
Date:   Tue Jul 29 02:36:44 2014 +0200

 drm/nouveau: Dis/Enable vblank irqs during suspend/resume

drm_vblank_on/off calls where added around suspend/resume to make sure
vblank stay doesn't go boom over that transition. But nouveau already
used drm_vblank_pre/post_modeset over modesets. Instead use
drm_vblank_on/off everyhwere. The slight change here is that after
_off drm_vblank_get will refuse to work right away, but nouveau
doesn't seem to depend upon that anywhere outside of the pageflip
paths.

The longer-term plan here is to switch all kms drivers to
drm_vblank_on/off so that common code like pending event cleanup can
be done there, while drm_vblank_pre/post_modeset will be purely
drm internal for the old UMS ioctl.

Note that the drm_vblank_off still seems required in the suspend path
since nouveau doesn't explicitly disable crtcs. But on the resume side
drm_helper_resume_force_mode should end up calling drm_vblank_on
through the nouveau crtc hooks already. Hence remove the call in the
resume code.

Cc: Mario Kleiner mario.kleiner...@gmail.com
Cc: Ben Skeggs bske...@redhat.com
Signed-off-by: Daniel Vetter daniel.vet...@intel.com
---
  drivers/gpu/drm/nouveau/dispnv04/crtc.c   | 4 ++--
  drivers/gpu/drm/nouveau/nouveau_display.c | 4 
  2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c 
b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
index 3d96b49fe662..dab24066fa21 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
@@ -708,7 +708,7 @@ static void nv_crtc_prepare(struct drm_crtc *crtc)
if (nv_two_heads(dev))
NVSetOwner(dev, nv_crtc-index);

-   drm_vblank_pre_modeset(dev, nv_crtc-index);
+   drm_vblank_off(dev, nv_crtc-index);
funcs-dpms(crtc, DRM_MODE_DPMS_OFF);

NVBlankScreen(dev, nv_crtc-index, true);
@@ -740,7 +740,7 @@ static void nv_crtc_commit(struct drm_crtc *crtc)
  #endif

funcs-dpms(crtc, DRM_MODE_DPMS_ON);
-   drm_vblank_post_modeset(dev, nv_crtc-index);
+   drm_vblank_on(dev, nv_crtc-index);
  }


The above hunk is probably correct, but i couldn't test it without
sufficiently old pre-nv 50 hardware.



  static void nv_crtc_destroy(struct drm_crtc *crtc)
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c 
b/drivers/gpu/drm/nouveau/nouveau_display.c
index 8670d90cdc11..d824023f9fc6 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -620,10 +620,6 @@ nouveau_display_resume(struct drm_device *dev, bool 
runtime)
nv_crtc-lut.depth = 0;
}

-   /* Make sure that drm and hw vblank irqs get resumed if needed. */
-   for (head = 0; head  dev-mode_config.num_crtc; head++)
-   drm_vblank_on(dev, head);
-
/* This should ensure we don't hit a locking problem when someone
 * wakes us up via a connector.  We should never go into suspend
 * while the display is on anyways.



Tested this one and this hunk breaks suspend/resume. After a suspend/resume
cycle, all OpenGL apps and composited desktop are dead, as the core can't
get any vblank irq's enabled anymore.

So the drm_vblank_on() is still needed here.


Hm that's very surprising. As mentioned above the force_mode_restore
should be calling nv_crtc_prepare already and fix this all up for us. I
guess I need to dig out my nv card and trace what's really going on here.

Enabling interrupts when the crtc is off isn't a good idea.
-Daniel



I think the nv_crtc_prepare() path modified in your first hunk is only 
for the original nv04 display engine for very old cards. nv50+ 
(GeForce-8 and later) take different paths.


-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/plane-helper: Adapt cursor hack to transitional helpers

2015-05-21 Thread Mario Kleiner

On 05/20/2015 10:36 AM, Daniel Vetter wrote:

In

commit f02ad907cd9e7fe3a6405d2d005840912f1ed258
Author: Daniel Vetter daniel.vet...@ffwll.ch
Date:   Thu Jan 22 16:36:23 2015 +0100

 drm/atomic-helpers: Recover full cursor plane behaviour

we've added a hack to atomic helpers to never to vblank waits for
cursor updates through the legacy apis since that's what X expects.
Unfortunately we've (again) forgotten to adjust the transitional
helpers. Do this now.

This fixes regressions for drivers only partially converted over to
atomic (like i915).

Reported-by: Pekka Paalanen ppaala...@gmail.com
Cc: Pekka Paalanen ppaala...@gmail.com
Cc: sta...@vger.kernel.org
Signed-off-by: Daniel Vetter daniel.vet...@intel.com
---
  drivers/gpu/drm/drm_plane_helper.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_plane_helper.c 
b/drivers/gpu/drm/drm_plane_helper.c
index 40c1db9ad7c3..2f0ed11024eb 100644
--- a/drivers/gpu/drm/drm_plane_helper.c
+++ b/drivers/gpu/drm/drm_plane_helper.c
@@ -465,6 +465,9 @@ int drm_plane_helper_commit(struct drm_plane *plane,
if (!crtc[i])
continue;

+   if (crtc[i]-cursor == plane)
+   continue;
+
/* There's no other way to figure out whether the crtc is 
running. */
ret = drm_crtc_vblank_get(crtc[i]);
if (ret == 0) {



This one is

Reviewed-and-tested-by: Mario Kleiner mario.kleiner...@gmail.com

I was looking into Weston performance and the cursor problem, so had 
necessary tracing in place to test this. I can confirm that cursor 
related blocking in Westons drm-backend execution are gone with this 
patch applied, whereas they are still present when using hardware 
overlays on Intel, as expected.


So hardware cursors should be fine again, once the patch also ends in 
stable kernels.


thanks,
-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Breakage for Ironlake due to some watermarks changes in Linux 4.0+?

2015-05-18 Thread Mario Kleiner

On 05/15/2015 11:00 AM, Jani Nikula wrote:

On Fri, 15 May 2015, Mario Kleiner mario.kleiner...@gmail.com wrote:

Hi all,

since Linux 4.0 i experience some massive display flicker problem on my
Intel HD Ironlake mobile (2010 MacBookPro6,2) under Waylands reference
compositor Weston.

- Only happens on Linux = 4.0 on intel-kms with the Intel HD, not under
nouveau-kms with the discrete NVidia gpu. Strangely on Linux 4.1-rc it
happens all the time, whereas on Linux 4.0 it can work normally for
quite a while, but once the problem starts only a reboot can cure it.

- Almost only happens on Weston, but only very rarely under the XServer.
VT switching from Weston to XOrg makes the problem disappear, switching
back to Weston and it starts again immediately.

- Only happens if a hardware cursor is displayed - hiding the cursor
stops the flicker immediately, showing the cursor starts the flicker.

- The drm and desktop is completely idle during this - drm.debug=15
shows no activity while this happens.

Symptom:

Up to the scanline where the cursor is located, the desktop image is
displayed, but jumps horizontally left and right by some random number
of pixels, maybe in the range 0 - 200 pixels with high frequency, making
the content unreadable. Starting with the scanline where scanout of the
cursor starts, the display goes blank, as if some display controller
fifo would underflow and the controller blanks the display in response.
Seems having to scanout the cursor plane in addition to the primary
plane is just enough to push it over some limit?

I also see cpu and pch pipe a fifo underruns reported by the underflow
irq handlers.

I saw there were many changes around Linux 4.0 in the kms driver wrt.
watermark calculations, so this might be related?


Please try http://patchwork.freedesktop.org/patch/49314 and report back.

BR,
Jani.



The patch fixes my flicker problem nicely. Thanks! If you want, you can 
add a


Tested-by: Mario Kleiner mario.kleiner...@gmail.com

best,
-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Breakage for Ironlake due to some watermarks changes in Linux 4.0+?

2015-05-14 Thread Mario Kleiner

Hi all,

since Linux 4.0 i experience some massive display flicker problem on my 
Intel HD Ironlake mobile (2010 MacBookPro6,2) under Waylands reference 
compositor Weston.


- Only happens on Linux = 4.0 on intel-kms with the Intel HD, not under 
nouveau-kms with the discrete NVidia gpu. Strangely on Linux 4.1-rc it 
happens all the time, whereas on Linux 4.0 it can work normally for 
quite a while, but once the problem starts only a reboot can cure it.


- Almost only happens on Weston, but only very rarely under the XServer. 
VT switching from Weston to XOrg makes the problem disappear, switching 
back to Weston and it starts again immediately.


- Only happens if a hardware cursor is displayed - hiding the cursor 
stops the flicker immediately, showing the cursor starts the flicker.


- The drm and desktop is completely idle during this - drm.debug=15 
shows no activity while this happens.


Symptom:

Up to the scanline where the cursor is located, the desktop image is 
displayed, but jumps horizontally left and right by some random number 
of pixels, maybe in the range 0 - 200 pixels with high frequency, making 
the content unreadable. Starting with the scanline where scanout of the 
cursor starts, the display goes blank, as if some display controller 
fifo would underflow and the controller blanks the display in response. 
Seems having to scanout the cursor plane in addition to the primary 
plane is just enough to push it over some limit?


I also see cpu and pch pipe a fifo underruns reported by the underflow 
irq handlers.


I saw there were many changes around Linux 4.0 in the kms driver wrt. 
watermark calculations, so this might be related?


thanks,
-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers

2015-05-07 Thread Mario Kleiner

On 05/07/2015 01:56 PM, Peter Hurley wrote:

On 05/06/2015 04:56 AM, Daniel Vetter wrote:

On Tue, May 05, 2015 at 11:57:42AM -0400, Peter Hurley wrote:

On 05/05/2015 11:42 AM, Daniel Vetter wrote:

On Tue, May 05, 2015 at 10:36:24AM -0400, Peter Hurley wrote:

On 05/04/2015 12:52 AM, Mario Kleiner wrote:

On 04/16/2015 03:03 PM, Daniel Vetter wrote:

On Thu, Apr 16, 2015 at 08:30:55AM -0400, Peter Hurley wrote:

On 04/15/2015 01:31 PM, Daniel Vetter wrote:

On Wed, Apr 15, 2015 at 09:00:04AM -0400, Peter Hurley wrote:

Hi Daniel,

On 04/15/2015 03:17 AM, Daniel Vetter wrote:

This was a bit too much cargo-culted, so lets make it solid:
- vblank-count doesn't need to be an atomic, writes are always done
under the protection of dev-vblank_time_lock. Switch to an unsigned
long instead and update comments. Note that atomic_read is just a
normal read of a volatile variable, so no need to audit all the
read-side access specifically.

- The barriers for the vblank counter seqlock weren't complete: The
read-side was missing the first barrier between the counter read and
the timestamp read, it only had a barrier between the ts and the
counter read. We need both.

- Barriers weren't properly documented. Since barriers only work if
you have them on boths sides of the transaction it's prudent to
reference where the other side is. To avoid duplicating the
write-side comment 3 times extract a little store_vblank() helper.
In that helper also assert that we do indeed hold
dev-vblank_time_lock, since in some cases the lock is acquired a
few functions up in the callchain.

Spotted while reviewing a patch from Chris Wilson to add a fastpath to
the vblank_wait ioctl.

Cc: Chris Wilson ch...@chris-wilson.co.uk
Cc: Mario Kleiner mario.kleiner...@gmail.com
Cc: Ville Syrjälä ville.syrj...@linux.intel.com
Cc: Michel Dänzer mic...@daenzer.net
Signed-off-by: Daniel Vetter daniel.vet...@intel.com
---
   drivers/gpu/drm/drm_irq.c | 92 
---
   include/drm/drmP.h|  8 +++--
   2 files changed, 54 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index c8a34476570a..23bfbc61a494 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -74,6 +74,33 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 
0600);
   module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 
0600);
   module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 0600);

+static void store_vblank(struct drm_device *dev, int crtc,
+ unsigned vblank_count_inc,
+ struct timeval *t_vblank)
+{
+struct drm_vblank_crtc *vblank = dev-vblank[crtc];
+u32 tslot;
+
+assert_spin_locked(dev-vblank_time_lock);
+
+if (t_vblank) {
+tslot = vblank-count + vblank_count_inc;
+vblanktimestamp(dev, crtc, tslot) = *t_vblank;
+}
+
+/*
+ * vblank timestamp updates are protected on the write side with
+ * vblank_time_lock, but on the read side done locklessly using a
+ * sequence-lock on the vblank counter. Ensure correct ordering using
+ * memory barrriers. We need the barrier both before and also after the
+ * counter update to synchronize with the next timestamp write.
+ * The read-side barriers for this are in drm_vblank_count_and_time.
+ */
+smp_wmb();
+vblank-count += vblank_count_inc;
+smp_wmb();


The comment and the code are each self-contradictory.

If vblank-count writes are always protected by vblank_time_lock (something I
did not verify but that the comment above asserts), then the trailing write
barrier is not required (and the assertion that it is in the comment is 
incorrect).

A spin unlock operation is always a write barrier.


Hm yeah. Otoh to me that's bordering on code too clever for my own good.
That the spinlock is held I can assure. That no one goes around and does
multiple vblank updates (because somehow that code raced with the hw
itself) I can't easily assure with a simple assert or something similar.
It's not the case right now, but that can changes.


The algorithm would be broken if multiple updates for the same vblank
count were allowed; that's why it checks to see if the vblank count has
not advanced before storing a new timestamp.

Otherwise, the read side would not be able to determine that the
timestamp is valid by double-checking that the vblank count has not
changed.

And besides, even if the code looped without dropping the spinlock,
the correct write order would still be observed because it would still
be executing on the same cpu.

My objection to the write memory barrier is not about optimization;
it's about correct code.


Well diff=0 is not allowed, I guess I could enforce this with some
WARN_ON. And I still think my point of non-local correctness is solid.
With the smp_wmb() removed the following still works correctly:

spin_lock

Re: [Intel-gfx] [PATCH] drm: Defer disabling the vblank IRQ until the next interrupt (for instant-off)

2015-05-03 Thread Mario Kleiner



On 04/15/2015 03:03 AM, Mario Kleiner wrote:

On 04/02/2015 01:34 PM, Chris Wilson wrote:

On vblank instant-off systems, we can get into a situation where the cost
of enabling and disabling the vblank IRQ around a drmWaitVblank query
dominates. However, we know that if the user wants the current vblank
counter, they are also very likely to immediately queue a vblank wait
and so we can keep the interrupt around and only turn it off if we have
no further vblank requests in the interrupt interval.

After vblank event delivery there is a shadow of one vblank where the
interrupt is kept alive for the user to query and queue another vblank
event. Similarly, if the user is using blocking drmWaitVblanks, the
interrupt will be disabled on the IRQ following the wait completion.
However, if the user is simply querying the current vblank counter and
timestamp, the interrupt will be disabled after every IRQ and the user
will enabled it again on the first query following the IRQ.

Testcase: igt/kms_vblank
Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
Cc: Ville Syrjälä ville.syrj...@linux.intel.com
Cc: Daniel Vetter dan...@ffwll.ch
Cc: Michel Dänzer mic...@daenzer.net
Cc: Laurent Pinchart laurent.pinch...@ideasonboard.com
Cc: Dave Airlie airl...@redhat.com,
Cc: Mario Kleiner mario.kleiner...@gmail.com
---
  drivers/gpu/drm/drm_irq.c | 15 +--
  1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index c8a34476570a..6f5dc18779e2 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -1091,9 +1091,9 @@ void drm_vblank_put(struct drm_device *dev, int
crtc)
  if (atomic_dec_and_test(vblank-refcount)) {
  if (drm_vblank_offdelay == 0)
  return;
-else if (dev-vblank_disable_immediate || drm_vblank_offdelay
 0)
+else if (drm_vblank_offdelay  0)
  vblank_disable_fn((unsigned long)vblank);
-else
+else if (!dev-vblank_disable_immediate)
  mod_timer(vblank-disable_timer,
jiffies + ((drm_vblank_offdelay * HZ)/1000));
  }
@@ -1697,6 +1697,17 @@ bool drm_handle_vblank(struct drm_device *dev,
int crtc)

  spin_lock_irqsave(dev-event_lock, irqflags);



You could move the code before the spin_lock_irqsave(dev-event_lock,
irqflags); i think it doesn't need that lock?


+if (dev-vblank_disable_immediate 
!atomic_read(vblank-refcount)) {


Also check for (drm_vblank_offdelay  0) to make sure we have a way out
of instant disable here, and the same meaning of of drm_vblank_offdelay
like we have in the current implementation.

This hunk ...


+unsigned long vbl_lock_irqflags;
+
+spin_lock_irqsave(dev-vbl_lock, vbl_lock_irqflags);
+if (atomic_read(vblank-refcount) == 0  vblank-enabled) {
+DRM_DEBUG(disabling vblank on crtc %d\n, crtc);
+vblank_disable_and_save(dev, crtc);
+}
+spin_unlock_irqrestore(dev-vbl_lock, vbl_lock_irqflags);


... is the same as a call to vblank_disable_fn((unsigned long) vblank);
Maybe replace by that call?

You could also return here already, as the code below will just take a
lock, realize vblanks are now disabled and then release the locks and exit.


+}
+
  /* Need timestamp lock to prevent concurrent execution with
   * vblank enable/disable, as this would cause inconsistent
   * or corrupted timestamps and vblank counts.



I think the logic itself is fine and at least basic testing of the patch
on a Intel HD Ironlake didn't show problems, so with the above taken
into account it would have my slightly uneasy reviewed-by.

One thing that worries me a little bit about the disable inside vblank
irq are the potential races between the disable code and the display
engine which could cause really bad off-by-one errors for clients on a
imperfect driver. These races can only happen if vblank enable or
disable happens close to or inside the vblank. This approach lets the
instant disable happen exactly inside vblank when there is the highest
chance of triggering that condition.

This doesn't seem to be a problem for intel kms, but other drivers don't
have instant disable yet, so we don't know how well we could do it
there. Additionally things like dynamic power management tend to operate
inside vblank, sometimes with funny side effects to other stuff, e.g.,
dpm on AMD, as i remember from some long debug session with Michel and
Alex last summer where dpm played a role. Therefore it seems more safe
to me to avoid actions inside vblank that could be done outside. E.g.,
instead of doing the disable inside the vblank irq one could maybe just
schedule an exact timer to do the disable a few milliseconds later in
the middle of active scanout to avoid these potential issues?

-mario


After testing this, one more thing that would make sense is to move the 
disable block at the end of drm_handle_vblank() instead of at the top.


Turns out

Re: [Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers

2015-05-03 Thread Mario Kleiner

On 04/16/2015 03:03 PM, Daniel Vetter wrote:

On Thu, Apr 16, 2015 at 08:30:55AM -0400, Peter Hurley wrote:

On 04/15/2015 01:31 PM, Daniel Vetter wrote:

On Wed, Apr 15, 2015 at 09:00:04AM -0400, Peter Hurley wrote:

Hi Daniel,

On 04/15/2015 03:17 AM, Daniel Vetter wrote:

This was a bit too much cargo-culted, so lets make it solid:
- vblank-count doesn't need to be an atomic, writes are always done
   under the protection of dev-vblank_time_lock. Switch to an unsigned
   long instead and update comments. Note that atomic_read is just a
   normal read of a volatile variable, so no need to audit all the
   read-side access specifically.

- The barriers for the vblank counter seqlock weren't complete: The
   read-side was missing the first barrier between the counter read and
   the timestamp read, it only had a barrier between the ts and the
   counter read. We need both.

- Barriers weren't properly documented. Since barriers only work if
   you have them on boths sides of the transaction it's prudent to
   reference where the other side is. To avoid duplicating the
   write-side comment 3 times extract a little store_vblank() helper.
   In that helper also assert that we do indeed hold
   dev-vblank_time_lock, since in some cases the lock is acquired a
   few functions up in the callchain.

Spotted while reviewing a patch from Chris Wilson to add a fastpath to
the vblank_wait ioctl.

Cc: Chris Wilson ch...@chris-wilson.co.uk
Cc: Mario Kleiner mario.kleiner...@gmail.com
Cc: Ville Syrjälä ville.syrj...@linux.intel.com
Cc: Michel Dänzer mic...@daenzer.net
Signed-off-by: Daniel Vetter daniel.vet...@intel.com
---
  drivers/gpu/drm/drm_irq.c | 92 ---
  include/drm/drmP.h|  8 +++--
  2 files changed, 54 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index c8a34476570a..23bfbc61a494 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -74,6 +74,33 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 
0600);
  module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 
0600);
  module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 0600);

+static void store_vblank(struct drm_device *dev, int crtc,
+unsigned vblank_count_inc,
+struct timeval *t_vblank)
+{
+   struct drm_vblank_crtc *vblank = dev-vblank[crtc];
+   u32 tslot;
+
+   assert_spin_locked(dev-vblank_time_lock);
+
+   if (t_vblank) {
+   tslot = vblank-count + vblank_count_inc;
+   vblanktimestamp(dev, crtc, tslot) = *t_vblank;
+   }
+
+   /*
+* vblank timestamp updates are protected on the write side with
+* vblank_time_lock, but on the read side done locklessly using a
+* sequence-lock on the vblank counter. Ensure correct ordering using
+* memory barrriers. We need the barrier both before and also after the
+* counter update to synchronize with the next timestamp write.
+* The read-side barriers for this are in drm_vblank_count_and_time.
+*/
+   smp_wmb();
+   vblank-count += vblank_count_inc;
+   smp_wmb();


The comment and the code are each self-contradictory.

If vblank-count writes are always protected by vblank_time_lock (something I
did not verify but that the comment above asserts), then the trailing write
barrier is not required (and the assertion that it is in the comment is 
incorrect).

A spin unlock operation is always a write barrier.


Hm yeah. Otoh to me that's bordering on code too clever for my own good.
That the spinlock is held I can assure. That no one goes around and does
multiple vblank updates (because somehow that code raced with the hw
itself) I can't easily assure with a simple assert or something similar.
It's not the case right now, but that can changes.


The algorithm would be broken if multiple updates for the same vblank
count were allowed; that's why it checks to see if the vblank count has
not advanced before storing a new timestamp.

Otherwise, the read side would not be able to determine that the
timestamp is valid by double-checking that the vblank count has not
changed.

And besides, even if the code looped without dropping the spinlock,
the correct write order would still be observed because it would still
be executing on the same cpu.

My objection to the write memory barrier is not about optimization;
it's about correct code.


Well diff=0 is not allowed, I guess I could enforce this with some
WARN_ON. And I still think my point of non-local correctness is solid.
With the smp_wmb() removed the following still works correctly:

spin_lock(vblank_time_lock);
store_vblank(dev, crtc, 1, ts1);
spin_unlock(vblank_time_lock);

spin_lock(vblank_time_lock);
store_vblank(dev, crtc, 1, ts2);
spin_unlock(vblank_time_lock);

But with the smp_wmb(); removed the following would be broken

Re: [Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers

2015-04-16 Thread Mario Kleiner

On 04/16/2015 03:29 AM, Peter Hurley wrote:

On 04/15/2015 05:26 PM, Mario Kleiner wrote:

A couple of questions to educate me and one review comment.

On 04/15/2015 07:34 PM, Daniel Vetter wrote:

This was a bit too much cargo-culted, so lets make it solid:
- vblank-count doesn't need to be an atomic, writes are always done
under the protection of dev-vblank_time_lock. Switch to an unsigned
long instead and update comments. Note that atomic_read is just a
normal read of a volatile variable, so no need to audit all the
read-side access specifically.

- The barriers for the vblank counter seqlock weren't complete: The
read-side was missing the first barrier between the counter read and
the timestamp read, it only had a barrier between the ts and the
counter read. We need both.

- Barriers weren't properly documented. Since barriers only work if
you have them on boths sides of the transaction it's prudent to
reference where the other side is. To avoid duplicating the
write-side comment 3 times extract a little store_vblank() helper.
In that helper also assert that we do indeed hold
dev-vblank_time_lock, since in some cases the lock is acquired a
few functions up in the callchain.

Spotted while reviewing a patch from Chris Wilson to add a fastpath to
the vblank_wait ioctl.

v2: Add comment to better explain how store_vblank works, suggested by
Chris.

v3: Peter noticed that as-is the 2nd smp_wmb is redundant with the
implicit barrier in the spin_unlock. But that can only be proven by
auditing all callers and my point in extracting this little helper was
to localize all the locking into just one place. Hence I think that
additional optimization is too risky.

Cc: Chris Wilson ch...@chris-wilson.co.uk
Cc: Mario Kleiner mario.kleiner...@gmail.com
Cc: Ville Syrjälä ville.syrj...@linux.intel.com
Cc: Michel Dänzer mic...@daenzer.net
Cc: Peter Hurley pe...@hurleysoftware.com
Signed-off-by: Daniel Vetter daniel.vet...@intel.com
---
   drivers/gpu/drm/drm_irq.c | 95 
+--
   include/drm/drmP.h|  8 +++-
   2 files changed, 57 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index c8a34476570a..8694b77d0002 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -74,6 +74,36 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 
0600);
   module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 
0600);
   module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 0600);

+static void store_vblank(struct drm_device *dev, int crtc,
+ unsigned vblank_count_inc,
+ struct timeval *t_vblank)
+{
+struct drm_vblank_crtc *vblank = dev-vblank[crtc];
+u32 tslot;
+
+assert_spin_locked(dev-vblank_time_lock);
+
+if (t_vblank) {
+/* All writers hold the spinlock, but readers are serialized by
+ * the latching of vblank-count below.
+ */
+tslot = vblank-count + vblank_count_inc;
+vblanktimestamp(dev, crtc, tslot) = *t_vblank;
+}
+
+/*
+ * vblank timestamp updates are protected on the write side with
+ * vblank_time_lock, but on the read side done locklessly using a
+ * sequence-lock on the vblank counter. Ensure correct ordering using
+ * memory barrriers. We need the barrier both before and also after the
+ * counter update to synchronize with the next timestamp write.
+ * The read-side barriers for this are in drm_vblank_count_and_time.
+ */
+smp_wmb();
+vblank-count += vblank_count_inc;
+smp_wmb();
+}
+
   /**
* drm_update_vblank_count - update the master vblank counter
* @dev: DRM device
@@ -93,7 +123,7 @@ module_param_named(timestamp_monotonic, 
drm_timestamp_monotonic, int, 0600);
   static void drm_update_vblank_count(struct drm_device *dev, int crtc)
   {
   struct drm_vblank_crtc *vblank = dev-vblank[crtc];
-u32 cur_vblank, diff, tslot;
+u32 cur_vblank, diff;
   bool rc;
   struct timeval t_vblank;

@@ -129,18 +159,12 @@ static void drm_update_vblank_count(struct drm_device 
*dev, int crtc)
   if (diff == 0)
   return;

-/* Reinitialize corresponding vblank timestamp if high-precision query
- * available. Skip this step if query unsupported or failed. Will
- * reinitialize delayed at next vblank interrupt in that case.
+/*
+ * Only reinitialize corresponding vblank timestamp if high-precision query
+ * available and didn't fail. Will reinitialize delayed at next vblank
+ * interrupt in that case.
*/
-if (rc) {
-tslot = atomic_read(vblank-count) + diff;
-vblanktimestamp(dev, crtc, tslot) = t_vblank;
-}
-
-smp_mb__before_atomic();
-atomic_add(diff, vblank-count);
-smp_mb__after_atomic();
+store_vblank(dev, crtc, diff, rc ? t_vblank : NULL);
   }

   /*
@@ -218,7 +242,7 @@ static

Re: [Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers

2015-04-15 Thread Mario Kleiner

A couple of questions to educate me and one review comment.

On 04/15/2015 07:34 PM, Daniel Vetter wrote:

This was a bit too much cargo-culted, so lets make it solid:
- vblank-count doesn't need to be an atomic, writes are always done
   under the protection of dev-vblank_time_lock. Switch to an unsigned
   long instead and update comments. Note that atomic_read is just a
   normal read of a volatile variable, so no need to audit all the
   read-side access specifically.

- The barriers for the vblank counter seqlock weren't complete: The
   read-side was missing the first barrier between the counter read and
   the timestamp read, it only had a barrier between the ts and the
   counter read. We need both.

- Barriers weren't properly documented. Since barriers only work if
   you have them on boths sides of the transaction it's prudent to
   reference where the other side is. To avoid duplicating the
   write-side comment 3 times extract a little store_vblank() helper.
   In that helper also assert that we do indeed hold
   dev-vblank_time_lock, since in some cases the lock is acquired a
   few functions up in the callchain.

Spotted while reviewing a patch from Chris Wilson to add a fastpath to
the vblank_wait ioctl.

v2: Add comment to better explain how store_vblank works, suggested by
Chris.

v3: Peter noticed that as-is the 2nd smp_wmb is redundant with the
implicit barrier in the spin_unlock. But that can only be proven by
auditing all callers and my point in extracting this little helper was
to localize all the locking into just one place. Hence I think that
additional optimization is too risky.

Cc: Chris Wilson ch...@chris-wilson.co.uk
Cc: Mario Kleiner mario.kleiner...@gmail.com
Cc: Ville Syrjälä ville.syrj...@linux.intel.com
Cc: Michel Dänzer mic...@daenzer.net
Cc: Peter Hurley pe...@hurleysoftware.com
Signed-off-by: Daniel Vetter daniel.vet...@intel.com
---
  drivers/gpu/drm/drm_irq.c | 95 +--
  include/drm/drmP.h|  8 +++-
  2 files changed, 57 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index c8a34476570a..8694b77d0002 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -74,6 +74,36 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 
0600);
  module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 
0600);
  module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 0600);

+static void store_vblank(struct drm_device *dev, int crtc,
+unsigned vblank_count_inc,
+struct timeval *t_vblank)
+{
+   struct drm_vblank_crtc *vblank = dev-vblank[crtc];
+   u32 tslot;
+
+   assert_spin_locked(dev-vblank_time_lock);
+
+   if (t_vblank) {
+   /* All writers hold the spinlock, but readers are serialized by
+* the latching of vblank-count below.
+*/
+   tslot = vblank-count + vblank_count_inc;
+   vblanktimestamp(dev, crtc, tslot) = *t_vblank;
+   }
+
+   /*
+* vblank timestamp updates are protected on the write side with
+* vblank_time_lock, but on the read side done locklessly using a
+* sequence-lock on the vblank counter. Ensure correct ordering using
+* memory barrriers. We need the barrier both before and also after the
+* counter update to synchronize with the next timestamp write.
+* The read-side barriers for this are in drm_vblank_count_and_time.
+*/
+   smp_wmb();
+   vblank-count += vblank_count_inc;
+   smp_wmb();
+}
+
  /**
   * drm_update_vblank_count - update the master vblank counter
   * @dev: DRM device
@@ -93,7 +123,7 @@ module_param_named(timestamp_monotonic, 
drm_timestamp_monotonic, int, 0600);
  static void drm_update_vblank_count(struct drm_device *dev, int crtc)
  {
struct drm_vblank_crtc *vblank = dev-vblank[crtc];
-   u32 cur_vblank, diff, tslot;
+   u32 cur_vblank, diff;
bool rc;
struct timeval t_vblank;

@@ -129,18 +159,12 @@ static void drm_update_vblank_count(struct drm_device 
*dev, int crtc)
if (diff == 0)
return;

-   /* Reinitialize corresponding vblank timestamp if high-precision query
-* available. Skip this step if query unsupported or failed. Will
-* reinitialize delayed at next vblank interrupt in that case.
+   /*
+* Only reinitialize corresponding vblank timestamp if high-precision 
query
+* available and didn't fail. Will reinitialize delayed at next vblank
+* interrupt in that case.
 */
-   if (rc) {
-   tslot = atomic_read(vblank-count) + diff;
-   vblanktimestamp(dev, crtc, tslot) = t_vblank;
-   }
-
-   smp_mb__before_atomic();
-   atomic_add(diff, vblank-count);
-   smp_mb__after_atomic();
+   store_vblank(dev, crtc, diff, rc

Re: [Intel-gfx] [PATCH] drm: Defer disabling the vblank IRQ until the next interrupt (for instant-off)

2015-04-14 Thread Mario Kleiner

On 04/02/2015 01:34 PM, Chris Wilson wrote:

On vblank instant-off systems, we can get into a situation where the cost
of enabling and disabling the vblank IRQ around a drmWaitVblank query
dominates. However, we know that if the user wants the current vblank
counter, they are also very likely to immediately queue a vblank wait
and so we can keep the interrupt around and only turn it off if we have
no further vblank requests in the interrupt interval.

After vblank event delivery there is a shadow of one vblank where the
interrupt is kept alive for the user to query and queue another vblank
event. Similarly, if the user is using blocking drmWaitVblanks, the
interrupt will be disabled on the IRQ following the wait completion.
However, if the user is simply querying the current vblank counter and
timestamp, the interrupt will be disabled after every IRQ and the user
will enabled it again on the first query following the IRQ.

Testcase: igt/kms_vblank
Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
Cc: Ville Syrjälä ville.syrj...@linux.intel.com
Cc: Daniel Vetter dan...@ffwll.ch
Cc: Michel Dänzer mic...@daenzer.net
Cc: Laurent Pinchart laurent.pinch...@ideasonboard.com
Cc: Dave Airlie airl...@redhat.com,
Cc: Mario Kleiner mario.kleiner...@gmail.com
---
  drivers/gpu/drm/drm_irq.c | 15 +--
  1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index c8a34476570a..6f5dc18779e2 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -1091,9 +1091,9 @@ void drm_vblank_put(struct drm_device *dev, int crtc)
if (atomic_dec_and_test(vblank-refcount)) {
if (drm_vblank_offdelay == 0)
return;
-   else if (dev-vblank_disable_immediate || drm_vblank_offdelay  
0)
+   else if (drm_vblank_offdelay  0)
vblank_disable_fn((unsigned long)vblank);
-   else
+   else if (!dev-vblank_disable_immediate)
mod_timer(vblank-disable_timer,
  jiffies + ((drm_vblank_offdelay * HZ)/1000));
}
@@ -1697,6 +1697,17 @@ bool drm_handle_vblank(struct drm_device *dev, int crtc)

spin_lock_irqsave(dev-event_lock, irqflags);



You could move the code before the spin_lock_irqsave(dev-event_lock, 
irqflags); i think it doesn't need that lock?



+   if (dev-vblank_disable_immediate  !atomic_read(vblank-refcount)) {


Also check for (drm_vblank_offdelay  0) to make sure we have a way out 
of instant disable here, and the same meaning of of drm_vblank_offdelay 
like we have in the current implementation.


This hunk ...


+   unsigned long vbl_lock_irqflags;
+
+   spin_lock_irqsave(dev-vbl_lock, vbl_lock_irqflags);
+   if (atomic_read(vblank-refcount) == 0  vblank-enabled) {
+   DRM_DEBUG(disabling vblank on crtc %d\n, crtc);
+   vblank_disable_and_save(dev, crtc);
+   }
+   spin_unlock_irqrestore(dev-vbl_lock, vbl_lock_irqflags);


... is the same as a call to vblank_disable_fn((unsigned long) vblank);
Maybe replace by that call?

You could also return here already, as the code below will just take a 
lock, realize vblanks are now disabled and then release the locks and exit.



+   }
+
/* Need timestamp lock to prevent concurrent execution with
 * vblank enable/disable, as this would cause inconsistent
 * or corrupted timestamps and vblank counts.



I think the logic itself is fine and at least basic testing of the patch 
on a Intel HD Ironlake didn't show problems, so with the above taken 
into account it would have my slightly uneasy reviewed-by.


One thing that worries me a little bit about the disable inside vblank 
irq are the potential races between the disable code and the display 
engine which could cause really bad off-by-one errors for clients on a 
imperfect driver. These races can only happen if vblank enable or 
disable happens close to or inside the vblank. This approach lets the 
instant disable happen exactly inside vblank when there is the highest 
chance of triggering that condition.


This doesn't seem to be a problem for intel kms, but other drivers don't 
have instant disable yet, so we don't know how well we could do it 
there. Additionally things like dynamic power management tend to operate 
inside vblank, sometimes with funny side effects to other stuff, e.g., 
dpm on AMD, as i remember from some long debug session with Michel and 
Alex last summer where dpm played a role. Therefore it seems more safe 
to me to avoid actions inside vblank that could be done outside. E.g., 
instead of doing the disable inside the vblank irq one could maybe just 
schedule an exact timer to do the disable a few milliseconds later in 
the middle of active scanout to avoid these potential issues?


-mario

Re: [Intel-gfx] [PATCH] drm: Return current vblank value for drmWaitVBlank queries

2015-03-19 Thread Mario Kleiner

On 03/19/2015 04:04 PM, Ville Syrjälä wrote:

On Thu, Mar 19, 2015 at 03:33:11PM +0100, Daniel Vetter wrote:

On Wed, Mar 18, 2015 at 03:52:56PM +0100, Mario Kleiner wrote:

On 03/18/2015 10:30 AM, Chris Wilson wrote:

On Wed, Mar 18, 2015 at 11:53:16AM +0900, Michel Dänzer wrote:

drm_vblank_count_and_time() doesn't return the correct sequence number
while the vblank interrupt is disabled, does it? It returns the sequence
number from the last time vblank_disable_and_save() was called (when the
vblank interrupt was disabled). That's why drm_vblank_get() is needed here.


Ville enlightened me as well. I thought the value was cooked so that
time did not pass whilst the IRQ was disabled. Hopefully, I can impress
upon the Intel folks, at least, that enabling/disabling the interrupts
just to read the current hw counter is interesting to say the least and
sits at the top of the profiles when benchmarking Present.
-Chris



drm_wait_vblank() not only gets the counter but also the corresponding
vblank timestamp. Counters are recalculated in vblank_disable_and_save() for
irq off, then in the vblank irq on path, and every refresh in
drm_handle_vblank at vblank irq time.

The timestamps can be recalculated at any time iff the driver supports high
precision timestamping, which currently intel kms, radeon kms, and nouveau
kms do. But for other parts, like most SoC's, afaik you only get a valid
timestamp by sampling system time in the vblank irq handler, so there you'd
have a problem.

There are also some races around the enable/disable path which require a lot
of care and exact knowledge of when each hardware fires its vblanks, updates
its hardware counters etc. to get rid of them. Ville did that - successfully
as far as my tests go - for Intel kms, but other drivers would be less
forgiving.

Our current method is to:

a) Only disable vblank irqs after a default idle period of 5 seconds, so we
don't get races frequent/likely enough to cause problems for clients. And we
save the overhead for all the vblank irq on/off.

b) On drivers which have high precision timestamping and have been carefully
checked to be race free (== intel kms only atm.) we have instant disable, so
things like blinking cursors don't keep vblank irq on forever.

If b) causes so much overhead, maybe we could change the instant disable
into a disable after a very short time, e.g., lowering the timeout from
5000 msecs to 2-3 video refresh durations ~ 50 msecs? That would still
disable vblank irqs for power saving if the desktop is really idle, but
avoid on/off storms for the various drm_wait_vblank's that happen when
preparing a swap.


Yeah I think we could add code which only gets run for drivers which
support instant disable (i915 doesn't do that on gen2 because the hw is
lacking). There we should be able to update the vblank counter/timestamp
correctly without enabling interrupts temporarily. Ofc we need to make
sure we have enough nasty igt testcase to ensure there's not going to be
jumps and missed frame numbers in that case.


I'd rather go for the very simple fast disable with short timeout 
method. That would only be a tiny almost one-liner patch that reuses the 
existing timer for the default slow case, and we'd know already that it 
will work reliably on instant off capable drivers - no extra tests 
required. Those drm_vblank_get/put calls usually come in short bursts 
which should be covered by a timeout of maybe 1 to max. 3 refresh durations.


When we query the hw timestamps, we always have a little bit of 
unavoidable noise, even if it's often only +/- 1 usec on modern hw, so 
clients querying the timestamp for the same vblank would get slightly 
different results on repeated queries. On hw which only allows scanline 
granularity for queries, we can get variability up to 1 scanline 
duration. If the caller does things like delta calculations on those 
results (dT = currentts - lastts) it can get confusing results like time 
going backwards by a few microseconds. That's why the current code 
caches the last vblank ts, to save overhead and to make sure that 
repeated queries of the same vblank give identical results.




Is enabling the interrupts the expensive part, or is it the actual
double timestamp read + scanout pos read? Or is it due to the several
spinlocks we have in this code?



The timestamp/scanout pos read itself is not that expensive iirc, 
usually 1-3 usecs depending on hw, from some testing i did a year ago. 
The machinery for irq on/off + all the reinitializing of vblank counts 
and matching timestamps etc. is probably not that cheap.



Also why is userspace reading the vblank counter in the first place? Due
to the crazy OML_whatever stuff perhaps? In the simple swap interval case
you shouldn't really need to read it. And if we actually made the page
flip/atomic ioctl take a target vblank count and let the kernel deal
with it we wouldn't need to call the vblank ioctl at all.


I object to the crazy, extensions have

Re: [Intel-gfx] [PATCH] drm: Return current vblank value for drmWaitVBlank queries

2015-03-18 Thread Mario Kleiner

On 03/18/2015 10:30 AM, Chris Wilson wrote:

On Wed, Mar 18, 2015 at 11:53:16AM +0900, Michel Dänzer wrote:

drm_vblank_count_and_time() doesn't return the correct sequence number
while the vblank interrupt is disabled, does it? It returns the sequence
number from the last time vblank_disable_and_save() was called (when the
vblank interrupt was disabled). That's why drm_vblank_get() is needed here.


Ville enlightened me as well. I thought the value was cooked so that
time did not pass whilst the IRQ was disabled. Hopefully, I can impress
upon the Intel folks, at least, that enabling/disabling the interrupts
just to read the current hw counter is interesting to say the least and
sits at the top of the profiles when benchmarking Present.
-Chris



drm_wait_vblank() not only gets the counter but also the corresponding 
vblank timestamp. Counters are recalculated in vblank_disable_and_save() 
for irq off, then in the vblank irq on path, and every refresh in 
drm_handle_vblank at vblank irq time.


The timestamps can be recalculated at any time iff the driver supports 
high precision timestamping, which currently intel kms, radeon kms, and 
nouveau kms do. But for other parts, like most SoC's, afaik you only get 
a valid timestamp by sampling system time in the vblank irq handler, so 
there you'd have a problem.


There are also some races around the enable/disable path which require a 
lot of care and exact knowledge of when each hardware fires its vblanks, 
updates its hardware counters etc. to get rid of them. Ville did that - 
successfully as far as my tests go - for Intel kms, but other drivers 
would be less forgiving.


Our current method is to:

a) Only disable vblank irqs after a default idle period of 5 seconds, so 
we don't get races frequent/likely enough to cause problems for clients. 
And we save the overhead for all the vblank irq on/off.


b) On drivers which have high precision timestamping and have been 
carefully checked to be race free (== intel kms only atm.) we have 
instant disable, so things like blinking cursors don't keep vblank irq 
on forever.


If b) causes so much overhead, maybe we could change the instant 
disable into a disable after a very short time, e.g., lowering the 
timeout from 5000 msecs to 2-3 video refresh durations ~ 50 msecs? That 
would still disable vblank irqs for power saving if the desktop is 
really idle, but avoid on/off storms for the various drm_wait_vblank's 
that happen when preparing a swap.


-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] sna: Also fix ZaphodHeads on Linux kernels older than 3.19

2014-12-06 Thread Mario Kleiner
Only 3.19 will have O_NONBLOCK for drm_read(), so the
current ddx will still stutter in ZaphodHead mode on
current kernels. Fix the problem by adding a poll() on
the drm fd before potentially blocking on read().

The logic is directly transplanted from the uxa backend
intel_mode_read_drm_events() function.

Fixes fdo bug #84744 on older kernels.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 src/sna/sna_display.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/sna/sna_display.c b/src/sna/sna_display.c
index 163..a7ad6cc 100644
--- a/src/sna/sna_display.c
+++ b/src/sna/sna_display.c
@@ -7390,7 +7390,16 @@ fixup_flip:
 void sna_mode_wakeup(struct sna *sna)
 {
char buffer[1024];
-   int len, i;
+   int len, i, r;
+   struct pollfd p = { .fd = sna-kgem.fd, .events = POLLIN };
+
+   /* DRM read is blocking on old kernels, so poll first to avoid it. */
+   do {
+   r = poll(p, 1, 0);
+   } while (r == -1  (errno == EINTR || errno == EAGAIN));
+
+   if (r = 0)
+   return;
 
/* The DRM read semantics guarantees that we always get only
 * complete events.
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] present: Fix use of vsynced pageflips and honor PresentOptionAsync. (v3)

2014-12-05 Thread Mario Kleiner

On 12/05/2014 12:56 AM, Eric Anholt wrote:

Mario Kleiner mario.kleiner...@gmail.com writes:


Pageflips for Pixmap presents were not synchronized to vblank on
drivers with support for PresentCapabilityAsync, due to some
missing init for vblank-sync_flips. The PresentOptionAsync
flag was completely ignored for pageflipped presents.

Vsynced flips only worked by accident on the intel-ddx, as that
driver doesn't have PresentCapabilityAsync support.

On nouveau-ddx, which supports PresentCapabilityAsync, this
always caused non-vsynced pageflips with pretty ugly tearing.

This patch fixes the problem, as tested on top of XOrg 1.16.2
on nouveau and intel.

Please also apply to XOrg 1.17 and XOrg 1.16.2 stable.

Applying on top of XOrg 1.16.2 may require cherry-picking
commit 2051514652481a83bd7cf22e57cb0fcd40333f33
which trivially fixes lack of support for protocol option
PresentOptionCopy - get two bug fixes for the price of one!

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
  present/present.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/present/present.c b/present/present.c
index e5d3fd5..be1c9f1 100644
--- a/present/present.c
+++ b/present/present.c
@@ -834,7 +834,7 @@ present_pixmap(WindowPtr window,
  vblank-notifies = notifies;
  vblank-num_notifies = num_notifies;
  
-if (!screen_priv-info || !(screen_priv-info-capabilities  PresentCapabilityAsync))

+if (!(options  PresentOptionAsync))
  vblank-sync_flip = TRUE;

I think I'd like to see a hunk like this in with this patch, so that
each driver doesn't need to have the cap check:

diff --git a/present/present.c b/present/present.c
index a9f2214..ed0d734 100644
--- a/present/present.c
+++ b/present/present.c
@@ -838,6 +838,9 @@ present_pixmap(WindowPtr window,
  vblank-sync_flip = TRUE;
  
  if (!(options  PresentOptionCopy) 

+!((options  PresentOptionAsync) 
+  (!screen_priv-info ||
+   !(screen_priv-info-capabilities  PresentCapabilityAsync))) 
  pixmap != NULL 
  present_check_flip (target_crtc, window, pixmap, vblank-sync_flip, 
valid, x_off, y_off))
  {

Seem reasonable?  If you wanted to squash this in, then this is:


I'm not sure if drivers will really avoid the cap check, as i assume the 
definition of the check_flip() function requires them to implement it 
anyway? Does some spec somewhere require them to do it? Do driver 
writers check all server implementations to see if they can get away 
with less?


But then having this hunk in doesn't hurt either, and it would keep the 
current intel-ddx uxa backends working, so i'll integrate it - after 
some urgently needed sleep.


Thanks for the review. These server patches are actually the critical 
ones for me. Without them in XOrg 1.16+, all the mesa fixes would be 
utterly useless for my kind of applications.


-mario


Reviewed-by: Eric Anholt e...@anholt.net

(So's patch 1/2, regardless).


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] DRI3/Present fixes for XServers 1.16 and 1.17rc - (v2)

2014-12-02 Thread Mario Kleiner
Hi,

an updated set of patches to fix the bugs i found in the
xserver dri3/present implementation and one bug in intel-ddx
uxa/dri3/present implementation. Axel Davys comments made me
rethink my original xserver patch and the new solution is
simple and better and afaics how this was actually intended
to work in the server, the server properly using the present_check_flip
ddx driver function.

Patch 1/2 fixes and slightly improves DebugPresent() macros for
the server to avoid crashes at logout, compositor en/disable or
closing windows while flips are pending when the server is compiled
with debug macros on.

Patch 2/2 fixes the use of PresentOptionAsync for page-flipped present,
and makes Present working on nouveau without horrible tearing.

These patches apply to master, 1.17rc and 1.16.2. They were tested
on top of 1.16.2 with the dri3/present backends of nouveau master
(glamor and exa) and intel master (sna and fixed uxa) on single-display
and dual-display, also ran through my hardware timing test equipment.

Patch uxa/present is a required fix for intel-ddx uxa backend, so
intel_present_check_flip no longer lies to the server about its
capabilities.

Can the x-server patches please also be included into the 1.16 series?

thanks,
-mario

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/2] present: Avoid crashes in DebugPresent(), a bit more info.

2014-12-02 Thread Mario Kleiner
DebugPresent() crashed the server when a dri3 drawable
was closed while a pageflipped present was still pending,
due to vblank-window- Null-Ptr deref, so debug builds
caused new problems to debug.

E.g.,

glXSwapBuffers(...);
glXDestroyWindow(...);
- Pageflip for non-existent window completes - boom.

Also often happens when switching desktop compositor on/off
due to Present unflips, or when logging out of session.

Also add info if a Present is queued for copyswap or pageflip,
if the present is vsynced, and the serial no of the Present
request, to aid debugging of pageflip and vsync issues. The
serial number is useful as Mesa's dri3/present backend encodes
its sendSBC in the serial number, so one can easily correlate
server debug output with Mesa and with the SBC values returned
to actual OpenGL client applications via OML_sync_control and
INTEL_swap_events extension, makes debugging quite a bit more
easy.

Please also cherry-pick this for a 1.16.x stable update.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 present/present.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/present/present.c b/present/present.c
index ac9047e..e5d3fd5 100644
--- a/present/present.c
+++ b/present/present.c
@@ -440,7 +440,7 @@ present_flip_notify(present_vblank_ptr vblank, uint64_t 
ust, uint64_t crtc_msc)
 DebugPresent((\tn %lld %p %8lld: %08lx - %08lx\n,
   vblank-event_id, vblank, vblank-target_msc,
   vblank-pixmap ? vblank-pixmap-drawable.id : 0,
-  vblank-window-drawable.id));
+  vblank-window ? vblank-window-drawable.id : 0));
 
 assert (vblank == screen_priv-flip_pending);
 
@@ -859,10 +859,10 @@ present_pixmap(WindowPtr window,
 }
 
 if (pixmap)
-DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p)\n,
+DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p) flip %d vsync 
%d serial %d\n,
   vblank-event_id, vblank, target_msc,
   vblank-pixmap-drawable.id, vblank-window-drawable.id,
-  target_crtc));
+  target_crtc, vblank-flip, vblank-sync_flip, 
vblank-serial));
 
 xorg_list_add(vblank-event_queue, present_exec_queue);
 vblank-queued = TRUE;
@@ -955,7 +955,7 @@ present_vblank_destroy(present_vblank_ptr vblank)
 DebugPresent((\td %lld %p %8lld: %08lx - %08lx\n,
   vblank-event_id, vblank, vblank-target_msc,
   vblank-pixmap ? vblank-pixmap-drawable.id : 0,
-  vblank-window-drawable.id));
+  vblank-window ? vblank-window-drawable.id : 0));
 
 /* Drop pixmap reference */
 if (vblank-pixmap)
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] uxa/present: Handle sync_flip flag in intel_present_check_flip()

2014-12-02 Thread Mario Kleiner
Make sure we reject async flips if we don't support
async flips.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 src/uxa/intel_present.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/uxa/intel_present.c b/src/uxa/intel_present.c
index d20043f..d2aa9ee 100644
--- a/src/uxa/intel_present.c
+++ b/src/uxa/intel_present.c
@@ -58,6 +58,8 @@ struct intel_present_vblank_event {
uint64_tevent_id;
 };
 
+static Bool intel_present_has_async_flip(ScreenPtr screen);
+
 static uint32_t pipe_select(int pipe)
 {
if (pipe  1)
@@ -266,6 +268,9 @@ intel_present_check_flip(RRCrtcPtr  crtc,
 if (!bo)
 return FALSE;
 
+if (!sync_flip  !intel_present_has_async_flip(screen))
+return FALSE;
+
return TRUE;
 }
 
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/2] present: Fix use of vsynced pageflips and honor PresentOptionAsync. (v3)

2014-12-02 Thread Mario Kleiner
Pageflips for Pixmap presents were not synchronized to vblank on
drivers with support for PresentCapabilityAsync, due to some
missing init for vblank-sync_flips. The PresentOptionAsync
flag was completely ignored for pageflipped presents.

Vsynced flips only worked by accident on the intel-ddx, as that
driver doesn't have PresentCapabilityAsync support.

On nouveau-ddx, which supports PresentCapabilityAsync, this
always caused non-vsynced pageflips with pretty ugly tearing.

This patch fixes the problem, as tested on top of XOrg 1.16.2
on nouveau and intel.

Please also apply to XOrg 1.17 and XOrg 1.16.2 stable.

Applying on top of XOrg 1.16.2 may require cherry-picking
commit 2051514652481a83bd7cf22e57cb0fcd40333f33
which trivially fixes lack of support for protocol option
PresentOptionCopy - get two bug fixes for the price of one!

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 present/present.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/present/present.c b/present/present.c
index e5d3fd5..be1c9f1 100644
--- a/present/present.c
+++ b/present/present.c
@@ -834,7 +834,7 @@ present_pixmap(WindowPtr window,
 vblank-notifies = notifies;
 vblank-num_notifies = num_notifies;
 
-if (!screen_priv-info || !(screen_priv-info-capabilities  
PresentCapabilityAsync))
+if (!(options  PresentOptionAsync))
 vblank-sync_flip = TRUE;
 
 if (!(options  PresentOptionCopy) 
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 14/19] drm: Don't update vblank timestamp when the counter didn't change

2014-09-23 Thread Mario Kleiner

On 23/09/14 15:51, Daniel Vetter wrote:

On Tue, Sep 23, 2014 at 03:48:25PM +0300, Jani Nikula wrote:

On Mon, 15 Sep 2014, Daniel Vetter dan...@ffwll.ch wrote:

On Sat, Sep 13, 2014 at 06:25:54PM +0200, Mario Kleiner wrote:

The current drm-next misses Ville's original Patch 14/19, the one i first
objected, then objected to my objection. It is needed to avoid actual
regressions. Attached a trivially rebased (v2) of Ville's patch to go on top
of drm-next, also as tgz in case my e-mail client mangles the patch again,
because it's one of those email hates me weeks.


Oh dear, I've made a decent mess of all of this really. Picked up to make
sure it doesn't get lost again.


After all this nice ping pong our QA has reported a bisected regression
on this commit: https://bugs.freedesktop.org/show_bug.cgi?id=84161


Looks like a minuscule timing change which resulted in us detecting a fifo
underrun. Or at least I don't see any other related information that would
indicate otherwise ...
-Daniel



There's nothing in that code path which could cause this - except for 
altered execution timing. I've seen that warning as well on my Intel HD 
Ironlake Mobile (MBP 2010), but only spuriously when plugging/unplugging 
an external display into the laptop iirc, so i thought it would be 
unrelated.


-mario

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 14/19] drm: Don't update vblank timestamp when the counter didn't change

2014-09-13 Thread Mario Kleiner
The current drm-next misses Ville's original Patch 14/19, the one i 
first objected, then objected to my objection. It is needed to avoid 
actual regressions. Attached a trivially rebased (v2) of Ville's patch 
to go on top of drm-next, also as tgz in case my e-mail client mangles 
the patch again, because it's one of those email hates me weeks.


-mario



On 08/06/2014 01:49 PM, ville.syrj...@linux.intel.com wrote:

From: Ville Syrjälä ville.syrj...@linux.intel.com

If we already have a timestamp for the current vblank counter, don't
update it with a new timestmap. Small errors can creep in between two
timestamp queries for the same vblank count, which could be confusing to
userspace when it queries the timestamp for the same vblank sequence
number twice.

This problem gets exposed when the vblank disable timer is not used
(or is set to expire quickly) and thus we can get multiple vblank
disable-enable transition during the same frame which would all
attempt to update the timestamp with the latest estimate.

Testcase: igt/kms_flip/flip-vs-expired-vblank
Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
---
  drivers/gpu/drm/drm_irq.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index af33df1..0523f5b 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -106,6 +106,9 @@ static void drm_update_vblank_count(struct drm_device *dev, 
int crtc)
DRM_DEBUG(enabling vblank interrupts on crtc %d, missed %d\n,
  crtc, diff);
  
+	if (diff == 0)

+   return;
+
/* Reinitialize corresponding vblank timestamp if high-precision query
 * available. Skip this step if query unsupported or failed. Will
 * reinitialize delayed at next vblank interrupt in that case.


From c0a5228a7fc43d4c3615a471c340b68bcb2caa16 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= ville.syrj...@linux.intel.com
Date: Wed, 6 Aug 2014 14:49:57 +0300
Subject: [PATCH] drm: Don't update vblank timestamp when the counter didn't
 change (v2)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

If we already have a timestamp for the current vblank counter, don't
update it with a new timestmap. Small errors can creep in between two
timestamp queries for the same vblank count, which could be confusing to
userspace when it queries the timestamp for the same vblank sequence
number twice.

This problem gets exposed when the vblank disable timer is not used
(or is set to expire quickly) and thus we can get multiple vblank
disable-enable transition during the same frame which would all
attempt to update the timestamp with the latest estimate.

Testcase: igt/kms_flip/flip-vs-expired-vblank
Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
Reviewed-by: Mario Kleiner mario.kleiner...@gmail.com

v2:Mario: Trivial rebase on top of current drm-next (13-Sep-2014)
Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 drivers/gpu/drm/drm_irq.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 80ff94a..e73cbda 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -126,6 +126,9 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc)
 	DRM_DEBUG(updating vblank count on crtc %d, missed %d\n,
 		  crtc, diff);
 
+	if (diff == 0)
+		return;
+
 	/* Reinitialize corresponding vblank timestamp if high-precision query
 	 * available. Skip this step if query unsupported or failed. Will
 	 * reinitialize delayed at next vblank interrupt in that case.
-- 
1.9.1



0001-drm-Don-t-update-vblank-timestamp-when-the-counter-d.patch.tar.gz
Description: application/gzip
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PULL] topic/vblank-rework

2014-09-10 Thread Mario Kleiner
Hmm, not quite an ack from my side for the pull in its current form. I
said if the two remaining issues i mentioned are addressed, then i'm
happy with it and can have my reviewed/acked-by. Looking at the code
they haven't been adressed.

However, this is easily fixable on top of the current patches:

1. A vblank_disable_timeout module parameter of zero should always
leave vblank irq's enabled and also override the drivers choice,
otherwise a user can't override the driver on a broken driver/gpu
combo, which is the only use case for having that module parameter.
Currenty the disable_immediately flag overrides the users override -
Ouch.

So in drm_vblank_put():

...

/* Last user schedules interrupt disable */
if (atomic_dec_and_test(vblank-refcount)) {
 Insert zero - opt-out check 
   if (drm_vblank_offdelay == 0)
   return;
 Remaining code continues 
   if (dev-vblank_disable_immediate || drm_vblank_offdelay  0)
   vblank_disable_fn((unsigned long)vblank);
   else if (drm_vblank_offdelay  0)
   mod_timer(vblank-disable_timer, jiffies +
((drm_vblank_offdelay * HZ)/1000));

...

2. For the drm: Have the vblank counter account for the time ... 
patch, we must opt-out of that last timestamp/counter update/bump if
the driver doesn't support high-precision vblank timestamping,
otherwise the vblank count and timestamp will be inconsistent with
each other - or outright wrong in case of the timestamp. Rather
deliver a slightly outdated, but correct count+timestamp pair to
userspace, which is still useable for practical purposes, than a pair
that's outright wrong and will definitely confuse clients.

A simple fix in static void vblank_disable_and_save() would be to
replace the new...

if (!vblank-enabled) {

... check by ...

if (!vblank-enabled 
) {


On Wed, Sep 10, 2014 at 2:05 PM, Daniel Vetter daniel.vet...@ffwll.ch wrote:
 Hi Dave,

 So here's the final bits of Ville's vblank rework with a bit of cleanup
 from Mario on top.

 The neat thing this finally allows is to immediately disable the vblank
 interrupt on the last drm_vblank_put if the hardware has perfectly
 accurate vblank counter and timestamp readout support. On i915 that
 required piles of small adjustements from Ville since depending upon the
 platform and port the vblank happens at different scanout lines.

 Of course this is fully opt-in and per-device (we need that since gen2
 doesn't have a hw vblank counter).

 Mario reviewed the entire pile too and after some initial hesitation
 (about drivers without accurate timestampt support) acked it.

 Cheers, Daniel


 The following changes since commit 21d70354bba9965a098382fc4d7fb17e138111f3:

   drm: move drm_stub.c to drm_drv.c (2014-08-06 19:10:44 +1000)

 are available in the git repository at:

   git://anongit.freedesktop.org/drm-intel tags/topic/vblank-rework-2014-09-10

 for you to fetch changes up to 2368ffb18b1d2b04eb80478d225676caa7a3c4c8:

   drm: Use vblank_disable_and_save in drm_vblank_cleanup() (2014-09-10 
 09:41:29 +0200)

 
 Mario Kleiner (2):
   drm: Remove drm_vblank_cleanup from drm_vblank_init error path.
   drm: Use vblank_disable_and_save in drm_vblank_cleanup()

 Ville Syrjälä (16):
   drm: Always reject drm_vblank_get() after drm_vblank_off()
   drm/i915: Warn if drm_vblank_get() still works after drm_vblank_off()
   drm: Don't clear vblank timestamps when vblank interrupt is disabled
   drm: Move drm_update_vblank_count()
   drm: Have the vblank counter account for the time between vblank irq 
 disable and drm_vblank_off()
   drm: Avoid random vblank counter jumps if the hardware counter has been 
 reset
   drm: Reduce the amount of dev-vblank[crtc] in the code
   drm: Fix deadlock between event_lock and vbl_lock/vblank_time_lock
   drm: Fix race between drm_vblank_off() and drm_queue_vblank_event()
   drm: Disable vblank interrupt immediately when drm_vblank_offdelay0
   drm: Add dev-vblank_disable_immediate flag
   drm/i915: Opt out of vblank disable timer on gen2
   drm: Kick start vblank interrupts at drm_vblank_on()
   drm/i915: Update scanline_offset only for active crtcs
   drm: Fix confusing debug message in drm_update_vblank_count()
   drm: Store the vblank timestamp when adjusting the counter during 
 disable

  Documentation/DocBook/drm.tmpl   |   7 +
  drivers/gpu/drm/drm_drv.c|   4 +-
  drivers/gpu/drm/drm_irq.c| 345 
 ++-
  drivers/gpu/drm/i915/i915_irq.c  |   8 +
  drivers/gpu/drm/i915/intel_display.c |  17 +-
  include/drm/drmP.h   |  12 +-
  6 files changed, 256 insertions(+), 137 deletions(-)

 --
 Daniel Vetter
 Software Engineer, Intel Corporation
 +41 (0) 79 365 57 48 - http://blog.ffwll.ch
 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org

Re: [Intel-gfx] [PULL] topic/vblank-rework

2014-09-10 Thread Mario Kleiner
e-mail snafu, sent it too early by accident, and from a gmail web
interface which i'm apparently incapable of using properly...

The second fix should look like this:

 A simple fix in static void vblank_disable_and_save() would be to
 replace the new...

 if (!vblank-enabled) {

 ... check by ...

if (!vblank-enabled 
   drm_get_last_vbltimestamp(dev, crtc, tvblank, 0)) {

... We need to make sure timestamp queries work and are actually
locked to the vblank, otherwise we can't do that last update there in
vblank_disable_and_save().


With these two fixes or similar applied i'd be happy, otherwise it
will inflict pain and real bugs on real users.

thanks,
-mario



On Wed, Sep 10, 2014 at 4:19 PM, Mario Kleiner
mario.kleiner...@gmail.com wrote:
 Hmm, not quite an ack from my side for the pull in its current form. I
 said if the two remaining issues i mentioned are addressed, then i'm
 happy with it and can have my reviewed/acked-by. Looking at the code
 they haven't been adressed.

 However, this is easily fixable on top of the current patches:

 1. A vblank_disable_timeout module parameter of zero should always
 leave vblank irq's enabled and also override the drivers choice,
 otherwise a user can't override the driver on a broken driver/gpu
 combo, which is the only use case for having that module parameter.
 Currenty the disable_immediately flag overrides the users override -
 Ouch.

 So in drm_vblank_put():

 ...

 /* Last user schedules interrupt disable */
 if (atomic_dec_and_test(vblank-refcount)) {
 Insert zero - opt-out check 
if (drm_vblank_offdelay == 0)
return;
 Remaining code continues 
if (dev-vblank_disable_immediate || drm_vblank_offdelay  0)
vblank_disable_fn((unsigned long)vblank);
else if (drm_vblank_offdelay  0)
mod_timer(vblank-disable_timer, jiffies +
 ((drm_vblank_offdelay * HZ)/1000));

 ...

 2. For the drm: Have the vblank counter account for the time ... 
 patch, we must opt-out of that last timestamp/counter update/bump if
 the driver doesn't support high-precision vblank timestamping,
 otherwise the vblank count and timestamp will be inconsistent with
 each other - or outright wrong in case of the timestamp. Rather
 deliver a slightly outdated, but correct count+timestamp pair to
 userspace, which is still useable for practical purposes, than a pair
 that's outright wrong and will definitely confuse clients.

 A simple fix in static void vblank_disable_and_save() would be to
 replace the new...

 if (!vblank-enabled) {

 ... check by ...

 if (!vblank-enabled 
 ) {


 On Wed, Sep 10, 2014 at 2:05 PM, Daniel Vetter daniel.vet...@ffwll.ch wrote:
 Hi Dave,

 So here's the final bits of Ville's vblank rework with a bit of cleanup
 from Mario on top.

 The neat thing this finally allows is to immediately disable the vblank
 interrupt on the last drm_vblank_put if the hardware has perfectly
 accurate vblank counter and timestamp readout support. On i915 that
 required piles of small adjustements from Ville since depending upon the
 platform and port the vblank happens at different scanout lines.

 Of course this is fully opt-in and per-device (we need that since gen2
 doesn't have a hw vblank counter).

 Mario reviewed the entire pile too and after some initial hesitation
 (about drivers without accurate timestampt support) acked it.

 Cheers, Daniel


 The following changes since commit 21d70354bba9965a098382fc4d7fb17e138111f3:

   drm: move drm_stub.c to drm_drv.c (2014-08-06 19:10:44 +1000)

 are available in the git repository at:

   git://anongit.freedesktop.org/drm-intel tags/topic/vblank-rework-2014-09-10

 for you to fetch changes up to 2368ffb18b1d2b04eb80478d225676caa7a3c4c8:

   drm: Use vblank_disable_and_save in drm_vblank_cleanup() (2014-09-10 
 09:41:29 +0200)

 
 Mario Kleiner (2):
   drm: Remove drm_vblank_cleanup from drm_vblank_init error path.
   drm: Use vblank_disable_and_save in drm_vblank_cleanup()

 Ville Syrjälä (16):
   drm: Always reject drm_vblank_get() after drm_vblank_off()
   drm/i915: Warn if drm_vblank_get() still works after drm_vblank_off()
   drm: Don't clear vblank timestamps when vblank interrupt is disabled
   drm: Move drm_update_vblank_count()
   drm: Have the vblank counter account for the time between vblank irq 
 disable and drm_vblank_off()
   drm: Avoid random vblank counter jumps if the hardware counter has 
 been reset
   drm: Reduce the amount of dev-vblank[crtc] in the code
   drm: Fix deadlock between event_lock and vbl_lock/vblank_time_lock
   drm: Fix race between drm_vblank_off() and drm_queue_vblank_event()
   drm: Disable vblank interrupt immediately when drm_vblank_offdelay0
   drm: Add dev-vblank_disable_immediate flag
   drm/i915: Opt out of vblank disable timer on gen2
   drm: Kick start vblank interrupts at drm_vblank_on()
   drm/i915: Update scanline_offset

Re: [Intel-gfx] [PULL] topic/vblank-rework

2014-09-10 Thread Mario Kleiner
On Wed, Sep 10, 2014 at 5:29 PM, Daniel Vetter daniel.vet...@ffwll.ch wrote:
 On Wed, Sep 10, 2014 at 4:19 PM, Mario Kleiner
 mario.kleiner...@gmail.com wrote:
 Hmm, not quite an ack from my side for the pull in its current form. I
 said if the two remaining issues i mentioned are addressed, then i'm
 happy with it and can have my reviewed/acked-by. Looking at the code
 they haven't been adressed.

 Sorry about the confusion, I've somehow thought that you've retracted
 those comments in Message-ID:
 caesyxygk4foqhky1wcerak_hybex2ogpftjyhu_zfhlbx46...@mail.gmail.com

 But I've missed that that was about just one of the issues.


Thought so. That one patch turns out to be crucial. My own software
immediately complained loudly about broken vblank irqs and switched to
lower performance fallbacks when that patch was missing.

I'll test the patches on a few more cards in the next days - but so
far things look good at least as far as my special test cases go.

 However, this is easily fixable on top of the current patches:

 1. A vblank_disable_timeout module parameter of zero should always
 leave vblank irq's enabled and also override the drivers choice,
 otherwise a user can't override the driver on a broken driver/gpu
 combo, which is the only use case for having that module parameter.
 Currenty the disable_immediately flag overrides the users override -
 Ouch.

 So in drm_vblank_put():

 ...

 /* Last user schedules interrupt disable */
 if (atomic_dec_and_test(vblank-refcount)) {
 Insert zero - opt-out check 
if (drm_vblank_offdelay == 0)
return;
 Remaining code continues 
if (dev-vblank_disable_immediate || drm_vblank_offdelay  0)
vblank_disable_fn((unsigned long)vblank);
else if (drm_vblank_offdelay  0)
mod_timer(vblank-disable_timer, jiffies +
 ((drm_vblank_offdelay * HZ)/1000));

 Yeah, I guess that makes sense. I'm not really a fan of giving users
 too powerful module options to hack around driver bugs since often
 that means they'll never report the bug :( But we have the support now
 to mark certain module options as debug-only and they'll taint the
 kernel if set, so this is fixable.

 I'll follow up with the patch you've suggested.


Thanks. I think the modules parameters i usually care about will get
proper testing and reporting, because while my software and users are
good at detecting such problems, they wouldn't know how to fix them
themselves, and at the same time they crucially depend on this stuff
working, so this gets reported to me quickly and i can give them the
module param workaround in private e-mail and take it from there with
proper bug reports or patches.

 ...

 2. For the drm: Have the vblank counter account for the time ... 
 patch, we must opt-out of that last timestamp/counter update/bump if
 the driver doesn't support high-precision vblank timestamping,
 otherwise the vblank count and timestamp will be inconsistent with
 each other - or outright wrong in case of the timestamp. Rather
 deliver a slightly outdated, but correct count+timestamp pair to
 userspace, which is still useable for practical purposes, than a pair
 that's outright wrong and will definitely confuse clients.

 A simple fix in static void vblank_disable_and_save() would be to
 replace the new...

 if (!vblank-enabled) {

 ... check by ...

 if (!vblank-enabled 
 ) {

 Yeah, makes sense (well the follow-up one ofc). I'll do a patch which
 adds this and adds a comment. Aside I think it would be useful to add
 a #define for the 0 return value, since the magic checks all over are
 imo fairly hard to understand.

 I'll also float a patch for rfc about that.


Good!

thanks,
-mario

 Thanks for your comments and again my apologies for missing that
 there's still outstanding work left to do on this.

 Cheers, Daniel



 On Wed, Sep 10, 2014 at 2:05 PM, Daniel Vetter daniel.vet...@ffwll.ch 
 wrote:
 Hi Dave,

 So here's the final bits of Ville's vblank rework with a bit of cleanup
 from Mario on top.

 The neat thing this finally allows is to immediately disable the vblank
 interrupt on the last drm_vblank_put if the hardware has perfectly
 accurate vblank counter and timestamp readout support. On i915 that
 required piles of small adjustements from Ville since depending upon the
 platform and port the vblank happens at different scanout lines.

 Of course this is fully opt-in and per-device (we need that since gen2
 doesn't have a hw vblank counter).

 Mario reviewed the entire pile too and after some initial hesitation
 (about drivers without accurate timestampt support) acked it.

 Cheers, Daniel


 The following changes since commit 21d70354bba9965a098382fc4d7fb17e138111f3:

   drm: move drm_stub.c to drm_drv.c (2014-08-06 19:10:44 +1000)

 are available in the git repository at:

   git://anongit.freedesktop.org/drm-intel 
 tags/topic/vblank-rework-2014-09-10

 for you to fetch changes up to 2368ffb18b1d2b04eb80478d225676caa7a3c4c8:

   drm: Use vblank_disable_and_save

Re: [Intel-gfx] [PATCH 14/19] drm: Don't update vblank timestamp when the counter didn't change

2014-09-04 Thread Mario Kleiner
I thought about this one again and opposed to my previous comment now think
it's fine, also for drivers without hw vblank counter queries.

-mario



On Wed, Aug 6, 2014 at 1:49 PM, ville.syrj...@linux.intel.com wrote:

 From: Ville Syrjälä ville.syrj...@linux.intel.com

 If we already have a timestamp for the current vblank counter, don't
 update it with a new timestmap. Small errors can creep in between two
 timestamp queries for the same vblank count, which could be confusing to
 userspace when it queries the timestamp for the same vblank sequence
 number twice.

 This problem gets exposed when the vblank disable timer is not used
 (or is set to expire quickly) and thus we can get multiple vblank
 disable-enable transition during the same frame which would all
 attempt to update the timestamp with the latest estimate.

 Testcase: igt/kms_flip/flip-vs-expired-vblank
 Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
 ---
  drivers/gpu/drm/drm_irq.c | 3 +++
  1 file changed, 3 insertions(+)

 diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
 index af33df1..0523f5b 100644
 --- a/drivers/gpu/drm/drm_irq.c
 +++ b/drivers/gpu/drm/drm_irq.c
 @@ -106,6 +106,9 @@ static void drm_update_vblank_count(struct drm_device
 *dev, int crtc)
 DRM_DEBUG(enabling vblank interrupts on crtc %d, missed %d\n,
   crtc, diff);

 +   if (diff == 0)
 +   return;
 +
 /* Reinitialize corresponding vblank timestamp if high-precision
 query
  * available. Skip this step if query unsupported or failed. Will
  * reinitialize delayed at next vblank interrupt in that case.
 --
 1.8.5.5


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/19] drm: Have the vblank counter account for the time between vblank irq disable and drm_vblank_off()

2014-09-02 Thread Mario Kleiner
-by: Mario Kleiner mario.kleiner...@gmail.com

for the whole series, if you want.

thanks,
-mario

On 08/06/2014 01:49 PM, ville.syrj...@linux.intel.com wrote:

From: Ville Syrjälä ville.syrj...@linux.intel.com

If the vblank irq has already been disabled (via the disable timer) when
we call drm_vblank_off() sample the counter and timestamp one last time.
This will make the sure that the user space visible counter will account
for time between vblank irq disable and drm_vblank_off().

Reviewed-by: Matt Roper matthew.d.ro...@intel.com
Reviewed-by: Daniel Vetter daniel.vet...@ffwll.ch
Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
---
  drivers/gpu/drm/drm_irq.c | 13 +
  1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index af96517..1f86f6c 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -140,6 +140,19 @@ static void vblank_disable_and_save(struct drm_device 
*dev, int crtc)
 */
spin_lock_irqsave(dev-vblank_time_lock, irqflags);
  
+	/*

+* If the vblank interrupt was already disbled update the count
+* and timestamp to maintain the appearance that the counter
+* has been ticking all along until this time. This makes the
+* count account for the entire time between drm_vblank_on() and
+* drm_vblank_off().
+*/
+   if (!dev-vblank[crtc].enabled) {
+   drm_update_vblank_count(dev, crtc);
+   spin_unlock_irqrestore(dev-vblank_time_lock, irqflags);
+   return;
+   }
+
dev-driver-disable_vblank(dev, crtc);
dev-vblank[crtc].enabled = false;
  


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/14] drm: Some more vblank timestampi changes

2014-01-12 Thread Mario Kleiner

On 29/10/13 19:06, ville.syrj...@linux.intel.com wrote:
 So I took another look at the vblank timestamping code, and got a bit
 excited. The result is this patchset.

 Summary of changes:
 - kill crtc-hwmode dependency
 - eliminate a bunch of 64bit math
 - fix timestamps for stereo and interlaced modes (on i915 at least)
 - move the early vbl irq hack into radeon code
 - add a similar hack to i915, but make it as finely targeted
as possibly to minimize the chance of accidentally
applying it in the wrong place

 The s/clock/crtc_clock change could use some radeon people to verify
 whether changing radeon_atom_get_tv_timings() is enough to make
 crtc_clock always populated.

 This series applies on top of Mario's
 Vblank timestamping improvements/fixes for Linux drm. series.

 Ville Syrjälä (14):
drm: Pass the display mode to drm_calc_timestamping_constants()
drm: Pass the display mode to 
drm_calc_vbltimestamp_from_scanoutpos()

drm/i915: Kill hwmode save/restore
drm/i915: Call drm_calc_timestamping_constants() earlier
drm: Improve drm_calc_timestamping_constants() documentation
drm: Simplify the math in drm_calc_timestamping_constants()
drm/radeon: Populate crtc_clock in radeon_atom_get_tv_timings()
drm: Use crtc_clock in drm_calc_timestamping_constants()
drm: Change {pixel,line,frame}dur_ns from s64 to int
drm/i915: Fix scanoutpos calculations for interlaced modes
drm: Fix vblank timestamping constants for interlaced modes
drm: Pass 'flags' from the caller to .get_scanout_position()
drm/radeon: Move the early vblank IRQ fixup to 
radeon_get_crtc_scanoutpos()
drm/i915: Add a kludge for DSL incrementing too late and ISR 
not working



Hi Ville,

sorry this took way longer than expected. I've reviewed all of your 
patches. Nice cleanups, nice improvements!


You can add a ...

Reviewed-by: mario.kleiner...@gmail.com

... to all of them.

Patches 0 - 11 and 14 are fine as they are. Only tiny formatting/comment 
fixes needed so they apply cleanly against the current drm-next.


Patch 12 and 13 need some small fixes, after applying those i'm fine 
with them. I'll send separate e-mails for those.


As far as testing goes, i had more encounters with Murphy's law in the 
last weeks than ever before, hence the long delay. You can add


Tested-by: mario.kleiner...@gmail.com

to the drm core and intel patches with the following restrictions:

I was able to sort of test the patchset on Intel GMA-950 (Gen-3 hw).

- I didn't test if your interlaced scanout patches 10 and 11 work as 
expected, because i was testing the patches first, then reviewing them, 
so i didn't realize at that point testing interlaced mode would be 
neccessary. The patches look correct to me though. I no longer have easy 
access to that machine.


- My photodiode test equipment, which i need for Intel testing 
malfunctioned. Not sure if my testing hardware is dying, or if it is a 
bug in the kernels usb or serial/tty stack, or some kernel 
misconfiguration wrt. low-latency, but there was so much timing noise in 
my equipment that i couldn't test with it.


- As a workaround I ran the kms-timestamping for regular non-interlaced 
mode against the original userspace implementation of the same code in 
my own toolkit Psychtoolbox, which itself was verified with testing 
equipment to do the right thing on that GMA-950 netbook earlier this 
year. Difference was less than 40 microseconds and more likely caused 
due to userspace noisyness and off-by-one errors in Psychtoolbox than 
your code, so i assume that your code is essentially correct at least 
for non-interlaced scanout, and that the DRM core changes are therefore 
also correct. If you or somebody would want to try this test yourself i 
can guide you through the steps. Psychtoolbox is easily apt-get'able for 
Debian and at least Ubuntu.


- The next limitation of my testing is wrt. to your early vbl irq 
handling improvements (patch 14). I currently only have Gen3 hardware 
which doesn't exercise those code path at all, so while the patch looks 
correct, it's not really tested by me.


As far as Radeon testing goes, i can't test it at all atm. After already 
not working very stable at all for the last half year, my last machine 
with an AMD card died during bootup for this test, but not without 
trying to corrupt the filesystem on my development drive as a little 
post-christmas gift to me. If somebody has a AMD card and wants to test 
this, it could be tested against the Psychtoolbox userspace reference 
implementation, which was verified with very precise external hardware 
last time a couple of months ago. However, patch 13 needs some fixes or 
it would crash. The now dead PC wasn't mine, but i still have the AMD card.


I will try to hunt for a new PC soon, and hopefully will get your 
patches better tested during the -rc phase if they get merged into 3.14.


Apart from a 

Re: [Intel-gfx] [PATCH 12/14] drm: Pass 'flags' from the caller to .get_scanout_position()

2014-01-12 Thread Mario Kleiner

On 29/10/13 19:06, ville.syrj...@linux.intel.com wrote:

From: Ville Syrjälä ville.syrj...@linux.intel.com

Preparation for moving the early vblank IRQ logic into
radeon_get_crtc_scanoutpos().

Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com


Tiny compile fix needed for this one. The function prototype for 
radeon_get_crtc_scanoutpos() is also defined in radeon_drv.c, so it 
needs the same update as the one in radeon_mode.h


Other than that

Reviewed-by: mario.kleiner...@gmail.com

-mario



---
  drivers/gpu/drm/drm_irq.c   | 2 +-
  drivers/gpu/drm/i915/i915_irq.c | 3 ++-
  drivers/gpu/drm/radeon/radeon_display.c | 7 ---
  drivers/gpu/drm/radeon/radeon_mode.h| 1 +
  drivers/gpu/drm/radeon/radeon_pm.c  | 2 +-
  include/drm/drmP.h  | 2 ++
  6 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index b5c4d42..b39255f 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -585,7 +585,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device 
*dev, int crtc,
/* Get vertical and horizontal scanout position vpos, hpos,
 * and bounding timestamps stime, etime, pre/post query.
 */
-   vbl_status = dev-driver-get_scanout_position(dev, crtc, vpos,
+   vbl_status = dev-driver-get_scanout_position(dev, crtc, flags, 
vpos,
   hpos, stime, 
etime);

/* Get correction for CLOCK_MONOTONIC - CLOCK_REALTIME if
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index f6b3206..70daf3c 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -657,7 +657,8 @@ static bool intel_pipe_in_vblank_locked(struct drm_device 
*dev, enum pipe pipe)
  }

  static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe,
-int *vpos, int *hpos, ktime_t *stime, ktime_t 
*etime)
+   unsigned int flags, int *vpos, int *hpos,
+   ktime_t *stime, ktime_t *etime)
  {
struct drm_i915_private *dev_priv = dev-dev_private;
struct drm_crtc *crtc = dev_priv-pipe_to_crtc_mapping[pipe];
diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index ccd8751..3581570 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -305,7 +305,7 @@ void radeon_crtc_handle_flip(struct radeon_device *rdev, 
int crtc_id)
 * to complete in this vblank?
 */
if (update_pending 
-   (DRM_SCANOUTPOS_VALID  radeon_get_crtc_scanoutpos(rdev-ddev, 
crtc_id,
+   (DRM_SCANOUTPOS_VALID  radeon_get_crtc_scanoutpos(rdev-ddev, 
crtc_id, 0,
   vpos, hpos, NULL, 
NULL)) 
((vpos = (99 * 
rdev-mode_info.crtcs[crtc_id]-base.hwmode.crtc_vdisplay)/100) ||
 (vpos  0  !ASIC_IS_AVIVO(rdev {
@@ -1544,6 +1544,7 @@ bool radeon_crtc_scaling_mode_fixup(struct drm_crtc *crtc,
   *
   * \param dev Device to query.
   * \param crtc Crtc to query.
+ * \param flags Flags from caller (DRM_CALLED_FROM_VBLIRQ or 0).
   * \param *vpos Location where vertical scanout position should be stored.
   * \param *hpos Location where horizontal scanout position should go.
   * \param *stime Target location for timestamp taken immediately before
@@ -1565,8 +1566,8 @@ bool radeon_crtc_scaling_mode_fixup(struct drm_crtc *crtc,
   * unknown small number of scanlines wrt. real scanout position.
   *
   */
-int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, 
int *hpos,
-  ktime_t *stime, ktime_t *etime)
+int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, unsigned int 
flags,
+  int *vpos, int *hpos, ktime_t *stime, ktime_t 
*etime)
  {
u32 stat_crtc = 0, vbl = 0, position = 0;
int vbl_start, vbl_end, vtotal, ret = 0;
diff --git a/drivers/gpu/drm/radeon/radeon_mode.h 
b/drivers/gpu/drm/radeon/radeon_mode.h
index 3bfa910..c4016dc 100644
--- a/drivers/gpu/drm/radeon/radeon_mode.h
+++ b/drivers/gpu/drm/radeon/radeon_mode.h
@@ -758,6 +758,7 @@ extern int radeon_crtc_cursor_move(struct drm_crtc *crtc,
   int x, int y);

  extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc,
+ unsigned int flags,
  int *vpos, int *hpos, ktime_t *stime,
  ktime_t *etime);

diff --git a/drivers/gpu/drm/radeon/radeon_pm.c 
b/drivers/gpu/drm/radeon/radeon_pm.c
index 98bf63b..a394049 100644
--- a/drivers/gpu/drm/radeon/radeon_pm.c
+++ b/drivers/gpu/drm/radeon/radeon_pm.c
@@ -1468,7 +1468,7 @@ static bool 

Re: [Intel-gfx] [PATCH 13/14] drm/radeon: Move the early vblank IRQ fixup to radeon_get_crtc_scanoutpos()

2014-01-12 Thread Mario Kleiner

On 29/10/13 19:06, ville.syrj...@linux.intel.com wrote:

From: Ville Syrjälä ville.syrj...@linux.intel.com

i915 doesn't need this kludge for most platforms. Although we do
appear to need something similar on certain platforms, but we can
be more accurate when we apply the adjustment since we know exactly
why the scanline counter doesn't always quite match the vblank
status.

Also the current code doesn't handle interlaced modes correctly,
and we already deal with interlaced modes in i915 code.

So let's just move the current code to radeon_get_crtc_scanoutpos()
since that's why it was added. For i915 we'll add a more finely
targeted variant.



The logic itself looks correct and should work, although i couldn't test 
it because of the dying PC.


But see below for some bugfix and some little nit-pick.

Other than that

Reviewed-by: mario.kleiner...@gmail.com



Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
---
  drivers/gpu/drm/drm_irq.c   | 25 ++---
  drivers/gpu/drm/radeon/radeon_display.c | 22 ++
  2 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index b39255f..a1cc1a3 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -542,7 +542,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device 
*dev, int crtc,
  {
ktime_t stime, etime, mono_time_offset;
struct timeval tv_etime;
-   int vbl_status, vtotal, vdisplay;
+   int vbl_status;
int vpos, hpos, i;
int framedur_ns, linedur_ns, pixeldur_ns, delta_ns, duration_ns;
bool invbl;
@@ -558,9 +558,6 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device 
*dev, int crtc,
return -EIO;
}

-   vtotal = mode-crtc_vtotal;
-   vdisplay = mode-crtc_vdisplay;
-
/* Durations of frames, lines, pixels in nanoseconds. */
framedur_ns = refcrtc-framedur_ns;
linedur_ns  = refcrtc-linedur_ns;
@@ -569,7 +566,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device 
*dev, int crtc,
/* If mode timing undefined, just return as no-op:
 * Happens during initial modesetting of a crtc.
 */
-   if (vtotal = 0 || vdisplay = 0 || framedur_ns == 0) {
+   if (framedur_ns == 0) {
DRM_DEBUG(crtc %d: Noop due to uninitialized mode.\n, crtc);
return -EAGAIN;
}
@@ -631,24 +628,6 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct 
drm_device *dev, int crtc,
 */
delta_ns = vpos * linedur_ns + hpos * pixeldur_ns;

-   /* Is vpos outside nominal vblank area, but less than
-* 1/100 of a frame height away from start of vblank?
-* If so, assume this isn't a massively delayed vblank
-* interrupt, but a vblank interrupt that fired a few
-* microseconds before true start of vblank. Compensate
-* by adding a full frame duration to the final timestamp.
-* Happens, e.g., on ATI R500, R600.
-*
-* We only do this if DRM_CALLED_FROM_VBLIRQ.
-*/
-   if ((flags  DRM_CALLED_FROM_VBLIRQ)  !invbl 
-   ((vdisplay - vpos)  vtotal / 100)) {
-   delta_ns = delta_ns - framedur_ns;
-
-   /* Signal this correction as applied. */
-   vbl_status |= 0x8;
-   }
-
if (!drm_timestamp_monotonic)
etime = ktime_sub(etime, mono_time_offset);

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index 3581570..9d02fa7 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -1709,5 +1709,27 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, 
int crtc, unsigned int fl
if (in_vbl)
ret |= DRM_SCANOUTPOS_INVBL;

+   /* Is vpos outside nominal vblank area, but less than
+* 1/100 of a frame height away from start of vblank?
+* If so, assume this isn't a massively delayed vblank
+* interrupt, but a vblank interrupt that fired a few
+* microseconds before true start of vblank. Compensate
+* by adding a full frame duration to the final timestamp.
+* Happens, e.g., on ATI R500, R600.
+*
+* We only do this if DRM_CALLED_FROM_VBLIRQ.
+*/
+   if ((flags  DRM_CALLED_FROM_VBLIRQ)  !in_vbl) {
+   vbl_start = 
rdev-mode_info.crtcs[crtc]-base.hwmode.crtc_vdisplay;


vbl_start gets already initialized by the code above, so the vbl_start 
assignment here shouldn't be neccessary. Only the vtotal assignment 
below is really needed.



+   vtotal = rdev-mode_info.crtcs[crtc]-base.hwmode.crtc_vtotal;
+
+   if (vbl_start - *vpos  vtotal / 100) {
+   vpos -= vtotal;


Here vpos is an int*, so the following line will corrupt kernel memory 
and die. Obviously then this


 +  

Re: [Intel-gfx] [PATCH 00/14] drm: Some more vblank timestampi changes

2013-11-30 Thread Mario Kleiner

On 29/11/13 14:36, Ville Syrjälä wrote:

On Wed, Nov 06, 2013 at 01:46:41PM +1000, Dave Airlie wrote:

On Wed, Oct 30, 2013 at 4:06 AM,  ville.syrj...@linux.intel.com wrote:

So I took another look at the vblank timestamping code, and got a bit
excited. The result is this patchset.


I'd like to merge this, I was hoping Mario could ack it at least as it
seems mostly sane to my eyes.


So we missed that boat, but maybe we'll get the next one...

Pinging Mario. Any chance you can take a look at this stuff at some
point?



I will, including testing. Hopefully within the coming week, but 
definitely safely before christmas.



Hmm. Do I have the wrong email addres for Mario? Adding the other one
too just to make sure...



Both work, but the tuebingen.mpg.de one will probably soon turn into a 
pure forward to the gmail one.


-mario
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/4] drm: Push latency sensitive bits of vblank scanoutpos timestamping into kms drivers.

2013-10-29 Thread Mario Kleiner
A change in locking of some kms drivers (currently intel-kms) make
the old approach too inaccurate and also incompatible with the
PREEMPT_RT realtime kernel patchset.

The driver-get_scanout_position() method of intel-kms now needs
to aquire a spinlock, which clashes badly with the former
preempt_disable() calls in the drm, and it also introduces larger
delays and timing uncertainty on a contended lock than acceptable.

This patch changes the prototype of driver-get_scanout_position()
to require/allow kms drivers to perform the ktime_get() system time
queries which go along with actual scanout position readout in a way
that provides maximum precision and to return those timestamps to
the drm. kms drivers implementations of get_scanout_position() are
asked to implement timestamping and scanoutpos readout in a way
that is as precise as possible and compatible with preempt_disable()
on a PREMPT_RT kernel. A driver should follow this pattern in
get_scanout_position() for precision and compatibility:

spin_lock...(...);
preempt_disable_rt(); // On a PREEMPT_RT kernel, otherwise omit.
if (stime) *stime = ktime_get();
... Minimum amount of MMIO register reads to get scanout position ...
... no taking of locks allowed here! ...
if (etime) *etime = ktime_get();
preempt_enable_rt(); // On PREEMPT_RT kernel, otherwise omit.
spin_unlock...(...);

v2: Fix formatting of new multi-line code comments.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com
Reviewed-by: Alex Deucher alexander.deuc...@amd.com
---
 drivers/gpu/drm/drm_irq.c |   20 
 include/drm/drmP.h|   10 --
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 33ee515..d80d952 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -219,7 +219,7 @@ int drm_vblank_init(struct drm_device *dev, int num_crtcs)
for (i = 0; i  num_crtcs; i++)
init_waitqueue_head(dev-vblank[i].queue);
 
-   DRM_INFO(Supports vblank timestamp caching Rev 1 (10.10.2010).\n);
+   DRM_INFO(Supports vblank timestamp caching Rev 2 (21.10.2013).\n);
 
/* Driver specific high-precision vblank timestamping supported? */
if (dev-driver-get_vblank_timestamp)
@@ -586,14 +586,17 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct 
drm_device *dev, int crtc,
 * code gets preempted or delayed for some reason.
 */
for (i = 0; i  DRM_TIMESTAMP_MAXRETRIES; i++) {
-   /* Get system timestamp before query. */
-   stime = ktime_get();
-
-   /* Get vertical and horizontal scanout pos. vpos, hpos. */
-   vbl_status = dev-driver-get_scanout_position(dev, crtc, 
vpos, hpos);
+   /*
+* Get vertical and horizontal scanout position vpos, hpos,
+* and bounding timestamps stime, etime, pre/post query.
+*/
+   vbl_status = dev-driver-get_scanout_position(dev, crtc, vpos,
+  hpos, stime, 
etime);
 
-   /* Get system timestamp after query. */
-   etime = ktime_get();
+   /*
+* Get correction for CLOCK_MONOTONIC - CLOCK_REALTIME if
+* CLOCK_REALTIME is requested.
+*/
if (!drm_timestamp_monotonic)
mono_time_offset = ktime_get_monotonic_offset();
 
@@ -604,6 +607,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device 
*dev, int crtc,
return -EIO;
}
 
+   /* Compute uncertainty in timestamp of scanout position query. 
*/
duration_ns = ktime_to_ns(etime) - ktime_to_ns(stime);
 
/* Accept result with   max_error nsecs timing uncertainty. */
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 2b954ad..48d15f0 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -835,12 +835,17 @@ struct drm_driver {
/**
 * Called by vblank timestamping code.
 *
-* Return the current display scanout position from a crtc.
+* Return the current display scanout position from a crtc, and an
+* optional accurate ktime_get timestamp of when position was measured.
 *
 * \param dev  DRM device.
 * \param crtc Id of the crtc to query.
 * \param *vpos Target location for current vertical scanout position.
 * \param *hpos Target location for current horizontal scanout position.
+* \param *stime Target location for timestamp taken immediately before
+*   scanout position query. Can be NULL to skip timestamp.
+* \param *etime Target location for timestamp taken immediately after
+*   scanout position query. Can be NULL to skip timestamp

[Intel-gfx] Vblank timestamping improvements/fixes for Linux drm. [v2]

2013-10-29 Thread Mario Kleiner
Hi Dave,

this is v2 of the patch set for improving/restoring accuracy and
robustness of vblank timestamping and for fixing incompatibilities
with the PREEMPT_RT patches.

Could you please merge this for the next kernel? Would be good to have
the old accuracy restored as soon as possible. Thanks.

v2: Added the reviewed-by's of Ville and Alex, thanks for the review!
Fixed multi-line code formatting as suggested by Ville.

Successfully tested on Intel and AMD Radeon hardware.

thanks,
-mario

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 4/4] drm/intel: Push get_scanout_position() timestamping into kms driver.

2013-10-29 Thread Mario Kleiner
Move the ktime_get() clock readouts and potential preempt_disable()
calls from drm core into kms driver to make it compatible with the
api changes in the drm core.

The intel-kms driver needs to take the uncore.lock inside
i915_get_crtc_scanoutpos() and intel_pipe_in_vblank().
This is incompatible with the preempt_disable() on a
PREEMPT_RT patched kernel, as regular spin locks must not
be taken within a preempt_disable'd section. Lock contention
on the uncore.lock also introduced too much uncertainty in vblank
timestamps.

Push the ktime_get() timestamping for scanoutpos queries and
potential preempt_disable_rt() into i915_get_crtc_scanoutpos(),
so these problems can be avoided:

1. First lock the uncore.lock (might sleep on a PREEMPT_RT kernel).
2. preempt_disable_rt() (will be added by the rt-linux folks).
3. ktime_get() a timestamp before scanout pos query.
4. Do all mmio reads as fast as possible without grabbing any new locks!
5. ktime_get() a post-query timestamp.
6. preempt_enable_rt()
7. Unlock the uncore.lock.

This reduces timestamp uncertainty on a low-end HP Atom Mini netbook
with Intel GMA-950 nicely:

Before: 3-8 usecs with spikes  20 usecs, triggering query retries.
After : Typically 1 usec (98% of all samples), occassionally 2 usecs
(2% of all samples), with maximum of 3 usecs (a handful).

v2: Fix formatting of new multi-line code comments.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com
Reviewed-by: Alex Deucher alexander.deuc...@amd.com
---
 drivers/gpu/drm/i915/i915_irq.c |   54 +++
 1 file changed, 43 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 156a1a4..7cafe64 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -599,35 +599,40 @@ static u32 gm45_get_vblank_counter(struct drm_device 
*dev, int pipe)
return I915_READ(reg);
 }
 
-static bool intel_pipe_in_vblank(struct drm_device *dev, enum pipe pipe)
+/* raw reads, only for fast reads of display block, no need for forcewake etc. 
*/
+#define __raw_i915_read32(dev_priv__, reg__) readl((dev_priv__)-regs + 
(reg__))
+#define __raw_i915_read16(dev_priv__, reg__) readw((dev_priv__)-regs + 
(reg__))
+
+static bool intel_pipe_in_vblank_locked(struct drm_device *dev, enum pipe pipe)
 {
struct drm_i915_private *dev_priv = dev-dev_private;
uint32_t status;
+   int reg;
 
if (IS_VALLEYVIEW(dev)) {
status = pipe == PIPE_A ?
I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT :
I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT;
 
-   return I915_READ(VLV_ISR)  status;
+   reg = VLV_ISR;
} else if (IS_GEN2(dev)) {
status = pipe == PIPE_A ?
I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT :
I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT;
 
-   return I915_READ16(ISR)  status;
+   reg = ISR;
} else if (INTEL_INFO(dev)-gen  5) {
status = pipe == PIPE_A ?
I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT :
I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT;
 
-   return I915_READ(ISR)  status;
+   reg = ISR;
} else if (INTEL_INFO(dev)-gen  7) {
status = pipe == PIPE_A ?
DE_PIPEA_VBLANK :
DE_PIPEB_VBLANK;
 
-   return I915_READ(DEISR)  status;
+   reg = DEISR;
} else {
switch (pipe) {
default:
@@ -642,12 +647,17 @@ static bool intel_pipe_in_vblank(struct drm_device *dev, 
enum pipe pipe)
break;
}
 
-   return I915_READ(DEISR)  status;
+   reg = DEISR;
}
+
+   if (IS_GEN2(dev))
+   return __raw_i915_read16(dev_priv, reg)  status;
+   else
+   return __raw_i915_read32(dev_priv, reg)  status;
 }
 
 static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe,
-int *vpos, int *hpos)
+int *vpos, int *hpos, ktime_t *stime, ktime_t 
*etime)
 {
struct drm_i915_private *dev_priv = dev-dev_private;
struct drm_crtc *crtc = dev_priv-pipe_to_crtc_mapping[pipe];
@@ -657,6 +667,7 @@ static int i915_get_crtc_scanoutpos(struct drm_device *dev, 
int pipe,
int vbl_start, vbl_end, htotal, vtotal;
bool in_vbl = true;
int ret = 0;
+   unsigned long irqflags;
 
if (!intel_crtc-active) {
DRM_DEBUG_DRIVER(trying to get scanoutpos for disabled 
@@ -671,14 +682,27 @@ static int i915_get_crtc_scanoutpos(struct drm_device 
*dev, int pipe,
 
ret |= DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_ACCURATE;
 
+   /*
+* Lock uncore.lock, as we will do multiple

[Intel-gfx] [PATCH 3/4] drm/radeon: Push get_scanout_position() timestamping into kms driver.

2013-10-29 Thread Mario Kleiner
Move the ktime_get() clock readouts and potential preempt_disable()
calls from drm core into kms driver to make it compatible with the
api changes in the drm core.

This should not introduce any change in functionality or behaviour
in radeon-kms, just a reshuffling of code.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com
Reviewed-by: Alex Deucher alexander.deuc...@amd.com
---
 drivers/gpu/drm/radeon/radeon_display.c |   24 +---
 drivers/gpu/drm/radeon/radeon_drv.c |3 ++-
 drivers/gpu/drm/radeon/radeon_mode.h|3 ++-
 drivers/gpu/drm/radeon/radeon_pm.c  |2 +-
 4 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index 0d1aa05..ccd8751 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -306,7 +306,7 @@ void radeon_crtc_handle_flip(struct radeon_device *rdev, 
int crtc_id)
 */
if (update_pending 
(DRM_SCANOUTPOS_VALID  radeon_get_crtc_scanoutpos(rdev-ddev, 
crtc_id,
-  vpos, hpos)) 
+  vpos, hpos, 
NULL, NULL)) 
((vpos = (99 * 
rdev-mode_info.crtcs[crtc_id]-base.hwmode.crtc_vdisplay)/100) ||
 (vpos  0  !ASIC_IS_AVIVO(rdev {
/* crtc didn't flip in this target vblank interval,
@@ -1539,12 +1539,17 @@ bool radeon_crtc_scaling_mode_fixup(struct drm_crtc 
*crtc,
 }
 
 /*
- * Retrieve current video scanout position of crtc on a given gpu.
+ * Retrieve current video scanout position of crtc on a given gpu, and
+ * an optional accurate timestamp of when query happened.
  *
  * \param dev Device to query.
  * \param crtc Crtc to query.
  * \param *vpos Location where vertical scanout position should be stored.
  * \param *hpos Location where horizontal scanout position should go.
+ * \param *stime Target location for timestamp taken immediately before
+ *   scanout position query. Can be NULL to skip timestamp.
+ * \param *etime Target location for timestamp taken immediately after
+ *   scanout position query. Can be NULL to skip timestamp.
  *
  * Returns vpos as a positive number while in active scanout area.
  * Returns vpos as a negative number inside vblank, counting the number
@@ -1560,7 +1565,8 @@ bool radeon_crtc_scaling_mode_fixup(struct drm_crtc *crtc,
  * unknown small number of scanlines wrt. real scanout position.
  *
  */
-int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, 
int *hpos)
+int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, 
int *hpos,
+  ktime_t *stime, ktime_t *etime)
 {
u32 stat_crtc = 0, vbl = 0, position = 0;
int vbl_start, vbl_end, vtotal, ret = 0;
@@ -1568,6 +1574,12 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, 
int crtc, int *vpos, int
 
struct radeon_device *rdev = dev-dev_private;
 
+   /* preempt_disable_rt() should go right here in PREEMPT_RT patchset. */
+
+   /* Get optional system timestamp before query. */
+   if (stime)
+   *stime = ktime_get();
+
if (ASIC_IS_DCE4(rdev)) {
if (crtc == 0) {
vbl = RREG32(EVERGREEN_CRTC_V_BLANK_START_END +
@@ -1650,6 +1662,12 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, 
int crtc, int *vpos, int
}
}
 
+   /* Get optional system timestamp after query. */
+   if (etime)
+   *etime = ktime_get();
+
+   /* preempt_enable_rt() should go right here in PREEMPT_RT patchset. */
+
/* Decode into vertical and horizontal scanout position. */
*vpos = position  0x1fff;
*hpos = (position  16)  0x1fff;
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 22f6858..101e7c0 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -106,7 +106,8 @@ int radeon_gem_object_open(struct drm_gem_object *obj,
 void radeon_gem_object_close(struct drm_gem_object *obj,
struct drm_file *file_priv);
 extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc,
- int *vpos, int *hpos);
+ int *vpos, int *hpos, ktime_t *stime,
+ ktime_t *etime);
 extern const struct drm_ioctl_desc radeon_ioctls_kms[];
 extern int radeon_max_kms_ioctl;
 int radeon_mmap(struct file *filp, struct vm_area_struct *vma);
diff --git a/drivers/gpu/drm/radeon/radeon_mode.h 
b/drivers/gpu/drm/radeon/radeon_mode.h
index ef63d3f..3bfa910 100644
--- a/drivers/gpu/drm/radeon/radeon_mode.h
+++ b/drivers/gpu/drm/radeon/radeon_mode.h
@@ -758,7 +758,8 @@ extern

[Intel-gfx] [PATCH 1/4] drm: Remove preempt_disable() from vblank timestamping code.

2013-10-29 Thread Mario Kleiner
Preemption handling will get pushed into the kms
drivers in followup patches, to make timestamping
more robust and PREEMPT_RT friendly.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com
Reviewed-by: Alex Deucher alexander.deuc...@amd.com
---
 drivers/gpu/drm/drm_irq.c |7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index f9af048..33ee515 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -586,11 +586,6 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct 
drm_device *dev, int crtc,
 * code gets preempted or delayed for some reason.
 */
for (i = 0; i  DRM_TIMESTAMP_MAXRETRIES; i++) {
-   /* Disable preemption to make it very likely to
-* succeed in the first iteration even on PREEMPT_RT kernel.
-*/
-   preempt_disable();
-
/* Get system timestamp before query. */
stime = ktime_get();
 
@@ -602,8 +597,6 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device 
*dev, int crtc,
if (!drm_timestamp_monotonic)
mono_time_offset = ktime_get_monotonic_offset();
 
-   preempt_enable();
-
/* Return as no-op if scanout query unsupported or failed. */
if (!(vbl_status  DRM_SCANOUTPOS_VALID)) {
DRM_DEBUG(crtc %d : scanoutpos query failed [%d].\n,
-- 
1.7.10.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Vblank timestamping improvements/fixes for Linux drm.

2013-10-26 Thread Mario Kleiner
Hi all,

this patch set for the kernel pushes the latency sensitive bits of
vblank scanoutpos timestamping from the drm core into the kms drivers.

A change in the locking of the intel-kms driver for Linux 3.11 made
the old approach too inaccurate and also incompatible with the
PREEMPT_RT realtime kernel patch set. These patches fix that problem
and restore the old level of precision and reliability.

The patch set changes the prototype of driver-get_scanout_position()
to require/allow kms drivers to perform the ktime_get() system time
queries which go along with actual scanout position readout in a way
that provides maximum precision and to return those timestamps to
the drm. It also converts the only two kms drivers which use this api
so far (intel-kms and radeon-kms) to the new api and improves precision
and reliability of the intel-kms a lot.

Patches have been tested on Intel and AMD Radeon hardware and the Intel
bits have received some review and feedback by Ville Syrjälä.

Please review and apply if possible.

Thanks,
-mario

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/4] drm: Remove preempt_disable() from vblank timestamping code.

2013-10-26 Thread Mario Kleiner
Preemption handling will get pushed into the kms
drivers in followup patches, to make timestamping
more robust and PREEMPT_RT friendly.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 drivers/gpu/drm/drm_irq.c |7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index f9af048..33ee515 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -586,11 +586,6 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct 
drm_device *dev, int crtc,
 * code gets preempted or delayed for some reason.
 */
for (i = 0; i  DRM_TIMESTAMP_MAXRETRIES; i++) {
-   /* Disable preemption to make it very likely to
-* succeed in the first iteration even on PREEMPT_RT kernel.
-*/
-   preempt_disable();
-
/* Get system timestamp before query. */
stime = ktime_get();
 
@@ -602,8 +597,6 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device 
*dev, int crtc,
if (!drm_timestamp_monotonic)
mono_time_offset = ktime_get_monotonic_offset();
 
-   preempt_enable();
-
/* Return as no-op if scanout query unsupported or failed. */
if (!(vbl_status  DRM_SCANOUTPOS_VALID)) {
DRM_DEBUG(crtc %d : scanoutpos query failed [%d].\n,
-- 
1.7.10.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/4] drm: Push latency sensitive bits of vblank scanoutpos timestamping into kms drivers.

2013-10-26 Thread Mario Kleiner
A change in locking of some kms drivers (currently intel-kms) make
the old approach too inaccurate and also incompatible with the
PREEMPT_RT realtime kernel patchset.

The driver-get_scanout_position() method of intel-kms now needs
to aquire a spinlock, which clashes badly with the former
preempt_disable() calls in the drm, and it also introduces larger
delays and timing uncertainty on a contended lock than acceptable.

This patch changes the prototype of driver-get_scanout_position()
to require/allow kms drivers to perform the ktime_get() system time
queries which go along with actual scanout position readout in a way
that provides maximum precision and to return those timestamps to
the drm. kms drivers implementations of get_scanout_position() are
asked to implement timestamping and scanoutpos readout in a way
that is as precise as possible and compatible with preempt_disable()
on a PREMPT_RT kernel. A driver should follow this pattern in
get_scanout_position() for precision and compatibility:

spin_lock...(...);
preempt_disable_rt(); // On a PREEMPT_RT kernel, otherwise omit.
if (stime) *stime = ktime_get();
... Minimum amount of MMIO register reads to get scanout position ...
... no taking of locks allowed here! ...
if (etime) *etime = ktime_get();
preempt_enable_rt(); // On PREEMPT_RT kernel, otherwise omit.
spin_unlock...(...);

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 drivers/gpu/drm/drm_irq.c |   18 ++
 include/drm/drmP.h|   10 --
 2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 33ee515..2250724 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -219,7 +219,7 @@ int drm_vblank_init(struct drm_device *dev, int num_crtcs)
for (i = 0; i  num_crtcs; i++)
init_waitqueue_head(dev-vblank[i].queue);
 
-   DRM_INFO(Supports vblank timestamp caching Rev 1 (10.10.2010).\n);
+   DRM_INFO(Supports vblank timestamp caching Rev 2 (21.10.2013).\n);
 
/* Driver specific high-precision vblank timestamping supported? */
if (dev-driver-get_vblank_timestamp)
@@ -586,14 +586,15 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct 
drm_device *dev, int crtc,
 * code gets preempted or delayed for some reason.
 */
for (i = 0; i  DRM_TIMESTAMP_MAXRETRIES; i++) {
-   /* Get system timestamp before query. */
-   stime = ktime_get();
-
-   /* Get vertical and horizontal scanout pos. vpos, hpos. */
-   vbl_status = dev-driver-get_scanout_position(dev, crtc, 
vpos, hpos);
+   /* Get vertical and horizontal scanout position vpos, hpos,
+* and bounding timestamps stime, etime, pre/post query.
+*/
+   vbl_status = dev-driver-get_scanout_position(dev, crtc, vpos,
+  hpos, stime, 
etime);
 
-   /* Get system timestamp after query. */
-   etime = ktime_get();
+   /* Get correction for CLOCK_MONOTONIC - CLOCK_REALTIME if
+* CLOCK_REALTIME is requested.
+*/
if (!drm_timestamp_monotonic)
mono_time_offset = ktime_get_monotonic_offset();
 
@@ -604,6 +605,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device 
*dev, int crtc,
return -EIO;
}
 
+   /* Compute uncertainty in timestamp of scanout position query. 
*/
duration_ns = ktime_to_ns(etime) - ktime_to_ns(stime);
 
/* Accept result with   max_error nsecs timing uncertainty. */
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 2b954ad..48d15f0 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -835,12 +835,17 @@ struct drm_driver {
/**
 * Called by vblank timestamping code.
 *
-* Return the current display scanout position from a crtc.
+* Return the current display scanout position from a crtc, and an
+* optional accurate ktime_get timestamp of when position was measured.
 *
 * \param dev  DRM device.
 * \param crtc Id of the crtc to query.
 * \param *vpos Target location for current vertical scanout position.
 * \param *hpos Target location for current horizontal scanout position.
+* \param *stime Target location for timestamp taken immediately before
+*   scanout position query. Can be NULL to skip timestamp.
+* \param *etime Target location for timestamp taken immediately after
+*   scanout position query. Can be NULL to skip timestamp.
 *
 * Returns vpos as a positive number while in active scanout area.
 * Returns vpos as a negative number inside vblank, counting the number
@@ -857,7 +862,8 @@ struct drm_driver

[Intel-gfx] [PATCH 3/4] drm/radeon: Push get_scanout_position() timestamping into kms driver.

2013-10-26 Thread Mario Kleiner
Move the ktime_get() clock readouts and potential preempt_disable()
calls from drm core into kms driver to make it compatible with the
api changes in the drm core.

This should not introduce any change in functionality or behaviour
in radeon-kms, just a reshuffling of code.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 drivers/gpu/drm/radeon/radeon_display.c |   24 +---
 drivers/gpu/drm/radeon/radeon_drv.c |3 ++-
 drivers/gpu/drm/radeon/radeon_mode.h|3 ++-
 drivers/gpu/drm/radeon/radeon_pm.c  |2 +-
 4 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index 0d1aa05..ccd8751 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -306,7 +306,7 @@ void radeon_crtc_handle_flip(struct radeon_device *rdev, 
int crtc_id)
 */
if (update_pending 
(DRM_SCANOUTPOS_VALID  radeon_get_crtc_scanoutpos(rdev-ddev, 
crtc_id,
-  vpos, hpos)) 
+  vpos, hpos, 
NULL, NULL)) 
((vpos = (99 * 
rdev-mode_info.crtcs[crtc_id]-base.hwmode.crtc_vdisplay)/100) ||
 (vpos  0  !ASIC_IS_AVIVO(rdev {
/* crtc didn't flip in this target vblank interval,
@@ -1539,12 +1539,17 @@ bool radeon_crtc_scaling_mode_fixup(struct drm_crtc 
*crtc,
 }
 
 /*
- * Retrieve current video scanout position of crtc on a given gpu.
+ * Retrieve current video scanout position of crtc on a given gpu, and
+ * an optional accurate timestamp of when query happened.
  *
  * \param dev Device to query.
  * \param crtc Crtc to query.
  * \param *vpos Location where vertical scanout position should be stored.
  * \param *hpos Location where horizontal scanout position should go.
+ * \param *stime Target location for timestamp taken immediately before
+ *   scanout position query. Can be NULL to skip timestamp.
+ * \param *etime Target location for timestamp taken immediately after
+ *   scanout position query. Can be NULL to skip timestamp.
  *
  * Returns vpos as a positive number while in active scanout area.
  * Returns vpos as a negative number inside vblank, counting the number
@@ -1560,7 +1565,8 @@ bool radeon_crtc_scaling_mode_fixup(struct drm_crtc *crtc,
  * unknown small number of scanlines wrt. real scanout position.
  *
  */
-int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, 
int *hpos)
+int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, 
int *hpos,
+  ktime_t *stime, ktime_t *etime)
 {
u32 stat_crtc = 0, vbl = 0, position = 0;
int vbl_start, vbl_end, vtotal, ret = 0;
@@ -1568,6 +1574,12 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, 
int crtc, int *vpos, int
 
struct radeon_device *rdev = dev-dev_private;
 
+   /* preempt_disable_rt() should go right here in PREEMPT_RT patchset. */
+
+   /* Get optional system timestamp before query. */
+   if (stime)
+   *stime = ktime_get();
+
if (ASIC_IS_DCE4(rdev)) {
if (crtc == 0) {
vbl = RREG32(EVERGREEN_CRTC_V_BLANK_START_END +
@@ -1650,6 +1662,12 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, 
int crtc, int *vpos, int
}
}
 
+   /* Get optional system timestamp after query. */
+   if (etime)
+   *etime = ktime_get();
+
+   /* preempt_enable_rt() should go right here in PREEMPT_RT patchset. */
+
/* Decode into vertical and horizontal scanout position. */
*vpos = position  0x1fff;
*hpos = (position  16)  0x1fff;
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 22f6858..101e7c0 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -106,7 +106,8 @@ int radeon_gem_object_open(struct drm_gem_object *obj,
 void radeon_gem_object_close(struct drm_gem_object *obj,
struct drm_file *file_priv);
 extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc,
- int *vpos, int *hpos);
+ int *vpos, int *hpos, ktime_t *stime,
+ ktime_t *etime);
 extern const struct drm_ioctl_desc radeon_ioctls_kms[];
 extern int radeon_max_kms_ioctl;
 int radeon_mmap(struct file *filp, struct vm_area_struct *vma);
diff --git a/drivers/gpu/drm/radeon/radeon_mode.h 
b/drivers/gpu/drm/radeon/radeon_mode.h
index ef63d3f..3bfa910 100644
--- a/drivers/gpu/drm/radeon/radeon_mode.h
+++ b/drivers/gpu/drm/radeon/radeon_mode.h
@@ -758,7 +758,8 @@ extern int radeon_crtc_cursor_move(struct drm_crtc *crtc,
   int x, int y

[Intel-gfx] [PATCH 4/4] drm/intel: Push get_scanout_position() timestamping into kms driver.

2013-10-26 Thread Mario Kleiner
Move the ktime_get() clock readouts and potential preempt_disable()
calls from drm core into kms driver to make it compatible with the
api changes in the drm core.

The intel-kms driver needs to take the uncore.lock inside
i915_get_crtc_scanoutpos() and intel_pipe_in_vblank().
This is incompatible with the preempt_disable() on a
PREEMPT_RT patched kernel, as regular spin locks must not
be taken within a preempt_disable'd section. Lock contention
on the uncore.lock also introduced too much uncertainty in vblank
timestamps.

Push the ktime_get() timestamping for scanoutpos queries and
potential preempt_disable_rt() into i915_get_crtc_scanoutpos(),
so these problems can be avoided:

1. First lock the uncore.lock (might sleep on a PREEMPT_RT kernel).
2. preempt_disable_rt() (will be added by the rt-linux folks).
3. ktime_get() a timestamp before scanout pos query.
4. Do all mmio reads as fast as possible without grabbing any new locks!
5. ktime_get() a post-query timestamp.
6. preempt_enable_rt()
7. Unlock the uncore.lock.

This reduces timestamp uncertainty on a low-end HP Atom Mini netbook
with Intel GMA-950 nicely:

Before: 3-8 usecs with spikes  20 usecs, triggering query retries.
After : Typically 1 usec (98% of all samples), occassionally 2 usecs
(2% of all samples), with maximum of 3 usecs (a handful).

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 drivers/gpu/drm/i915/i915_irq.c |   53 +++
 1 file changed, 42 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 156a1a4..a3e41d3 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -599,35 +599,40 @@ static u32 gm45_get_vblank_counter(struct drm_device 
*dev, int pipe)
return I915_READ(reg);
 }
 
-static bool intel_pipe_in_vblank(struct drm_device *dev, enum pipe pipe)
+/* raw reads, only for fast reads of display block, no need for forcewake etc. 
*/
+#define __raw_i915_read32(dev_priv__, reg__) readl((dev_priv__)-regs + 
(reg__))
+#define __raw_i915_read16(dev_priv__, reg__) readw((dev_priv__)-regs + 
(reg__))
+
+static bool intel_pipe_in_vblank_locked(struct drm_device *dev, enum pipe pipe)
 {
struct drm_i915_private *dev_priv = dev-dev_private;
uint32_t status;
+   int reg;
 
if (IS_VALLEYVIEW(dev)) {
status = pipe == PIPE_A ?
I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT :
I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT;
 
-   return I915_READ(VLV_ISR)  status;
+   reg = VLV_ISR;
} else if (IS_GEN2(dev)) {
status = pipe == PIPE_A ?
I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT :
I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT;
 
-   return I915_READ16(ISR)  status;
+   reg = ISR;
} else if (INTEL_INFO(dev)-gen  5) {
status = pipe == PIPE_A ?
I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT :
I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT;
 
-   return I915_READ(ISR)  status;
+   reg = ISR;
} else if (INTEL_INFO(dev)-gen  7) {
status = pipe == PIPE_A ?
DE_PIPEA_VBLANK :
DE_PIPEB_VBLANK;
 
-   return I915_READ(DEISR)  status;
+   reg = DEISR;
} else {
switch (pipe) {
default:
@@ -642,12 +647,17 @@ static bool intel_pipe_in_vblank(struct drm_device *dev, 
enum pipe pipe)
break;
}
 
-   return I915_READ(DEISR)  status;
+   reg = DEISR;
}
+
+   if (IS_GEN2(dev))
+   return __raw_i915_read16(dev_priv, reg)  status;
+   else
+   return __raw_i915_read32(dev_priv, reg)  status;
 }
 
 static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe,
-int *vpos, int *hpos)
+int *vpos, int *hpos, ktime_t *stime, ktime_t 
*etime)
 {
struct drm_i915_private *dev_priv = dev-dev_private;
struct drm_crtc *crtc = dev_priv-pipe_to_crtc_mapping[pipe];
@@ -657,6 +667,7 @@ static int i915_get_crtc_scanoutpos(struct drm_device *dev, 
int pipe,
int vbl_start, vbl_end, htotal, vtotal;
bool in_vbl = true;
int ret = 0;
+   unsigned long irqflags;
 
if (!intel_crtc-active) {
DRM_DEBUG_DRIVER(trying to get scanoutpos for disabled 
@@ -671,14 +682,26 @@ static int i915_get_crtc_scanoutpos(struct drm_device 
*dev, int pipe,
 
ret |= DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_ACCURATE;
 
+   /* Lock uncore.lock, as we will do multiple timing critical raw
+* register reads, potentially with preemption disabled, so the
+* following code must not block on uncore.lock

Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7

2013-10-11 Thread Mario Kleiner

On 10/11/2013 03:30 PM, Sebastian Andrzej Siewior wrote:

On 10/11/2013 02:37 PM, Steven Rostedt wrote:

On Fri, 11 Oct 2013 12:18:00 +0200
Sebastian Andrzej Siewior bige...@linutronix.de wrote:


* Mario Kleiner | 2013-09-26 18:16:47 [+0200]:


Good! I will do that. Thanks for clarifying the irq and constraints
on raw locks in the other thread.


Are there any suggestions for now?  preempt_disable_nort() like Luis
suggesed?



The preempt_disable_nort() is rather pointless, because the
preempt_disable() was added specifically for -rt. When PREEMPT_RT is
not enabled, preemption is disabled there already by the previous calls
to spin_lock().


Either way. Then I remove the preempt_enable/disable call. Any
objections?



Good with me. I'm currently working on a replacement.
-mario


-- Steve


Sebastian



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/3] drm/i915: Fix scanoutpos calculations

2013-10-11 Thread Mario Kleiner

Daniel, Ville,

i tested Ville's patch series for the scanoutpos improvements on a 
GMA-950, on top of airlied's current drm-next branch.


There's one issue: The variable position in i915_get_crtc_scanoutpos() 
must be turned from a u32 into a int, otherwise funny sign errors happen 
and we end up with *vpos being off by multiple million scanlines and 
timestamps being off by over 60 seconds.


Other than that looks good. Execution time is now better:

Before uncore.lock addition: 3-4 usecs execution time for the scanoutpos 
query on my machine. After uncore.lock addition (3.12.0-rc3) 9-20 usecs, 
sometimes repetition of the timing loop triggered. After Ville's patches 
down to typically 3-8 usecs, occassionally spiking to almost 20 usecs.


I'll make my patches for the realtime kernel + increased accuracy on top 
of drm-next + Ville's patches.


thanks,
-mario

On 09/23/2013 12:02 PM, ville.syrj...@linux.intel.com wrote:

From: Ville Syrjälä ville.syrj...@linux.intel.com

The reported scanout position must be relative to the end of vblank.
Currently we manage to fumble that in a few ways.

First we don't consider the case when vtotal != vbl_end. While that
isn't very common (happens maybe only w/ old panel fitting hardware),
we can fix it easily enough.

The second issue is that on pre-CTG hardware we convert the pixel count
to horizontal/vertical components at the very beginning, and then forget
to adjust the horizontal component to be relative to vbl_end. So instead
we should keep our numbers in the pixel count domain while we're
adjusting the position to be relative to vbl_end. Then when we do the
conversion in the end, both vertical _and_ horizontal components will
come out correct.

Cc: Mario Kleiner mario.klei...@tuebingen.mpg.de
Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
---
  drivers/gpu/drm/i915/i915_irq.c | 37 -
  1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 697d62c..4f74f0c 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -615,13 +615,7 @@ static int i915_get_crtc_scanoutpos(struct drm_device 
*dev, int pipe,
/* No obvious pixelcount register. Only query vertical
 * scanout position from Display scan line register.
 */
-   position = I915_READ(PIPEDSL(pipe));
-
-   /* Decode into vertical scanout position. Don't have
-* horizontal scanout position.
-*/
-   *vpos = position  0x1fff;
-   *hpos = 0;
+   position = I915_READ(PIPEDSL(pipe))  0x1fff;
} else {
/* Have access to pixelcount since start of frame.
 * We can split this into vertical and horizontal
@@ -629,15 +623,32 @@ static int i915_get_crtc_scanoutpos(struct drm_device 
*dev, int pipe,
 */
position = (I915_READ(PIPEFRAMEPIXEL(pipe))  PIPE_PIXEL_MASK) 
 PIPE_PIXEL_SHIFT;
  
-		*vpos = position / htotal;

-   *hpos = position - (*vpos * htotal);
+   /* convert to pixel counts */
+   vbl_start *= htotal;
+   vbl_end *= htotal;
+   vtotal *= htotal;
}
  
-	in_vbl = *vpos = vbl_start  *vpos  vbl_end;

+   in_vbl = position = vbl_start  position  vbl_end;
  
-	/* Inside upper part of vblank area? Apply corrective offset: */

-   if (in_vbl  (*vpos = vbl_start))
-   *vpos = *vpos - vtotal;
+   /*
+* While in vblank, position will be negative
+* counting up towards 0 at vbl_end. And outside
+* vblank, position will be positive counting
+* up since vbl_end.
+*/
+   if (position = vbl_start)
+   position -= vbl_end;
+   else
+   position += vtotal - vbl_end;
+
+   if (IS_G4X(dev) || INTEL_INFO(dev)-gen = 5) {
+   *vpos = position;
+   *hpos = 0;
+   } else {
+   *vpos = position / htotal;
+   *hpos = position - (*vpos * htotal);
+   }
  
  	/* In vblank? */

if (in_vbl)


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/3] drm/i915: Fix scanoutpos calculations

2013-10-11 Thread Mario Kleiner
Yes.
On Oct 12, 2013 1:18 AM, Daniel Vetter dan...@ffwll.ch wrote:

 On Fri, Oct 11, 2013 at 04:31:38PM +0200, Mario Kleiner wrote:
  Daniel, Ville,
 
  i tested Ville's patch series for the scanoutpos improvements on a
  GMA-950, on top of airlied's current drm-next branch.
 
  There's one issue: The variable position in
  i915_get_crtc_scanoutpos() must be turned from a u32 into a int,
  otherwise funny sign errors happen and we end up with *vpos being
  off by multiple million scanlines and timestamps being off by over
  60 seconds.
 
  Other than that looks good. Execution time is now better:
 
  Before uncore.lock addition: 3-4 usecs execution time for the
  scanoutpos query on my machine. After uncore.lock addition
  (3.12.0-rc3) 9-20 usecs, sometimes repetition of the timing loop
  triggered. After Ville's patches down to typically 3-8 usecs,
  occassionally spiking to almost 20 usecs.
 
  I'll make my patches for the realtime kernel + increased accuracy on
  top of drm-next + Ville's patches.

 So official reviewed-by/tested-by from you on Ville's latest patches in
 this thread?

Yes.
-mario

 -Daniel
 --
 Daniel Vetter
 Software Engineer, Intel Corporation
 +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 1/3] drm/i915: Skip register reads in i915_get_crtc_scanoutpos()

2013-09-26 Thread Mario Kleiner

On 25.09.13 10:14, Ville Syrjälä wrote:

On Wed, Sep 25, 2013 at 04:35:56AM +0200, Mario Kleiner wrote:

On 23.09.13 13:48, ville.syrj...@linux.intel.com wrote:

From: Ville Syrjälä ville.syrj...@linux.intel.com

We have all the information we need in the mode structure, so going and
reading it from the hardware is pointless, and slower.

We never populated -get_vblank_timestamp() in the UMS case, and as that
is the only way we'd ever call -get_scanout_position(), we can
completely ignore UMS in i915_get_crtc_scanoutpos().

Also reorganize intel_irq_init() a bit to clarify the KMS vs. UMS
situation.

v2: Drop UMS code

Cc: Mario Kleiner mario.klei...@tuebingen.mpg.de
Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
---
   drivers/gpu/drm/i915/i915_irq.c | 43 
-
   1 file changed, 17 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index b356dc1..058f099 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -570,24 +570,29 @@ static u32 gm45_get_vblank_counter(struct drm_device 
*dev, int pipe)
   static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe,
 int *vpos, int *hpos)
   {
-   drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev-dev_private;
-   u32 vbl = 0, position = 0;
+   struct drm_i915_private *dev_priv = dev-dev_private;
+   struct drm_crtc *crtc = dev_priv-pipe_to_crtc_mapping[pipe];
+   struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+   const struct drm_display_mode *mode = intel_crtc-config.adjusted_mode;
+   u32 position;
int vbl_start, vbl_end, htotal, vtotal;
bool in_vbl = true;
int ret = 0;
-   enum transcoder cpu_transcoder = intel_pipe_to_cpu_transcoder(dev_priv,
- pipe);

-   if (!i915_pipe_enabled(dev, pipe)) {
+   if (!intel_crtc-active) {
DRM_DEBUG_DRIVER(trying to get scanoutpos for disabled 
 pipe %c\n, pipe_name(pipe));
return 0;
}

-   /* Get vtotal. */
-   vtotal = 1 + ((I915_READ(VTOTAL(cpu_transcoder))  16)  0x1fff);
+   htotal = mode-crtc_htotal;
+   vtotal = mode-crtc_vtotal;
+   vbl_start = mode-crtc_vblank_start;
+   vbl_end = mode-crtc_vblank_end;

-   if (INTEL_INFO(dev)-gen = 4) {
+   ret |= DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_ACCURATE;
+
+   if (IS_G4X(dev) || INTEL_INFO(dev)-gen = 5) {
/* No obvious pixelcount register. Only query vertical
 * scanout position from Display scan line register.
 */
@@ -605,29 +610,16 @@ static int i915_get_crtc_scanoutpos(struct drm_device 
*dev, int pipe,
 */
position = (I915_READ(PIPEFRAMEPIXEL(pipe))  PIPE_PIXEL_MASK) 
 PIPE_PIXEL_SHIFT;

-   htotal = 1 + ((I915_READ(HTOTAL(cpu_transcoder))  16)  
0x1fff);
*vpos = position / htotal;
*hpos = position - (*vpos * htotal);
}

-   /* Query vblank area. */
-   vbl = I915_READ(VBLANK(cpu_transcoder));
-
-   /* Test position against vblank region. */
-   vbl_start = vbl  0x1fff;
-   vbl_end = (vbl  16)  0x1fff;
-
-   if ((*vpos  vbl_start) || (*vpos  vbl_end))
-   in_vbl = false;
+   in_vbl = *vpos = vbl_start  *vpos  vbl_end;


I think this should be a = instead of  in *vpos  vbl_end, if it is
meant to be equal to the line it replaces (not   is =), unless the
original comparison was off-by-one?


Yeah, I think the original was wrong, in more ways than one. It forgot
to add +1 to vbl_start/end, and then it did the comparison wrong as
well.



Ah ok, that's possible. Then you have my blessing :).

On the Intel side i only had and have sporadic access to an old Intel 
GMA-950 (Gen-3?) when writing that function, so i could only really test 
one half of the code-path in that function. Also that card only has a 
VGA output, which limits my actual measurements to use of a photo-diode 
attached to a CRT monitor. That means i can only verify the accuracy of 
timestamping down to about 0.2 msecs variability and 0.5 msecs bias due 
to the limitations/noise of the measurement setup (depending how close i 
get the photo-diode to the corner of the monitor, how dark it is, etc.). 
So i know that the jitter in the timestamps is very low, less than 1 
usec standard deviation iirc, and that the absolute error wrt. reality 
is lower than 0.2 msecs, but i wouldn't be able to detect absolute 
errors of a few scanlines.


-mario



   +in_vbl = *vpos = vbl_start  *vpos = vbl_end;

Other than that, it looks good.

Reviewed-by: mario.kleiner...@gmail.com



/* Inside upper part of vblank area? Apply corrective offset: */
if (in_vbl  (*vpos = vbl_start))
*vpos = *vpos

Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7

2013-09-26 Thread Mario Kleiner

On 25.09.13 16:13, Steven Rostedt wrote:

On Wed, 25 Sep 2013 06:32:10 +0200
Mario Kleiner mario.klei...@tuebingen.mpg.de wrote:



But given the new situation, your proposal is great! If we push the
clock readouts into the get_scanoutpos routine, we can make this robust
without causing grief for the rt people and without the need for a new
separate lock for display regs in intel-kms.

E.g., for intel-kms:

i915_get_crtc_scanoutpos(..., ktime_t *stime, ktime_t *etime)
{
...
spin_lock_irqsave(...uncore.lock);
preempt_disable();
*stime = ktime_get();
position = __raw_i915_read32(dev_priv, PIPEDSL(pipe));
*etime = ktime_get();
preempt_enable();
spin_unlock_irqrestore(...uncore.lock)
...
}

With your patchset to reduce the amount of register reads needed in that
function, and given that forcewake handling isn't needed for these
registers, this should make it robust again and wouldn't need new locks.

Unless ktime_get is also a bad thing to do in a preempt disabled section?


ktime_get() works fine in preempt_disable sections, although it may add
some latencies, but you shouldn't need to worry about it.

I like this solution the best too, but if it does go in, I would ask to
send us the patch for adding the preempt_disable() and we can add the
preempt_disable_rt() to it. Why make mainline have a little more
overhead?

-- Steve


Good! I will do that. Thanks for clarifying the irq and constraints on 
raw locks in the other thread.


-mario


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7

2013-09-26 Thread Mario Kleiner

On 25.09.13 09:49, Ville Syrjälä wrote:

On Wed, Sep 25, 2013 at 06:32:10AM +0200, Mario Kleiner wrote:

On 23.09.13 10:38, Ville Syrjälä wrote:

On Sat, Sep 21, 2013 at 12:07:36AM +0200, Mario Kleiner wrote:

On 09/17/2013 10:55 PM, Daniel Vetter wrote:

On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley pe...@hurleysoftware.com wrote:

On 09/11/2013 03:31 PM, Peter Hurley wrote:


[+cc dri-devel]

On 09/11/2013 11:38 AM, Steven Rostedt wrote:


On Wed, 11 Sep 2013 11:16:43 -0400
Peter Hurley pe...@hurleysoftware.com wrote:


The funny part is, there's a comment there that shows that this was
done even for PREEMPT_RT. Unfortunately, the call to
get_scanout_position() can call functions that use the rt-mutex
sleeping spin locks and it breaks there.

I guess we need to ask the authors of the mainline patch exactly why
that preempt_disable() is needed?



The drm core associates a timestamp with each vertical blank frame #.
Drm drivers can optionally support a 'high resolution' hw timestamp.
The vblank frame #/timestamp tuple is user-space visible.

The i915 drm driver supports a hw timestamp via this drm helper function
which computes the timestamp from the crtc scan position (based on the
pixel clock).

For mainline, the preempt_disable/_enable() isn't actually necessary
because every call tree that leads here already has preemption disabled.

For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?



No, it should not. Note, any other lock that can be held when it is
held would also need to be raw.



By that, you mean any other lock that might be claimed would also need
to be raw?  Hopefully not any other lock already held?


And by taking a quick audit of the code, I see this:

   spin_lock_irqsave(dev_priv-uncore.lock, irqflags);

   /* Reset the chip */

   /* GEN6_GDRST is not in the gt power well, no need to check
* for fifo space for the write or forcewake the chip for
* the read
*/
   __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);

   /* Spin waiting for the device to ack the reset request */
   ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) 
GEN6_GRDOM_FULL) == 0, 500);

That spin is unacceptable in RT with preemption and interrupts disabled.



Yep. That would be bad.

AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
in the force-wake set, so raw reads of the registers would
probably be acceptable (thus obviating the need for claiming the
uncore.lock).

Except that _ALL_ register access is disabled with the uncore.lock
during a gpu reset. Not sure if that's meant to include crtc registers
or not, or what other synchronization/serialization issues are being
handled/hidden by forcing all register accesses to wait during a gpu
reset.

Hopefully an i915 expert can weigh in here?




Daniel,

Can you shed some light on whether the i915+ crtc registers (specifically
those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter())
read as part of the vblank counter/timestamp handling need to
be prevented during gpu reset?


The depency here in the locking is a recent addition:

commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a
Author: Chris Wilson ch...@chris-wilson.co.uk
Date:   Fri Jul 19 20:36:51 2013 +0100

   drm/i915: Serialize almost all register access

It's a (slightly) oversized hammer to work around a hardware issue -
we could break it down to register blocks, which can be accessed
concurrently, but that tends to be more fragile. But the chip really
dies if you access (even just reads) the same block concurrently :(

We could try break the spinlock protected section a bit in the reset
handler - register access on a hung gpu tends to be ill-defined
anyway.


The implied wait with preemption and interrupts disabled is causing grief
in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.


Oops, the magic code in wait_for which is just there to make the imo
totally misguided kgdb support work papered over the aweful long wait
in atomic context ever since we've added this in

commit b6e45f866465f42b53d803b0c574da0fc508a0e9
Author: Keith Packard kei...@keithp.com
Date:   Fri Jan 6 11:34:04 2012 -0800

   drm/i915: Move reset forcewake processing to gen6_do_reset

Reverting this change should be enough (code moved obviously a bit).

Cheers, Daniel



Regards,
Peter Hurley




What's the real issue here?



That the vblank timestamp needs to be an accurate measurement of a
realtime event. Sleeping/servicing interrupts while reading
the registers necessary to compute the timestamp would be bad too.

(edit: which hopefully Mario Kleiner clarified in his reply)

My point earlier was three-fold:
1. Don't need the preempt_disable() for mainline: all callers are already
   holding interrupt-disabling spinlocks.
2. -RT still needs to prevent scheduling there.
3. the problem is i915-specific.

[update: the radeon driver should also BUG like the i915 driver but
probably

  1   2   >