[Intel-gfx] Canceled event: XDC 2023 A Corunha Spain @ Tue Oct 17 - Thu Oct 19, 2023 (intel-gfx@lists.freedesktop.org)
BEGIN:VCALENDAR PRODID:-//Google Inc//Google Calendar 70.9054//EN VERSION:2.0 CALSCALE:GREGORIAN METHOD:CANCEL BEGIN:VEVENT DTSTART;VALUE=DATE:20231017 DTEND;VALUE=DATE:20231020 DTSTAMP:20230417T170848Z ORGANIZER;CN=mario.kleiner...@gmail.com:mailto:mario.kleiner...@gmail.com UID:65qeuuc9e0gll25tq5r7e61...@google.com ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=et na...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:etnaviv@lists.freedesktop .org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=xo rg-de...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:xorg-devel@lists.freed esktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=am d-gfx list;X-NUM-GUESTS=0:mailto:amd-...@lists.freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=in tel-gfx;X-NUM-GUESTS=0:mailto:intel-gfx@lists.freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=No uveau Dev;X-NUM-GUESTS=0:mailto:nouv...@lists.freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=ACCEPTED;CN=mario. kleiner...@gmail.com;X-NUM-GUESTS=0:mailto:mario.kleiner...@gmail.com ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=bo a...@foundation.x.org;X-NUM-GUESTS=0:mailto:bo...@foundation.x.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=li bre-soc-...@lists.libre-soc.org;X-NUM-GUESTS=0:mailto:libre-soc-dev@lists.l ibre-soc.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=ML mesa-dev;X-NUM-GUESTS=0:mailto:mesa-...@lists.freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=me mb...@x.org;X-NUM-GUESTS=0:mailto:memb...@x.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=fr eedr...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:freedreno@lists.freedes ktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=dr oidbit...@gmail.com;X-NUM-GUESTS=0:mailto:droidbit...@gmail.com ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=wa yland-de...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:wayland-devel@lists .freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=dr i-devel;X-NUM-GUESTS=0:mailto:dri-de...@lists.freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=si gles...@igalia.com;X-NUM-GUESTS=0:mailto:sigles...@igalia.com ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=ev e...@lists.x.org;X-NUM-GUESTS=0:mailto:eve...@lists.x.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;X-NUM -GUESTS=0:mailto:bibby.hs...@mediatek.com ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN="G arg, Rohan";X-NUM-GUESTS=0:mailto:rohan.g...@intel.com X-GOOGLE-CONFERENCE:https://meet.google.com/azn-uwfp-pgw CREATED:20230417T170310Z DESCRIPTION:Hello!\n \nRegistration & Call for Proposals are now open for X DC 2023\, which will\ntake place on October 17-19\, 2023.\n\nhttps://xdc202 3.x.org\n \nAs usual\, the conference is free of charge and open to the gen eral\npublic. If you plan on attending\, please make sure to register as ea rly\nas possible!\n \nIn order to register as attendee\, you will therefore need to register\nvia the XDC website.\n \nhttps://indico.freedesktop.org/ event/4/registrations/\n \nIn addition to registration\, the CfP is now ope n for talks\, workshops\nand demos at XDC 2023. While ...\n\n-::~:~::~:~:~: ~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-\n Join with Google Meet: https://meet.google.com/azn-uwfp-pgw\n\nLearn more a bout Meet at: https://support.google.com/a/users/answer/9282720\n\nPlease d o not edit this section.\n-::~:~::~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~ :~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::- LAST-MODIFIED:20230417T170847Z LOCATION: SEQUENCE:1 STATUS:CANCELLED SUMMARY:XDC 2023 A Corunha Spain TRANSP:TRANSPARENT END:VEVENT END:VCALENDAR invite.ics Description: application/ics
[Intel-gfx] Invitation: XDC 2023 A Corunha Spain @ Tue Oct 17 - Thu Oct 19, 2023 (intel-gfx@lists.freedesktop.org)
BEGIN:VCALENDAR PRODID:-//Google Inc//Google Calendar 70.9054//EN VERSION:2.0 CALSCALE:GREGORIAN METHOD:REQUEST BEGIN:VEVENT DTSTART;VALUE=DATE:20231017 DTEND;VALUE=DATE:20231020 DTSTAMP:20230417T170311Z ORGANIZER;CN=mario.kleiner...@gmail.com:mailto:mario.kleiner...@gmail.com UID:65qeuuc9e0gll25tq5r7e61...@google.com ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=etna...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:etnaviv@lists.f reedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=xorg-de...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:xorg-devel@l ists.freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=amd-gfx list;X-NUM-GUESTS=0:mailto:amd-...@lists.freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=intel-gfx;X-NUM-GUESTS=0:mailto:intel-gfx@lists.freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=Nouveau Dev;X-NUM-GUESTS=0:mailto:nouv...@lists.freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=ACCEPTED;RSVP=TRUE ;CN=mario.kleiner...@gmail.com;X-NUM-GUESTS=0:mailto:mario.kleiner.de@gmail .com ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=bo...@foundation.x.org;X-NUM-GUESTS=0:mailto:bo...@foundation.x.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=libre-soc-...@lists.libre-soc.org;X-NUM-GUESTS=0:mailto:libre-soc-d e...@lists.libre-soc.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=ML mesa-dev;X-NUM-GUESTS=0:mailto:mesa-...@lists.freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=memb...@x.org;X-NUM-GUESTS=0:mailto:memb...@x.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=freedr...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:freedreno@lis ts.freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=droidbit...@gmail.com;X-NUM-GUESTS=0:mailto:droidbit...@gmail.com ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=wayland-de...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:wayland-d e...@lists.freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=dri-devel;X-NUM-GUESTS=0:mailto:dri-de...@lists.freedesktop.org ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=sigles...@igalia.com;X-NUM-GUESTS=0:mailto:sigles...@igalia.com ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP= TRUE;CN=eve...@lists.x.org;X-NUM-GUESTS=0:mailto:eve...@lists.x.org X-GOOGLE-CONFERENCE:https://meet.google.com/azn-uwfp-pgw X-MICROSOFT-CDO-OWNERAPPTID:148915568 CREATED:20230417T170310Z DESCRIPTION:Hello!\n \nRegistration & Call for Proposals are now open for X DC 2023\, which will\ntake place on October 17-19\, 2023.\n\nhttps://xdc202 3.x.org\n \nAs usual\, the conference is free of charge and open to the gen eral\npublic. If you plan on attending\, please make sure to register as ea rly\nas possible!\n \nIn order to register as attendee\, you will therefore need to register\nvia the XDC website.\n \nhttps://indico.freedesktop.org/ event/4/registrations/\n \nIn addition to registration\, the CfP is now ope n for talks\, workshops\nand demos at XDC 2023. While ...\n\n-::~:~::~:~:~: ~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-\n Join with Google Meet: https://meet.google.com/azn-uwfp-pgw\n\nLearn more a bout Meet at: https://support.google.com/a/users/answer/9282720\n\nPlease d o not edit this section.\n-::~:~::~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~ :~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::- LAST-MODIFIED:20230417T170310Z LOCATION: SEQUENCE:0 STATUS:CONFIRMED SUMMARY:XDC 2023 A Corunha Spain TRANSP:TRANSPARENT BEGIN:VALARM ACTION:EMAIL DESCRIPTION:This is an event reminder SUMMARY:Alarm notification ATTENDEE:mailto:intel-gfx@lists.freedesktop.org TRIGGER:-P0DT0H30M0S END:VALARM BEGIN:VALARM ACTION:DISPLAY DESCRIPTION:This is an event reminder TRIGGER:-P0DT0H30M0S END:VALARM BEGIN:VALARM ACTION:EMAIL DESCRIPTION:This is an event reminder SUMMARY:Alarm notification ATTENDEE:mailto:intel-gfx@lists.freedesktop.org TRIGGER:-P0DT7H30M0S END:VALARM END:VEVENT END:VCALENDAR invite.ics Description: application/ics
Re: [Intel-gfx] [PATCH 4/8] drm/i915: Use preempt_disable/enable_rt() where recommended
On Fri, Feb 11, 2022 at 9:44 AM Sebastian Andrzej Siewior wrote: > > On 2022-01-27 00:29:37 [+0100], Mario Kleiner wrote: > > Hi, first thank you for implementing these preempt disables according to > Hi Mario, > > > the markers i left long ago. And sorry for the rather late reply. > > > > I had a look at the code, as of Linux 5.16, and did also a little test run > > (of a standard kernel, not with PREEMPT_RT, only > > CONFIG_PREEMPT_VOLUNTARY=y) on my Intel Kabylake GT2, so some thoughts: > > > > The area covers only register reads and writes. The part that worries me > > > is: > > > - __intel_get_crtc_scanline() the worst case is 100us if no match is > > > found. > > > > > > > This one can be a problem indeed on (maybe all?) modern Intel gpu's since > > Haswell, ie. the last ~10 years. I was able to reproduce it on my Kabylake > > Intel gpu. > > > > Most of the time that for-loop with up to 100 repetitions (~ 100 > > udelay(1) + one mmio register read) (cfe. > > https://elixir.bootlin.com/linux/v5.17-rc1/source/drivers/gpu/drm/i915/i915_irq.c#L856) > > will not execute, because most of the time that function gets called from > > the vblank irq handler and then that trigger condition (if > > (HAS_DDI(dev_priv) && !position)) is not true. However, it also gets called > > as part of power-saving on behalf of userspace context, whenever the > > desktop graphics goes idle for two video refresh cycles. If the desktop > > shows graphics activity again, and vblank interrupts need to get reenabled, > > the probability of hitting that case is then ~1-4% depending on video mode. > > How many loops it runs also varies. > > > > On my little Intel(R) Core(TM) i5-8250U CPU machine with a mostly idle > > desktop, I observed about one hit every couple of seconds of regular use, > > and each hit took between 125 usecs and almost 250 usecs. I guess udelay(1) > > can take a bit longer than 1 usec? > > it should get very close to this. Maybe something else extended the time > depending on what you observe. > Probably all the other stuff in that for-loop adds a microsecond. I don't have a good feeling how long a typical mmio register read is expected to take, except for quite a bit less than 1 usec from my experience. > > So that's too much for preempt-rt. What one could do is the following: > > > > 1. In the for-loop in __intel_get_crtc_scanline(), add a preempt_enable() > > before the udelay(1); and a preempt_disable() again after it. Or > > potentially around the whole for-loop if the overhead of > > preempt_en/disable() is significant? > > It is very optimized on x86 ;) Good! So adding a disable/enable pair into each of those loop iterations won't hurt. > > > 2. In intel_get_crtc_scanline() also wrap the call to > > __intel_get_crtc_scanline() into a preempt_disable() and preempt_enable(), > > so we can be sure that __intel_get_crtc_scanline() always gets called with > > preemption disabled. > > > > Why should this work ok'ish? The point of the original preempt disable > > inside i915_get_crtc_scanoutpos > > <https://elixir.bootlin.com/linux/v5.17-rc1/C/ident/i915_get_crtc_scanoutpos> > > is that those two *stime = ktime_get() and *etime = ktime_get() clock > > queries happen as close to the scanout position query as possible to get a > > small confidence interval for when exactly the scanoutpos was > > read/determined from the display hardware. error = (etime - stime) is the > > error margin. If that margin becomes greater than 20 usecs, then the > > higher-level code will consider the measurement invalid and repeat the > > whole procedure up to 3 times before giving up. > > The preempt-disable is needed then? The task is preemptible here on > PREEMPT_RT but it _may_ not come to this. The difference vs !RT is that > an interrupt will preempt this code without it. > Yes, it is needed, as that chunk of code between the two ktime_get() requires should ideally not get interrupted by anything. The "try up to three times" higher level logic in calling code is just to cover the hopefully rare cases where something still preempts, e.g., a NMI or such. I have not ever tested this on a PREEMPT_RT kernel in at least a decade, but on regular kernels, e.g., Ubuntu generic or Ubuntu low-latency kernels I haven't observed more than one retry when it mattered, and usually the code executes in 0-2 usecs on my test machines, way below the limit of 20 usecs at which a measurement is considered failed and then retried. So the retries are sufficient as long as all preventable preemption is prevented. Hence the preempt_disable() ann
Re: [Intel-gfx] [PATCH 4/8] drm/i915: Use preempt_disable/enable_rt() where recommended
On Tue, Dec 14, 2021 at 3:03 PM Sebastian Andrzej Siewior < bige...@linutronix.de> wrote: > From: Mike Galbraith > > Mario Kleiner suggest in commit > ad3543ede630f ("drm/intel: Push get_scanout_position() timestamping into > kms driver.") > > a spots where preemption should be disabled on PREEMPT_RT. The > difference is that on PREEMPT_RT the intel_uncore::lock disables neither > preemption nor interrupts and so region remains preemptible. > > Hi, first thank you for implementing these preempt disables according to the markers i left long ago. And sorry for the rather late reply. I had a look at the code, as of Linux 5.16, and did also a little test run (of a standard kernel, not with PREEMPT_RT, only CONFIG_PREEMPT_VOLUNTARY=y) on my Intel Kabylake GT2, so some thoughts: The area covers only register reads and writes. The part that worries me > is: > - __intel_get_crtc_scanline() the worst case is 100us if no match is > found. > This one can be a problem indeed on (maybe all?) modern Intel gpu's since Haswell, ie. the last ~10 years. I was able to reproduce it on my Kabylake Intel gpu. Most of the time that for-loop with up to 100 repetitions (~ 100 udelay(1) + one mmio register read) (cfe. https://elixir.bootlin.com/linux/v5.17-rc1/source/drivers/gpu/drm/i915/i915_irq.c#L856) will not execute, because most of the time that function gets called from the vblank irq handler and then that trigger condition (if (HAS_DDI(dev_priv) && !position)) is not true. However, it also gets called as part of power-saving on behalf of userspace context, whenever the desktop graphics goes idle for two video refresh cycles. If the desktop shows graphics activity again, and vblank interrupts need to get reenabled, the probability of hitting that case is then ~1-4% depending on video mode. How many loops it runs also varies. On my little Intel(R) Core(TM) i5-8250U CPU machine with a mostly idle desktop, I observed about one hit every couple of seconds of regular use, and each hit took between 125 usecs and almost 250 usecs. I guess udelay(1) can take a bit longer than 1 usec? So that's too much for preempt-rt. What one could do is the following: 1. In the for-loop in __intel_get_crtc_scanline(), add a preempt_enable() before the udelay(1); and a preempt_disable() again after it. Or potentially around the whole for-loop if the overhead of preempt_en/disable() is significant? 2. In intel_get_crtc_scanline() also wrap the call to __intel_get_crtc_scanline() into a preempt_disable() and preempt_enable(), so we can be sure that __intel_get_crtc_scanline() always gets called with preemption disabled. Why should this work ok'ish? The point of the original preempt disable inside i915_get_crtc_scanoutpos <https://elixir.bootlin.com/linux/v5.17-rc1/C/ident/i915_get_crtc_scanoutpos> is that those two *stime = ktime_get() and *etime = ktime_get() clock queries happen as close to the scanout position query as possible to get a small confidence interval for when exactly the scanoutpos was read/determined from the display hardware. error = (etime - stime) is the error margin. If that margin becomes greater than 20 usecs, then the higher-level code will consider the measurement invalid and repeat the whole procedure up to 3 times before giving up. Normally, in my experience with different graphics chips, one would observe error < 3 usecs, so the measurement almost always succeeds at first try, only very rarely takes two attempts. The preempt disable is meant to make sure that this stays the case on a PREEMPT_RT kernel. The problem here are the relatively rare cases where we hit that up to 100 iterations for-loop. Here even on a regular kernel, due to hardware quirks, we already exceed the 20 usecs tolerance by a huge amount of more than 100 usecs, leading to a retry of the measurement. And my tests showed that often the two succeeding retries also fail, because of hardware quirks can apparently create a blackout situation approaching 1 msec, so we lose anyway, regardless if we get preempted on a RT kernel or not. That's why enabling preemption on RT again during that for-loop should not make the situation worse and at least keep RT as real-time as intended. In practice I would also expect that this failure case is the one least likely to impair userspace applications greatly in practice. The cases that mostly matter are the ones executed during vblank hardware irq, where the for-loop never executes and error margin and preempt off time is only about 1 usec. My own software which depends on very precise timestamps from the mechanism never reported >> 20 usecs errors during startup tests or runtime tests. > - intel_crtc_scanlines_since_frame_timestamp() not sure how long this > may take in the worst case. > > intel_crtc_scanlines_since_frame_timestamp() should be harmless. That do-while loop just tries to make sure that two
Re: [Intel-gfx] [PATCH v2 2/7] drm/uAPI: Add "active bpc" as feedback channel for "max bpc" drm property
On Thu, Jun 10, 2021 at 9:55 AM Pekka Paalanen wrote: > > On Tue, 8 Jun 2021 19:43:15 +0200 > Werner Sembach wrote: > > > Add a new general drm property "active bpc" which can be used by graphic > > drivers > > to report the applied bit depth per pixel back to userspace. > > Maybe "bit depth per pixel" -> "bit depth per pixel color component" for slightly more clarity? > > While "max bpc" can be used to change the color depth, there was no way to > > check > > which one actually got used. While in theory the driver chooses the > > best/highest > > color depth within the max bpc setting a user might not be fully aware what > > his > > hardware is or isn't capable off. This is meant as a quick way to double > > check > > the setup. > > > > In the future, automatic color calibration for screens might also depend on > > this > > information being available. > > > > Signed-off-by: Werner Sembach > > --- > > drivers/gpu/drm/drm_atomic_uapi.c | 2 ++ > > drivers/gpu/drm/drm_connector.c | 41 +++ > > include/drm/drm_connector.h | 15 +++ > > 3 files changed, 58 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_atomic_uapi.c > > b/drivers/gpu/drm/drm_atomic_uapi.c > > index 268bb69c2e2f..7ae4e40936b5 100644 > > --- a/drivers/gpu/drm/drm_atomic_uapi.c > > +++ b/drivers/gpu/drm/drm_atomic_uapi.c > > @@ -873,6 +873,8 @@ drm_atomic_connector_get_property(struct drm_connector > > *connector, > > *val = 0; > > } else if (property == connector->max_bpc_property) { > > *val = state->max_requested_bpc; > > + } else if (property == connector->active_bpc_property) { > > + *val = state->active_bpc; > > } else if (connector->funcs->atomic_get_property) { > > return connector->funcs->atomic_get_property(connector, > > state, property, val); > > diff --git a/drivers/gpu/drm/drm_connector.c > > b/drivers/gpu/drm/drm_connector.c > > index 7631f76e7f34..c0c3c09bfed0 100644 > > --- a/drivers/gpu/drm/drm_connector.c > > +++ b/drivers/gpu/drm/drm_connector.c > > @@ -1195,6 +1195,14 @@ static const struct drm_prop_enum_list > > dp_colorspaces[] = { > > * drm_connector_attach_max_bpc_property() to create and attach the > > * property to the connector during initialization. > > * > > + * active bpc: > > + * This read-only range property tells userspace the pixel color bit > > depth > > + * actually used by the hardware display engine on "the cable" on a > > + * connector. The chosen value depends on hardware capabilities, both > > + * display engine and connected monitor, and the "max bpc" property. > > + * Drivers shall use drm_connector_attach_active_bpc_property() to > > install > > + * this property. > > + * > > This description is now clear to me, but I wonder, is it also how > others understand it wrt. dithering? > > Dithering done on monitor is irrelevant, because we are talking about > "on the cable" pixels. But since we are talking about "on the cable" > pixels, also dithering done by the display engine must not factor in. > Should the dithering done by display engine result in higher "active > bpc" number than what is actually transmitted on the cable? > > I cannot guess what userspace would want exactly. I think the > strict "on the cable" interpretation is a safe bet, because it then > gives a lower limit on observed bpc. Dithering settings should be > exposed with other KMS properties, so userspace can factor those in. > But to be absolutely sure, we'd have to ask some color management > experts. > > Cc'ing Mario in case he has an opinion. > Thanks. I like this a lot, in fact such a connector property was on my todo list / wish list for something like that! I agree with the "active bpc" definition here in this patch and Pekka's comments. I want what goes out over the cable, not including any effects of dithering. At least AMD's amdpu-kms driver exposes "active bpc" already as a per-connector property in debugfs, and i use reported output from there a lot to debug problems with respect to HDR display or high color precision output, and to verify i'm not fooling myself wrt. what goes out, compared to what dithering may "fake" on top of it. Software like mine would greatly benefit from getting this directly off the connector, ie. as a RandR output property, just like with "max bpc", as mapping X11 output names to driver output names is a guessing game, directing regular users to those debugfs files is tedious and error prone, and many regular users don't have root permissions anyway. Sometimes one wants to prioritize "active bpc" over resolution or refresh rate, and especially on now more common HDR displays, and actual bit depth also changes depending on bandwidth requirements vs. availability, and how well DP link training went with a flaky or loose cable, like only getting 10 bpc for HDR-10 when running on less than maximum
Re: [Intel-gfx] [PATCH 18/18] drm/i915/display13: Enabling dithering after the CC1 pipe
On Fri, Feb 19, 2021 at 4:22 AM Mario Kleiner wrote: > > > On Thu, Feb 11, 2021 at 1:29 PM Ville Syrjälä < > ville.syrj...@linux.intel.com> wrote: > >> On Thu, Jan 28, 2021 at 11:24:13AM -0800, Matt Roper wrote: >> > From: Nischal Varide >> > >> > If the panel is 12bpc then Dithering is not enabled in the Legacy >> > dithering block , instead its Enabled after the C1 CC1 pipe post >> > color space conversion.For a 6bpc pannel Dithering is enabled in >> > Legacy block. >> >> Dithering is probably going to require a whole uapi bikeshed. >> Not sure we can just enable it unilaterally. >> >> Ccing dri-devel, and Mario who had issues with dithering in the >> past... >> >> Thanks for the cc Ville! > > The problem with dithering on Intel is that various tested Intel gpu's > (Ironlake, IvyBridge, Haswell, Skylake iirc.) are dithering when they > shouldn't. If one has a standard 8 bpc framebuffer feeding into a standard > (legacy) 256 slots, 8 bit wide lut which was loaded with an identity > mapping, feeding into a standard 8 bpc video output (DVI/HDMI/DP), the > expected result is that pixels rendered into the framebuffer show up > unmodified at the video output. What happens instead is that some dithering > is needlessly applied. This is bad for various neuroscience/medical > research equipment that requires pixels to pass unmodified in a pure 8 bpc > configuration, e.g., because some digital info is color-encoded in-band in > the rendered image to control research hardware, a la "if rgb pixel (123, > 12, 23) is detected in the digital video stream, emit some trigger signal, > or timestamp that moment with a hw clock, or start or stop some scientific > recording equipment". Also there exist specialized visual stimulators to > drive special displays with more than 12 bpc, e.g., 16 bpc, and so they > encode the 8MSB of 16 bpc color values in pixels in even columns, and the > 8LSB in the odd columns of the framebuffer. Unexpected dithering makes such > equipment completely unusable. By now I must have spent months of my life, > just trying to deal with dithering induced problems on different gpu's due > to hw quirks or bugs somewhere in the graphics stack. > > Atm. the intel kms driver disables dithering for anything with >= 8 bpc as > a fix for this harmful hardware quirk. > > Ideally we'd have uapi that makes dithering controllable per connector > (on/off/auto, selectable depth), also in a way that those controls are > exposed as RandR output properties, easily controllable by X clients. And > some safe default in case the client can't access the properties (like I'd > expect to happen with the dozens of Wayland compositors under the sun). > Various drivers had this over time, e.g., AMD classic kms path (if i don't > misremember) and nouveau, but some of it also got lost in the new atomic > kms variants, and Intel never exposed this. > > Or maybe some method that checks the values actually stored in the hw > lut's, CTM etc. and if the values suggest no dithering should be needed, > disable the dithering. E.g., if output depth is 8 bpc, one only needs > dithering if the slots in the final active hw lut do have any meaningful > values in the lower bits below the top 8 MSB, ie. if the content is > actually > 8 bpc net bit depth. > > -mario > > One cup of coffee later... I think this specific patch should be ok wrt. my use cases. The majority of the above mentioned research devices are single/dual-link DVI receivers, ie. 8 bpc video sinks. I'm only aware of one recent device that has a DisplayPort receiver who could act as a > 8 bpc video sink. See the following link for advanced examples of such devices: https://vpixx.com/our-products/video-i-o-hub/ I cannot think of a use case that would require more than 8 bits for inband signalling given that that was good enough for the last 20 years, or for encoding very high color precision content -- the 16 bpc precision that one can get out of the current even/odd pixel = 8 MSB + 8 LSB encoding scheme should be enough for the foreseeable future. Therefore dithering shouldn't pose a problem if it leaves the 8 MSB of each pixel color component intact, and spatial dithering as employed here usually only touches the least significant bit (or maybe the 2 LSB's?). As this patch only enables dithering on 12 bpc video sinks, if i understand pipe_bpp correctly, it could only "corrupt" one bit and leave at least the 10-11 MSB's intact, right? pipe_bpp == 24 is the case that would really hurt a lot of researchers if dithering would be enabled without providing good uapi or other mechanisms to prevent it. So: Acked-by: Mario Kleiner One suggestion: It would be good to also add a bit of
Re: [Intel-gfx] [PATCH 18/18] drm/i915/display13: Enabling dithering after the CC1 pipe
On Thu, Feb 11, 2021 at 1:29 PM Ville Syrjälä wrote: > On Thu, Jan 28, 2021 at 11:24:13AM -0800, Matt Roper wrote: > > From: Nischal Varide > > > > If the panel is 12bpc then Dithering is not enabled in the Legacy > > dithering block , instead its Enabled after the C1 CC1 pipe post > > color space conversion.For a 6bpc pannel Dithering is enabled in > > Legacy block. > > Dithering is probably going to require a whole uapi bikeshed. > Not sure we can just enable it unilaterally. > > Ccing dri-devel, and Mario who had issues with dithering in the > past... > > Thanks for the cc Ville! The problem with dithering on Intel is that various tested Intel gpu's (Ironlake, IvyBridge, Haswell, Skylake iirc.) are dithering when they shouldn't. If one has a standard 8 bpc framebuffer feeding into a standard (legacy) 256 slots, 8 bit wide lut which was loaded with an identity mapping, feeding into a standard 8 bpc video output (DVI/HDMI/DP), the expected result is that pixels rendered into the framebuffer show up unmodified at the video output. What happens instead is that some dithering is needlessly applied. This is bad for various neuroscience/medical research equipment that requires pixels to pass unmodified in a pure 8 bpc configuration, e.g., because some digital info is color-encoded in-band in the rendered image to control research hardware, a la "if rgb pixel (123, 12, 23) is detected in the digital video stream, emit some trigger signal, or timestamp that moment with a hw clock, or start or stop some scientific recording equipment". Also there exist specialized visual stimulators to drive special displays with more than 12 bpc, e.g., 16 bpc, and so they encode the 8MSB of 16 bpc color values in pixels in even columns, and the 8LSB in the odd columns of the framebuffer. Unexpected dithering makes such equipment completely unusable. By now I must have spent months of my life, just trying to deal with dithering induced problems on different gpu's due to hw quirks or bugs somewhere in the graphics stack. Atm. the intel kms driver disables dithering for anything with >= 8 bpc as a fix for this harmful hardware quirk. Ideally we'd have uapi that makes dithering controllable per connector (on/off/auto, selectable depth), also in a way that those controls are exposed as RandR output properties, easily controllable by X clients. And some safe default in case the client can't access the properties (like I'd expect to happen with the dozens of Wayland compositors under the sun). Various drivers had this over time, e.g., AMD classic kms path (if i don't misremember) and nouveau, but some of it also got lost in the new atomic kms variants, and Intel never exposed this. Or maybe some method that checks the values actually stored in the hw lut's, CTM etc. and if the values suggest no dithering should be needed, disable the dithering. E.g., if output depth is 8 bpc, one only needs dithering if the slots in the final active hw lut do have any meaningful values in the lower bits below the top 8 MSB, ie. if the content is actually > 8 bpc net bit depth. -mario > > > Cc: Uma Shankar > > Signed-off-by: Nischal Varide > > Signed-off-by: Bhanuprakash Modem > > Signed-off-by: Matt Roper > > --- > > drivers/gpu/drm/i915/display/intel_color.c | 16 > > drivers/gpu/drm/i915/display/intel_display.c | 9 - > > drivers/gpu/drm/i915/i915_reg.h | 3 ++- > > 3 files changed, 26 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/display/intel_color.c > b/drivers/gpu/drm/i915/display/intel_color.c > > index ff7dcb7088bf..9a0572bbc5db 100644 > > --- a/drivers/gpu/drm/i915/display/intel_color.c > > +++ b/drivers/gpu/drm/i915/display/intel_color.c > > @@ -1604,6 +1604,20 @@ static u32 icl_csc_mode(const struct > intel_crtc_state *crtc_state) > > return csc_mode; > > } > > > > +static u32 dither_after_cc1_12bpc(const struct intel_crtc_state > *crtc_state) > > +{ > > + u32 gamma_mode = crtc_state->gamma_mode; > > + struct drm_i915_private *i915 = > to_i915(crtc_state->uapi.crtc->dev); > > + > > + if (HAS_DISPLAY13(i915)) { > > + if (!crtc_state->dither_force_disable && > > + (crtc_state->pipe_bpp == 36)) > > + gamma_mode |= GAMMA_MODE_DITHER_AFTER_CC1; > > + } > > + > > + return gamma_mode; > > +} > > + > > static int icl_color_check(struct intel_crtc_state *crtc_state) > > { > > int ret; > > @@ -1614,6 +1628,8 @@ static int icl_color_check(struct intel_crtc_state > *crtc_state) > > > > crtc_state->gamma_mode = icl_gamma_mode(crtc_state); > > > > + crtc_state->gamma_mode = dither_after_cc1_12bpc(crtc_state); > > + > > crtc_state->csc_mode = icl_csc_mode(crtc_state); > > > > crtc_state->preload_luts = intel_can_preload_luts(crtc_state); > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c > b/drivers/gpu/drm/i915/display/intel_display.c > > index
[Intel-gfx] [PATCH] drm/i915/dp: Add dpcd link_rate quirk for Apple 15" MBP 2017 (v3)
This fixes a problem found on the MacBookPro 2017 Retina panel. The panel reports 10 bpc color depth in its EDID, and the firmware chooses link settings at boot which support enough bandwidth for 10 bpc (324000 kbit/sec = multiplier 0xc), but the DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps (multiplier value 0xa) as possible, in direct contradiction of what the firmware successfully set up. This restricts the panel to 8 bpc, not providing the full color depth of the panel. This patch adds a quirk specific to the MBP 2017 15" Retina panel to add the additiional 324000 kbps link rate during edp setup. Link to previous discussion of a different attempted fix with Ville and Jani: https://patchwork.kernel.org/patch/11325935/ v2: Follow Jani's proposal of defining quirk_rates[] instead of just appending 324000. This for better clarity. v3: Rebased onto current drm-tip, as of 16-March-2020. Adapt to new edid_quirks parameter of drm_dp_has_quirk(). Signed-off-by: Mario Kleiner Tested-by: Mario Kleiner Cc: Jani Nikula --- drivers/gpu/drm/drm_dp_helper.c | 2 ++ drivers/gpu/drm/i915/display/intel_dp.c | 11 +++ include/drm/drm_dp_helper.h | 7 +++ 3 files changed, 20 insertions(+) diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c index c6fbe6e6bc9d..8ba4531e808d 100644 --- a/drivers/gpu/drm/drm_dp_helper.c +++ b/drivers/gpu/drm/drm_dp_helper.c @@ -1238,6 +1238,8 @@ static const struct dpcd_quirk dpcd_quirk_list[] = { { OUI(0x00, 0x00, 0x00), DEVICE_ID('C', 'H', '7', '5', '1', '1'), false, BIT(DP_DPCD_QUIRK_NO_SINK_COUNT) }, /* Synaptics DP1.4 MST hubs can support DSC without virtual DPCD */ { OUI(0x90, 0xCC, 0x24), DEVICE_ID_ANY, true, BIT(DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD) }, + /* Apple MacBookPro 2017 15 inch eDP Retina panel reports too low DP_MAX_LINK_RATE */ + { OUI(0x00, 0x10, 0xfa), DEVICE_ID(101, 68, 21, 101, 98, 97), false, BIT(DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS) }, }; #undef OUI diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 0a417cd2af2b..ef2e06e292d5 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -164,6 +164,17 @@ static void intel_dp_set_sink_rates(struct intel_dp *intel_dp) }; int i, max_rate; + if (drm_dp_has_quirk(_dp->desc, 0, +DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS)) { + /* Needed, e.g., for Apple MBP 2017, 15 inch eDP Retina panel */ + static const int quirk_rates[] = { 162000, 27, 324000 }; + + memcpy(intel_dp->sink_rates, quirk_rates, sizeof(quirk_rates)); + intel_dp->num_sink_rates = ARRAY_SIZE(quirk_rates); + + return; + } + max_rate = drm_dp_bw_code_to_link_rate(intel_dp->dpcd[DP_MAX_LINK_RATE]); for (i = 0; i < ARRAY_SIZE(dp_rates); i++) { diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h index c6119e4c169a..9d87cdf2740a 100644 --- a/include/drm/drm_dp_helper.h +++ b/include/drm/drm_dp_helper.h @@ -1548,6 +1548,13 @@ enum drm_dp_quirk { * capabilities advertised. */ DP_QUIRK_FORCE_DPCD_BACKLIGHT, + /** +* @DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS: +* +* The device supports a link rate of 3.24 Gbps (multiplier 0xc) despite +* the DP_MAX_LINK_RATE register reporting a lower max multiplier. +*/ + DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS, }; /** -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v4 2/2] drm/dp: Add function to parse EDID descriptors for adaptive sync limits
Just as a comment, u8 for max_vfreq in struct drm_adaptive_sync_info might be not very future proof? I just read that ASUS announced a "TUF Gaming VG259QM" monitor which seems to have an adaptive sync range of 48 Hz to 280 Hz, exceeding the max 255 Hz of u8? -mario On Fri, Mar 6, 2020 at 4:02 PM Kazlauskas, Nicholas wrote: > > On 2020-03-05 8:42 p.m., Manasi Navare wrote: > > Adaptive Sync is a VESA feature so add a DRM core helper to parse > > the EDID's detailed descritors to obtain the adaptive sync monitor range. > > Store this info as part fo drm_display_info so it can be used > > across all drivers. > > This part of the code is stripped out of amdgpu's function > > amdgpu_dm_update_freesync_caps() to make it generic and be used > > across all DRM drivers > > > > v4: > > * Use is_display_descriptor() (Ville) > > * Name the monitor range flags (Ville) > > v3: > > * Remove the edid parsing restriction for just DP (Nicholas) > > * Use drm_for_each_detailed_block (Ville) > > * Make the drm_get_adaptive_sync_range function static (Harry, Jani) > > v2: > > * Change vmin and vmax to use u8 (Ville) > > * Dont store pixel clock since that is just a max dotclock > > and not related to VRR mode (Manasi) > > > > Cc: Ville Syrjälä > > Cc: Harry Wentland > > Cc: Clinton A Taylor > > Cc: Kazlauskas Nicholas > > Signed-off-by: Manasi Navare > > Looks good to me now. I'm fine with whether we want to rename the flags > or not, I don't have much of a preference either way. > > Series is: > > Reviewed-by: Nicholas Kazlauskas > > Regards, > Nicholas Kazlauskas > > > --- > > drivers/gpu/drm/drm_edid.c | 44 + > > include/drm/drm_connector.h | 22 +++ > > 2 files changed, 66 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c > > index ad41764a4ebe..61ed544d9535 100644 > > --- a/drivers/gpu/drm/drm_edid.c > > +++ b/drivers/gpu/drm/drm_edid.c > > @@ -4938,6 +4938,47 @@ static void drm_parse_cea_ext(struct drm_connector > > *connector, > > } > > } > > > > +static > > +void get_adaptive_sync_range(struct detailed_timing *timing, > > + void *info_adaptive_sync) > > +{ > > + struct drm_adaptive_sync_info *adaptive_sync = info_adaptive_sync; > > + const struct detailed_non_pixel *data = >data.other_data; > > + const struct detailed_data_monitor_range *range = >data.range; > > + > > + if (!is_display_descriptor((const u8 *)timing, > > EDID_DETAIL_MONITOR_RANGE)) > > + return; > > + > > + /* > > + * Check for flag range limits only. If flag == 1 then > > + * no additional timing information provided. > > + * Default GTF, GTF Secondary curve and CVT are not > > + * supported > > + */ > > + if (range->flags != EDID_RANGE_LIMITS_ONLY_FLAG) > > + return; > > + > > + adaptive_sync->min_vfreq = range->min_vfreq; > > + adaptive_sync->max_vfreq = range->max_vfreq; > > +} > > + > > +static > > +void drm_get_adaptive_sync_range(struct drm_connector *connector, > > + const struct edid *edid) > > +{ > > + struct drm_display_info *info = >display_info; > > + > > + if (!version_greater(edid, 1, 1)) > > + return; > > + > > + drm_for_each_detailed_block((u8 *)edid, get_adaptive_sync_range, > > + >adaptive_sync); > > + > > + DRM_DEBUG_KMS("Adaptive Sync refresh rate range is %d Hz - %d Hz\n", > > + info->adaptive_sync.min_vfreq, > > + info->adaptive_sync.max_vfreq); > > +} > > + > > /* A connector has no EDID information, so we've got no EDID to compute > > quirks from. Reset > >* all of the values which would have been set from EDID > >*/ > > @@ -4960,6 +5001,7 @@ drm_reset_display_info(struct drm_connector > > *connector) > > memset(>hdmi, 0, sizeof(info->hdmi)); > > > > info->non_desktop = 0; > > + memset(>adaptive_sync, 0, sizeof(info->adaptive_sync)); > > } > > > > u32 drm_add_display_info(struct drm_connector *connector, const struct > > edid *edid) > > @@ -4975,6 +5017,8 @@ u32 drm_add_display_info(struct drm_connector > > *connector, const struct edid *edi > > > > info->non_desktop = !!(quirks & EDID_QUIRK_NON_DESKTOP); > > > > + drm_get_adaptive_sync_range(connector, edid); > > + > > DRM_DEBUG_KMS("non_desktop set to %d\n", info->non_desktop); > > > > if (edid->revision < 3) > > diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h > > index 0df7a95ca5d9..2b22c0fa42c4 100644 > > --- a/include/drm/drm_connector.h > > +++ b/include/drm/drm_connector.h > > @@ -254,6 +254,23 @@ enum drm_panel_orientation { > > DRM_MODE_PANEL_ORIENTATION_RIGHT_UP, > > }; > > > > +/** > > + * struct drm_adaptive_sync_info - Panel's Adaptive Sync capabilities for > > + * _display_info > > + * > > + * This struct is used to store a Panel's
Re: [Intel-gfx] [PATCH] drm/i915/dp: Add dpcd link_rate quirk for Apple 15" MBP 2017
On Wed, Mar 4, 2020 at 4:32 PM Jani Nikula wrote: > > On Sat, 29 Feb 2020, Mario Kleiner wrote: > > This fixes a problem found on the MacBookPro 2017 Retina panel. > > > > The panel reports 10 bpc color depth in its EDID, and the > > firmware chooses link settings at boot which support enough > > bandwidth for 10 bpc (324000 kbit/sec = multiplier 0xc), > > but the DP_MAX_LINK_RATE dpcd register only reports > > 2.7 Gbps (multiplier value 0xa) as possible, in direct > > contradiction of what the firmware successfully set up. > > > > This restricts the panel to 8 bpc, not providing the full > > color depth of the panel. > > > > This patch adds a quirk specific to the MBP 2017 15" Retina > > panel to add the additiional 324000 kbps link rate during > > edp setup. > > > > Link to previous discussion of a different attempted fix > > with Ville and Jani: > > > > https://patchwork.kernel.org/patch/11325935/ > > > > Signed-off-by: Mario Kleiner > > Cc: Ville Syrjälä > > Cc: Jani Nikula > > --- > > drivers/gpu/drm/drm_dp_helper.c | 2 ++ > > drivers/gpu/drm/i915/display/intel_dp.c | 7 +++ > > include/drm/drm_dp_helper.h | 7 +++ > > 3 files changed, 16 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_dp_helper.c > > b/drivers/gpu/drm/drm_dp_helper.c > > index 5a103e9b3c86..36a371c016cb 100644 > > --- a/drivers/gpu/drm/drm_dp_helper.c > > +++ b/drivers/gpu/drm/drm_dp_helper.c > > @@ -1179,6 +1179,8 @@ static const struct dpcd_quirk dpcd_quirk_list[] = { > > { OUI(0x00, 0x00, 0x00), DEVICE_ID('C', 'H', '7', '5', '1', '1'), > > false, BIT(DP_DPCD_QUIRK_NO_SINK_COUNT) }, > > /* Synaptics DP1.4 MST hubs can support DSC without virtual DPCD */ > > { OUI(0x90, 0xCC, 0x24), DEVICE_ID_ANY, true, > > BIT(DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD) }, > > + /* Apple MacBookPro 2017 15 inch eDP Retina panel reports too low > > DP_MAX_LINK_RATE */ > > + { OUI(0x00, 0x10, 0xfa), DEVICE_ID(101, 68, 21, 101, 98, 97), false, > > BIT(DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS) }, > > }; > > > > #undef OUI > > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c > > b/drivers/gpu/drm/i915/display/intel_dp.c > > index 4074d83b1a5f..1f6bd659ad41 100644 > > --- a/drivers/gpu/drm/i915/display/intel_dp.c > > +++ b/drivers/gpu/drm/i915/display/intel_dp.c > > @@ -178,6 +178,13 @@ static void intel_dp_set_sink_rates(struct intel_dp > > *intel_dp) > > } > > > > intel_dp->num_sink_rates = i; > > + > > + if (drm_dp_has_quirk(_dp->desc, > > + DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS)) { > > + /* Needed for Apple MBP 2017, 15 inch eDP Retina panel */ > > + intel_dp->sink_rates[i] = 324000; > > + intel_dp->num_sink_rates++; > > + } > > If we can isolate the quirk to this one function, I'll be happy. \o/ > Me too \o/ - Patch v2 is out, following your proposal, retested on the machine, works. cat ... i915_display_info reports a pipe depth of 30 bpp, instead of 24 bpp. I didn't add a stable tag, but wonder if a cc stable tag could be added by you, if you think it is minimal enough, to get it also into the kernels for the spring distro updates. In any case, case closed. Thanks for the review, -mario > However, even if this might work on said machine, I'd prefer it if we > didn't give the idea that you could just append a value in sink_rates > (it must be sorted). How about putting something like this in the > beginning of the function, to be a bit more explicit: > > if (quirk) { > static const int quirk_rates[] = { 162000, 27, 324000 }; > > memcpy(intel_dp->sink_rates, quirk_rates, > sizeof(quirk_rates)); > intel_dp->num_sink_rates = ARRAY_SIZE(quirk_rates); > > return; > } > > BR, > Jani. > > > } > > > > /* Get length of rates array potentially limited by max_rate. */ > > diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h > > index 262faf9e5e94..4b86a1f2a559 100644 > > --- a/include/drm/drm_dp_helper.h > > +++ b/include/drm/drm_dp_helper.h > > @@ -1532,6 +1532,13 @@ enum drm_dp_quirk { > >* The DSC caps can be read from the physical aux instead. > >*/ > > DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD, > > + /** > > + * @DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS: > > + * > > + * The device supports a link rate of 3.24 Gbps (multiplier 0xc) > > despite > > + * the DP_MAX_LINK_RATE register reporting a lower max multiplier. > > + */ > > + DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS, > > }; > > > > /** > > -- > Jani Nikula, Intel Open Source Graphics Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/dp: Add dpcd link_rate quirk for Apple 15" MBP 2017 (v2)
This fixes a problem found on the MacBookPro 2017 Retina panel. The panel reports 10 bpc color depth in its EDID, and the firmware chooses link settings at boot which support enough bandwidth for 10 bpc (324000 kbit/sec = multiplier 0xc), but the DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps (multiplier value 0xa) as possible, in direct contradiction of what the firmware successfully set up. This restricts the panel to 8 bpc, not providing the full color depth of the panel. This patch adds a quirk specific to the MBP 2017 15" Retina panel to add the additiional 324000 kbps link rate during edp setup. Link to previous discussion of a different attempted fix with Ville and Jani: https://patchwork.kernel.org/patch/11325935/ v2: Follow Jani's proposal of defining quirk_rates[] instead of just appending 324000. This for better clarity. Signed-off-by: Mario Kleiner Tested-by: Mario Kleiner Cc: Ville Syrjälä Cc: Jani Nikula --- drivers/gpu/drm/drm_dp_helper.c | 2 ++ drivers/gpu/drm/i915/display/intel_dp.c | 11 +++ include/drm/drm_dp_helper.h | 7 +++ 3 files changed, 20 insertions(+) diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c index 5a103e9b3c86..36a371c016cb 100644 --- a/drivers/gpu/drm/drm_dp_helper.c +++ b/drivers/gpu/drm/drm_dp_helper.c @@ -1179,6 +1179,8 @@ static const struct dpcd_quirk dpcd_quirk_list[] = { { OUI(0x00, 0x00, 0x00), DEVICE_ID('C', 'H', '7', '5', '1', '1'), false, BIT(DP_DPCD_QUIRK_NO_SINK_COUNT) }, /* Synaptics DP1.4 MST hubs can support DSC without virtual DPCD */ { OUI(0x90, 0xCC, 0x24), DEVICE_ID_ANY, true, BIT(DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD) }, + /* Apple MacBookPro 2017 15 inch eDP Retina panel reports too low DP_MAX_LINK_RATE */ + { OUI(0x00, 0x10, 0xfa), DEVICE_ID(101, 68, 21, 101, 98, 97), false, BIT(DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS) }, }; #undef OUI diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 4074d83b1a5f..c0d2c70b04fb 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -169,6 +169,17 @@ static void intel_dp_set_sink_rates(struct intel_dp *intel_dp) }; int i, max_rate; + if (drm_dp_has_quirk(_dp->desc, +DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS)) { + /* Needed, e.g., for Apple MBP 2017, 15 inch eDP Retina panel */ + static const int quirk_rates[] = { 162000, 27, 324000 }; + + memcpy(intel_dp->sink_rates, quirk_rates, sizeof(quirk_rates)); + intel_dp->num_sink_rates = ARRAY_SIZE(quirk_rates); + + return; + } + max_rate = drm_dp_bw_code_to_link_rate(intel_dp->dpcd[DP_MAX_LINK_RATE]); for (i = 0; i < ARRAY_SIZE(dp_rates); i++) { diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h index 262faf9e5e94..4b86a1f2a559 100644 --- a/include/drm/drm_dp_helper.h +++ b/include/drm/drm_dp_helper.h @@ -1532,6 +1532,13 @@ enum drm_dp_quirk { * The DSC caps can be read from the physical aux instead. */ DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD, + /** +* @DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS: +* +* The device supports a link rate of 3.24 Gbps (multiplier 0xc) despite +* the DP_MAX_LINK_RATE register reporting a lower max multiplier. +*/ + DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS, }; /** -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/dp: Add dpcd link_rate quirk for Apple 15" MBP 2017
This fixes a problem found on the MacBookPro 2017 Retina panel. The panel reports 10 bpc color depth in its EDID, and the firmware chooses link settings at boot which support enough bandwidth for 10 bpc (324000 kbit/sec = multiplier 0xc), but the DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps (multiplier value 0xa) as possible, in direct contradiction of what the firmware successfully set up. This restricts the panel to 8 bpc, not providing the full color depth of the panel. This patch adds a quirk specific to the MBP 2017 15" Retina panel to add the additiional 324000 kbps link rate during edp setup. Link to previous discussion of a different attempted fix with Ville and Jani: https://patchwork.kernel.org/patch/11325935/ Signed-off-by: Mario Kleiner Cc: Ville Syrjälä Cc: Jani Nikula --- drivers/gpu/drm/drm_dp_helper.c | 2 ++ drivers/gpu/drm/i915/display/intel_dp.c | 7 +++ include/drm/drm_dp_helper.h | 7 +++ 3 files changed, 16 insertions(+) diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c index 5a103e9b3c86..36a371c016cb 100644 --- a/drivers/gpu/drm/drm_dp_helper.c +++ b/drivers/gpu/drm/drm_dp_helper.c @@ -1179,6 +1179,8 @@ static const struct dpcd_quirk dpcd_quirk_list[] = { { OUI(0x00, 0x00, 0x00), DEVICE_ID('C', 'H', '7', '5', '1', '1'), false, BIT(DP_DPCD_QUIRK_NO_SINK_COUNT) }, /* Synaptics DP1.4 MST hubs can support DSC without virtual DPCD */ { OUI(0x90, 0xCC, 0x24), DEVICE_ID_ANY, true, BIT(DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD) }, + /* Apple MacBookPro 2017 15 inch eDP Retina panel reports too low DP_MAX_LINK_RATE */ + { OUI(0x00, 0x10, 0xfa), DEVICE_ID(101, 68, 21, 101, 98, 97), false, BIT(DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS) }, }; #undef OUI diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 4074d83b1a5f..1f6bd659ad41 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -178,6 +178,13 @@ static void intel_dp_set_sink_rates(struct intel_dp *intel_dp) } intel_dp->num_sink_rates = i; + + if (drm_dp_has_quirk(_dp->desc, + DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS)) { + /* Needed for Apple MBP 2017, 15 inch eDP Retina panel */ + intel_dp->sink_rates[i] = 324000; + intel_dp->num_sink_rates++; + } } /* Get length of rates array potentially limited by max_rate. */ diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h index 262faf9e5e94..4b86a1f2a559 100644 --- a/include/drm/drm_dp_helper.h +++ b/include/drm/drm_dp_helper.h @@ -1532,6 +1532,13 @@ enum drm_dp_quirk { * The DSC caps can be read from the physical aux instead. */ DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD, + /** +* @DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS: +* +* The device supports a link rate of 3.24 Gbps (multiplier 0xc) despite +* the DP_MAX_LINK_RATE register reporting a lower max multiplier. +*/ + DP_DPCD_QUIRK_CAN_DO_MAX_LINK_RATE_3_24_GBPS, }; /** -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.
On Thu, Jan 9, 2020 at 10:26 PM Harry Wentland wrote: > > > On 2020-01-09 4:04 p.m., Mario Kleiner wrote: > > On Thu, Jan 9, 2020 at 8:49 PM Alex Deucher wrote: > >> On Thu, Jan 9, 2020 at 11:47 AM Mario Kleiner >> wrote: >> > >> > On Thu, Jan 9, 2020 at 4:40 PM Alex Deucher >> wrote: >> >> >> >> On Thu, Jan 9, 2020 at 10:08 AM Mario Kleiner >> >> wrote: >> >> > >> As Harry mentioned in the other thread, won't this only work if the >> display was brought up by the vbios? In the suspend/resume case, >> won't we just fall back to 2.7Gbps? >> >> Alex >> >> > Adding Harry to cc... > > The code is only executed for eDP. On the Intel side, it seems that > intel_edp_init_dpcd() gets only called during driver load / modesetting > init, so not on resume. > > On the AMD DC side, dc_link_detect_helper() has this early no-op return at > the beginning: > > if ((link->connector_signal == SIGNAL_TYPE_LVDS || > link->connector_signal == SIGNAL_TYPE_EDP) && > link->local_sink) > return true; > > > So i guess if link->local_sink doesn't get NULL'ed during a suspend/resume > cycle, then we never reach the setup code that would overwrite with non > vbios settings? > > Sounds reasonable to me, given that eDP panels are usually fixed internal > panels, nothing that gets hot(un-)plugged? > > I can't test, because suspend/resume with the Polaris gpu on the MBP 2017 > is totally broken atm., just as vgaswitcheroo can't do its job. Looks like > powering down the gpu works, but powering up doesn't. And also modesetting > at vgaswitcheroo switch time is no-go, because the DDC/AUX lines apparently > can't be switched on that Apple gmux, and handover of that data seems to be > not implemented in current vgaswitcheroo. At the moment switching between > AMD only or Intel+AMD Prime setup is quite a pita... > > > I haven't followed the entire discussion on the i915 thread but for the > amdgpu dc patch I would prefer a DPCD quirk to override the reported link > settings with the correct link rate. > > Harry > > Ok, as you wish. How do i do that? Is there already some DP related official mechanism, or do i just add some if-statement to detect_edp_sink_caps <https://elixir.bootlin.com/linux/v5.5-rc5/ident/detect_edp_sink_caps>() that matches on a new EDID quirk to be defined for that panel in drm_edid etc., and then if (edit quirk for that panel) dpcd[DP_MAX_LINK_RATE <https://elixir.bootlin.com/linux/v5.5-rc5/ident/DP_MAX_LINK_RATE>] = 0xc; The other question would be if we should do it for this panel on AMD DC at all? I see my original patch more as something to fix other odd (Apple?) panels, than for this specific one. As mentioned above, photometer testing on AMD DC with a Polaris on the MBP 2017 suggests that the deault 2.7 Gbps 8 bit mode + AMD's spatial dithering provides higher quality results for >= 10 bpc framebuffers than actually running the panel at 10 bit without dithering. As a little side-note, for squeezing out more precision than the 10 bpc framebuffers we officially have in Mesa/OpenGL, my software Psychtoolbox has some special hacks, playing funny tricks with resizing X-Screens, applying bit-twiddling shaders to images and MMIO programming the gpu "behind the back" of the driver, to get the gpu into RGBA16161616 linear scanout mode. That gives up to 12 bpc precision on that panel according to photometer measurements. While AMD's dithering with the panel in 8 bit + 4 bit spatial dithering gives pretty good results, panel at 10 bit + 2 bit spatial dithering has some artifacts. And even at a normal 10 bit framebuffer, the 8 bit panel + 2 bit dithering seems to give better results than 10 bit panel mode. -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.
On Fri, Jan 10, 2020 at 2:32 PM Ville Syrjälä wrote: > On Thu, Jan 09, 2020 at 09:19:07PM +0100, Mario Kleiner wrote: > > On Thu, Jan 9, 2020 at 7:24 PM Ville Syrjälä < > ville.syrj...@linux.intel.com> > > wrote: > > > > > On Thu, Jan 09, 2020 at 06:57:14PM +0100, Mario Kleiner wrote: > > > > On Thu, Jan 9, 2020 at 5:47 PM Ville Syrjälä < > > > ville.syrj...@linux.intel.com> > > > > wrote: > > > > > > > > > On Thu, Jan 09, 2020 at 05:30:05PM +0100, Mario Kleiner wrote: > > > > > > On Thu, Jan 9, 2020 at 4:38 PM Ville Syrjälä < > > > > > ville.syrj...@linux.intel.com> > > > > > > wrote: > > > > > > > > > > > > > > wouldn't work if dpcd[0x1] == 0xa, which it likely is [*]. AMD DC > > > > identified it as DP 1.1, eDP 1.3, and these extended caps seem to be > only > > > > part of DP 1.3+ if i understand the comments in > > > > intel_dp_extended_receiver_capabilities() correctly. > > > > > > > > Ok, looking at previous debug output logs shows that those extended caps > > are not present on the systems, ie. that extended caps bit is not set. So > > dpcd[0x1] == 0xa. > > > > > > > Yeah, but you never know how creative they've been with the DPCD in > > > such a propritary machine. A full DPCD dump from /dev/drm_dp_aux* would > > > be nice. Can you file a bug an attach the DPCD dump there so we have a > > > good reference on what we're talking about (also for future if/when > > > someone eventually starts to wonder why we have such hacks in the > > > code)? > > > > > > > > True, it's Apple which likes to "Think different..." :/ > > > > Will do. But is there a proper/better way to do the /dev/drm_dp_aux0 > dump? > > I used cat /dev/drm_dp_aux0 > dump, and that hangs, but if i interrupt it > > after a few seconds, i get a dump file of 512k size, which seems > excessive? > > On AMD DC atm., in case that matters. > > It can take a while to dump the whole thing. If there are errors in some > parts (against the spec but some devices simply don't care about the > spec) you may need to use ddrescue/etc. to dump everything that can be > dumped. > > Ok, it is Mozilla bug 206157: https://bugzilla.kernel.org/show_bug.cgi?id=206157 I attached the first ~ 5000 Bytes of DPCD dump, as there is a 5k file size limit. The total dump is 512 kB, mostly zeros. -mario -- > Ville Syrjälä > Intel > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.
On Thu, Jan 9, 2020 at 8:49 PM Alex Deucher wrote: > On Thu, Jan 9, 2020 at 11:47 AM Mario Kleiner > wrote: > > > > On Thu, Jan 9, 2020 at 4:40 PM Alex Deucher > wrote: > >> > >> On Thu, Jan 9, 2020 at 10:08 AM Mario Kleiner > >> wrote: > >> > > As Harry mentioned in the other thread, won't this only work if the > display was brought up by the vbios? In the suspend/resume case, > won't we just fall back to 2.7Gbps? > > Alex > > Adding Harry to cc... The code is only executed for eDP. On the Intel side, it seems that intel_edp_init_dpcd() gets only called during driver load / modesetting init, so not on resume. On the AMD DC side, dc_link_detect_helper() has this early no-op return at the beginning: if ((link <https://elixir.bootlin.com/linux/v5.5-rc5/ident/link>->connector_signal == SIGNAL_TYPE_LVDS <https://elixir.bootlin.com/linux/v5.5-rc5/ident/SIGNAL_TYPE_LVDS> || link <https://elixir.bootlin.com/linux/v5.5-rc5/ident/link>->connector_signal == SIGNAL_TYPE_EDP <https://elixir.bootlin.com/linux/v5.5-rc5/ident/SIGNAL_TYPE_EDP>) && link <https://elixir.bootlin.com/linux/v5.5-rc5/ident/link>->local_sink) return <https://elixir.bootlin.com/linux/v5.5-rc5/ident/return> true <https://elixir.bootlin.com/linux/v5.5-rc5/ident/true>; So i guess if link->local_sink doesn't get NULL'ed during a suspend/resume cycle, then we never reach the setup code that would overwrite with non vbios settings? Sounds reasonable to me, given that eDP panels are usually fixed internal panels, nothing that gets hot(un-)plugged? I can't test, because suspend/resume with the Polaris gpu on the MBP 2017 is totally broken atm., just as vgaswitcheroo can't do its job. Looks like powering down the gpu works, but powering up doesn't. And also modesetting at vgaswitcheroo switch time is no-go, because the DDC/AUX lines apparently can't be switched on that Apple gmux, and handover of that data seems to be not implemented in current vgaswitcheroo. At the moment switching between AMD only or Intel+AMD Prime setup is quite a pita... -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.
On Thu, Jan 9, 2020 at 7:24 PM Ville Syrjälä wrote: > On Thu, Jan 09, 2020 at 06:57:14PM +0100, Mario Kleiner wrote: > > On Thu, Jan 9, 2020 at 5:47 PM Ville Syrjälä < > ville.syrj...@linux.intel.com> > > wrote: > > > > > On Thu, Jan 09, 2020 at 05:30:05PM +0100, Mario Kleiner wrote: > > > > On Thu, Jan 9, 2020 at 4:38 PM Ville Syrjälä < > > > ville.syrj...@linux.intel.com> > > > > wrote: > > > > > > wouldn't work if dpcd[0x1] == 0xa, which it likely is [*]. AMD DC > > identified it as DP 1.1, eDP 1.3, and these extended caps seem to be only > > part of DP 1.3+ if i understand the comments in > > intel_dp_extended_receiver_capabilities() correctly. > > Ok, looking at previous debug output logs shows that those extended caps are not present on the systems, ie. that extended caps bit is not set. So dpcd[0x1] == 0xa. > Yeah, but you never know how creative they've been with the DPCD in > such a propritary machine. A full DPCD dump from /dev/drm_dp_aux* would > be nice. Can you file a bug an attach the DPCD dump there so we have a > good reference on what we're talking about (also for future if/when > someone eventually starts to wonder why we have such hacks in the > code)? > > True, it's Apple which likes to "Think different..." :/ Will do. But is there a proper/better way to do the /dev/drm_dp_aux0 dump? I used cat /dev/drm_dp_aux0 > dump, and that hangs, but if i interrupt it after a few seconds, i get a dump file of 512k size, which seems excessive? On AMD DC atm., in case that matters. However, the file shows DPCD_REV 1.1, maximum 0xa and no extended caps ( DP_TRAINING_AUX_RD_INTERVAL <https://elixir.bootlin.com/linux/v5.5-rc5/ident/DP_TRAINING_AUX_RD_INTERVAL> aka [0xe] == 0x00). -mario -- > Ville Syrjälä > Intel > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.
On Thu, Jan 9, 2020 at 5:47 PM Ville Syrjälä wrote: > On Thu, Jan 09, 2020 at 05:30:05PM +0100, Mario Kleiner wrote: > > On Thu, Jan 9, 2020 at 4:38 PM Ville Syrjälä < > ville.syrj...@linux.intel.com> > > wrote: > > > > > On Thu, Jan 09, 2020 at 05:26:57PM +0200, Ville Syrjälä wrote: > > > > On Thu, Jan 09, 2020 at 04:07:52PM +0100, Mario Kleiner wrote: > > > > > The panel reports 10 bpc color depth in its EDID, and the UEFI > > > > > firmware chooses link settings at boot which support enough > > > > > bandwidth for 10 bpc (324000 kbit/sec to be precise), but the > > > > > DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps as possible, > > > > > > Does it actually or do we just ignore the fact that it reports > 3.24Gbps? > > > > > > If it really reports 3.24 then we should be able to just add that to > > > dp_rates[] in intel_dp_set_sink_rates() and be done with it. > > > > > > Although we'd likely want to skip 3.24 unless it really is reported > > > as the max so as to not use that non-standard rate on other displays. > > > So would require a bit fancier logic for that. > > > > > > > > Was also my initial thought, but the DP_MAX_LINK_RATE reg reports 2.7 > Gbps > > as maximum. > > So dpcd[0x1] == 0xa ? > > Yes. [*] > What about the magic second version of DP_MAX_LINK_RATE at 0x2201 ? > Hmm. I guess we should already be reading that via > intel_dp_extended_receiver_capabilities(). > Yes, you do. [*] Well, i have to recheck on the machine. I started this work on the AMD side and checked what AMD DC gave me, haven't rechecked stuff under i915 that i already knew from AMD. Comparing the implementations, there's some peculiar differences that may matter: intel_dp_extended_receiver_capabilities() is more "paranoid" than AMD DC's retrieve_link_cap() function in deciding if the extended receiver caps are valid. Intels implementation copies only the first 6 Bytes of extended receiver caps into the dpcd[] arrays, whereas AMD copies 16 Bytes. Not sure about the differences, but one of you may wanna check why this is, and if it matters somehow. Btw. your proposed /* blah */ if (max_rate > ...) wouldn't work if dpcd[0x1] == 0xa, which it likely is [*]. AMD DC identified it as DP 1.1, eDP 1.3, and these extended caps seem to be only part of DP 1.3+ if i understand the comments in intel_dp_extended_receiver_capabilities() correctly. -mario > > -- > Ville Syrjälä > Intel > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.
On Thu, Jan 9, 2020 at 4:40 PM Alex Deucher wrote: > On Thu, Jan 9, 2020 at 10:08 AM Mario Kleiner > wrote: > > > > If the current eDP link rate, as read from hw, provides a > > higher bandwidth than the standard link rates, then add the > > current link rate to the link_rates array for consideration > > in future mode-sets. > > > > These initial current eDP link settings have been set up by > > firmware during boot, so they should work on the eDP panel. > > Therefore use them if the firmware thinks they are good and > > they provide higher link bandwidth, e.g., to enable higher > > resolutions / color depths. > > > > This fixes a problem found on the MacBookPro 2017 Retina panel: > > > > The panel reports 10 bpc color depth in its EDID, and the UEFI > > firmware chooses link settings at boot which support enough > > bandwidth for 10 bpc (324000 kbit/sec to be precise), but the > > DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps as possible, > > so intel_dp_set_sink_rates() would cap at that. This restricts > > achievable color depth to 8 bpc, not providing the full color > > depth of the panel. With this commit, we can use firmware setting > > and get the full 10 bpc advertised by the Retina panel. > > Would it make more sense to just add a quirk for this particular > panel? Would there be cases where the link was programmed wrong and > then we end up using that additional link speed as supported? > > Alex > > Not sure. This MBP 2017 is the only non-ancient laptop i now have. I'd assume many other Apple Retina panels would behave similar. The panels dpcd regs report DP 1.1 and eDP 1.3, so the flexible table with additional modes from eDP1.4+ does not exist. According to Wikipedia, eDP 1.4 was introduced in february 2013 and this is a mid 2017 machine, so Apple seems to be quite behind. Therefore i assume we'd need a lot of quirks over time. That said: 1. The logic in amdgpu's DC for the same purpose is a bit different than on the intel side. 2. DC allows overriding DP link settings, that's how i initially tested this, so one could do the "quirk" via something like that in a bootup script. So on AMD one could work around the lack of the patch and of quirks. 3. I spent a lot of time with a photo-meter, testing the quality of the 10 bit: It turns out that running the panel at 8 bit + AMD's spatial dithering that kicks in gives better results than running the panel in native 10 bit. Maybe the panel is not really a 10 bit one, but just pretends to be and then uses its own dithering to achieve 10 bit. So at least on AMD one is better off precision-wise with the 8 bit panel default with this specific panel. On Intel however, we don't do dithering for > 6 bpc panels atm., so using the panel at 10 bpc is the only way to get 10 bit display atm. Adn we don't use dithering on Intel at > 6 bpc panels atm., because there are some oddities in the way Intel hw dithers at higher bit depths - it also dithers pixel values where it shouldn't. That makes it impossible to get an identity passthrough of a 8 bpc framebuffer to the outputs, which kills all kind of special display equipment that needs that identity passthrough to work. -mario > > > Signed-off-by: Mario Kleiner > > Cc: Daniel Vetter > > --- > > drivers/gpu/drm/i915/display/intel_dp.c | 23 +++ > > 1 file changed, 23 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c > b/drivers/gpu/drm/i915/display/intel_dp.c > > index 2f31d226c6eb..aa3e0b5108c6 100644 > > --- a/drivers/gpu/drm/i915/display/intel_dp.c > > +++ b/drivers/gpu/drm/i915/display/intel_dp.c > > @@ -4368,6 +4368,8 @@ intel_edp_init_dpcd(struct intel_dp *intel_dp) > > { > > struct drm_i915_private *dev_priv = > > to_i915(dp_to_dig_port(intel_dp)->base.base.dev); > > + int max_rate; > > + u8 link_bw; > > > > /* this function is meant to be called only once */ > > WARN_ON(intel_dp->dpcd[DP_DPCD_REV] != 0); > > @@ -4433,6 +4435,27 @@ intel_edp_init_dpcd(struct intel_dp *intel_dp) > > else > > intel_dp_set_sink_rates(intel_dp); > > > > + /* > > +* If the firmware programmed a rate higher than the standard > sink rates > > +* during boot, then add that rate as a valid sink rate, as fw > knows > > +* this is a good rate and we get extra bandwidth. > > +* > > +* Helps, e.g., on the Apple MacBookPro 2017 Retina panel, which > is only > > +* eDP 1.1, but supports the unusual rate of 324000 kHz at > bootup, for > > +* 10 bpc
Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.
On Thu, Jan 9, 2020 at 4:38 PM Ville Syrjälä wrote: > On Thu, Jan 09, 2020 at 05:26:57PM +0200, Ville Syrjälä wrote: > > On Thu, Jan 09, 2020 at 04:07:52PM +0100, Mario Kleiner wrote: > > > The panel reports 10 bpc color depth in its EDID, and the UEFI > > > firmware chooses link settings at boot which support enough > > > bandwidth for 10 bpc (324000 kbit/sec to be precise), but the > > > DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps as possible, > > Does it actually or do we just ignore the fact that it reports 3.24Gbps? > > If it really reports 3.24 then we should be able to just add that to > dp_rates[] in intel_dp_set_sink_rates() and be done with it. > > Although we'd likely want to skip 3.24 unless it really is reported > as the max so as to not use that non-standard rate on other displays. > So would require a bit fancier logic for that. > > Was also my initial thought, but the DP_MAX_LINK_RATE reg reports 2.7 Gbps as maximum. -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.
On Thu, Jan 9, 2020 at 4:27 PM Ville Syrjälä wrote: > On Thu, Jan 09, 2020 at 04:07:52PM +0100, Mario Kleiner wrote: > > If the current eDP link rate, as read from hw, provides a > > higher bandwidth than the standard link rates, then add the > > current link rate to the link_rates array for consideration > > in future mode-sets. > > > > These initial current eDP link settings have been set up by > > firmware during boot, so they should work on the eDP panel. > > Therefore use them if the firmware thinks they are good and > > they provide higher link bandwidth, e.g., to enable higher > > resolutions / color depths. > > > > This fixes a problem found on the MacBookPro 2017 Retina panel: > > > > The panel reports 10 bpc color depth in its EDID, and the UEFI > > firmware chooses link settings at boot which support enough > > bandwidth for 10 bpc (324000 kbit/sec to be precise), but the > > DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps as possible, > > so intel_dp_set_sink_rates() would cap at that. This restricts > > achievable color depth to 8 bpc, not providing the full color > > depth of the panel. With this commit, we can use firmware setting > > and get the full 10 bpc advertised by the Retina panel. > > > > Signed-off-by: Mario Kleiner > > Cc: Daniel Vetter > > --- > > drivers/gpu/drm/i915/display/intel_dp.c | 23 +++ > > 1 file changed, 23 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c > b/drivers/gpu/drm/i915/display/intel_dp.c > > index 2f31d226c6eb..aa3e0b5108c6 100644 > > --- a/drivers/gpu/drm/i915/display/intel_dp.c > > +++ b/drivers/gpu/drm/i915/display/intel_dp.c > > @@ -4368,6 +4368,8 @@ intel_edp_init_dpcd(struct intel_dp *intel_dp) > > { > > struct drm_i915_private *dev_priv = > > to_i915(dp_to_dig_port(intel_dp)->base.base.dev); > > + int max_rate; > > + u8 link_bw; > > > > /* this function is meant to be called only once */ > > WARN_ON(intel_dp->dpcd[DP_DPCD_REV] != 0); > > @@ -4433,6 +4435,27 @@ intel_edp_init_dpcd(struct intel_dp *intel_dp) > > else > > intel_dp_set_sink_rates(intel_dp); > > > > + /* > > + * If the firmware programmed a rate higher than the standard sink > rates > > + * during boot, then add that rate as a valid sink rate, as fw > knows > > + * this is a good rate and we get extra bandwidth. > > + * > > + * Helps, e.g., on the Apple MacBookPro 2017 Retina panel, which > is only > > + * eDP 1.1, but supports the unusual rate of 324000 kHz at bootup, > for > > + * 10 bpc / 30 bit color depth. > > + */ > > + if (!intel_dp->use_rate_select && > > + (drm_dp_dpcd_read(_dp->aux, DP_LINK_BW_SET, _bw, 1) > == 1) && > > + (link_bw > 0) && (intel_dp->num_sink_rates < > DP_MAX_SUPPORTED_RATES)) { > > + max_rate = drm_dp_bw_code_to_link_rate(link_bw); > > + if (max_rate > > intel_dp->sink_rates[intel_dp->num_sink_rates - 1]) { > > + intel_dp->sink_rates[intel_dp->num_sink_rates] = > max_rate; > > + intel_dp->num_sink_rates++; > > + DRM_DEBUG_KMS("Adding max bandwidth eDP rate %d > kHz.\n", > > + max_rate); > > + } > > Hmm. I guess we could do this. But plese put it into a separate > function so we don't end up with that super ugly if condition. > > Ok. Does static void intel_edp_add_bootup_rate() good to you? Or intel_edp_add_fw_rate()? The debug message should probably be a bit more explicit. Eg. > something like: > "Firmware using non-standard link rate %d kHz. Including it in sink > rates.\n" > Ok. > I'm also wondering if we shouldn't just add the link rate to the sink > rates regradless of whether it's the highest rate or not... > > I tried to be conservative, and simple, but yes, one could add it anyway. Would need to preserve the order in the sink_rates[] array. Your choice, your're the expert :) > > + } > > + > > intel_dp_set_common_rates(intel_dp); > > > > /* Read the eDP DSC DPCD registers */ > > -- > > 2.24.0 > > > > ___ > > dri-devel mailing list > > dri-de...@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > -- > Ville Syrjälä > Intel > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/dp: Add current maximum eDP link rate to sink_rate array.
If the current eDP link rate, as read from hw, provides a higher bandwidth than the standard link rates, then add the current link rate to the link_rates array for consideration in future mode-sets. These initial current eDP link settings have been set up by firmware during boot, so they should work on the eDP panel. Therefore use them if the firmware thinks they are good and they provide higher link bandwidth, e.g., to enable higher resolutions / color depths. This fixes a problem found on the MacBookPro 2017 Retina panel: The panel reports 10 bpc color depth in its EDID, and the UEFI firmware chooses link settings at boot which support enough bandwidth for 10 bpc (324000 kbit/sec to be precise), but the DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps as possible, so intel_dp_set_sink_rates() would cap at that. This restricts achievable color depth to 8 bpc, not providing the full color depth of the panel. With this commit, we can use firmware setting and get the full 10 bpc advertised by the Retina panel. Signed-off-by: Mario Kleiner Cc: Daniel Vetter --- drivers/gpu/drm/i915/display/intel_dp.c | 23 +++ 1 file changed, 23 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 2f31d226c6eb..aa3e0b5108c6 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -4368,6 +4368,8 @@ intel_edp_init_dpcd(struct intel_dp *intel_dp) { struct drm_i915_private *dev_priv = to_i915(dp_to_dig_port(intel_dp)->base.base.dev); + int max_rate; + u8 link_bw; /* this function is meant to be called only once */ WARN_ON(intel_dp->dpcd[DP_DPCD_REV] != 0); @@ -4433,6 +4435,27 @@ intel_edp_init_dpcd(struct intel_dp *intel_dp) else intel_dp_set_sink_rates(intel_dp); + /* +* If the firmware programmed a rate higher than the standard sink rates +* during boot, then add that rate as a valid sink rate, as fw knows +* this is a good rate and we get extra bandwidth. +* +* Helps, e.g., on the Apple MacBookPro 2017 Retina panel, which is only +* eDP 1.1, but supports the unusual rate of 324000 kHz at bootup, for +* 10 bpc / 30 bit color depth. +*/ + if (!intel_dp->use_rate_select && + (drm_dp_dpcd_read(_dp->aux, DP_LINK_BW_SET, _bw, 1) == 1) && + (link_bw > 0) && (intel_dp->num_sink_rates < DP_MAX_SUPPORTED_RATES)) { + max_rate = drm_dp_bw_code_to_link_rate(link_bw); + if (max_rate > intel_dp->sink_rates[intel_dp->num_sink_rates - 1]) { + intel_dp->sink_rates[intel_dp->num_sink_rates] = max_rate; + intel_dp->num_sink_rates++; + DRM_DEBUG_KMS("Adding max bandwidth eDP rate %d kHz.\n", + max_rate); + } + } + intel_dp_set_common_rates(intel_dp); /* Read the eDP DSC DPCD registers */ -- 2.24.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/3] drm/atomic: Rename crtc_state->pageflip_flags to async_flip
On Wed, Sep 4, 2019 at 2:57 PM Kazlauskas, Nicholas wrote: > > On 2019-09-03 3:06 p.m., Daniel Vetter wrote: > > It's the only flag anyone actually cares about. Plus if we're unlucky, > > the atomic ioctl might need a different flag for async flips. So > > better to abstract this away from the uapi a bit. > > > > Cc: Maarten Lankhorst > > Cc: Michel Dänzer > > Cc: Alex Deucher > > Cc: Adam Jackson > > Cc: Sean Paul > > Cc: David Airlie > > Signed-off-by: Daniel Vetter > > Cc: Maxime Ripard > > Cc: Daniel Vetter > > Cc: Nicholas Kazlauskas > > Cc: Leo Li > > Cc: Harry Wentland > > Cc: David Francis > > Cc: Mario Kleiner > > Cc: Bhawanpreet Lakha > > Cc: Ben Skeggs > > Cc: "Christian König" > > Cc: Ilia Mirkin > > Cc: Sam Ravnborg > > Cc: Chris Wilson > > --- > > Series is: > > Reviewed-by: Nicholas Kazlauskas > > I would like to see a new flag eventually show up for atomic as well, > but the existing one is effectively broken at this point and I would > hope that no userspace is setting it expecting that it actually does > something. You mean it is generally broken? My software uses non-vsync'ed flips for diagnostic purpose and iirc some gpu + driver combo didn't work as expected anymore. But i thought that was one specific driver bug (maybe on AMD + DC)? > > At this point we don't really gain anything from enabling atomic in DDX > I think, most drivers already make use of DRM helpers to map these > legacy IOCTLs to atomic anyway. > One thing i wanted to try, once i hopefully find some time in late 2019 / early 2020 (if nobody else starts working on such a thing earlier), would be to add the ability to pass in a target flip time to the pageflip ioctl for use with VRR. For that i thought adding a new pageflip flag a la DRM_MODE_PAGE_FLIP_TARGETTIME) would be a good way to reuse the existing page_flip_target ioctl and redefine the "uint32 sequence" field of struct drm_mode_crtc_page_flip_target to pass in the target time - or at least the lower 32 bits of a target time. So that would be one more page flip flag for the future. I'd like this to be workable from X11, and the current DDX don't use the atomic interface, apart from the modesetting DDX where it just got disabled by default in xserver master due to various unresolved bugs afaik? thanks, -mario > Nicholas Kazlauskas > > > drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 ++--- > > drivers/gpu/drm/drm_atomic_helper.c | 2 +- > > drivers/gpu/drm/drm_atomic_state_helper.c | 2 +- > > drivers/gpu/drm/nouveau/dispnv50/wndw.c | 4 ++-- > > include/drm/drm_crtc.h| 8 > > 5 files changed, 10 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > index 0a71ed1e7762..2f0ef0820f00 100644 > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > @@ -5756,8 +5756,7 @@ static void amdgpu_dm_commit_planes(struct > > drm_atomic_state *state, > >* change FB pitch, DCC state, rotation or mirroing. > >*/ > > bundle->flip_addrs[planes_count].flip_immediate = > > - (crtc->state->pageflip_flags & > > - DRM_MODE_PAGE_FLIP_ASYNC) != 0 && > > + crtc->state->async_flip && > > acrtc_state->update_type == UPDATE_TYPE_FAST; > > > > timestamp_ns = ktime_get_ns(); > > @@ -6334,7 +6333,7 @@ static void amdgpu_dm_atomic_commit_tail(struct > > drm_atomic_state *state) > > amdgpu_dm_enable_crtc_interrupts(dev, state, true); > > > > for_each_new_crtc_in_state(state, crtc, new_crtc_state, j) > > - if (new_crtc_state->pageflip_flags & DRM_MODE_PAGE_FLIP_ASYNC) > > + if (new_crtc_state->async_flip) > > wait_for_vblank = false; > > > > /* update planes when needed per crtc*/ > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c > > b/drivers/gpu/drm/drm_atomic_helper.c > > index e9c6112e7f73..1e5293eb66e3 100644 > > --- a/drivers/gpu/drm/drm_atomic_helper.c > > +++ b/drivers/gpu/drm/drm_atomic_helper.c > > @@ -3263,7 +3263,7 @@ static int page_flip_common(struct drm_atomic_state > > *state, > > return PTR_ERR(crtc_state); > > > > c
Re: [Intel-gfx] [PATCH xf86-video-intel v3 2/2] sna: Support 10bpc gamma via the GAMMA_LUT crtc property
Hi Ville, now somebody just needs to merge these two 10 bit gamma lut patches into intel-ddx? thanks, -mario On Fri, May 17, 2019 at 3:51 PM Ville Syrjala wrote: > > From: Ville Syrjälä > > Probe the GAMMA_LUT/GAMMA_LUT_SIZE props and utilize them when > the running with > 8bpc. > > v2: s/sna_crtc_id/__sna_crtc_id/ in DBG since we have a sna_crtc > v3: Fix the vg "bluered" typo (Mario) > This time I even build tested with vg support > > Cc: Mario Kleiner > Signed-off-by: Ville Syrjälä > Reviewed-and-tested-by: Mario Kleiner > --- > src/sna/sna_display.c | 247 +++--- > 1 file changed, 208 insertions(+), 39 deletions(-) > > diff --git a/src/sna/sna_display.c b/src/sna/sna_display.c > index 41edfec12839..d6210cc7bbc8 100644 > --- a/src/sna/sna_display.c > +++ b/src/sna/sna_display.c > @@ -127,6 +127,7 @@ struct local_mode_obj_get_properties { > uint32_t obj_type; > uint32_t pad; > }; > +#define LOCAL_MODE_OBJECT_CRTC 0x > #define LOCAL_MODE_OBJECT_PLANE 0x > > struct local_mode_set_plane { > @@ -229,6 +230,11 @@ struct sna_crtc { > } primary; > struct list sprites; > > + struct drm_color_lut *gamma_lut; > + uint64_t gamma_lut_prop; > + uint64_t gamma_lut_blob; > + uint32_t gamma_lut_size; > + > uint32_t mode_serial, flip_serial; > > uint32_t last_seq, wrap_seq; > @@ -317,6 +323,9 @@ static void __sna_output_dpms(xf86OutputPtr output, int > dpms, int fixup); > static void sna_crtc_disable_cursor(struct sna *sna, struct sna_crtc *crtc); > static bool sna_crtc_flip(struct sna *sna, struct sna_crtc *crtc, > struct kgem_bo *bo, int x, int y); > +static void sna_crtc_gamma_set(xf86CrtcPtr crtc, > + CARD16 *red, CARD16 *green, > + CARD16 *blue, int size); > > static bool is_zaphod(ScrnInfoPtr scrn) > { > @@ -3150,11 +3159,9 @@ sna_crtc_set_mode_major(xf86CrtcPtr crtc, > DisplayModePtr mode, >mode->VDisplay <= sna->mode.max_crtc_height); > > #if HAS_GAMMA > - drmModeCrtcSetGamma(sna->kgem.fd, __sna_crtc_id(sna_crtc), > - crtc->gamma_size, > - crtc->gamma_red, > - crtc->gamma_green, > - crtc->gamma_blue); > + sna_crtc_gamma_set(crtc, > + crtc->gamma_red, crtc->gamma_green, > + crtc->gamma_blue, crtc->gamma_size); > #endif > > saved_kmode = sna_crtc->kmode; > @@ -3212,12 +3219,44 @@ void sna_mode_adjust_frame(struct sna *sna, int x, > int y) > > static void > sna_crtc_gamma_set(xf86CrtcPtr crtc, > - CARD16 *red, CARD16 *green, CARD16 *blue, int size) > + CARD16 *red, CARD16 *green, CARD16 *blue, int size) > { > - assert(to_sna_crtc(crtc)); > - drmModeCrtcSetGamma(to_sna(crtc->scrn)->kgem.fd, > - sna_crtc_id(crtc), > - size, red, green, blue); > + struct sna *sna = to_sna(crtc->scrn); > + struct sna_crtc *sna_crtc = to_sna_crtc(crtc); > + struct drm_color_lut *lut = sna_crtc->gamma_lut; > + uint32_t blob_size = size * sizeof(lut[0]); > + uint32_t blob_id; > + int ret, i; > + > + DBG(("%s: gamma_size %d\n", __FUNCTION__, size)); > + > + if (!lut) { > + assert(size == 256); > + > + drmModeCrtcSetGamma(to_sna(crtc->scrn)->kgem.fd, > + sna_crtc_id(crtc), > + size, red, green, blue); > + return; > + } > + > + assert(size == sna_crtc->gamma_lut_size); > + > + for (i = 0; i < size; i++) { > + lut[i].red = red[i]; > + lut[i].green = green[i]; > + lut[i].blue = blue[i]; > + } > + > + ret = drmModeCreatePropertyBlob(sna->kgem.fd, lut, blob_size, > _id); > + if (ret) > + return; > + > + ret = drmModeObjectSetProperty(sna->kgem.fd, > + sna_crtc->id, DRM_MODE_OBJECT_CRTC, > + sna_crtc->gamma_lut_prop, > + blob_id); > + > + drmModeDestroyPropertyBlob(sna->kgem.fd, blob_id); > } > > static void > @@ -3229,6 +3268,8 @@ sna_crtc_destroy(xf86CrtcPtr crtc) > if (sna_crtc == NULL) >
Re: [Intel-gfx] [PATCH xf86-video-intel v2 2/2] sna: Support 10bpc gamma via the GAMMA_LUT crtc property
On Fri, Apr 26, 2019 at 6:32 PM Ville Syrjala wrote: > > From: Ville Syrjälä > > Probe the GAMMA_LUT/GAMMA_LUT_SIZE props and utilize them when > the running with > 8bpc. > > v2: s/sna_crtc_id/__sna_crtc_id/ in DBG since we have a sna_crtc > > Cc: Mario Kleiner > Signed-off-by: Ville Syrjälä > --- > src/sna/sna_display.c | 245 +++--- > 1 file changed, 207 insertions(+), 38 deletions(-) > > diff --git a/src/sna/sna_display.c b/src/sna/sna_display.c > index 41edfec12839..6d671dce8c14 100644 > --- a/src/sna/sna_display.c > +++ b/src/sna/sna_display.c > @@ -127,6 +127,7 @@ struct local_mode_obj_get_properties { > uint32_t obj_type; > uint32_t pad; > }; > +#define LOCAL_MODE_OBJECT_CRTC 0x > #define LOCAL_MODE_OBJECT_PLANE 0x > > struct local_mode_set_plane { > @@ -229,6 +230,11 @@ struct sna_crtc { > } primary; > struct list sprites; > > + struct drm_color_lut *gamma_lut; > + uint64_t gamma_lut_prop; > + uint64_t gamma_lut_blob; > + uint32_t gamma_lut_size; > + > uint32_t mode_serial, flip_serial; > > uint32_t last_seq, wrap_seq; > @@ -317,6 +323,9 @@ static void __sna_output_dpms(xf86OutputPtr output, int > dpms, int fixup); > static void sna_crtc_disable_cursor(struct sna *sna, struct sna_crtc *crtc); > static bool sna_crtc_flip(struct sna *sna, struct sna_crtc *crtc, > struct kgem_bo *bo, int x, int y); > +static void sna_crtc_gamma_set(xf86CrtcPtr crtc, > + CARD16 *red, CARD16 *green, > + CARD16 *blue, int size); > > static bool is_zaphod(ScrnInfoPtr scrn) > { > @@ -3150,11 +3159,9 @@ sna_crtc_set_mode_major(xf86CrtcPtr crtc, > DisplayModePtr mode, >mode->VDisplay <= sna->mode.max_crtc_height); > > #if HAS_GAMMA > - drmModeCrtcSetGamma(sna->kgem.fd, __sna_crtc_id(sna_crtc), > - crtc->gamma_size, > - crtc->gamma_red, > - crtc->gamma_green, > - crtc->gamma_blue); > + sna_crtc_gamma_set(crtc, > + crtc->gamma_red, crtc->gamma_green, > + crtc->gamma_blue, crtc->gamma_size); > #endif > > saved_kmode = sna_crtc->kmode; > @@ -3212,12 +3219,44 @@ void sna_mode_adjust_frame(struct sna *sna, int x, > int y) > > static void > sna_crtc_gamma_set(xf86CrtcPtr crtc, > - CARD16 *red, CARD16 *green, CARD16 *blue, int size) > + CARD16 *red, CARD16 *green, CARD16 *blue, int size) > { > - assert(to_sna_crtc(crtc)); > - drmModeCrtcSetGamma(to_sna(crtc->scrn)->kgem.fd, > - sna_crtc_id(crtc), > - size, red, green, blue); > + struct sna *sna = to_sna(crtc->scrn); > + struct sna_crtc *sna_crtc = to_sna_crtc(crtc); > + struct drm_color_lut *lut = sna_crtc->gamma_lut; > + uint32_t blob_size = size * sizeof(lut[0]); > + uint32_t blob_id; > + int ret, i; > + > + DBG(("%s: gamma_size %d\n", __FUNCTION__, size)); > + > + if (!lut) { > + assert(size == 256); > + > + drmModeCrtcSetGamma(to_sna(crtc->scrn)->kgem.fd, > + sna_crtc_id(crtc), > + size, red, green, blue); > + return; > + } > + > + assert(size == sna_crtc->gamma_lut_size); > + > + for (i = 0; i < size; i++) { > + lut[i].red = red[i]; > + lut[i].green = green[i]; > + lut[i].blue = blue[i]; > + } > + > + ret = drmModeCreatePropertyBlob(sna->kgem.fd, lut, blob_size, > _id); > + if (ret) > + return; > + > + ret = drmModeObjectSetProperty(sna->kgem.fd, > + sna_crtc->id, DRM_MODE_OBJECT_CRTC, > + sna_crtc->gamma_lut_prop, > + blob_id); > + > + drmModeDestroyPropertyBlob(sna->kgem.fd, blob_id); > } > > static void > @@ -3229,6 +3268,8 @@ sna_crtc_destroy(xf86CrtcPtr crtc) > if (sna_crtc == NULL) > return; > > + free(sna_crtc->gamma_lut); > + > list_for_each_entry_safe(sprite, sn, _crtc->sprites, link) > free(sprite); > > @@ -3663,6 +3704,55 @@ b
Re: [Intel-gfx] [PATCH xf86-video-intel v2 1/2] sna: Refactor property parsing
On Fri, Apr 26, 2019 at 6:32 PM Ville Syrjala wrote: > > From: Ville Syrjälä > > Generalize the code that parses the plane properties to be useable > for crtc (or any kms object) properties as well. > > v2: plane 'type' prop is enum not range! > > Cc: Mario Kleiner > Signed-off-by: Ville Syrjälä > --- This patch is Reviewed-and-tested-by: Mario Kleiner -mario > src/sna/sna_display.c | 69 ++- > 1 file changed, 49 insertions(+), 20 deletions(-) > > diff --git a/src/sna/sna_display.c b/src/sna/sna_display.c > index 119ea981d243..41edfec12839 100644 > --- a/src/sna/sna_display.c > +++ b/src/sna/sna_display.c > @@ -215,6 +215,7 @@ struct sna_crtc { > uint32_t rotation; > struct plane { > uint32_t id; > + uint32_t type; > struct { > uint32_t prop; > uint32_t supported; > @@ -3391,33 +3392,40 @@ void sna_crtc_set_sprite_colorspace(xf86CrtcPtr crtc, > p->color_encoding.values[colorspace]); > } > > -static int plane_details(struct sna *sna, struct plane *p) > +typedef void (*parse_prop_func)(struct sna *sna, > + struct drm_mode_get_property *prop, > + uint64_t value, > + void *data); > +static void parse_props(struct sna *sna, > + uint32_t obj_type, uint32_t obj_id, > + parse_prop_func parse_prop, > + void *data) > { > #define N_STACK_PROPS 32 /* must be a multiple of 2 */ > struct local_mode_obj_get_properties arg; > uint64_t stack[N_STACK_PROPS + N_STACK_PROPS/2]; > uint64_t *values = stack; > uint32_t *props = (uint32_t *)(values + N_STACK_PROPS); > - int i, type = DRM_PLANE_TYPE_OVERLAY; > + int i; > > memset(, 0, sizeof(struct local_mode_obj_get_properties)); > - arg.obj_id = p->id; > - arg.obj_type = LOCAL_MODE_OBJECT_PLANE; > + arg.obj_id = obj_id; > + arg.obj_type = obj_type; > > arg.props_ptr = (uintptr_t)props; > arg.prop_values_ptr = (uintptr_t)values; > arg.count_props = N_STACK_PROPS; > > if (drmIoctl(sna->kgem.fd, LOCAL_IOCTL_MODE_OBJ_GETPROPERTIES, )) > - return -1; > + return; > > DBG(("%s: object %d (type %x) has %d props\n", __FUNCTION__, > -p->id, LOCAL_MODE_OBJECT_PLANE, arg.count_props)); > +obj_id, obj_type, arg.count_props)); > > if (arg.count_props > N_STACK_PROPS) { > values = malloc(2*sizeof(uint64_t)*arg.count_props); > if (values == NULL) > - return -1; > + return; > > props = (uint32_t *)(values + arg.count_props); > > @@ -3444,27 +3452,48 @@ static int plane_details(struct sna *sna, struct > plane *p) > DBG(("%s: prop[%d] .id=%ld, .name=%s, .flags=%x, > .value=%ld\n", __FUNCTION__, i, > (long)props[i], prop.name, (unsigned)prop.flags, > (long)values[i])); > > - if (strcmp(prop.name, "type") == 0) { > - type = values[i]; > - } else if (prop_is_rotation()) { > - parse_rotation_prop(sna, p, , values[i]); > - } else if (prop_is_color_encoding()) { > - parse_color_encoding_prop(sna, p, , values[i]); > - } > + parse_prop(sna, , values[i], data); > } > > - p->rotation.supported &= DBG_NATIVE_ROTATION; > - if (!xf86ReturnOptValBool(sna->Options, OPTION_ROTATION, TRUE)) > - p->rotation.supported = RR_Rotate_0; > - > if (values != stack) > free(values); > > - DBG(("%s: plane=%d type=%d\n", __FUNCTION__, p->id, type)); > - return type; > #undef N_STACK_PROPS > } > > +static bool prop_is_type(const struct drm_mode_get_property *prop) > +{ > + return prop_has_type_and_name(prop, 3, "type"); > +} > + > +static void plane_parse_prop(struct sna *sna, > +struct drm_mode_get_property *prop, > +uint64_t value, void *data) > +{ > + struct plane *p = data; > + > + if (prop_is_type(prop)) > + p->type = value; > + else if (prop_is_rotation(prop)) > + parse_rotation_prop(sna, p, prop, value); > + else if (prop_is
Re: [Intel-gfx] [PATCH xf86-video-intel] sna/uxa: Fix colormap handling at screen depth 30. (v2)
On Mon, Oct 15, 2018 at 6:21 PM Ville Syrjälä wrote: > On Tue, Jun 12, 2018 at 06:20:35PM +0200, Mario Kleiner wrote: > > The various clut handling functions like a setup > > consistent with the x-screen color depth. Otherwise > > we observe improper sampling in the gamma tables > > at depth 30. > > > > Therefore replace hard-coded bitsPerRGB = 8 by actual > > bits per channel scrn->rgbBits. Also use this for call > > to xf86HandleColormaps(). > > > > Tested for uxa and sna at depths 8, 16, 24 and 30 on > > IvyBridge, and tested at depth 24 and 30 that xgamma > > and gamma table animations work, and with measurement > > equipment to make sure identity gamma ramps actually > > are identity mappings at the output. > > > > v2: Also deal with X-Server 1.19 and earlier, which as of > > v1.19.6 lack a fix to color palette handling and can > > not deal with depths/bpc > 24/8 bpc. On < 1.20 we skip > > xf86HandleColormaps() setup at > 8 bpc. This disables > > color palette handling on such servers at > 8 bpc, but > > still keeps RandR gamma table handling intact. > > > > Tested on 1.19.6 and 1.20.0 to do the right thing. > > > > Signed-off-by: Mario Kleiner > > Forgot this didn't get applied. It did make sense to me at the > time when I was looking at the explosions with depth 30. > Still seems to do the trick on 1.19, and redshit still works > so > > Reviewed-by: Ville Syrjälä > > Thanks Ville! Now it just needs to get merged, please. Chris? One last missing piece is support for 1024 slot gamma tables in i965-kms, or gamma table bypass for such high bit depth framebuffers to make them actually useful. Ville, i think you mentioned working on that around spring last year? Thanks, -mario > --- > > src/sna/sna_driver.c | 9 ++--- > > src/uxa/intel_driver.c | 6 +- > > 2 files changed, 11 insertions(+), 4 deletions(-) > > > > diff --git a/src/sna/sna_driver.c b/src/sna/sna_driver.c > > index 2007e354..8c79d43b 100644 > > --- a/src/sna/sna_driver.c > > +++ b/src/sna/sna_driver.c > > @@ -1152,7 +1152,7 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL) > > if (!miInitVisuals(, , , , > , > > , > > ((unsigned long)1 << (scrn->bitsPerPixel - 1)), > > -8, -1)) > > +scrn->rgbBits, -1)) > > return FALSE; > > > > if (!miScreenInit(screen, NULL, > > @@ -1223,8 +1223,11 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL) > > if (!miCreateDefColormap(screen)) > > return FALSE; > > > > - if (sna->mode.num_real_crtc && > > - !xf86HandleColormaps(screen, 256, 8, sna_load_palette, NULL, > > + /* X-Server < 1.20 mishandles > 256 slots / > 8 bpc color maps. */ > > + if (sna->mode.num_real_crtc && (scrn->rgbBits <= 8 || > > + XORG_VERSION_CURRENT >= XORG_VERSION_NUMERIC(1,20,0,0,0)) && > > + !xf86HandleColormaps(screen, 1 << scrn->rgbBits, scrn->rgbBits, > > + sna_load_palette, NULL, > >CMAP_RELOAD_ON_MODE_SWITCH | > >CMAP_PALETTED_TRUECOLOR)) > > return FALSE; > > diff --git a/src/uxa/intel_driver.c b/src/uxa/intel_driver.c > > index 3703c412..77c0dc00 100644 > > --- a/src/uxa/intel_driver.c > > +++ b/src/uxa/intel_driver.c > > @@ -991,7 +991,11 @@ I830ScreenInit(SCREEN_INIT_ARGS_DECL) > > if (!miCreateDefColormap(screen)) > > return FALSE; > > > > - if (!xf86HandleColormaps(screen, 256, 8, I830LoadPalette, NULL, > > + /* X-Server < 1.20 mishandles > 256 slots / > 8 bpc color maps. */ > > + if ((scrn->rgbBits <= 8 || > > + XORG_VERSION_CURRENT >= XORG_VERSION_NUMERIC(1,20,0,0,0)) && > > + !xf86HandleColormaps(screen, 1 << scrn->rgbBits, scrn->rgbBits, > > + I830LoadPalette, NULL, > >CMAP_RELOAD_ON_MODE_SWITCH | > >CMAP_PALETTED_TRUECOLOR)) { > > return FALSE; > > -- > > 2.17.1 > > -- > Ville Syrjälä > Intel > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH xf86-video-intel] sna/uxa: Fix colormap handling at screen depth 30. (v2)
The various clut handling functions like a setup consistent with the x-screen color depth. Otherwise we observe improper sampling in the gamma tables at depth 30. Therefore replace hard-coded bitsPerRGB = 8 by actual bits per channel scrn->rgbBits. Also use this for call to xf86HandleColormaps(). Tested for uxa and sna at depths 8, 16, 24 and 30 on IvyBridge, and tested at depth 24 and 30 that xgamma and gamma table animations work, and with measurement equipment to make sure identity gamma ramps actually are identity mappings at the output. v2: Also deal with X-Server 1.19 and earlier, which as of v1.19.6 lack a fix to color palette handling and can not deal with depths/bpc > 24/8 bpc. On < 1.20 we skip xf86HandleColormaps() setup at > 8 bpc. This disables color palette handling on such servers at > 8 bpc, but still keeps RandR gamma table handling intact. Tested on 1.19.6 and 1.20.0 to do the right thing. Signed-off-by: Mario Kleiner --- src/sna/sna_driver.c | 9 ++--- src/uxa/intel_driver.c | 6 +- 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/src/sna/sna_driver.c b/src/sna/sna_driver.c index 2007e354..8c79d43b 100644 --- a/src/sna/sna_driver.c +++ b/src/sna/sna_driver.c @@ -1152,7 +1152,7 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL) if (!miInitVisuals(, , , , , , ((unsigned long)1 << (scrn->bitsPerPixel - 1)), - 8, -1)) + scrn->rgbBits, -1)) return FALSE; if (!miScreenInit(screen, NULL, @@ -1223,8 +1223,11 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL) if (!miCreateDefColormap(screen)) return FALSE; - if (sna->mode.num_real_crtc && - !xf86HandleColormaps(screen, 256, 8, sna_load_palette, NULL, + /* X-Server < 1.20 mishandles > 256 slots / > 8 bpc color maps. */ + if (sna->mode.num_real_crtc && (scrn->rgbBits <= 8 || + XORG_VERSION_CURRENT >= XORG_VERSION_NUMERIC(1,20,0,0,0)) && + !xf86HandleColormaps(screen, 1 << scrn->rgbBits, scrn->rgbBits, +sna_load_palette, NULL, CMAP_RELOAD_ON_MODE_SWITCH | CMAP_PALETTED_TRUECOLOR)) return FALSE; diff --git a/src/uxa/intel_driver.c b/src/uxa/intel_driver.c index 3703c412..77c0dc00 100644 --- a/src/uxa/intel_driver.c +++ b/src/uxa/intel_driver.c @@ -991,7 +991,11 @@ I830ScreenInit(SCREEN_INIT_ARGS_DECL) if (!miCreateDefColormap(screen)) return FALSE; - if (!xf86HandleColormaps(screen, 256, 8, I830LoadPalette, NULL, + /* X-Server < 1.20 mishandles > 256 slots / > 8 bpc color maps. */ + if ((scrn->rgbBits <= 8 || + XORG_VERSION_CURRENT >= XORG_VERSION_NUMERIC(1,20,0,0,0)) && + !xf86HandleColormaps(screen, 1 << scrn->rgbBits, scrn->rgbBits, +I830LoadPalette, NULL, CMAP_RELOAD_ON_MODE_SWITCH | CMAP_PALETTED_TRUECOLOR)) { return FALSE; -- 2.17.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Depth 30 colormap handling fixes for servers 1.20+ and < 1.20
Hi, finally here's an updated patch that for depth 30 now works on both Server 1.20 with the full colormap + gamma table handling, and for servers < 1.20 with the RandR gamma tables working fine and the colormap processing skipped. This one successfully tested on sna and uxa with both server 1.20.0 and server 1.19.6. I assume this one will be replaced by Ville's ddx+kmswork anyway soonish, but until that is done, this one keeps things at least testable without crashes and other problems. I use it with my own intel-kms 10 bit lut poc hacks for measurements and to test that Mesa's depth 30 stuff doesn't break. Thanks, -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] sna/uxa: Fix colormap handling at screen depth 30.
Oops, didn't reply yet, sorry! On Thu, Mar 15, 2018 at 5:14 PM, Chris Wilson <ch...@chris-wilson.co.uk> wrote: > Quoting Ville Syrjälä (2018-03-15 16:02:42) >> On Thu, Mar 15, 2018 at 03:28:18PM +, Chris Wilson wrote: >> > Quoting Ville Syrjälä (2018-03-01 11:12:53) >> > > On Thu, Mar 01, 2018 at 02:20:48AM +0100, Mario Kleiner wrote: >> > > > The various clut handling functions like a setup >> > > > consistent with the x-screen color depth. Otherwise >> > > > we observe improper sampling in the gamma tables >> > > > at depth 30. >> > > > >> > > > Therefore replace hard-coded bitsPerRGB = 8 by actual >> > > > bits per channel scrn->rgbBits. Also use this for call >> > > > to xf86HandleColormaps(). >> > > > >> > > > Tested for uxa and sna at depths 8, 16, 24 and 30 on >> > > > IvyBridge, and tested at depth 24 and 30 that xgamma >> > > > and gamma table animations work, and with measurement >> > > > equipment to make sure identity gamma ramps actually >> > > > are identity mappings at the output. >> > > >> > > You mean identity mapping at 8bpc? We don't support higher precision >> > > gamma on pre-bdw atm, and the ddx doesn't use the higher precision >> > > stuff even on bdw+. I'm working on fixing both, but it turned out to >> > > be a bit more work than I anticipated so will take a while. >> > > >> > > > >> > > > Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com> >> > > > --- >> > > > src/sna/sna_driver.c | 5 +++-- >> > > > src/uxa/intel_driver.c | 3 ++- >> > > > 2 files changed, 5 insertions(+), 3 deletions(-) >> > > > >> > > > diff --git a/src/sna/sna_driver.c b/src/sna/sna_driver.c >> > > > index 2643e6c..9c4bcd4 100644 >> > > > --- a/src/sna/sna_driver.c >> > > > +++ b/src/sna/sna_driver.c >> > > > @@ -1145,7 +1145,7 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL) >> > > > if (!miInitVisuals(, , , , >> > > > , >> > > > , >> > > > ((unsigned long)1 << (scrn->bitsPerPixel - >> > > > 1)), >> > > > -8, -1)) >> > > > +scrn->rgbBits, -1)) >> > > > return FALSE; >> > > > >> > > > if (!miScreenInit(screen, NULL, >> > > > @@ -1217,7 +1217,8 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL) >> > > > return FALSE; >> > > > >> > > > if (sna->mode.num_real_crtc && >> > > > - !xf86HandleColormaps(screen, 256, 8, sna_load_palette, NULL, >> > > > + !xf86HandleColormaps(screen, 1 << scrn->rgbBits, >> > > > scrn->rgbBits, >> > > > + sna_load_palette, NULL, >> > > >CMAP_RELOAD_ON_MODE_SWITCH | >> > > >CMAP_PALETTED_TRUECOLOR)) >> > > >> > > I already forgot what this does prior to your randr fix. IIRC bumping >> > > the 8 alone would cause the thing to segfault, but I guess bumping both >> > > was fine? >> > > We always need this fix for X-Screen depth 30, even on older servers. With the current maxColors=256, bitsPerPixel=8 setting and color depth 30, the server does some out of bounds reads in its gamma handling code for maxColors=256, and we get crash at server startup. It's a bit a matter of luck to reproduce. I had it running for months without problems, then after some Ubuntu system upgrade i got crashes at server startup under sna and at server shutdown under uxa. Without raising the bitsPerPixel to 10, we get some bottleneck in the way the server mushes together the old XF86VidMode gamma ramps (per-x-screen) and the new RandR per-crtc gamma ramps, so there are artifacts in the gamma table finally uploaded to the hw. For DefaultDeph=24 this patch doesn't change anything. On X-Server < 1.20 however, without my fix, at color depth 30 this will get us stuck on a identity gamma ramp, as the update code in the server effectively no-ops. Or so i think, because i tested so many permutations of so many things on intel,amd,nouveau with different mesa,server,ddx branches lately that i may misremember something. Ilia reported some odd behavior on 1.19 with the correspond
[Intel-gfx] [PATCH] sna/uxa: Fix colormap handling at screen depth 30.
The various clut handling functions like a setup consistent with the x-screen color depth. Otherwise we observe improper sampling in the gamma tables at depth 30. Therefore replace hard-coded bitsPerRGB = 8 by actual bits per channel scrn->rgbBits. Also use this for call to xf86HandleColormaps(). Tested for uxa and sna at depths 8, 16, 24 and 30 on IvyBridge, and tested at depth 24 and 30 that xgamma and gamma table animations work, and with measurement equipment to make sure identity gamma ramps actually are identity mappings at the output. Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com> --- src/sna/sna_driver.c | 5 +++-- src/uxa/intel_driver.c | 3 ++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/src/sna/sna_driver.c b/src/sna/sna_driver.c index 2643e6c..9c4bcd4 100644 --- a/src/sna/sna_driver.c +++ b/src/sna/sna_driver.c @@ -1145,7 +1145,7 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL) if (!miInitVisuals(, , , , , , ((unsigned long)1 << (scrn->bitsPerPixel - 1)), - 8, -1)) + scrn->rgbBits, -1)) return FALSE; if (!miScreenInit(screen, NULL, @@ -1217,7 +1217,8 @@ sna_screen_init(SCREEN_INIT_ARGS_DECL) return FALSE; if (sna->mode.num_real_crtc && - !xf86HandleColormaps(screen, 256, 8, sna_load_palette, NULL, + !xf86HandleColormaps(screen, 1 << scrn->rgbBits, scrn->rgbBits, +sna_load_palette, NULL, CMAP_RELOAD_ON_MODE_SWITCH | CMAP_PALETTED_TRUECOLOR)) return FALSE; diff --git a/src/uxa/intel_driver.c b/src/uxa/intel_driver.c index 3703c41..88c749e 100644 --- a/src/uxa/intel_driver.c +++ b/src/uxa/intel_driver.c @@ -991,7 +991,8 @@ I830ScreenInit(SCREEN_INIT_ARGS_DECL) if (!miCreateDefColormap(screen)) return FALSE; - if (!xf86HandleColormaps(screen, 256, 8, I830LoadPalette, NULL, + if (!xf86HandleColormaps(screen, 1 << scrn->rgbBits, scrn->rgbBits, +I830LoadPalette, NULL, CMAP_RELOAD_ON_MODE_SWITCH | CMAP_PALETTED_TRUECOLOR)) { return FALSE; -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Add module parameter to en-/disable hw color correction.
On 09/26/2017 07:05 AM, Daniel Vetter wrote: On Fri, Sep 15, 2017 at 05:48:25PM +0200, Mario Kleiner wrote: The new module parameter enable_hw_color_correction defaults to true, to retain the current behaviour. If set to false, it will disable all hardware color correction, like gamma/degamma and csc. This is useful for debugging gamma table / csc precision problems, and to ensure unmodified pixel passthrough from framebuffer to outputs, e.g., for scientific applications which critically depend on perfect pixel passthrough. While i hope this switch generally won't be needed, it provides extra peace-of-mind - an "airbag" for color correction trouble. Tested on Ironlake, IvyBridge, Haswell, Skylake. One unexpected result during testing was that while this works on all tested gpu's with a 8 bpc XR24 framebuffer as primary plane, if a 10 bpc XR30 fb is active, then hw gamma tables seem to get automatically bypassed on at least the tested IvyBridge and later (but not on the tested Ironlake), regardless of hw programming, at least for the legacy 256->8 bit luts and the 1024->10 bit precision luts. However, the type of selected - but bypassed - hw gamma table still determines output precision, ie. even an auto-bypassed legacy 256 slot 8 bit lut in XR30 fb mode still restricts the effective output precision to 8 bit, while an auto-bypassed precision lut doesn't restrict precision. Instead of a modparam I think the right thing to fix here is the driver setup. Enabling the legacy gamma table is indeed documented to restrict the pipe to 8bpc (the 2 additional bits for 10bpc are just padded). Having driver options for "pls give me non-broken behaviour" doesn't make any sense to me. -Daniel Hi Daniel, this isn't meant as a permanent solution, but as a debugging aid, and as the equivalent of an air-bag in a car. You hope you won't need it, but it is good to have. In the past it would have been very handy for me to have a master-switch for this, debugging problems on users machines related to "pixels don't appear on the outputs as specified in the OpenGL rendering code". When looking over the docs for color correction i just realized the hardware has an easy way to disable this part of the pipeline, so i thought this could make debugging so much easier - at least for me. I had the impression that many current i915 module parameters are of this nature. The debug switch also provides a temporary workaround on production systems if a problem is related to color correction, not meant as a permanent solution. Many of my users are challenged already by the fact that Linux is not macOS, and editing a config file or installing a prebuilt kernel from a .dpkg is already borderline rocket science for them, that's why those module parameters would be nice to have. My actual plan is to implement true 10 bit -> 12 bit gamma table support, hopefully still for the 4.15 kernel. I have experimental patches for using the precision luts with 1024 slots and 10 bit output width, ie. 10 bit in -> 10 bit out on Ironlake and later. I'll send those out in their hacky state just for reference. However the better plan i have in mind is to extend the code so that if (we are in single-lut mode (DEGAMMA_LUT == 0)) AND (the userspace provided input lut is monotonically increasing) we switch from the dual 512 slot 10 bit luts to the 512 slot 12 bit lut. This would also be applicable to the 256 slot legacy gamma tables, which are always single-lut and can be upsampled from 256 slots to 512 slots. The reason is that the dual-512 slot luts are not good enough to handle a 10 bit framebuffer. As far as i read the PRMs, a 10 bit fb value would simply get truncated to 9 bits to select one of the 512 slots, so we would lose 1 bit of precision, which makes 10 bit framebuffers mostly pointless, at least for scientific/medical/HDR applications. The 512 slot 12 bit lut is perfect for such applications, as the PRMs say the hw will linearly interpolate between the nearest neighbor slots of the 512 slot lut for the given fb input value -> works with 10 bit fb's. Would also work with those 16 bit half-float fb format that is supported by current hw but currently unused - but could be handy for future HDR applications. Also 12 bit output precision is nice for better gamma correction on true 10-12 bit displays over DP/HDMI deep color. I will try to work on this within the next 1-2 weeks. Now here's a catch i found while testing with the 1024 slot 10 bit luts, which i found very surprising: - If i have a standard XR24 framebuffer on the primary plane, the 1024 slot/10 bit lut's work exactly as expected, as verified on a XR24 fb via photometer measurements and tweaking the values in the gamma tables -- and the "force dithering on" module parameter patch, as i don't have a true 10 bit panel around atm. - As soon as a XR30 fb is active (X11 Def
[Intel-gfx] [PATCH 1/2] drm/i915: Add module parameter to force en-/disable dithering.
i915.enable_dithering allows to force dithering on all outputs on (=1) or off (=0). The default is -1 for current automatic per-pipe selection. This is useful for debugging and for special case scenarios, e.g., providing simulated 10 bpc output on 8 bpc digital sinks if a 10 bpc framebuffer + rendering is in use. A more flexible solution would be connector properties, like other drivers (radeon, amdgpu, nouveau) already provide. A global override via module parameter is useful even with such connector properties, e.g., for scientific applications which require strict control over dithering, to have an override for DE's which may not expose such properties via some standard protocol in a user-controllable way, e.g., afaik all currently existing Wayland compositors. Tested on Ironlake, IvyBridge, Haswell, Skylake. Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com> --- drivers/gpu/drm/i915/i915_params.c | 5 + drivers/gpu/drm/i915/i915_params.h | 1 + drivers/gpu/drm/i915/intel_display.c | 5 + 3 files changed, 11 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 8ab003dca113..07ec3a96457c 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -65,6 +65,7 @@ struct i915_params i915 __read_mostly = { .inject_load_failure = 0, .enable_dpcd_backlight = false, .enable_gvt = false, + .enable_dithering = -1, }; module_param_named(modeset, i915.modeset, int, 0400); @@ -257,3 +258,7 @@ MODULE_PARM_DESC(enable_dpcd_backlight, module_param_named(enable_gvt, i915.enable_gvt, bool, 0400); MODULE_PARM_DESC(enable_gvt, "Enable support for Intel GVT-g graphics virtualization host support(default:false)"); + +module_param_named(enable_dithering, i915.enable_dithering, int, 0644); +MODULE_PARM_DESC(enable_dithering, + "Enable dithering (-1=auto [default], 0=force off on all outputs, 1=force on on all outputs)"); diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h index ac844709c97e..7e365cd4fc91 100644 --- a/drivers/gpu/drm/i915/i915_params.h +++ b/drivers/gpu/drm/i915/i915_params.h @@ -54,6 +54,7 @@ func(int, edp_vswing); \ func(int, reset); \ func(unsigned int, inject_load_failure); \ + func(int, enable_dithering); \ /* leave bools at the end to not create holes */ \ func(bool, alpha_support); \ func(bool, enable_cmd_parser); \ diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 0e93ec201fe3..bea471a96820 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -10978,6 +10978,11 @@ intel_modeset_pipe_config(struct drm_crtc *crtc, */ pipe_config->dither = (pipe_config->pipe_bpp == 6*3) && !pipe_config->dither_force_disable; + + /* Override of auto-selected dither mode via module parameter? */ + if (i915.enable_dithering != -1) + pipe_config->dither = i915.enable_dithering > 0 ? true : false; + DRM_DEBUG_KMS("hw max bpp: %i, pipe bpp: %i, dithering: %i\n", base_bpp, pipe_config->pipe_bpp, pipe_config->dither); -- 2.13.0.rc1.294.g07d810a77f ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/2] drm/i915: Add module parameter to en-/disable hw color correction.
The new module parameter enable_hw_color_correction defaults to true, to retain the current behaviour. If set to false, it will disable all hardware color correction, like gamma/degamma and csc. This is useful for debugging gamma table / csc precision problems, and to ensure unmodified pixel passthrough from framebuffer to outputs, e.g., for scientific applications which critically depend on perfect pixel passthrough. While i hope this switch generally won't be needed, it provides extra peace-of-mind - an "airbag" for color correction trouble. Tested on Ironlake, IvyBridge, Haswell, Skylake. One unexpected result during testing was that while this works on all tested gpu's with a 8 bpc XR24 framebuffer as primary plane, if a 10 bpc XR30 fb is active, then hw gamma tables seem to get automatically bypassed on at least the tested IvyBridge and later (but not on the tested Ironlake), regardless of hw programming, at least for the legacy 256->8 bit luts and the 1024->10 bit precision luts. However, the type of selected - but bypassed - hw gamma table still determines output precision, ie. even an auto-bypassed legacy 256 slot 8 bit lut in XR30 fb mode still restricts the effective output precision to 8 bit, while an auto-bypassed precision lut doesn't restrict precision. Iow. this patch is needed even with XR30 fb's for actual 10 bit precision output, even though the hw seems to sort of ignore the tested gamma tables for XR30 fb's. Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com> --- drivers/gpu/drm/i915/i915_params.c | 5 + drivers/gpu/drm/i915/i915_params.h | 3 ++- drivers/gpu/drm/i915/intel_display.c | 26 +- drivers/gpu/drm/i915/intel_sprite.c | 21 - 4 files changed, 40 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 07ec3a96457c..8f6a176a97e1 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -66,6 +66,7 @@ struct i915_params i915 __read_mostly = { .enable_dpcd_backlight = false, .enable_gvt = false, .enable_dithering = -1, + .enable_hw_color_correction = true, }; module_param_named(modeset, i915.modeset, int, 0400); @@ -262,3 +263,7 @@ MODULE_PARM_DESC(enable_gvt, module_param_named(enable_dithering, i915.enable_dithering, int, 0644); MODULE_PARM_DESC(enable_dithering, "Enable dithering (-1=auto [default], 0=force off on all outputs, 1=force on on all outputs)"); + +module_param_named(enable_hw_color_correction, i915.enable_hw_color_correction, bool, 0644); +MODULE_PARM_DESC(enable_hw_color_correction, + "Enable hardware color correction like gamma luts and csc (default: true)"); diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h index 7e365cd4fc91..f5c9163d2675 100644 --- a/drivers/gpu/drm/i915/i915_params.h +++ b/drivers/gpu/drm/i915/i915_params.h @@ -69,7 +69,8 @@ func(bool, nuclear_pageflip); \ func(bool, enable_dp_mst); \ func(bool, enable_dpcd_backlight); \ - func(bool, enable_gvt) + func(bool, enable_gvt); \ + func(bool, enable_hw_color_correction) #define MEMBER(T, member) T member struct i915_params { diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index bea471a96820..1e1b157353a9 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -3184,13 +3184,17 @@ static u32 i9xx_plane_ctl(const struct intel_crtc_state *crtc_state, unsigned int rotation = plane_state->base.rotation; u32 dspcntr; - dspcntr = DISPLAY_PLANE_ENABLE | DISPPLANE_GAMMA_ENABLE; + dspcntr = DISPLAY_PLANE_ENABLE; + + if (i915.enable_hw_color_correction) + dspcntr |= DISPPLANE_GAMMA_ENABLE; if (IS_G4X(dev_priv) || IS_GEN5(dev_priv) || IS_GEN6(dev_priv) || IS_IVYBRIDGE(dev_priv)) dspcntr |= DISPPLANE_TRICKLE_FEED_DISABLE; - if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv)) + if ((IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv)) && + i915.enable_hw_color_correction) dspcntr |= DISPPLANE_PIPE_CSC_ENABLE; if (INTEL_GEN(dev_priv) < 4) @@ -3514,7 +3518,8 @@ u32 skl_plane_ctl(const struct intel_crtc_state *crtc_state, plane_ctl = PLANE_CTL_ENABLE; - if (!IS_GEMINILAKE(dev_priv) && !IS_CANNONLAKE(dev_priv)) { + if (!IS_GEMINILAKE(dev_priv) && !IS_CANNONLAKE(dev_priv) && + i915.enable_hw_color_correction) { plane_ctl |= PLANE_CTL_PIPE_GAMMA_ENABLE | PLANE_CTL_PIPE_CSC_ENABLE | @@ -3571,7 +3576,8 @@ static void skylake_update_primary_plane(struct intel_plane *plane, spin_lock_irqsave(_priv->unco
[Intel-gfx] Module parameters to override color management/dithering.
Hi, so these two patches add i915 module parameters to globally override how the driver handles dithering and gamma/csc conversion. They serve two purposes: First as debug aid and "airbag" for working around potential precision problems in getting pixels from rendering to the display outputs. This mostly for applications that critically depend on getting pixels untampered from the fb to the outputs, e.g., scientific neuro-science/vision research/medical applications. Having the ability to bypass parts of the pipeline can help a lot in debugging such problems on remote user machines, and to allow such users to work around the problems until proper fixes are made. I expect this to become especially useful when dealing with all the Wayland compositor implementations, which so far don't have a standardized application/user controllable equivalent to RandR protocol / xrandr tools. The second, short-term purpose is to enable true 10 bit output from rendering, so people with urgent 10 bit precision needs can benefit from the Mesa patches i started working on for i965 (rev 1 on the mailing-list, rev 2 to come soon). I realize the merge window for Linux 4.14 is almost over, but wanted to ask if it would be possible to slip these patches into 4.14 if they aren't considered too intrusive? These are tested on Ironlake, Ivybridge, Haswell and Skylake, also with a photometer to see what actually comes out of the display for different settings. The bigger plan is to enhance the gamma table support, so we could also use > 8 bit precision gamma tables on Ironlake and later, both for the legacy gamma ioctl and the new color mgmt. method. I do have proof of concept patches for using the 1024->10 precision luts on Ironlake and later. Tweaking the gamma tables i upload via RandR and measuring with photometer showed my poc patches work. However, as described in the 2nd patch, at least the tested legacy luts and 1024->10 precision luts seem to get mostly ignored/bypassed in the hw when a XR30 fb is attached to the primary plane. Not sure if some setup is missing, or if this is some hardware quirk? Couldn't find anything in the PRM's so far. What i'd actually like to implement for Ironlake+ instead of the 1024->10 bit luts is this: If in dual-gamma lut mode, or if the input gamma table is not monotonically increasing, do what is done now (legacy luts for legacy gamma ioctl, split 512->10 big luts for new path). If only a single gamma lut is requested (DEGAMMA_LUT == 0), and the provided input lut is monotonically increasing, switch to the linearly interpolated 512->12 bit lut instead, which exists on Ironlake+. Also for the legacy gamma ioctl, so existing apps can benefit from the higher precision. This would enable > 8 bit framebuffers to be output properly and with high quality gamma correction. Thanks, -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm: Make the decision to keep vblank irq enabled earlier
On 03/23/2017 02:26 PM, Ville Syrjälä wrote: On Thu, Mar 23, 2017 at 07:51:06AM +, Chris Wilson wrote: We want to provide the vblank irq shadow for pageflip events as well as vblank queries. Such events are completed within the vblank interrupt handler, and so the current check for disabling the irq will disable it from with the same interrupt as the last pageflip event. If we move the decision on whether to disable the irq (based on there no being no remaining vblank events, i.e. vblank->refcount == 0) to before we signal the events, we will only disable the irq on the interrupt after the last event was signaled. In the normal course of events, this will keep the vblank irq enabled for the entire flip sequence whereas before it would flip-flop around every interrupt. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrj...@linux.intel.com> Cc: Daniel Vetter <dan...@ffwll.ch> Cc: Michel Dänzer <mic...@daenzer.net> Cc: Laurent Pinchart <laurent.pinch...@ideasonboard.com> Cc: Dave Airlie <airl...@redhat.com>, Cc: Mario Kleiner <mario.kleiner...@gmail.com> --- drivers/gpu/drm/drm_irq.c | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 5b77057e91ca..1d6bcee3708f 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -1741,6 +1741,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe) { struct drm_vblank_crtc *vblank = >vblank[pipe]; unsigned long irqflags; + bool disable_irq; if (WARN_ON_ONCE(!dev->num_crtcs)) return false; @@ -1768,16 +1769,19 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe) spin_unlock(>vblank_time_lock); wake_up(>queue); - drm_handle_vblank_events(dev, pipe); /* With instant-off, we defer disabling the interrupt until after -* we finish processing the following vblank. The disable has to -* be last (after drm_handle_vblank_events) so that the timestamp -* is always accurate. +* we finish processing the following vblank after all events have +* been signaled. The disable has to be last (after +* drm_handle_vblank_events) so that the timestamp is always accurate. We wouldn't actually do the disable as long there's a reference still held, so the timestamp should be fine in that case. And if there aren't any references the timestamp shouldn't matter... I think. But it's probably more clear to keep to the order you propose here anyway. Reviewed-by: Ville Syrjälä <ville.syrj...@linux.intel.com> Looks good to me. As a further optimization, i think we could move the vblank_disable_fn() call outside/below the spin_unlock_irqrestore for event_lock, as vblank_disable_fn() doesn't need any locks held at call time, so slightly reduce event_lock hold time. Don't know if it is worth it. In any case Reviewed-by: Mario Kleiner <mario.kleiner...@gmail.com> thanks, -mario Oh, and now that I think about this stuff again, I start to wonder why I made the disable actually update the seq/ts. If the interrupt is currently enabled the seq/ts should be reasonably uptodate already when we do disable the interrupt. Perhaps I was only thinking about drm_vblank_off() when I made that change, or I decided that I didn't want two different disable codepaths. Anyways, just an idea that we might be able to make the vblank irq disable a little cheaper. */ - if (dev->vblank_disable_immediate && - drm_vblank_offdelay > 0 && - !atomic_read(>refcount)) + disable_irq = (dev->vblank_disable_immediate && + drm_vblank_offdelay > 0 && + !atomic_read(>refcount)); + + drm_handle_vblank_events(dev, pipe); + + if (disable_irq) vblank_disable_fn((unsigned long)vblank); spin_unlock_irqrestore(>event_lock, irqflags); -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/3] drm: Defer disabling the vblank IRQ until the next interrupt (for instant-off)
On 03/15/2017 10:00 PM, Ville Syrjälä wrote: On Wed, Mar 15, 2017 at 08:40:25PM +, Chris Wilson wrote: On vblank instant-off systems, we can get into a situation where the cost of enabling and disabling the vblank IRQ around a drmWaitVblank query dominates. And with the advent of even deeper hardware sleep state, touching registers becomes ever more expensive. However, we know that if the user wants the current vblank counter, they are also very likely to immediately queue a vblank wait and so we can keep the interrupt around and only turn it off if we have no further vblank requests queued within the interrupt interval. After vblank event delivery, this patch adds a shadow of one vblank where the interrupt is kept alive for the user to query and queue another vblank event. Similarly, if the user is using blocking drmWaitVblanks, the interrupt will be disabled on the IRQ following the wait completion. However, if the user is simply querying the current vblank counter and timestamp, the interrupt will be disabled after every IRQ and the user will enabled it again on the first query following the IRQ. v2: Mario Kleiner - After testing this, one more thing that would make sense is to move the disable block at the end of drm_handle_vblank() instead of at the top. Turns out that if high precision timestaming is disabled or doesn't work for some reason (as can be simulated by echo 0 > /sys/module/drm/parameters/timestamp_precision_usec), then with your delayed disable code at its current place, the vblank counter won't increment anymore at all for instant queries, ie. with your other "instant query" patches. Clients which repeatedly query the counter and wait for it to progress will simply hang, spinning in an endless query loop. There's that comment in vblank_disable_and_save: "* Skip this step if there isn't any high precision timestamp * available. In that case we can't account for this and just * hope for the best. */ With the disable happening after leading edge of vblank (== hw counter increment already happened) but before the vblank counter/timestamp handling in drm_handle_vblank, that step is needed to keep the counter progressing, so skipping it is bad. Now without high precision timestamping support, a kms driver must not set dev->vblank_disable_immediate = true, as this would cause problems for clients, so this shouldn't matter, but it would be good to still make this robust against a future kms driver which might have unreliable high precision timestamping, e.g., high precision timestamping that intermittently doesn't work. v3: Patch before coffee needs extra coffee. Testcase: igt/kms_vblank Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrj...@linux.intel.com> Cc: Daniel Vetter <dan...@ffwll.ch> Cc: Michel Dänzer <mic...@daenzer.net> Cc: Laurent Pinchart <laurent.pinch...@ideasonboard.com> Cc: Dave Airlie <airl...@redhat.com>, Cc: Mario Kleiner <mario.kleiner...@gmail.com> Yep. This seems like a good idea to me. I just neglected to review it last time around (and maybe even before that?) for some reason. Locks seem to be taken in the right order, so it at least looks safe to me. Reviewed-by: Ville Syrjälä <ville.syrj...@linux.intel.com> Hi, as a followup to this one, maybe we should move the drm_handle_vblank_events(dev, pipe); down, immediately after Chris new delayed disable code? The idea was to avoid lots of redundant enable->disable->enable... calls by having some 1 frame delay before disable. This works for pure vblank count/ts queries. But both DRI2 and DRI3/Present use vblank events to trigger a pageflip-ioctl at the right target vblank. With the current ordering we may dispatch the vblank swap trigger event to the X-Server and drop the vblank refcount to zero due to the vblank_put inside drm_handle_vblank_events for the dispatched event, then detect in this patch that refcount == 0 and disable vblanks, but a few microseconds later the server will queue a pageflip ioctl which bumps the refcount and reenables vblank irqs, so we have a redundant disable->enable. Also many kms drivers now use drm_crtc_arm_vblank_event() for pageflip completion handling at vblank, the pageflip completion events are also dispatched via drm_handle_vblank_events(). After a pageflip completes, it makes sense to have this "swap shadow" of 1 full frame, as animations would likely queue a new vblank query/event immediately for the next animation frame. -mario --- drivers/gpu/drm/drm_irq.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 9bdca69f754c..e64b05ea95ea 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -1198,9 +1198,9 @@ static void drm_vblank_put(struct drm_device *dev, unsigned int pipe)
Re: [Intel-gfx] [PATCH] drm/i915: Fix legacy gamma lut updates in Linux 4.7-rc6
Ok, so legacy gamma table updates are completely broken for Intel on Linux-4.7-rc7, the final release candidate. The good news is that applying Lionel's patch "drm/i915: add missing condition for committing planes on crtc" from https://patchwork.freedesktop.org/patch/89111/ fixes it nicely. The patch currently applies cleanly to drm-fixes and drm-next and is Reviewed-and-tested-by: Mario Kleiner <mario.kleiner...@gmail.com> When we are at it, could somebody please look at that updated series of my Displayport color depth fixes ("EDID/DP fixes for proper bpc detection of displays.") i sent out a week ago? Especially pulling patch 2/5 "[PATCH 2/5] drm/i915/dp: Revert "drm/i915/dp: fall back to 18 bpp when sink capability is unknown" would be important, as that bug introduced a regression for Intel + DP + legacy DP converters into stable kernels which is very serious for users of scientific/medical display equipment, especially as the failures can easily go unnoticed during normal equipment tests, but would introduce the equivalent of "silent data corruption" into their measured scientific data, which is not a great experience given that collecting such data can easily take half a year of work time and ten-thousands of euros of wasted research funding. Patches 3 and 4 contain changes Daniel asked me to do, patch 5 would be good to safe-guard against similar issues in the future. thanks, -mario On 07/12/2016 12:50 PM, Lionel Landwerlin wrote: Hi Mario, There was a couple of patch to fix this issue : https://patchwork.freedesktop.org/series/5467/ https://patchwork.freedesktop.org/series/5466/ I tested this late last week on drm-intel-nightly, it seems a series of revert fixed most of the issues. Cheers, - Lionel On 12/07/16 11:33, Mario Kleiner wrote: Updating legacy gamma tables, e.g., via RandR doesn't work at all as of Linux 4.7-rc6. Reason seems to be that the required call to drm_atomic_helper_commit_planes_on_crtc is skipped in intel_atomic_commit after userspace set new gamma tables, because neither crtc->state->planes_changed nor update_pipe (= pipe_config->update_pipe) are true. Removing the check for planes_changed || update_pipe fixes gamma table updates. The code for Linux 4.8 drm-next has changed a lot in that area wrt. 4.7, but the new code for 4.8 also removed those checks and calls drm_atomic_helper_commit_planes_on_crtc unconditionally, and legacy gamma lut updates work on drm-next, so this seems to be the right solution. Tested also shutdown/reboot, suspend/resume, (un-)plugging displays, mode switches for resolution/refresh rate, display rotation, and page-flipping/pageflip timing on Intel HD Ironlake to confirm the fix apparently doesn't break anything under X11. Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com> Cc: Maarten Lankhorst <maarten.lankho...@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwer...@intel.com> Cc: Daniel Vetter <daniel.vet...@ffwll.ch> --- drivers/gpu/drm/i915/intel_display.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 04452cf..eb8fb36 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13685,7 +13685,6 @@ static int intel_atomic_commit(struct drm_device *dev, bool modeset = needs_modeset(crtc->state); struct intel_crtc_state *pipe_config = to_intel_crtc_state(crtc->state); -bool update_pipe = !modeset && pipe_config->update_pipe; if (modeset && crtc->state->active) { update_scanline_offset(to_intel_crtc(crtc)); @@ -13699,8 +13698,7 @@ static int intel_atomic_commit(struct drm_device *dev, drm_atomic_get_existing_plane_state(state, crtc->primary)) intel_fbc_enable(intel_crtc); -if (crtc->state->active && -(crtc->state->planes_changed || update_pipe)) +if (crtc->state->active) drm_atomic_helper_commit_planes_on_crtc(old_crtc_state); if (pipe_config->base.active && needs_vblank_wait(pipe_config)) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix legacy gamma lut updates in Linux 4.7-rc6
On 07/12/2016 05:02 PM, Lionel Landwerlin wrote: On 12/07/16 13:11, Mario Kleiner wrote: On 07/12/2016 12:50 PM, Lionel Landwerlin wrote: Hi Mario, Hi Lionel, There was a couple of patch to fix this issue : https://patchwork.freedesktop.org/series/5467/ https://patchwork.freedesktop.org/series/5466/ Looking at them they should fix the issue, but they seem to be stuck in review? I tested this late last week on drm-intel-nightly, it seems a series of revert fixed most of the issues. You mean something else has fixed legacy gamma updates, as i can't find above patches applied on drm-intel-nightly? This revert on drm-intel-nightly seems to have fixed the problem : https://cgit.freedesktop.org/drm-intel/commit/drivers/gpu/drm/i915?id=e42aeef1237b7c969a77b7f726c50f6cb832185f Ok, with that intel-nightly looks like drm-next for 4.8 and that indeed has working lut updates in my testing. My own patch was motivated by the way the implementation is done in intel_atomic_commit_tail() from drm-next. Are those fixes supposed to be already part of 4.7-rc7, the final rc afaik? I haven't seen it on 4.7-rc7. I just checked Linus tree for 4.7-rc7 and there the code in intel_display.c didn't receive any updates since 13 days and looks like the broken code from rc6 which according to my testing doesn't work. So i'd assume legacy gamma table updates are broken in Linux 4.7 final rc atm. Couldn't test, because for some weird reason 4.7-rc7 doesn't even boot on my laptop :( - However i got that via a quick install from Ubuntu's mainline ppa so it could be some unrelated problem with their ppa builds. I think either my patch would fix it, but is untested wrt. nuclear pageflip, or those two patches you referenced, which apparently didn't move forward. What now? -mario thanks, -mario Cheers, - Lionel On 12/07/16 11:33, Mario Kleiner wrote: Updating legacy gamma tables, e.g., via RandR doesn't work at all as of Linux 4.7-rc6. Reason seems to be that the required call to drm_atomic_helper_commit_planes_on_crtc is skipped in intel_atomic_commit after userspace set new gamma tables, because neither crtc->state->planes_changed nor update_pipe (= pipe_config->update_pipe) are true. Removing the check for planes_changed || update_pipe fixes gamma table updates. The code for Linux 4.8 drm-next has changed a lot in that area wrt. 4.7, but the new code for 4.8 also removed those checks and calls drm_atomic_helper_commit_planes_on_crtc unconditionally, and legacy gamma lut updates work on drm-next, so this seems to be the right solution. Tested also shutdown/reboot, suspend/resume, (un-)plugging displays, mode switches for resolution/refresh rate, display rotation, and page-flipping/pageflip timing on Intel HD Ironlake to confirm the fix apparently doesn't break anything under X11. Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com> Cc: Maarten Lankhorst <maarten.lankho...@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwer...@intel.com> Cc: Daniel Vetter <daniel.vet...@ffwll.ch> --- drivers/gpu/drm/i915/intel_display.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 04452cf..eb8fb36 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13685,7 +13685,6 @@ static int intel_atomic_commit(struct drm_device *dev, bool modeset = needs_modeset(crtc->state); struct intel_crtc_state *pipe_config = to_intel_crtc_state(crtc->state); -bool update_pipe = !modeset && pipe_config->update_pipe; if (modeset && crtc->state->active) { update_scanline_offset(to_intel_crtc(crtc)); @@ -13699,8 +13698,7 @@ static int intel_atomic_commit(struct drm_device *dev, drm_atomic_get_existing_plane_state(state, crtc->primary)) intel_fbc_enable(intel_crtc); -if (crtc->state->active && -(crtc->state->planes_changed || update_pipe)) +if (crtc->state->active) drm_atomic_helper_commit_planes_on_crtc(old_crtc_state); if (pipe_config->base.active && needs_vblank_wait(pipe_config)) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix legacy gamma lut updates in Linux 4.7-rc6
On 07/12/2016 12:50 PM, Lionel Landwerlin wrote: Hi Mario, Hi Lionel, There was a couple of patch to fix this issue : https://patchwork.freedesktop.org/series/5467/ https://patchwork.freedesktop.org/series/5466/ Looking at them they should fix the issue, but they seem to be stuck in review? I tested this late last week on drm-intel-nightly, it seems a series of revert fixed most of the issues. You mean something else has fixed legacy gamma updates, as i can't find above patches applied on drm-intel-nightly? Are those fixes supposed to be already part of 4.7-rc7, the final rc afaik? thanks, -mario Cheers, - Lionel On 12/07/16 11:33, Mario Kleiner wrote: Updating legacy gamma tables, e.g., via RandR doesn't work at all as of Linux 4.7-rc6. Reason seems to be that the required call to drm_atomic_helper_commit_planes_on_crtc is skipped in intel_atomic_commit after userspace set new gamma tables, because neither crtc->state->planes_changed nor update_pipe (= pipe_config->update_pipe) are true. Removing the check for planes_changed || update_pipe fixes gamma table updates. The code for Linux 4.8 drm-next has changed a lot in that area wrt. 4.7, but the new code for 4.8 also removed those checks and calls drm_atomic_helper_commit_planes_on_crtc unconditionally, and legacy gamma lut updates work on drm-next, so this seems to be the right solution. Tested also shutdown/reboot, suspend/resume, (un-)plugging displays, mode switches for resolution/refresh rate, display rotation, and page-flipping/pageflip timing on Intel HD Ironlake to confirm the fix apparently doesn't break anything under X11. Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com> Cc: Maarten Lankhorst <maarten.lankho...@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwer...@intel.com> Cc: Daniel Vetter <daniel.vet...@ffwll.ch> --- drivers/gpu/drm/i915/intel_display.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 04452cf..eb8fb36 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13685,7 +13685,6 @@ static int intel_atomic_commit(struct drm_device *dev, bool modeset = needs_modeset(crtc->state); struct intel_crtc_state *pipe_config = to_intel_crtc_state(crtc->state); -bool update_pipe = !modeset && pipe_config->update_pipe; if (modeset && crtc->state->active) { update_scanline_offset(to_intel_crtc(crtc)); @@ -13699,8 +13698,7 @@ static int intel_atomic_commit(struct drm_device *dev, drm_atomic_get_existing_plane_state(state, crtc->primary)) intel_fbc_enable(intel_crtc); -if (crtc->state->active && -(crtc->state->planes_changed || update_pipe)) +if (crtc->state->active) drm_atomic_helper_commit_planes_on_crtc(old_crtc_state); if (pipe_config->base.active && needs_vblank_wait(pipe_config)) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Fix legacy gamma lut updates in Linux 4.7-rc6
Updating legacy gamma tables, e.g., via RandR doesn't work at all as of Linux 4.7-rc6. Reason seems to be that the required call to drm_atomic_helper_commit_planes_on_crtc is skipped in intel_atomic_commit after userspace set new gamma tables, because neither crtc->state->planes_changed nor update_pipe (= pipe_config->update_pipe) are true. Removing the check for planes_changed || update_pipe fixes gamma table updates. The code for Linux 4.8 drm-next has changed a lot in that area wrt. 4.7, but the new code for 4.8 also removed those checks and calls drm_atomic_helper_commit_planes_on_crtc unconditionally, and legacy gamma lut updates work on drm-next, so this seems to be the right solution. Tested also shutdown/reboot, suspend/resume, (un-)plugging displays, mode switches for resolution/refresh rate, display rotation, and page-flipping/pageflip timing on Intel HD Ironlake to confirm the fix apparently doesn't break anything under X11. Signed-off-by: Mario Kleiner <mario.kleiner...@gmail.com> Cc: Maarten Lankhorst <maarten.lankho...@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwer...@intel.com> Cc: Daniel Vetter <daniel.vet...@ffwll.ch> --- drivers/gpu/drm/i915/intel_display.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 04452cf..eb8fb36 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13685,7 +13685,6 @@ static int intel_atomic_commit(struct drm_device *dev, bool modeset = needs_modeset(crtc->state); struct intel_crtc_state *pipe_config = to_intel_crtc_state(crtc->state); - bool update_pipe = !modeset && pipe_config->update_pipe; if (modeset && crtc->state->active) { update_scanline_offset(to_intel_crtc(crtc)); @@ -13699,8 +13698,7 @@ static int intel_atomic_commit(struct drm_device *dev, drm_atomic_get_existing_plane_state(state, crtc->primary)) intel_fbc_enable(intel_crtc); - if (crtc->state->active && - (crtc->state->planes_changed || update_pipe)) + if (crtc->state->active) drm_atomic_helper_commit_planes_on_crtc(old_crtc_state); if (pipe_config->base.active && needs_vblank_wait(pipe_config)) -- 2.7.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Pageflipping bugs in drm-next on at least Ironlake and Ivybridge.
On 07/06/2016 03:05 PM, Chris Wilson wrote: On Wed, Jul 06, 2016 at 12:17:55PM +0200, Mario Kleiner wrote: Since i pulled the current drm-next tree i see strong flicker and visual corruption during pageflipping, both in my own app, but also in KDE4 and KDE5 Plasma with desktop composition enabled. This happens on both Intel HD Ironake mobile (Apple MBP 2010) and HD-4000 Ivybridge mobile (Apple macMini 2012). It looks like page flips are not waiting properly for rendering to complete, showing partially rendered frames at flip time. If i revert Daniel's commit that switches legacy pageflips from the old code path to the atomic code, all problems disappear, so apparently the atomic code for Intel is not quite ready at least on those parts? Exactly right, we've reverted the enabling patch for the time being. Daniel Stone has spotted the likely problem, but we also want to review the handling of state/old_state to see if the same problem has cropped up elsewhere. -Chris Ah ok, now i see it in drm-intel-next-queued. I'm probably not adding anything new here, but wrt. your crc based tests not catching it, while it happens all the time for me under KDE, in my own fullscreen app, it only obviously happens for some tests, the more graphics heavy ones not others, so probably (gfx-)load dependent? Maybe the tests don't put enough work onto the gpu to still keep it rendering at flip time. -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Legacy gamma table updates broken in 4.7-rc4
A strange one. In Linux 4.7-rc4, at least as build by the Ubuntu mainline ppa, gamma table updates via RandR don't work. No errors are reported and the X-Server thinks everything went well, but on Intel Ironlake and Ivybridge the updates don't have any visual effect. The same problem doesn't happen with current drm-next, so something was fixed. Looking at the new code in intel_color.c i can't see anything obvious that would break it on 4.7-rc but make it work on drm-next? Are there some gamma fixes in drm-next that didn't make it into 4.7-rc yet? Thanks, -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Pageflipping bugs in drm-next on at least Ironlake and Ivybridge.
Since i pulled the current drm-next tree i see strong flicker and visual corruption during pageflipping, both in my own app, but also in KDE4 and KDE5 Plasma with desktop composition enabled. This happens on both Intel HD Ironake mobile (Apple MBP 2010) and HD-4000 Ivybridge mobile (Apple macMini 2012). It looks like page flips are not waiting properly for rendering to complete, showing partially rendered frames at flip time. If i revert Daniel's commit that switches legacy pageflips from the old code path to the atomic code, all problems disappear, so apparently the atomic code for Intel is not quite ready at least on those parts? In case this helps: As i was also testing DRI3/Present + PRIME on the hybrid graphics MBP, if i use the Intel HD as display gpu and the NVidia/nouveau as render offload gpu i don't get any corruption/flicker even with the atomic pageflip code for legacy pageflips. Iow. the path using dmabuf fence wait in intel_prepare_plane_fb works fine. thanks, -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm: use seqlocks for vblank time/count
On 05/18/2016 05:10 PM, Matthew Auld wrote: There's an updated version of this patch already on the ml [1], which I Cc'd you in on. I take it that your @tuebingen.mpg.de is in fact an old email address? [1] https://patchwork.freedesktop.org/patch/86354/ Your patch looks good to me. I'd only keep that one dropped comment line in drmP.h about the vblank counter and ts also needing to be protected by the vblank_timelock in addition to the seqlock, as this is still needed, especially to get _irqsave part of spin_lock_irqsave, as the write seqlocks in don't do the local irq disable. I'll give it a test later this week. Reviewed-by: Mario Kleiner <mario.kleiner...@gmail.com> Indeed the old inactive @tuebingen.mpg.de is only a forward to the gmail address, probably with some botched mail filter rules, so they can go unnoticed quite a while. thanks, -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm: use seqlocks for vblank time/count
On 05/09/2016 08:11 PM, Daniel Vetter wrote: On Mon, May 09, 2016 at 08:16:07PM +0300, Ville Syrjälä wrote: On Mon, May 09, 2016 at 05:08:43PM +0100, Matthew Auld wrote: This patch aims to replace the roll-your-own seqlock implementation with full-blown seqlock'. We also remove the timestamp ring-buffer in favour of single timestamp/count pair protected by a seqlock. In turn this means we can now increment the vblank freely without the need for clamping. This will also change the behaviour to block new readers while the writer has the lock, whereas the old code would allow readers to proceed in parallel. We do the whole hw counter + scanout position query while holding the lock so it's not exactly zero amount of work, but I'm not sure that's a real problem. I guess we could reduce the scope of the seqlock, but then maybe we'd need to keep the vblank_time_lock spinlock as well. The details escape me now, so I'd have re-read the code again. Ccing Mario too. Yeah, my idea was to keep the spinlock, and only replace the stuff in store_vblank and the few do {} while (cur_vblank != get_vblank_counter) loops. Extending the seqlock stuff to everything seems indeed counter to Mario's locking scheme. So goal would be to really just replace the half-baked seqlock that we have already, and leave all other locking unchanged. -Daniel +1 to that, for simplicity. I thought Ville already had a patch laying around somewhere which essentially does this? -mario Cc: Daniel VetterCc: Ville Syrjälä Signed-off-by: Matthew Auld --- drivers/gpu/drm/drm_irq.c | 111 +- include/drm/drmP.h| 14 ++ 2 files changed, 25 insertions(+), 100 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 3c1a6f1..bfc6a8d 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -42,10 +42,6 @@ #include #include -/* Access macro for slots in vblank timestamp ringbuffer. */ -#define vblanktimestamp(dev, pipe, count) \ - ((dev)->vblank[pipe].time[(count) % DRM_VBLANKTIME_RBSIZE]) - /* Retry timestamp calculation up to 3 times to satisfy * drm_timestamp_precision before giving up. */ @@ -82,29 +78,13 @@ static void store_vblank(struct drm_device *dev, unsigned int pipe, struct timeval *t_vblank, u32 last) { struct drm_vblank_crtc *vblank = >vblank[pipe]; - u32 tslot; - assert_spin_locked(>vblank_time_lock); + assert_spin_locked(>vblank_seqlock.lock); vblank->last = last; - /* All writers hold the spinlock, but readers are serialized by -* the latching of vblank->count below. -*/ - tslot = vblank->count + vblank_count_inc; - vblanktimestamp(dev, pipe, tslot) = *t_vblank; - - /* -* vblank timestamp updates are protected on the write side with -* vblank_time_lock, but on the read side done locklessly using a -* sequence-lock on the vblank counter. Ensure correct ordering using -* memory barrriers. We need the barrier both before and also after the -* counter update to synchronize with the next timestamp write. -* The read-side barriers for this are in drm_vblank_count_and_time. -*/ - smp_wmb(); + vblank->time = *t_vblank; vblank->count += vblank_count_inc; - smp_wmb(); } /** @@ -127,7 +107,7 @@ static void drm_reset_vblank_timestamp(struct drm_device *dev, unsigned int pipe struct timeval t_vblank; int count = DRM_TIMESTAMP_MAXRETRIES; - spin_lock(>vblank_time_lock); + write_seqlock(>vblank_seqlock); /* * sample the current counter to avoid random jumps @@ -152,7 +132,7 @@ static void drm_reset_vblank_timestamp(struct drm_device *dev, unsigned int pipe */ store_vblank(dev, pipe, 1, _vblank, cur_vblank); - spin_unlock(>vblank_time_lock); + write_sequnlock(>vblank_seqlock); } /** @@ -205,7 +185,7 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, const struct timeval *t_old; u64 diff_ns; - t_old = (dev, pipe, vblank->count); + t_old = >time; diff_ns = timeval_to_ns(_vblank) - timeval_to_ns(t_old); /* @@ -239,49 +219,6 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, diff = 1; } - /* -* FIMXE: Need to replace this hack with proper seqlocks. -* -* Restrict the bump of the software vblank counter to a safe maximum -* value of +1 whenever there is the possibility that concurrent readers -* of vblank timestamps could be active at the moment, as the current -* implementation of the timestamp caching and updating is not safe -
Re: [Intel-gfx] [PATCH v2 01/21] drm/core: Add drm_accurate_vblank_count, v5.
I'm fine with it. I assume the function will only be used by kms drivers, whose writers probably know when it is safe to call the function, ie. what kind of potential quirks the kms drivers timestamping implementation has. Reviewed-by: Mario Kleiner <mario.kleiner...@gmail.com> On 05/17/2016 03:07 PM, Maarten Lankhorst wrote: This function is useful for gen2 intel devices which have no frame counter, but need a way to determine the current vblank count without racing with the vblank interrupt handler. intel_pipe_update_start checks if no vblank interrupt will occur during vblank evasion, but cannot check whether the vblank handler has run to completion. This function uses the timestamps to determine when the last vblank has happened, and interpolates from there. Changes since v1: - Take vblank_time_lock and don't use drm_vblank_count_and_time. Changes since v2: - Don't return time of last vblank. Changes since v3: - Change pipe to unsigned int. (Ville) - Remove unused documentation for tv_ret. (kbuild) Changes since v4: - Add warning to docs when the function is useful. - Add a WARN_ON when get_vblank_timestamp is unavailable. - Use drm_vblank_count. Cc: Mario Kleiner <mario.kleiner...@gmail.com> Cc: Ville Syrjälä <ville.syrj...@linux.intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankho...@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrj...@linux.intel.com> #v4 Acked-by: David Airlie <airl...@linux.ie> #irc, v4 --- drivers/gpu/drm/drm_irq.c | 31 +++ include/drm/drmP.h| 1 + 2 files changed, 32 insertions(+) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 3c1a6f18e71c..d3124b67f4a5 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -303,6 +303,37 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, store_vblank(dev, pipe, diff, _vblank, cur_vblank); } +/** + * drm_accurate_vblank_count - retrieve the master vblank counter + * @crtc: which counter to retrieve + * + * This function is similar to @drm_crtc_vblank_count but this + * function interpolates to handle a race with vblank irq's. + * + * This is mostly useful for hardware that can obtain the scanout + * position, but doesn't have a frame counter. + */ +u32 drm_accurate_vblank_count(struct drm_crtc *crtc) +{ + struct drm_device *dev = crtc->dev; + unsigned int pipe = drm_crtc_index(crtc); + u32 vblank; + unsigned long flags; + + WARN(!dev->driver->get_vblank_timestamp, +"This function requires support for accurate vblank timestamps."); + + spin_lock_irqsave(>vblank_time_lock, flags); + + drm_update_vblank_count(dev, pipe, 0); + vblank = drm_vblank_count(dev, pipe); + + spin_unlock_irqrestore(>vblank_time_lock, flags); + + return vblank; +} +EXPORT_SYMBOL(drm_accurate_vblank_count); + /* * Disable vblank irq's on crtc, make sure that last vblank count * of hardware and corresponding consistent software vblank counter diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 360b2a74e1ef..ed890384b938 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -1002,6 +1002,7 @@ extern void drm_crtc_vblank_off(struct drm_crtc *crtc); extern void drm_crtc_vblank_reset(struct drm_crtc *crtc); extern void drm_crtc_vblank_on(struct drm_crtc *crtc); extern void drm_vblank_cleanup(struct drm_device *dev); +extern u32 drm_accurate_vblank_count(struct drm_crtc *crtc); extern u32 drm_vblank_no_hw_counter(struct drm_device *dev, unsigned int pipe); extern int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 01/19] drm/core: Add drm_accurate_vblank_count, v5.
Anyway, although i would have liked the stricter check and warning docs, the v4 patch is ok with me: Reviewed-by: Mario Kleiner <mario.kleiner...@gmail.com> -mario On 04/25/2016 08:32 AM, Maarten Lankhorst wrote: This function is useful for gen2 intel devices which have no frame counter, but need a way to determine the current vblank count without racing with the vblank interrupt handler. intel_pipe_update_start checks if no vblank interrupt will occur during vblank evasion, but cannot check whether the vblank handler has run to completion. This function uses the timestamps to determine when the last vblank has happened, and interpolates from there. Changes since v1: - Take vblank_time_lock and don't use drm_vblank_count_and_time. Changes since v2: - Don't return time of last vblank. Changes since v3: - Change pipe to unsigned int. (Ville) - Remove unused documentation for tv_ret. (kbuild) Changes since v4: - Add warning to docs when the function is useful. - Add a WARN_ON when get_vblank_timestamp is unavailable. - Use drm_vblank_count. Cc: Mario Kleiner <mario.kleiner...@gmail.com> Cc: Ville Syrjälä <ville.syrj...@linux.intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankho...@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrj...@linux.intel.com> #v4 Acked-by: David Airlie <airl...@linux.ie> #irc, v4 --- Unfortunately WARN_ON(!dev->disable_vblank_immediate) doesn't work on gen2, which is the reason this function is created. So I used WARN_ON(!get_vblank_timestamp) instead. drivers/gpu/drm/drm_irq.c | 31 +++ include/drm/drmP.h| 1 + 2 files changed, 32 insertions(+) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 3c1a6f18e71c..d3124b67f4a5 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -303,6 +303,37 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, store_vblank(dev, pipe, diff, _vblank, cur_vblank); } +/** + * drm_accurate_vblank_count - retrieve the master vblank counter + * @crtc: which counter to retrieve + * + * This function is similar to @drm_crtc_vblank_count but this + * function interpolates to handle a race with vblank irq's. + * + * This is mostly useful for hardware that can obtain the scanout + * position, but doesn't have a frame counter. + */ +u32 drm_accurate_vblank_count(struct drm_crtc *crtc) +{ + struct drm_device *dev = crtc->dev; + unsigned int pipe = drm_crtc_index(crtc); + u32 vblank; + unsigned long flags; + + WARN(!dev->driver->get_vblank_timestamp, +"This function requires support for accurate vblank timestamps."); + + spin_lock_irqsave(>vblank_time_lock, flags); + + drm_update_vblank_count(dev, pipe, 0); + vblank = drm_vblank_count(dev, pipe); + + spin_unlock_irqrestore(>vblank_time_lock, flags); + + return vblank; +} +EXPORT_SYMBOL(drm_accurate_vblank_count); + /* * Disable vblank irq's on crtc, make sure that last vblank count * of hardware and corresponding consistent software vblank counter diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 005202ea5900..90527c41cd5a 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -995,6 +995,7 @@ extern void drm_crtc_vblank_off(struct drm_crtc *crtc); extern void drm_crtc_vblank_reset(struct drm_crtc *crtc); extern void drm_crtc_vblank_on(struct drm_crtc *crtc); extern void drm_vblank_cleanup(struct drm_device *dev); +extern u32 drm_accurate_vblank_count(struct drm_crtc *crtc); extern u32 drm_vblank_no_hw_counter(struct drm_device *dev, unsigned int pipe); extern int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 01/19] drm/core: Add drm_accurate_vblank_count, v5.
On 04/25/2016 08:32 AM, Maarten Lankhorst wrote: This function is useful for gen2 intel devices which have no frame counter, but need a way to determine the current vblank count without racing with the vblank interrupt handler. intel_pipe_update_start checks if no vblank interrupt will occur during vblank evasion, but cannot check whether the vblank handler has run to completion. This function uses the timestamps to determine when the last vblank has happened, and interpolates from there. Changes since v1: - Take vblank_time_lock and don't use drm_vblank_count_and_time. Changes since v2: - Don't return time of last vblank. Changes since v3: - Change pipe to unsigned int. (Ville) - Remove unused documentation for tv_ret. (kbuild) Changes since v4: - Add warning to docs when the function is useful. - Add a WARN_ON when get_vblank_timestamp is unavailable. - Use drm_vblank_count. Cc: Mario Kleiner <mario.kleiner...@gmail.com> Cc: Ville Syrjälä <ville.syrj...@linux.intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankho...@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrj...@linux.intel.com> #v4 Acked-by: David Airlie <airl...@linux.ie> #irc, v4 --- Unfortunately WARN_ON(!dev->disable_vblank_immediate) doesn't work on gen2, which is the reason this function is created. So I used WARN_ON(!get_vblank_timestamp) instead. That's a weaker warning. I'd like to have the WARN_ON and the doc text to be more frightening/restrictive to discourage abuse. But can't you simply remove that !IS_GEN2 check now and always set dev->disable_vblank_immediate = true? The reason for that exception was that GEN2 doesn't have a hw vblank counter. But it has scanout pos based vblank timestamping, which i'd assume is well behaved. With the new scanout based vblank counter emulation in drm_update_vblank_count() since around Linux 4.4 you therefore essentially have a proper emulated vblank counter, so this should be safe. Ville will probably know. Otherwise you couldn't trust drm_accurate_vblank_count() here, because it depends on the same logic, no? -mario drivers/gpu/drm/drm_irq.c | 31 +++ include/drm/drmP.h| 1 + 2 files changed, 32 insertions(+) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 3c1a6f18e71c..d3124b67f4a5 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -303,6 +303,37 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, store_vblank(dev, pipe, diff, _vblank, cur_vblank); } +/** + * drm_accurate_vblank_count - retrieve the master vblank counter + * @crtc: which counter to retrieve + * + * This function is similar to @drm_crtc_vblank_count but this + * function interpolates to handle a race with vblank irq's. + * + * This is mostly useful for hardware that can obtain the scanout + * position, but doesn't have a frame counter. + */ +u32 drm_accurate_vblank_count(struct drm_crtc *crtc) +{ + struct drm_device *dev = crtc->dev; + unsigned int pipe = drm_crtc_index(crtc); + u32 vblank; + unsigned long flags; + + WARN(!dev->driver->get_vblank_timestamp, +"This function requires support for accurate vblank timestamps."); + + spin_lock_irqsave(>vblank_time_lock, flags); + + drm_update_vblank_count(dev, pipe, 0); + vblank = drm_vblank_count(dev, pipe); + + spin_unlock_irqrestore(>vblank_time_lock, flags); + + return vblank; +} +EXPORT_SYMBOL(drm_accurate_vblank_count); + /* * Disable vblank irq's on crtc, make sure that last vblank count * of hardware and corresponding consistent software vblank counter diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 005202ea5900..90527c41cd5a 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -995,6 +995,7 @@ extern void drm_crtc_vblank_off(struct drm_crtc *crtc); extern void drm_crtc_vblank_reset(struct drm_crtc *crtc); extern void drm_crtc_vblank_on(struct drm_crtc *crtc); extern void drm_vblank_cleanup(struct drm_device *dev); +extern u32 drm_accurate_vblank_count(struct drm_crtc *crtc); extern u32 drm_vblank_no_hw_counter(struct drm_device *dev, unsigned int pipe); extern int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 01/19] drm/core: Add drm_accurate_vblank_count, v4.
Sorry for the late review, but see below... On 04/19/2016 09:52 AM, Maarten Lankhorst wrote: This function is useful for gen2 intel devices which have no frame counter, but need a way to determine the current vblank count without racing with the vblank interrupt handler. intel_pipe_update_start checks if no vblank interrupt will occur during vblank evasion, but cannot check whether the vblank handler has run to completion. This function uses the timestamps to determine when the last vblank has happened, and interpolates from there. Changes since v1: - Take vblank_time_lock and don't use drm_vblank_count_and_time. Changes since v2: - Don't return time of last vblank. Changes since v3: - Change pipe to unsigned int. (Ville) - Remove unused documentation for tv_ret. (kbuild) Cc: Mario Kleiner <mario.kleiner...@gmail.com> Cc: Ville Syrjälä <ville.syrj...@linux.intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankho...@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrj...@linux.intel.com> Acked-by: David Airlie <airl...@linux.ie> #irc --- drivers/gpu/drm/drm_irq.c | 26 ++ include/drm/drmP.h| 1 + 2 files changed, 27 insertions(+) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 3c1a6f18e71c..f1bda13562da 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -303,6 +303,32 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, store_vblank(dev, pipe, diff, _vblank, cur_vblank); } +/** + * drm_accurate_vblank_count - retrieve the master vblank counter + * @crtc: which counter to retrieve + * + * This function is similar to @drm_crtc_vblank_count but this + * function interpolates to handle a race with vblank irq's. + */ + +u32 drm_accurate_vblank_count(struct drm_crtc *crtc) +{ + struct drm_device *dev = crtc->dev; + unsigned int pipe = drm_crtc_index(crtc); + u32 vblank; + unsigned long flags; + This function is rather dangerous to use on any driver that doesn't have precise vblank timestamping, or doesn't have the guarantee that hw vblank counters (if there are any) and timestamps update exactly at leading edge of vblank, so i think we need some WARN() here and maybe much less encouraging docs to avoid this being called from incapable kms drivers or in general code. - If the driver doesn't have precise scanoutpos based timestamping each call into drm_update_vblank_count from non-irq context will reset the vblank timestamps to zero, so clients will only receive invalid timestamps if this is frequently used. Also bogus vblank counts Atm. only i915 Intel, AMD, NVidia desktop for >= NV-50, maybe nouveau driven Tegra parts, and some modern Adrenos (msm/mdp-5 - i assume from the code?) support this reliably. - If the drivers scanoutpos timestamps and/or vblank counter don't increment at leading edge we will get funny off-by-one problems with vblank counters. That's why we normally only call drm_update_vblank_count() from vblank irq on such parts - the only safe place to avoid off-by-one problems, and limit vblank disable/enable to only at most once every 5 seconds to reduce the problems caused by off-by-one errors. Which restricts the list to only the above parts, maybe minus Adreno where i don't know if it obeys the "leading edge" rule or not. So on most SoC's one must not use this function. WARN_ON(!dev->vblank_disable_immediate, "This function is unsafe on this driver."); would probably prevent the worst abuse, unless drivers lie about vblank_disable_immediate. Not sure how much this was checked for msm / Adreno? At least drm_vblank_init() only allows vblank_disable_immediate if the driver at least implements proper timestamping. Not sure how much general use this function will have outside Intel gen-2 with the restrictions on safe use? + spin_lock_irqsave(>vblank_time_lock, flags); + + drm_update_vblank_count(dev, pipe, 0); + vblank = dev->vblank[pipe].count; Could do vblank = drm_vblank_count(dev, pipe); instead, given that we avoid open coding this in most places. -mario + + spin_unlock_irqrestore(>vblank_time_lock, flags); + + return vblank; +} +EXPORT_SYMBOL(drm_accurate_vblank_count); + /* * Disable vblank irq's on crtc, make sure that last vblank count * of hardware and corresponding consistent software vblank counter diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 005202ea5900..90527c41cd5a 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -995,6 +995,7 @@ extern void drm_crtc_vblank_off(struct drm_crtc *crtc); extern void drm_crtc_vblank_reset(struct drm_crtc *crtc); extern void drm_crtc_vblank_on(struct drm_crtc *crtc); extern void drm_vblank_cleanup(struct drm_device *dev); +extern u32 drm_accurate_vblank_count(struct drm_crtc *crtc); extern u32 drm_vblank_no_h
Re: [Intel-gfx] [PATCH] drm/i915: Only dither on 6bpc panels
Thanks for the quick fix! Comments below... On 08/12/2015 11:43 AM, Daniel Vetter wrote: In commit d328c9d78d64ca11e744fe227096990430a88477 Author: Daniel Vetter daniel.vet...@ffwll.ch Date: Fri Apr 10 16:22:37 2015 +0200 drm/i915: Select starting pipe bpp irrespective or the primary plane we started to select the pipe bpp from sink capabilities and not from the primary framebuffer - that one might change (and we don't want to incur a modeset) and sprites might contain higher bpp content too. Problem is that now if you have a 10bpc screen and display 24bpp rgb primary then we select dithering, and apparently that mangles the high 8 bits even (even thought you'd expect dithering only to affect how 12bpc gets mapped into 10bpc). And that mangling upsets certain users. Probably doesn't matter, but your explanation of the former problem here is slightly off. We also selected dithering on a 8 bpc screen displaying a 24bpp rgb primary, because pipe_bpp is 24 for such a typical 8 bpc sink, but since the commit mentioned above, base_bpp is always the absolute maximum supported by the hardware, e.g., 36 bpp on my Ironlake chip. Iow. the only way to not get dithering would have been to connect a deep color 12 bpc display, so pipe_bpp == 36 == base_bpp. Hence only enable dithering on 6bpc screens where we difinitely and always want it. Other than that, i tested the patch on both 8 bpc output with my measurement equipment and on the internal laptop 6 bpc panel, and everything is fine now - No banding on the 6 bpc panel, no banding or equipment failure on the external 8 bpc output. Life is good again :) Reviewed-and-tested-by: Mario Kleiner mario.kleiner...@gmail.com thanks, -mario Cc: Mario Kleiner mario.kleiner...@gmail.com Reported-by: Mario Kleiner mario.kleiner...@gmail.com Signed-off-by: Daniel Vetter daniel.vet...@intel.com --- drivers/gpu/drm/i915/intel_display.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 9a2f229a1c3a..128462e0a0b5 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12186,7 +12186,9 @@ encoder_retry: goto encoder_retry; } - pipe_config-dither = pipe_config-pipe_bpp != base_bpp; + /* Dithering seems to not pass-through bits correctly when it should, so +* only enable it on 6bpc panels. */ + pipe_config-dither = pipe_config-pipe_bpp == 6*3; DRM_DEBUG_KMS(plane bpp: %i, pipe bpp: %i, dithering: %i\n, base_bpp, pipe_config-pipe_bpp, pipe_config-dither); ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Intel-kms in Linux-4.2rc causes regression due to dithering always on.
On 08/07/2015 09:14 AM, Daniel Vetter wrote: On Fri, Aug 07, 2015 at 12:45:52AM +0200, Mario Kleiner wrote: On 08/07/2015 12:12 AM, Daniel Vetter wrote: On Thu, Aug 6, 2015 at 11:56 PM, Mario Kleiner mario.kleiner...@gmail.com wrote: Hi Daniel and all, since Linux 4.2 (tested with rc4), i think this commit d328c9d78d64ca11e744fe227096990430a88477 drm/i915: Select starting pipe bpp irrespective or the primary plane causes trouble for me and my users, as tested on Intel HD Ironlake and Ivy Bridge with MiniDP-Singlelink-DVI adapter - Measurement device. Afaics it causes dithering to always be enabled on a regular 8bpc framebuffer, even when outputting to a 8 bpc DVI-D output, and that dithering causes my display measurement equipment and other special display devices used for neuro-science and medical applications to fail. This equipment requires an identity passthrough of 8 bpc framebuffer pixels to the digital outputs, iow. dithering off. Log output on Linux 4.1 (good): Aug 1 06:39:26 twisty kernel: [ 154.175394] [drm:connected_sink_compute_bpp] [CONNECTOR:35:HDMI-A-1] checking for sink bpp constrains Aug 1 06:39:26 twisty kernel: [ 154.175396] [drm:intel_hdmi_compute_config] picking bpc to 8 for HDMI output Aug 1 06:39:26 twisty kernel: [ 154.175397] [drm:intel_hdmi_compute_config] forcing pipe bpc to 24 for HDMI Aug 1 06:39:26 twisty kernel: [ 154.175400] [drm:ironlake_check_fdi_lanes] checking fdi config on pipe A, lanes 1 Aug 1 06:39:26 twisty kernel: [ 154.175402] [drm:intel_modeset_pipe_config] plane bpp: 24, pipe bpp: 24, dithering: 0 Aug 1 06:39:26 twisty kernel: [ 154.175403] [drm:intel_dump_pipe_config] [CRTC:20][modeset] config for pipe A Aug 1 06:39:26 twisty kernel: [ 154.175404] [drm:intel_dump_pipe_config] cpu_transcoder: A Aug 1 06:39:26 twisty kernel: [ 154.175405] [drm:intel_dump_pipe_config] pipe bpp: 24, dithering: 0 Log output on Linux 4.2-rc4 (bad): Aug 1 06:21:31 twisty kernel: [ 200.924831] [drm:connected_sink_compute_bpp] [CONNECTOR:36:HDMI-A-1] checking for sink bpp constrains Aug 1 06:21:31 twisty kernel: [ 200.924832] [drm:connected_sink_compute_bpp] clamping display bpp (was 36) to default limit of 24 Aug 1 06:21:31 twisty kernel: [ 200.924834] [drm:intel_hdmi_compute_config] picking bpc to 8 for HDMI output Aug 1 06:21:31 twisty kernel: [ 200.924835] [drm:intel_hdmi_compute_config] forcing pipe bpc to 24 for HDMI Aug 1 06:21:31 twisty kernel: [ 200.924838] [drm:ironlake_check_fdi_lanes] checking fdi config on pipe A, lanes 1 Aug 1 06:21:31 twisty kernel: [ 200.924840] [drm:intel_modeset_pipe_config] plane bpp: 36, pipe bpp: 24, dithering: 1 Aug 1 06:21:31 twisty kernel: [ 200.924841] [drm:intel_dump_pipe_config] [CRTC:21][modeset] config 880131a5c800 for pipe A Aug 1 06:21:31 twisty kernel: [ 200.924842] [drm:intel_dump_pipe_config] cpu_transcoder: A Aug 1 06:21:31 twisty kernel: [ 200.924843] [drm:intel_dump_pipe_config] pipe bpp: 24, dithering: 1 Ideas what to do about this? Well I somehow assumed the dither bit would be sane and not wreak havoc with the lower bits when they would fit into the final bpc pipe mode ... Can you confirm with your equipment that we seem to be doing 8bpc-6bpc dithering on the 8bpc sink? It will need a bit of work to find this out when i'm back in the lab. So far i just know something bad is happening to the signal and i assume it's the dithering, because the visual error pattern of messiness looks like that caused by dithering. E.g., on a static framebuffer i see some repeating pattern over the screen, but the pattern changes with every OpenGL bufferswap, even if i swap to the same fb content, as if the swap triggers some change of the spatial dither pattern (assuming PIPECONF_DITHER_TYPE_SP = spatial dithering?) If that's the case we simply limit to only ever dither when the sink is 6bpc, and not in any other case. -Daniel That would be an improvement for my immediate problem if that works. But assuming we have 10 bpc framebuffers at some point, dithering 10 bpc - 8 bpc would also have some practical use. Probably some dynamic check would be good, a la if there is a mismatch between the max(bpc) over all active planes and the supported depth of the sink then dither? It's not clear to me where the dithering happens on intel hw. I'd expected that with a 24 bpp framebuffer feeding into a 24 bpp pipe, dithering simply wouldn't do anything even if enabled. Yeah my assumption was that if you run the pipe at a given bpc it will just pass through anything that fits and only dither the additional bits. But obviously that's not how the hardware works ... The problem with adaptive schemes is that we have multiple planes nowadays and they might all run at different depths. And dither seems to be happening at the pipe/overall level (at least there's only one bit). Of course this wouldn't be a problem if the thing wouldn't mangle bits which should pass! Anyway if we can confirm this I think
[Intel-gfx] Intel-kms in Linux-4.2rc causes regression due to dithering always on.
Hi Daniel and all, since Linux 4.2 (tested with rc4), i think this commit d328c9d78d64ca11e744fe227096990430a88477 drm/i915: Select starting pipe bpp irrespective or the primary plane causes trouble for me and my users, as tested on Intel HD Ironlake and Ivy Bridge with MiniDP-Singlelink-DVI adapter - Measurement device. Afaics it causes dithering to always be enabled on a regular 8bpc framebuffer, even when outputting to a 8 bpc DVI-D output, and that dithering causes my display measurement equipment and other special display devices used for neuro-science and medical applications to fail. This equipment requires an identity passthrough of 8 bpc framebuffer pixels to the digital outputs, iow. dithering off. Log output on Linux 4.1 (good): Aug 1 06:39:26 twisty kernel: [ 154.175394] [drm:connected_sink_compute_bpp] [CONNECTOR:35:HDMI-A-1] checking for sink bpp constrains Aug 1 06:39:26 twisty kernel: [ 154.175396] [drm:intel_hdmi_compute_config] picking bpc to 8 for HDMI output Aug 1 06:39:26 twisty kernel: [ 154.175397] [drm:intel_hdmi_compute_config] forcing pipe bpc to 24 for HDMI Aug 1 06:39:26 twisty kernel: [ 154.175400] [drm:ironlake_check_fdi_lanes] checking fdi config on pipe A, lanes 1 Aug 1 06:39:26 twisty kernel: [ 154.175402] [drm:intel_modeset_pipe_config] plane bpp: 24, pipe bpp: 24, dithering: 0 Aug 1 06:39:26 twisty kernel: [ 154.175403] [drm:intel_dump_pipe_config] [CRTC:20][modeset] config for pipe A Aug 1 06:39:26 twisty kernel: [ 154.175404] [drm:intel_dump_pipe_config] cpu_transcoder: A Aug 1 06:39:26 twisty kernel: [ 154.175405] [drm:intel_dump_pipe_config] pipe bpp: 24, dithering: 0 Log output on Linux 4.2-rc4 (bad): Aug 1 06:21:31 twisty kernel: [ 200.924831] [drm:connected_sink_compute_bpp] [CONNECTOR:36:HDMI-A-1] checking for sink bpp constrains Aug 1 06:21:31 twisty kernel: [ 200.924832] [drm:connected_sink_compute_bpp] clamping display bpp (was 36) to default limit of 24 Aug 1 06:21:31 twisty kernel: [ 200.924834] [drm:intel_hdmi_compute_config] picking bpc to 8 for HDMI output Aug 1 06:21:31 twisty kernel: [ 200.924835] [drm:intel_hdmi_compute_config] forcing pipe bpc to 24 for HDMI Aug 1 06:21:31 twisty kernel: [ 200.924838] [drm:ironlake_check_fdi_lanes] checking fdi config on pipe A, lanes 1 Aug 1 06:21:31 twisty kernel: [ 200.924840] [drm:intel_modeset_pipe_config] plane bpp: 36, pipe bpp: 24, dithering: 1 Aug 1 06:21:31 twisty kernel: [ 200.924841] [drm:intel_dump_pipe_config] [CRTC:21][modeset] config 880131a5c800 for pipe A Aug 1 06:21:31 twisty kernel: [ 200.924842] [drm:intel_dump_pipe_config] cpu_transcoder: A Aug 1 06:21:31 twisty kernel: [ 200.924843] [drm:intel_dump_pipe_config] pipe bpp: 24, dithering: 1 Ideas what to do about this? thanks, -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Intel-kms in Linux-4.2rc causes regression due to dithering always on.
On 08/07/2015 12:12 AM, Daniel Vetter wrote: On Thu, Aug 6, 2015 at 11:56 PM, Mario Kleiner mario.kleiner...@gmail.com wrote: Hi Daniel and all, since Linux 4.2 (tested with rc4), i think this commit d328c9d78d64ca11e744fe227096990430a88477 drm/i915: Select starting pipe bpp irrespective or the primary plane causes trouble for me and my users, as tested on Intel HD Ironlake and Ivy Bridge with MiniDP-Singlelink-DVI adapter - Measurement device. Afaics it causes dithering to always be enabled on a regular 8bpc framebuffer, even when outputting to a 8 bpc DVI-D output, and that dithering causes my display measurement equipment and other special display devices used for neuro-science and medical applications to fail. This equipment requires an identity passthrough of 8 bpc framebuffer pixels to the digital outputs, iow. dithering off. Log output on Linux 4.1 (good): Aug 1 06:39:26 twisty kernel: [ 154.175394] [drm:connected_sink_compute_bpp] [CONNECTOR:35:HDMI-A-1] checking for sink bpp constrains Aug 1 06:39:26 twisty kernel: [ 154.175396] [drm:intel_hdmi_compute_config] picking bpc to 8 for HDMI output Aug 1 06:39:26 twisty kernel: [ 154.175397] [drm:intel_hdmi_compute_config] forcing pipe bpc to 24 for HDMI Aug 1 06:39:26 twisty kernel: [ 154.175400] [drm:ironlake_check_fdi_lanes] checking fdi config on pipe A, lanes 1 Aug 1 06:39:26 twisty kernel: [ 154.175402] [drm:intel_modeset_pipe_config] plane bpp: 24, pipe bpp: 24, dithering: 0 Aug 1 06:39:26 twisty kernel: [ 154.175403] [drm:intel_dump_pipe_config] [CRTC:20][modeset] config for pipe A Aug 1 06:39:26 twisty kernel: [ 154.175404] [drm:intel_dump_pipe_config] cpu_transcoder: A Aug 1 06:39:26 twisty kernel: [ 154.175405] [drm:intel_dump_pipe_config] pipe bpp: 24, dithering: 0 Log output on Linux 4.2-rc4 (bad): Aug 1 06:21:31 twisty kernel: [ 200.924831] [drm:connected_sink_compute_bpp] [CONNECTOR:36:HDMI-A-1] checking for sink bpp constrains Aug 1 06:21:31 twisty kernel: [ 200.924832] [drm:connected_sink_compute_bpp] clamping display bpp (was 36) to default limit of 24 Aug 1 06:21:31 twisty kernel: [ 200.924834] [drm:intel_hdmi_compute_config] picking bpc to 8 for HDMI output Aug 1 06:21:31 twisty kernel: [ 200.924835] [drm:intel_hdmi_compute_config] forcing pipe bpc to 24 for HDMI Aug 1 06:21:31 twisty kernel: [ 200.924838] [drm:ironlake_check_fdi_lanes] checking fdi config on pipe A, lanes 1 Aug 1 06:21:31 twisty kernel: [ 200.924840] [drm:intel_modeset_pipe_config] plane bpp: 36, pipe bpp: 24, dithering: 1 Aug 1 06:21:31 twisty kernel: [ 200.924841] [drm:intel_dump_pipe_config] [CRTC:21][modeset] config 880131a5c800 for pipe A Aug 1 06:21:31 twisty kernel: [ 200.924842] [drm:intel_dump_pipe_config] cpu_transcoder: A Aug 1 06:21:31 twisty kernel: [ 200.924843] [drm:intel_dump_pipe_config] pipe bpp: 24, dithering: 1 Ideas what to do about this? Well I somehow assumed the dither bit would be sane and not wreak havoc with the lower bits when they would fit into the final bpc pipe mode ... Can you confirm with your equipment that we seem to be doing 8bpc-6bpc dithering on the 8bpc sink? It will need a bit of work to find this out when i'm back in the lab. So far i just know something bad is happening to the signal and i assume it's the dithering, because the visual error pattern of messiness looks like that caused by dithering. E.g., on a static framebuffer i see some repeating pattern over the screen, but the pattern changes with every OpenGL bufferswap, even if i swap to the same fb content, as if the swap triggers some change of the spatial dither pattern (assuming PIPECONF_DITHER_TYPE_SP = spatial dithering?) If that's the case we simply limit to only ever dither when the sink is 6bpc, and not in any other case. -Daniel That would be an improvement for my immediate problem if that works. But assuming we have 10 bpc framebuffers at some point, dithering 10 bpc - 8 bpc would also have some practical use. Probably some dynamic check would be good, a la if there is a mismatch between the max(bpc) over all active planes and the supported depth of the sink then dither? It's not clear to me where the dithering happens on intel hw. I'd expected that with a 24 bpp framebuffer feeding into a 24 bpp pipe, dithering simply wouldn't do anything even if enabled. -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/3] drm/nouveau: Use drm_vblank_on/off consistently
On 05/29/2015 07:35 PM, Daniel Vetter wrote: On Fri, May 29, 2015 at 07:23:35PM +0200, Mario Kleiner wrote: On 05/29/2015 07:19 PM, Daniel Vetter wrote: On Fri, May 29, 2015 at 06:50:06PM +0200, Mario Kleiner wrote: On 05/27/2015 11:04 AM, Daniel Vetter wrote: In commit 9cba5efab5a8145ae6c52ea273553f069c294482 Author: Mario Kleiner mario.kleiner...@gmail.com Date: Tue Jul 29 02:36:44 2014 +0200 drm/nouveau: Dis/Enable vblank irqs during suspend/resume drm_vblank_on/off calls where added around suspend/resume to make sure vblank stay doesn't go boom over that transition. But nouveau already used drm_vblank_pre/post_modeset over modesets. Instead use drm_vblank_on/off everyhwere. The slight change here is that after _off drm_vblank_get will refuse to work right away, but nouveau doesn't seem to depend upon that anywhere outside of the pageflip paths. The longer-term plan here is to switch all kms drivers to drm_vblank_on/off so that common code like pending event cleanup can be done there, while drm_vblank_pre/post_modeset will be purely drm internal for the old UMS ioctl. Note that the drm_vblank_off still seems required in the suspend path since nouveau doesn't explicitly disable crtcs. But on the resume side drm_helper_resume_force_mode should end up calling drm_vblank_on through the nouveau crtc hooks already. Hence remove the call in the resume code. Cc: Mario Kleiner mario.kleiner...@gmail.com Cc: Ben Skeggs bske...@redhat.com Signed-off-by: Daniel Vetter daniel.vet...@intel.com --- drivers/gpu/drm/nouveau/dispnv04/crtc.c | 4 ++-- drivers/gpu/drm/nouveau/nouveau_display.c | 4 2 files changed, 2 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c b/drivers/gpu/drm/nouveau/dispnv04/crtc.c index 3d96b49fe662..dab24066fa21 100644 --- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c +++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c @@ -708,7 +708,7 @@ static void nv_crtc_prepare(struct drm_crtc *crtc) if (nv_two_heads(dev)) NVSetOwner(dev, nv_crtc-index); - drm_vblank_pre_modeset(dev, nv_crtc-index); + drm_vblank_off(dev, nv_crtc-index); funcs-dpms(crtc, DRM_MODE_DPMS_OFF); NVBlankScreen(dev, nv_crtc-index, true); @@ -740,7 +740,7 @@ static void nv_crtc_commit(struct drm_crtc *crtc) #endif funcs-dpms(crtc, DRM_MODE_DPMS_ON); - drm_vblank_post_modeset(dev, nv_crtc-index); + drm_vblank_on(dev, nv_crtc-index); } The above hunk is probably correct, but i couldn't test it without sufficiently old pre-nv 50 hardware. static void nv_crtc_destroy(struct drm_crtc *crtc) diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c index 8670d90cdc11..d824023f9fc6 100644 --- a/drivers/gpu/drm/nouveau/nouveau_display.c +++ b/drivers/gpu/drm/nouveau/nouveau_display.c @@ -620,10 +620,6 @@ nouveau_display_resume(struct drm_device *dev, bool runtime) nv_crtc-lut.depth = 0; } - /* Make sure that drm and hw vblank irqs get resumed if needed. */ - for (head = 0; head dev-mode_config.num_crtc; head++) - drm_vblank_on(dev, head); - /* This should ensure we don't hit a locking problem when someone * wakes us up via a connector. We should never go into suspend * while the display is on anyways. Tested this one and this hunk breaks suspend/resume. After a suspend/resume cycle, all OpenGL apps and composited desktop are dead, as the core can't get any vblank irq's enabled anymore. So the drm_vblank_on() is still needed here. Hm that's very surprising. As mentioned above the force_mode_restore should be calling nv_crtc_prepare already and fix this all up for us. I guess I need to dig out my nv card and trace what's really going on here. Enabling interrupts when the crtc is off isn't a good idea. -Daniel I think the nv_crtc_prepare() path modified in your first hunk is only for the original nv04 display engine for very old cards. nv50+ (GeForce-8 and later) take different paths. Oh right totally missed the nv50+ code. I only grepped for pre/post_modeset ... Below untested diff should help. I also realized that the pre-nv50 code lacks drm_vblank_on/off in the dpms callback, so there's more work to do anyway for this one here. Thanks, Daniel The diff on top of your patch is now tested and helps. suspend-resume is now fine on nv50. In your patch, nouveau_display_resume() would also need to get a now unused int head removed to make the compiler happy. -mario diff --git a/drivers/gpu/drm/nouveau/nv50_display.c b/drivers/gpu/drm/nouveau/nv50_display.c index 7da7958556a3..a16c37d8f7e1 100644 --- a/drivers/gpu/drm/nouveau/nv50_display.c +++ b/drivers/gpu/drm/nouveau/nv50_display.c @@ -997,6 +997,10 @@ nv50_crtc_cursor_show_hide(struct nouveau_crtc *nv_crtc, bool show, bool update) static void nv50_crtc_dpms(struct drm_crtc *crtc, int mode
Re: [Intel-gfx] [PATCH 1/3] drm/nouveau: Use drm_vblank_on/off consistently
On 05/27/2015 11:04 AM, Daniel Vetter wrote: In commit 9cba5efab5a8145ae6c52ea273553f069c294482 Author: Mario Kleiner mario.kleiner...@gmail.com Date: Tue Jul 29 02:36:44 2014 +0200 drm/nouveau: Dis/Enable vblank irqs during suspend/resume drm_vblank_on/off calls where added around suspend/resume to make sure vblank stay doesn't go boom over that transition. But nouveau already used drm_vblank_pre/post_modeset over modesets. Instead use drm_vblank_on/off everyhwere. The slight change here is that after _off drm_vblank_get will refuse to work right away, but nouveau doesn't seem to depend upon that anywhere outside of the pageflip paths. The longer-term plan here is to switch all kms drivers to drm_vblank_on/off so that common code like pending event cleanup can be done there, while drm_vblank_pre/post_modeset will be purely drm internal for the old UMS ioctl. Note that the drm_vblank_off still seems required in the suspend path since nouveau doesn't explicitly disable crtcs. But on the resume side drm_helper_resume_force_mode should end up calling drm_vblank_on through the nouveau crtc hooks already. Hence remove the call in the resume code. Cc: Mario Kleiner mario.kleiner...@gmail.com Cc: Ben Skeggs bske...@redhat.com Signed-off-by: Daniel Vetter daniel.vet...@intel.com --- drivers/gpu/drm/nouveau/dispnv04/crtc.c | 4 ++-- drivers/gpu/drm/nouveau/nouveau_display.c | 4 2 files changed, 2 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c b/drivers/gpu/drm/nouveau/dispnv04/crtc.c index 3d96b49fe662..dab24066fa21 100644 --- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c +++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c @@ -708,7 +708,7 @@ static void nv_crtc_prepare(struct drm_crtc *crtc) if (nv_two_heads(dev)) NVSetOwner(dev, nv_crtc-index); - drm_vblank_pre_modeset(dev, nv_crtc-index); + drm_vblank_off(dev, nv_crtc-index); funcs-dpms(crtc, DRM_MODE_DPMS_OFF); NVBlankScreen(dev, nv_crtc-index, true); @@ -740,7 +740,7 @@ static void nv_crtc_commit(struct drm_crtc *crtc) #endif funcs-dpms(crtc, DRM_MODE_DPMS_ON); - drm_vblank_post_modeset(dev, nv_crtc-index); + drm_vblank_on(dev, nv_crtc-index); } The above hunk is probably correct, but i couldn't test it without sufficiently old pre-nv 50 hardware. static void nv_crtc_destroy(struct drm_crtc *crtc) diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c index 8670d90cdc11..d824023f9fc6 100644 --- a/drivers/gpu/drm/nouveau/nouveau_display.c +++ b/drivers/gpu/drm/nouveau/nouveau_display.c @@ -620,10 +620,6 @@ nouveau_display_resume(struct drm_device *dev, bool runtime) nv_crtc-lut.depth = 0; } - /* Make sure that drm and hw vblank irqs get resumed if needed. */ - for (head = 0; head dev-mode_config.num_crtc; head++) - drm_vblank_on(dev, head); - /* This should ensure we don't hit a locking problem when someone * wakes us up via a connector. We should never go into suspend * while the display is on anyways. Tested this one and this hunk breaks suspend/resume. After a suspend/resume cycle, all OpenGL apps and composited desktop are dead, as the core can't get any vblank irq's enabled anymore. So the drm_vblank_on() is still needed here. thanks, -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/3] drm/nouveau: Use drm_vblank_on/off consistently
On 05/29/2015 07:19 PM, Daniel Vetter wrote: On Fri, May 29, 2015 at 06:50:06PM +0200, Mario Kleiner wrote: On 05/27/2015 11:04 AM, Daniel Vetter wrote: In commit 9cba5efab5a8145ae6c52ea273553f069c294482 Author: Mario Kleiner mario.kleiner...@gmail.com Date: Tue Jul 29 02:36:44 2014 +0200 drm/nouveau: Dis/Enable vblank irqs during suspend/resume drm_vblank_on/off calls where added around suspend/resume to make sure vblank stay doesn't go boom over that transition. But nouveau already used drm_vblank_pre/post_modeset over modesets. Instead use drm_vblank_on/off everyhwere. The slight change here is that after _off drm_vblank_get will refuse to work right away, but nouveau doesn't seem to depend upon that anywhere outside of the pageflip paths. The longer-term plan here is to switch all kms drivers to drm_vblank_on/off so that common code like pending event cleanup can be done there, while drm_vblank_pre/post_modeset will be purely drm internal for the old UMS ioctl. Note that the drm_vblank_off still seems required in the suspend path since nouveau doesn't explicitly disable crtcs. But on the resume side drm_helper_resume_force_mode should end up calling drm_vblank_on through the nouveau crtc hooks already. Hence remove the call in the resume code. Cc: Mario Kleiner mario.kleiner...@gmail.com Cc: Ben Skeggs bske...@redhat.com Signed-off-by: Daniel Vetter daniel.vet...@intel.com --- drivers/gpu/drm/nouveau/dispnv04/crtc.c | 4 ++-- drivers/gpu/drm/nouveau/nouveau_display.c | 4 2 files changed, 2 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c b/drivers/gpu/drm/nouveau/dispnv04/crtc.c index 3d96b49fe662..dab24066fa21 100644 --- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c +++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c @@ -708,7 +708,7 @@ static void nv_crtc_prepare(struct drm_crtc *crtc) if (nv_two_heads(dev)) NVSetOwner(dev, nv_crtc-index); - drm_vblank_pre_modeset(dev, nv_crtc-index); + drm_vblank_off(dev, nv_crtc-index); funcs-dpms(crtc, DRM_MODE_DPMS_OFF); NVBlankScreen(dev, nv_crtc-index, true); @@ -740,7 +740,7 @@ static void nv_crtc_commit(struct drm_crtc *crtc) #endif funcs-dpms(crtc, DRM_MODE_DPMS_ON); - drm_vblank_post_modeset(dev, nv_crtc-index); + drm_vblank_on(dev, nv_crtc-index); } The above hunk is probably correct, but i couldn't test it without sufficiently old pre-nv 50 hardware. static void nv_crtc_destroy(struct drm_crtc *crtc) diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c index 8670d90cdc11..d824023f9fc6 100644 --- a/drivers/gpu/drm/nouveau/nouveau_display.c +++ b/drivers/gpu/drm/nouveau/nouveau_display.c @@ -620,10 +620,6 @@ nouveau_display_resume(struct drm_device *dev, bool runtime) nv_crtc-lut.depth = 0; } - /* Make sure that drm and hw vblank irqs get resumed if needed. */ - for (head = 0; head dev-mode_config.num_crtc; head++) - drm_vblank_on(dev, head); - /* This should ensure we don't hit a locking problem when someone * wakes us up via a connector. We should never go into suspend * while the display is on anyways. Tested this one and this hunk breaks suspend/resume. After a suspend/resume cycle, all OpenGL apps and composited desktop are dead, as the core can't get any vblank irq's enabled anymore. So the drm_vblank_on() is still needed here. Hm that's very surprising. As mentioned above the force_mode_restore should be calling nv_crtc_prepare already and fix this all up for us. I guess I need to dig out my nv card and trace what's really going on here. Enabling interrupts when the crtc is off isn't a good idea. -Daniel I think the nv_crtc_prepare() path modified in your first hunk is only for the original nv04 display engine for very old cards. nv50+ (GeForce-8 and later) take different paths. -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/plane-helper: Adapt cursor hack to transitional helpers
On 05/20/2015 10:36 AM, Daniel Vetter wrote: In commit f02ad907cd9e7fe3a6405d2d005840912f1ed258 Author: Daniel Vetter daniel.vet...@ffwll.ch Date: Thu Jan 22 16:36:23 2015 +0100 drm/atomic-helpers: Recover full cursor plane behaviour we've added a hack to atomic helpers to never to vblank waits for cursor updates through the legacy apis since that's what X expects. Unfortunately we've (again) forgotten to adjust the transitional helpers. Do this now. This fixes regressions for drivers only partially converted over to atomic (like i915). Reported-by: Pekka Paalanen ppaala...@gmail.com Cc: Pekka Paalanen ppaala...@gmail.com Cc: sta...@vger.kernel.org Signed-off-by: Daniel Vetter daniel.vet...@intel.com --- drivers/gpu/drm/drm_plane_helper.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/drm_plane_helper.c b/drivers/gpu/drm/drm_plane_helper.c index 40c1db9ad7c3..2f0ed11024eb 100644 --- a/drivers/gpu/drm/drm_plane_helper.c +++ b/drivers/gpu/drm/drm_plane_helper.c @@ -465,6 +465,9 @@ int drm_plane_helper_commit(struct drm_plane *plane, if (!crtc[i]) continue; + if (crtc[i]-cursor == plane) + continue; + /* There's no other way to figure out whether the crtc is running. */ ret = drm_crtc_vblank_get(crtc[i]); if (ret == 0) { This one is Reviewed-and-tested-by: Mario Kleiner mario.kleiner...@gmail.com I was looking into Weston performance and the cursor problem, so had necessary tracing in place to test this. I can confirm that cursor related blocking in Westons drm-backend execution are gone with this patch applied, whereas they are still present when using hardware overlays on Intel, as expected. So hardware cursors should be fine again, once the patch also ends in stable kernels. thanks, -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Breakage for Ironlake due to some watermarks changes in Linux 4.0+?
On 05/15/2015 11:00 AM, Jani Nikula wrote: On Fri, 15 May 2015, Mario Kleiner mario.kleiner...@gmail.com wrote: Hi all, since Linux 4.0 i experience some massive display flicker problem on my Intel HD Ironlake mobile (2010 MacBookPro6,2) under Waylands reference compositor Weston. - Only happens on Linux = 4.0 on intel-kms with the Intel HD, not under nouveau-kms with the discrete NVidia gpu. Strangely on Linux 4.1-rc it happens all the time, whereas on Linux 4.0 it can work normally for quite a while, but once the problem starts only a reboot can cure it. - Almost only happens on Weston, but only very rarely under the XServer. VT switching from Weston to XOrg makes the problem disappear, switching back to Weston and it starts again immediately. - Only happens if a hardware cursor is displayed - hiding the cursor stops the flicker immediately, showing the cursor starts the flicker. - The drm and desktop is completely idle during this - drm.debug=15 shows no activity while this happens. Symptom: Up to the scanline where the cursor is located, the desktop image is displayed, but jumps horizontally left and right by some random number of pixels, maybe in the range 0 - 200 pixels with high frequency, making the content unreadable. Starting with the scanline where scanout of the cursor starts, the display goes blank, as if some display controller fifo would underflow and the controller blanks the display in response. Seems having to scanout the cursor plane in addition to the primary plane is just enough to push it over some limit? I also see cpu and pch pipe a fifo underruns reported by the underflow irq handlers. I saw there were many changes around Linux 4.0 in the kms driver wrt. watermark calculations, so this might be related? Please try http://patchwork.freedesktop.org/patch/49314 and report back. BR, Jani. The patch fixes my flicker problem nicely. Thanks! If you want, you can add a Tested-by: Mario Kleiner mario.kleiner...@gmail.com best, -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Breakage for Ironlake due to some watermarks changes in Linux 4.0+?
Hi all, since Linux 4.0 i experience some massive display flicker problem on my Intel HD Ironlake mobile (2010 MacBookPro6,2) under Waylands reference compositor Weston. - Only happens on Linux = 4.0 on intel-kms with the Intel HD, not under nouveau-kms with the discrete NVidia gpu. Strangely on Linux 4.1-rc it happens all the time, whereas on Linux 4.0 it can work normally for quite a while, but once the problem starts only a reboot can cure it. - Almost only happens on Weston, but only very rarely under the XServer. VT switching from Weston to XOrg makes the problem disappear, switching back to Weston and it starts again immediately. - Only happens if a hardware cursor is displayed - hiding the cursor stops the flicker immediately, showing the cursor starts the flicker. - The drm and desktop is completely idle during this - drm.debug=15 shows no activity while this happens. Symptom: Up to the scanline where the cursor is located, the desktop image is displayed, but jumps horizontally left and right by some random number of pixels, maybe in the range 0 - 200 pixels with high frequency, making the content unreadable. Starting with the scanline where scanout of the cursor starts, the display goes blank, as if some display controller fifo would underflow and the controller blanks the display in response. Seems having to scanout the cursor plane in addition to the primary plane is just enough to push it over some limit? I also see cpu and pch pipe a fifo underruns reported by the underflow irq handlers. I saw there were many changes around Linux 4.0 in the kms driver wrt. watermark calculations, so this might be related? thanks, -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers
On 05/07/2015 01:56 PM, Peter Hurley wrote: On 05/06/2015 04:56 AM, Daniel Vetter wrote: On Tue, May 05, 2015 at 11:57:42AM -0400, Peter Hurley wrote: On 05/05/2015 11:42 AM, Daniel Vetter wrote: On Tue, May 05, 2015 at 10:36:24AM -0400, Peter Hurley wrote: On 05/04/2015 12:52 AM, Mario Kleiner wrote: On 04/16/2015 03:03 PM, Daniel Vetter wrote: On Thu, Apr 16, 2015 at 08:30:55AM -0400, Peter Hurley wrote: On 04/15/2015 01:31 PM, Daniel Vetter wrote: On Wed, Apr 15, 2015 at 09:00:04AM -0400, Peter Hurley wrote: Hi Daniel, On 04/15/2015 03:17 AM, Daniel Vetter wrote: This was a bit too much cargo-culted, so lets make it solid: - vblank-count doesn't need to be an atomic, writes are always done under the protection of dev-vblank_time_lock. Switch to an unsigned long instead and update comments. Note that atomic_read is just a normal read of a volatile variable, so no need to audit all the read-side access specifically. - The barriers for the vblank counter seqlock weren't complete: The read-side was missing the first barrier between the counter read and the timestamp read, it only had a barrier between the ts and the counter read. We need both. - Barriers weren't properly documented. Since barriers only work if you have them on boths sides of the transaction it's prudent to reference where the other side is. To avoid duplicating the write-side comment 3 times extract a little store_vblank() helper. In that helper also assert that we do indeed hold dev-vblank_time_lock, since in some cases the lock is acquired a few functions up in the callchain. Spotted while reviewing a patch from Chris Wilson to add a fastpath to the vblank_wait ioctl. Cc: Chris Wilson ch...@chris-wilson.co.uk Cc: Mario Kleiner mario.kleiner...@gmail.com Cc: Ville Syrjälä ville.syrj...@linux.intel.com Cc: Michel Dänzer mic...@daenzer.net Signed-off-by: Daniel Vetter daniel.vet...@intel.com --- drivers/gpu/drm/drm_irq.c | 92 --- include/drm/drmP.h| 8 +++-- 2 files changed, 54 insertions(+), 46 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index c8a34476570a..23bfbc61a494 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -74,6 +74,33 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 0600); module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 0600); module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 0600); +static void store_vblank(struct drm_device *dev, int crtc, + unsigned vblank_count_inc, + struct timeval *t_vblank) +{ +struct drm_vblank_crtc *vblank = dev-vblank[crtc]; +u32 tslot; + +assert_spin_locked(dev-vblank_time_lock); + +if (t_vblank) { +tslot = vblank-count + vblank_count_inc; +vblanktimestamp(dev, crtc, tslot) = *t_vblank; +} + +/* + * vblank timestamp updates are protected on the write side with + * vblank_time_lock, but on the read side done locklessly using a + * sequence-lock on the vblank counter. Ensure correct ordering using + * memory barrriers. We need the barrier both before and also after the + * counter update to synchronize with the next timestamp write. + * The read-side barriers for this are in drm_vblank_count_and_time. + */ +smp_wmb(); +vblank-count += vblank_count_inc; +smp_wmb(); The comment and the code are each self-contradictory. If vblank-count writes are always protected by vblank_time_lock (something I did not verify but that the comment above asserts), then the trailing write barrier is not required (and the assertion that it is in the comment is incorrect). A spin unlock operation is always a write barrier. Hm yeah. Otoh to me that's bordering on code too clever for my own good. That the spinlock is held I can assure. That no one goes around and does multiple vblank updates (because somehow that code raced with the hw itself) I can't easily assure with a simple assert or something similar. It's not the case right now, but that can changes. The algorithm would be broken if multiple updates for the same vblank count were allowed; that's why it checks to see if the vblank count has not advanced before storing a new timestamp. Otherwise, the read side would not be able to determine that the timestamp is valid by double-checking that the vblank count has not changed. And besides, even if the code looped without dropping the spinlock, the correct write order would still be observed because it would still be executing on the same cpu. My objection to the write memory barrier is not about optimization; it's about correct code. Well diff=0 is not allowed, I guess I could enforce this with some WARN_ON. And I still think my point of non-local correctness is solid. With the smp_wmb() removed the following still works correctly: spin_lock
Re: [Intel-gfx] [PATCH] drm: Defer disabling the vblank IRQ until the next interrupt (for instant-off)
On 04/15/2015 03:03 AM, Mario Kleiner wrote: On 04/02/2015 01:34 PM, Chris Wilson wrote: On vblank instant-off systems, we can get into a situation where the cost of enabling and disabling the vblank IRQ around a drmWaitVblank query dominates. However, we know that if the user wants the current vblank counter, they are also very likely to immediately queue a vblank wait and so we can keep the interrupt around and only turn it off if we have no further vblank requests in the interrupt interval. After vblank event delivery there is a shadow of one vblank where the interrupt is kept alive for the user to query and queue another vblank event. Similarly, if the user is using blocking drmWaitVblanks, the interrupt will be disabled on the IRQ following the wait completion. However, if the user is simply querying the current vblank counter and timestamp, the interrupt will be disabled after every IRQ and the user will enabled it again on the first query following the IRQ. Testcase: igt/kms_vblank Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Ville Syrjälä ville.syrj...@linux.intel.com Cc: Daniel Vetter dan...@ffwll.ch Cc: Michel Dänzer mic...@daenzer.net Cc: Laurent Pinchart laurent.pinch...@ideasonboard.com Cc: Dave Airlie airl...@redhat.com, Cc: Mario Kleiner mario.kleiner...@gmail.com --- drivers/gpu/drm/drm_irq.c | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index c8a34476570a..6f5dc18779e2 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -1091,9 +1091,9 @@ void drm_vblank_put(struct drm_device *dev, int crtc) if (atomic_dec_and_test(vblank-refcount)) { if (drm_vblank_offdelay == 0) return; -else if (dev-vblank_disable_immediate || drm_vblank_offdelay 0) +else if (drm_vblank_offdelay 0) vblank_disable_fn((unsigned long)vblank); -else +else if (!dev-vblank_disable_immediate) mod_timer(vblank-disable_timer, jiffies + ((drm_vblank_offdelay * HZ)/1000)); } @@ -1697,6 +1697,17 @@ bool drm_handle_vblank(struct drm_device *dev, int crtc) spin_lock_irqsave(dev-event_lock, irqflags); You could move the code before the spin_lock_irqsave(dev-event_lock, irqflags); i think it doesn't need that lock? +if (dev-vblank_disable_immediate !atomic_read(vblank-refcount)) { Also check for (drm_vblank_offdelay 0) to make sure we have a way out of instant disable here, and the same meaning of of drm_vblank_offdelay like we have in the current implementation. This hunk ... +unsigned long vbl_lock_irqflags; + +spin_lock_irqsave(dev-vbl_lock, vbl_lock_irqflags); +if (atomic_read(vblank-refcount) == 0 vblank-enabled) { +DRM_DEBUG(disabling vblank on crtc %d\n, crtc); +vblank_disable_and_save(dev, crtc); +} +spin_unlock_irqrestore(dev-vbl_lock, vbl_lock_irqflags); ... is the same as a call to vblank_disable_fn((unsigned long) vblank); Maybe replace by that call? You could also return here already, as the code below will just take a lock, realize vblanks are now disabled and then release the locks and exit. +} + /* Need timestamp lock to prevent concurrent execution with * vblank enable/disable, as this would cause inconsistent * or corrupted timestamps and vblank counts. I think the logic itself is fine and at least basic testing of the patch on a Intel HD Ironlake didn't show problems, so with the above taken into account it would have my slightly uneasy reviewed-by. One thing that worries me a little bit about the disable inside vblank irq are the potential races between the disable code and the display engine which could cause really bad off-by-one errors for clients on a imperfect driver. These races can only happen if vblank enable or disable happens close to or inside the vblank. This approach lets the instant disable happen exactly inside vblank when there is the highest chance of triggering that condition. This doesn't seem to be a problem for intel kms, but other drivers don't have instant disable yet, so we don't know how well we could do it there. Additionally things like dynamic power management tend to operate inside vblank, sometimes with funny side effects to other stuff, e.g., dpm on AMD, as i remember from some long debug session with Michel and Alex last summer where dpm played a role. Therefore it seems more safe to me to avoid actions inside vblank that could be done outside. E.g., instead of doing the disable inside the vblank irq one could maybe just schedule an exact timer to do the disable a few milliseconds later in the middle of active scanout to avoid these potential issues? -mario After testing this, one more thing that would make sense is to move the disable block at the end of drm_handle_vblank() instead of at the top. Turns out
Re: [Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers
On 04/16/2015 03:03 PM, Daniel Vetter wrote: On Thu, Apr 16, 2015 at 08:30:55AM -0400, Peter Hurley wrote: On 04/15/2015 01:31 PM, Daniel Vetter wrote: On Wed, Apr 15, 2015 at 09:00:04AM -0400, Peter Hurley wrote: Hi Daniel, On 04/15/2015 03:17 AM, Daniel Vetter wrote: This was a bit too much cargo-culted, so lets make it solid: - vblank-count doesn't need to be an atomic, writes are always done under the protection of dev-vblank_time_lock. Switch to an unsigned long instead and update comments. Note that atomic_read is just a normal read of a volatile variable, so no need to audit all the read-side access specifically. - The barriers for the vblank counter seqlock weren't complete: The read-side was missing the first barrier between the counter read and the timestamp read, it only had a barrier between the ts and the counter read. We need both. - Barriers weren't properly documented. Since barriers only work if you have them on boths sides of the transaction it's prudent to reference where the other side is. To avoid duplicating the write-side comment 3 times extract a little store_vblank() helper. In that helper also assert that we do indeed hold dev-vblank_time_lock, since in some cases the lock is acquired a few functions up in the callchain. Spotted while reviewing a patch from Chris Wilson to add a fastpath to the vblank_wait ioctl. Cc: Chris Wilson ch...@chris-wilson.co.uk Cc: Mario Kleiner mario.kleiner...@gmail.com Cc: Ville Syrjälä ville.syrj...@linux.intel.com Cc: Michel Dänzer mic...@daenzer.net Signed-off-by: Daniel Vetter daniel.vet...@intel.com --- drivers/gpu/drm/drm_irq.c | 92 --- include/drm/drmP.h| 8 +++-- 2 files changed, 54 insertions(+), 46 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index c8a34476570a..23bfbc61a494 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -74,6 +74,33 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 0600); module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 0600); module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 0600); +static void store_vblank(struct drm_device *dev, int crtc, +unsigned vblank_count_inc, +struct timeval *t_vblank) +{ + struct drm_vblank_crtc *vblank = dev-vblank[crtc]; + u32 tslot; + + assert_spin_locked(dev-vblank_time_lock); + + if (t_vblank) { + tslot = vblank-count + vblank_count_inc; + vblanktimestamp(dev, crtc, tslot) = *t_vblank; + } + + /* +* vblank timestamp updates are protected on the write side with +* vblank_time_lock, but on the read side done locklessly using a +* sequence-lock on the vblank counter. Ensure correct ordering using +* memory barrriers. We need the barrier both before and also after the +* counter update to synchronize with the next timestamp write. +* The read-side barriers for this are in drm_vblank_count_and_time. +*/ + smp_wmb(); + vblank-count += vblank_count_inc; + smp_wmb(); The comment and the code are each self-contradictory. If vblank-count writes are always protected by vblank_time_lock (something I did not verify but that the comment above asserts), then the trailing write barrier is not required (and the assertion that it is in the comment is incorrect). A spin unlock operation is always a write barrier. Hm yeah. Otoh to me that's bordering on code too clever for my own good. That the spinlock is held I can assure. That no one goes around and does multiple vblank updates (because somehow that code raced with the hw itself) I can't easily assure with a simple assert or something similar. It's not the case right now, but that can changes. The algorithm would be broken if multiple updates for the same vblank count were allowed; that's why it checks to see if the vblank count has not advanced before storing a new timestamp. Otherwise, the read side would not be able to determine that the timestamp is valid by double-checking that the vblank count has not changed. And besides, even if the code looped without dropping the spinlock, the correct write order would still be observed because it would still be executing on the same cpu. My objection to the write memory barrier is not about optimization; it's about correct code. Well diff=0 is not allowed, I guess I could enforce this with some WARN_ON. And I still think my point of non-local correctness is solid. With the smp_wmb() removed the following still works correctly: spin_lock(vblank_time_lock); store_vblank(dev, crtc, 1, ts1); spin_unlock(vblank_time_lock); spin_lock(vblank_time_lock); store_vblank(dev, crtc, 1, ts2); spin_unlock(vblank_time_lock); But with the smp_wmb(); removed the following would be broken
Re: [Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers
On 04/16/2015 03:29 AM, Peter Hurley wrote: On 04/15/2015 05:26 PM, Mario Kleiner wrote: A couple of questions to educate me and one review comment. On 04/15/2015 07:34 PM, Daniel Vetter wrote: This was a bit too much cargo-culted, so lets make it solid: - vblank-count doesn't need to be an atomic, writes are always done under the protection of dev-vblank_time_lock. Switch to an unsigned long instead and update comments. Note that atomic_read is just a normal read of a volatile variable, so no need to audit all the read-side access specifically. - The barriers for the vblank counter seqlock weren't complete: The read-side was missing the first barrier between the counter read and the timestamp read, it only had a barrier between the ts and the counter read. We need both. - Barriers weren't properly documented. Since barriers only work if you have them on boths sides of the transaction it's prudent to reference where the other side is. To avoid duplicating the write-side comment 3 times extract a little store_vblank() helper. In that helper also assert that we do indeed hold dev-vblank_time_lock, since in some cases the lock is acquired a few functions up in the callchain. Spotted while reviewing a patch from Chris Wilson to add a fastpath to the vblank_wait ioctl. v2: Add comment to better explain how store_vblank works, suggested by Chris. v3: Peter noticed that as-is the 2nd smp_wmb is redundant with the implicit barrier in the spin_unlock. But that can only be proven by auditing all callers and my point in extracting this little helper was to localize all the locking into just one place. Hence I think that additional optimization is too risky. Cc: Chris Wilson ch...@chris-wilson.co.uk Cc: Mario Kleiner mario.kleiner...@gmail.com Cc: Ville Syrjälä ville.syrj...@linux.intel.com Cc: Michel Dänzer mic...@daenzer.net Cc: Peter Hurley pe...@hurleysoftware.com Signed-off-by: Daniel Vetter daniel.vet...@intel.com --- drivers/gpu/drm/drm_irq.c | 95 +-- include/drm/drmP.h| 8 +++- 2 files changed, 57 insertions(+), 46 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index c8a34476570a..8694b77d0002 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -74,6 +74,36 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 0600); module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 0600); module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 0600); +static void store_vblank(struct drm_device *dev, int crtc, + unsigned vblank_count_inc, + struct timeval *t_vblank) +{ +struct drm_vblank_crtc *vblank = dev-vblank[crtc]; +u32 tslot; + +assert_spin_locked(dev-vblank_time_lock); + +if (t_vblank) { +/* All writers hold the spinlock, but readers are serialized by + * the latching of vblank-count below. + */ +tslot = vblank-count + vblank_count_inc; +vblanktimestamp(dev, crtc, tslot) = *t_vblank; +} + +/* + * vblank timestamp updates are protected on the write side with + * vblank_time_lock, but on the read side done locklessly using a + * sequence-lock on the vblank counter. Ensure correct ordering using + * memory barrriers. We need the barrier both before and also after the + * counter update to synchronize with the next timestamp write. + * The read-side barriers for this are in drm_vblank_count_and_time. + */ +smp_wmb(); +vblank-count += vblank_count_inc; +smp_wmb(); +} + /** * drm_update_vblank_count - update the master vblank counter * @dev: DRM device @@ -93,7 +123,7 @@ module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 0600); static void drm_update_vblank_count(struct drm_device *dev, int crtc) { struct drm_vblank_crtc *vblank = dev-vblank[crtc]; -u32 cur_vblank, diff, tslot; +u32 cur_vblank, diff; bool rc; struct timeval t_vblank; @@ -129,18 +159,12 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) if (diff == 0) return; -/* Reinitialize corresponding vblank timestamp if high-precision query - * available. Skip this step if query unsupported or failed. Will - * reinitialize delayed at next vblank interrupt in that case. +/* + * Only reinitialize corresponding vblank timestamp if high-precision query + * available and didn't fail. Will reinitialize delayed at next vblank + * interrupt in that case. */ -if (rc) { -tslot = atomic_read(vblank-count) + diff; -vblanktimestamp(dev, crtc, tslot) = t_vblank; -} - -smp_mb__before_atomic(); -atomic_add(diff, vblank-count); -smp_mb__after_atomic(); +store_vblank(dev, crtc, diff, rc ? t_vblank : NULL); } /* @@ -218,7 +242,7 @@ static
Re: [Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers
A couple of questions to educate me and one review comment. On 04/15/2015 07:34 PM, Daniel Vetter wrote: This was a bit too much cargo-culted, so lets make it solid: - vblank-count doesn't need to be an atomic, writes are always done under the protection of dev-vblank_time_lock. Switch to an unsigned long instead and update comments. Note that atomic_read is just a normal read of a volatile variable, so no need to audit all the read-side access specifically. - The barriers for the vblank counter seqlock weren't complete: The read-side was missing the first barrier between the counter read and the timestamp read, it only had a barrier between the ts and the counter read. We need both. - Barriers weren't properly documented. Since barriers only work if you have them on boths sides of the transaction it's prudent to reference where the other side is. To avoid duplicating the write-side comment 3 times extract a little store_vblank() helper. In that helper also assert that we do indeed hold dev-vblank_time_lock, since in some cases the lock is acquired a few functions up in the callchain. Spotted while reviewing a patch from Chris Wilson to add a fastpath to the vblank_wait ioctl. v2: Add comment to better explain how store_vblank works, suggested by Chris. v3: Peter noticed that as-is the 2nd smp_wmb is redundant with the implicit barrier in the spin_unlock. But that can only be proven by auditing all callers and my point in extracting this little helper was to localize all the locking into just one place. Hence I think that additional optimization is too risky. Cc: Chris Wilson ch...@chris-wilson.co.uk Cc: Mario Kleiner mario.kleiner...@gmail.com Cc: Ville Syrjälä ville.syrj...@linux.intel.com Cc: Michel Dänzer mic...@daenzer.net Cc: Peter Hurley pe...@hurleysoftware.com Signed-off-by: Daniel Vetter daniel.vet...@intel.com --- drivers/gpu/drm/drm_irq.c | 95 +-- include/drm/drmP.h| 8 +++- 2 files changed, 57 insertions(+), 46 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index c8a34476570a..8694b77d0002 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -74,6 +74,36 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 0600); module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 0600); module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 0600); +static void store_vblank(struct drm_device *dev, int crtc, +unsigned vblank_count_inc, +struct timeval *t_vblank) +{ + struct drm_vblank_crtc *vblank = dev-vblank[crtc]; + u32 tslot; + + assert_spin_locked(dev-vblank_time_lock); + + if (t_vblank) { + /* All writers hold the spinlock, but readers are serialized by +* the latching of vblank-count below. +*/ + tslot = vblank-count + vblank_count_inc; + vblanktimestamp(dev, crtc, tslot) = *t_vblank; + } + + /* +* vblank timestamp updates are protected on the write side with +* vblank_time_lock, but on the read side done locklessly using a +* sequence-lock on the vblank counter. Ensure correct ordering using +* memory barrriers. We need the barrier both before and also after the +* counter update to synchronize with the next timestamp write. +* The read-side barriers for this are in drm_vblank_count_and_time. +*/ + smp_wmb(); + vblank-count += vblank_count_inc; + smp_wmb(); +} + /** * drm_update_vblank_count - update the master vblank counter * @dev: DRM device @@ -93,7 +123,7 @@ module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 0600); static void drm_update_vblank_count(struct drm_device *dev, int crtc) { struct drm_vblank_crtc *vblank = dev-vblank[crtc]; - u32 cur_vblank, diff, tslot; + u32 cur_vblank, diff; bool rc; struct timeval t_vblank; @@ -129,18 +159,12 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) if (diff == 0) return; - /* Reinitialize corresponding vblank timestamp if high-precision query -* available. Skip this step if query unsupported or failed. Will -* reinitialize delayed at next vblank interrupt in that case. + /* +* Only reinitialize corresponding vblank timestamp if high-precision query +* available and didn't fail. Will reinitialize delayed at next vblank +* interrupt in that case. */ - if (rc) { - tslot = atomic_read(vblank-count) + diff; - vblanktimestamp(dev, crtc, tslot) = t_vblank; - } - - smp_mb__before_atomic(); - atomic_add(diff, vblank-count); - smp_mb__after_atomic(); + store_vblank(dev, crtc, diff, rc
Re: [Intel-gfx] [PATCH] drm: Defer disabling the vblank IRQ until the next interrupt (for instant-off)
On 04/02/2015 01:34 PM, Chris Wilson wrote: On vblank instant-off systems, we can get into a situation where the cost of enabling and disabling the vblank IRQ around a drmWaitVblank query dominates. However, we know that if the user wants the current vblank counter, they are also very likely to immediately queue a vblank wait and so we can keep the interrupt around and only turn it off if we have no further vblank requests in the interrupt interval. After vblank event delivery there is a shadow of one vblank where the interrupt is kept alive for the user to query and queue another vblank event. Similarly, if the user is using blocking drmWaitVblanks, the interrupt will be disabled on the IRQ following the wait completion. However, if the user is simply querying the current vblank counter and timestamp, the interrupt will be disabled after every IRQ and the user will enabled it again on the first query following the IRQ. Testcase: igt/kms_vblank Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Ville Syrjälä ville.syrj...@linux.intel.com Cc: Daniel Vetter dan...@ffwll.ch Cc: Michel Dänzer mic...@daenzer.net Cc: Laurent Pinchart laurent.pinch...@ideasonboard.com Cc: Dave Airlie airl...@redhat.com, Cc: Mario Kleiner mario.kleiner...@gmail.com --- drivers/gpu/drm/drm_irq.c | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index c8a34476570a..6f5dc18779e2 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -1091,9 +1091,9 @@ void drm_vblank_put(struct drm_device *dev, int crtc) if (atomic_dec_and_test(vblank-refcount)) { if (drm_vblank_offdelay == 0) return; - else if (dev-vblank_disable_immediate || drm_vblank_offdelay 0) + else if (drm_vblank_offdelay 0) vblank_disable_fn((unsigned long)vblank); - else + else if (!dev-vblank_disable_immediate) mod_timer(vblank-disable_timer, jiffies + ((drm_vblank_offdelay * HZ)/1000)); } @@ -1697,6 +1697,17 @@ bool drm_handle_vblank(struct drm_device *dev, int crtc) spin_lock_irqsave(dev-event_lock, irqflags); You could move the code before the spin_lock_irqsave(dev-event_lock, irqflags); i think it doesn't need that lock? + if (dev-vblank_disable_immediate !atomic_read(vblank-refcount)) { Also check for (drm_vblank_offdelay 0) to make sure we have a way out of instant disable here, and the same meaning of of drm_vblank_offdelay like we have in the current implementation. This hunk ... + unsigned long vbl_lock_irqflags; + + spin_lock_irqsave(dev-vbl_lock, vbl_lock_irqflags); + if (atomic_read(vblank-refcount) == 0 vblank-enabled) { + DRM_DEBUG(disabling vblank on crtc %d\n, crtc); + vblank_disable_and_save(dev, crtc); + } + spin_unlock_irqrestore(dev-vbl_lock, vbl_lock_irqflags); ... is the same as a call to vblank_disable_fn((unsigned long) vblank); Maybe replace by that call? You could also return here already, as the code below will just take a lock, realize vblanks are now disabled and then release the locks and exit. + } + /* Need timestamp lock to prevent concurrent execution with * vblank enable/disable, as this would cause inconsistent * or corrupted timestamps and vblank counts. I think the logic itself is fine and at least basic testing of the patch on a Intel HD Ironlake didn't show problems, so with the above taken into account it would have my slightly uneasy reviewed-by. One thing that worries me a little bit about the disable inside vblank irq are the potential races between the disable code and the display engine which could cause really bad off-by-one errors for clients on a imperfect driver. These races can only happen if vblank enable or disable happens close to or inside the vblank. This approach lets the instant disable happen exactly inside vblank when there is the highest chance of triggering that condition. This doesn't seem to be a problem for intel kms, but other drivers don't have instant disable yet, so we don't know how well we could do it there. Additionally things like dynamic power management tend to operate inside vblank, sometimes with funny side effects to other stuff, e.g., dpm on AMD, as i remember from some long debug session with Michel and Alex last summer where dpm played a role. Therefore it seems more safe to me to avoid actions inside vblank that could be done outside. E.g., instead of doing the disable inside the vblank irq one could maybe just schedule an exact timer to do the disable a few milliseconds later in the middle of active scanout to avoid these potential issues? -mario
Re: [Intel-gfx] [PATCH] drm: Return current vblank value for drmWaitVBlank queries
On 03/19/2015 04:04 PM, Ville Syrjälä wrote: On Thu, Mar 19, 2015 at 03:33:11PM +0100, Daniel Vetter wrote: On Wed, Mar 18, 2015 at 03:52:56PM +0100, Mario Kleiner wrote: On 03/18/2015 10:30 AM, Chris Wilson wrote: On Wed, Mar 18, 2015 at 11:53:16AM +0900, Michel Dänzer wrote: drm_vblank_count_and_time() doesn't return the correct sequence number while the vblank interrupt is disabled, does it? It returns the sequence number from the last time vblank_disable_and_save() was called (when the vblank interrupt was disabled). That's why drm_vblank_get() is needed here. Ville enlightened me as well. I thought the value was cooked so that time did not pass whilst the IRQ was disabled. Hopefully, I can impress upon the Intel folks, at least, that enabling/disabling the interrupts just to read the current hw counter is interesting to say the least and sits at the top of the profiles when benchmarking Present. -Chris drm_wait_vblank() not only gets the counter but also the corresponding vblank timestamp. Counters are recalculated in vblank_disable_and_save() for irq off, then in the vblank irq on path, and every refresh in drm_handle_vblank at vblank irq time. The timestamps can be recalculated at any time iff the driver supports high precision timestamping, which currently intel kms, radeon kms, and nouveau kms do. But for other parts, like most SoC's, afaik you only get a valid timestamp by sampling system time in the vblank irq handler, so there you'd have a problem. There are also some races around the enable/disable path which require a lot of care and exact knowledge of when each hardware fires its vblanks, updates its hardware counters etc. to get rid of them. Ville did that - successfully as far as my tests go - for Intel kms, but other drivers would be less forgiving. Our current method is to: a) Only disable vblank irqs after a default idle period of 5 seconds, so we don't get races frequent/likely enough to cause problems for clients. And we save the overhead for all the vblank irq on/off. b) On drivers which have high precision timestamping and have been carefully checked to be race free (== intel kms only atm.) we have instant disable, so things like blinking cursors don't keep vblank irq on forever. If b) causes so much overhead, maybe we could change the instant disable into a disable after a very short time, e.g., lowering the timeout from 5000 msecs to 2-3 video refresh durations ~ 50 msecs? That would still disable vblank irqs for power saving if the desktop is really idle, but avoid on/off storms for the various drm_wait_vblank's that happen when preparing a swap. Yeah I think we could add code which only gets run for drivers which support instant disable (i915 doesn't do that on gen2 because the hw is lacking). There we should be able to update the vblank counter/timestamp correctly without enabling interrupts temporarily. Ofc we need to make sure we have enough nasty igt testcase to ensure there's not going to be jumps and missed frame numbers in that case. I'd rather go for the very simple fast disable with short timeout method. That would only be a tiny almost one-liner patch that reuses the existing timer for the default slow case, and we'd know already that it will work reliably on instant off capable drivers - no extra tests required. Those drm_vblank_get/put calls usually come in short bursts which should be covered by a timeout of maybe 1 to max. 3 refresh durations. When we query the hw timestamps, we always have a little bit of unavoidable noise, even if it's often only +/- 1 usec on modern hw, so clients querying the timestamp for the same vblank would get slightly different results on repeated queries. On hw which only allows scanline granularity for queries, we can get variability up to 1 scanline duration. If the caller does things like delta calculations on those results (dT = currentts - lastts) it can get confusing results like time going backwards by a few microseconds. That's why the current code caches the last vblank ts, to save overhead and to make sure that repeated queries of the same vblank give identical results. Is enabling the interrupts the expensive part, or is it the actual double timestamp read + scanout pos read? Or is it due to the several spinlocks we have in this code? The timestamp/scanout pos read itself is not that expensive iirc, usually 1-3 usecs depending on hw, from some testing i did a year ago. The machinery for irq on/off + all the reinitializing of vblank counts and matching timestamps etc. is probably not that cheap. Also why is userspace reading the vblank counter in the first place? Due to the crazy OML_whatever stuff perhaps? In the simple swap interval case you shouldn't really need to read it. And if we actually made the page flip/atomic ioctl take a target vblank count and let the kernel deal with it we wouldn't need to call the vblank ioctl at all. I object to the crazy, extensions have
Re: [Intel-gfx] [PATCH] drm: Return current vblank value for drmWaitVBlank queries
On 03/18/2015 10:30 AM, Chris Wilson wrote: On Wed, Mar 18, 2015 at 11:53:16AM +0900, Michel Dänzer wrote: drm_vblank_count_and_time() doesn't return the correct sequence number while the vblank interrupt is disabled, does it? It returns the sequence number from the last time vblank_disable_and_save() was called (when the vblank interrupt was disabled). That's why drm_vblank_get() is needed here. Ville enlightened me as well. I thought the value was cooked so that time did not pass whilst the IRQ was disabled. Hopefully, I can impress upon the Intel folks, at least, that enabling/disabling the interrupts just to read the current hw counter is interesting to say the least and sits at the top of the profiles when benchmarking Present. -Chris drm_wait_vblank() not only gets the counter but also the corresponding vblank timestamp. Counters are recalculated in vblank_disable_and_save() for irq off, then in the vblank irq on path, and every refresh in drm_handle_vblank at vblank irq time. The timestamps can be recalculated at any time iff the driver supports high precision timestamping, which currently intel kms, radeon kms, and nouveau kms do. But for other parts, like most SoC's, afaik you only get a valid timestamp by sampling system time in the vblank irq handler, so there you'd have a problem. There are also some races around the enable/disable path which require a lot of care and exact knowledge of when each hardware fires its vblanks, updates its hardware counters etc. to get rid of them. Ville did that - successfully as far as my tests go - for Intel kms, but other drivers would be less forgiving. Our current method is to: a) Only disable vblank irqs after a default idle period of 5 seconds, so we don't get races frequent/likely enough to cause problems for clients. And we save the overhead for all the vblank irq on/off. b) On drivers which have high precision timestamping and have been carefully checked to be race free (== intel kms only atm.) we have instant disable, so things like blinking cursors don't keep vblank irq on forever. If b) causes so much overhead, maybe we could change the instant disable into a disable after a very short time, e.g., lowering the timeout from 5000 msecs to 2-3 video refresh durations ~ 50 msecs? That would still disable vblank irqs for power saving if the desktop is really idle, but avoid on/off storms for the various drm_wait_vblank's that happen when preparing a swap. -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] sna: Also fix ZaphodHeads on Linux kernels older than 3.19
Only 3.19 will have O_NONBLOCK for drm_read(), so the current ddx will still stutter in ZaphodHead mode on current kernels. Fix the problem by adding a poll() on the drm fd before potentially blocking on read(). The logic is directly transplanted from the uxa backend intel_mode_read_drm_events() function. Fixes fdo bug #84744 on older kernels. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- src/sna/sna_display.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/src/sna/sna_display.c b/src/sna/sna_display.c index 163..a7ad6cc 100644 --- a/src/sna/sna_display.c +++ b/src/sna/sna_display.c @@ -7390,7 +7390,16 @@ fixup_flip: void sna_mode_wakeup(struct sna *sna) { char buffer[1024]; - int len, i; + int len, i, r; + struct pollfd p = { .fd = sna-kgem.fd, .events = POLLIN }; + + /* DRM read is blocking on old kernels, so poll first to avoid it. */ + do { + r = poll(p, 1, 0); + } while (r == -1 (errno == EINTR || errno == EAGAIN)); + + if (r = 0) + return; /* The DRM read semantics guarantees that we always get only * complete events. -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] present: Fix use of vsynced pageflips and honor PresentOptionAsync. (v3)
On 12/05/2014 12:56 AM, Eric Anholt wrote: Mario Kleiner mario.kleiner...@gmail.com writes: Pageflips for Pixmap presents were not synchronized to vblank on drivers with support for PresentCapabilityAsync, due to some missing init for vblank-sync_flips. The PresentOptionAsync flag was completely ignored for pageflipped presents. Vsynced flips only worked by accident on the intel-ddx, as that driver doesn't have PresentCapabilityAsync support. On nouveau-ddx, which supports PresentCapabilityAsync, this always caused non-vsynced pageflips with pretty ugly tearing. This patch fixes the problem, as tested on top of XOrg 1.16.2 on nouveau and intel. Please also apply to XOrg 1.17 and XOrg 1.16.2 stable. Applying on top of XOrg 1.16.2 may require cherry-picking commit 2051514652481a83bd7cf22e57cb0fcd40333f33 which trivially fixes lack of support for protocol option PresentOptionCopy - get two bug fixes for the price of one! Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- present/present.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/present/present.c b/present/present.c index e5d3fd5..be1c9f1 100644 --- a/present/present.c +++ b/present/present.c @@ -834,7 +834,7 @@ present_pixmap(WindowPtr window, vblank-notifies = notifies; vblank-num_notifies = num_notifies; -if (!screen_priv-info || !(screen_priv-info-capabilities PresentCapabilityAsync)) +if (!(options PresentOptionAsync)) vblank-sync_flip = TRUE; I think I'd like to see a hunk like this in with this patch, so that each driver doesn't need to have the cap check: diff --git a/present/present.c b/present/present.c index a9f2214..ed0d734 100644 --- a/present/present.c +++ b/present/present.c @@ -838,6 +838,9 @@ present_pixmap(WindowPtr window, vblank-sync_flip = TRUE; if (!(options PresentOptionCopy) +!((options PresentOptionAsync) + (!screen_priv-info || + !(screen_priv-info-capabilities PresentCapabilityAsync))) pixmap != NULL present_check_flip (target_crtc, window, pixmap, vblank-sync_flip, valid, x_off, y_off)) { Seem reasonable? If you wanted to squash this in, then this is: I'm not sure if drivers will really avoid the cap check, as i assume the definition of the check_flip() function requires them to implement it anyway? Does some spec somewhere require them to do it? Do driver writers check all server implementations to see if they can get away with less? But then having this hunk in doesn't hurt either, and it would keep the current intel-ddx uxa backends working, so i'll integrate it - after some urgently needed sleep. Thanks for the review. These server patches are actually the critical ones for me. Without them in XOrg 1.16+, all the mesa fixes would be utterly useless for my kind of applications. -mario Reviewed-by: Eric Anholt e...@anholt.net (So's patch 1/2, regardless). ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] DRI3/Present fixes for XServers 1.16 and 1.17rc - (v2)
Hi, an updated set of patches to fix the bugs i found in the xserver dri3/present implementation and one bug in intel-ddx uxa/dri3/present implementation. Axel Davys comments made me rethink my original xserver patch and the new solution is simple and better and afaics how this was actually intended to work in the server, the server properly using the present_check_flip ddx driver function. Patch 1/2 fixes and slightly improves DebugPresent() macros for the server to avoid crashes at logout, compositor en/disable or closing windows while flips are pending when the server is compiled with debug macros on. Patch 2/2 fixes the use of PresentOptionAsync for page-flipped present, and makes Present working on nouveau without horrible tearing. These patches apply to master, 1.17rc and 1.16.2. They were tested on top of 1.16.2 with the dri3/present backends of nouveau master (glamor and exa) and intel master (sna and fixed uxa) on single-display and dual-display, also ran through my hardware timing test equipment. Patch uxa/present is a required fix for intel-ddx uxa backend, so intel_present_check_flip no longer lies to the server about its capabilities. Can the x-server patches please also be included into the 1.16 series? thanks, -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/2] present: Avoid crashes in DebugPresent(), a bit more info.
DebugPresent() crashed the server when a dri3 drawable was closed while a pageflipped present was still pending, due to vblank-window- Null-Ptr deref, so debug builds caused new problems to debug. E.g., glXSwapBuffers(...); glXDestroyWindow(...); - Pageflip for non-existent window completes - boom. Also often happens when switching desktop compositor on/off due to Present unflips, or when logging out of session. Also add info if a Present is queued for copyswap or pageflip, if the present is vsynced, and the serial no of the Present request, to aid debugging of pageflip and vsync issues. The serial number is useful as Mesa's dri3/present backend encodes its sendSBC in the serial number, so one can easily correlate server debug output with Mesa and with the SBC values returned to actual OpenGL client applications via OML_sync_control and INTEL_swap_events extension, makes debugging quite a bit more easy. Please also cherry-pick this for a 1.16.x stable update. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- present/present.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/present/present.c b/present/present.c index ac9047e..e5d3fd5 100644 --- a/present/present.c +++ b/present/present.c @@ -440,7 +440,7 @@ present_flip_notify(present_vblank_ptr vblank, uint64_t ust, uint64_t crtc_msc) DebugPresent((\tn %lld %p %8lld: %08lx - %08lx\n, vblank-event_id, vblank, vblank-target_msc, vblank-pixmap ? vblank-pixmap-drawable.id : 0, - vblank-window-drawable.id)); + vblank-window ? vblank-window-drawable.id : 0)); assert (vblank == screen_priv-flip_pending); @@ -859,10 +859,10 @@ present_pixmap(WindowPtr window, } if (pixmap) -DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p)\n, +DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p) flip %d vsync %d serial %d\n, vblank-event_id, vblank, target_msc, vblank-pixmap-drawable.id, vblank-window-drawable.id, - target_crtc)); + target_crtc, vblank-flip, vblank-sync_flip, vblank-serial)); xorg_list_add(vblank-event_queue, present_exec_queue); vblank-queued = TRUE; @@ -955,7 +955,7 @@ present_vblank_destroy(present_vblank_ptr vblank) DebugPresent((\td %lld %p %8lld: %08lx - %08lx\n, vblank-event_id, vblank, vblank-target_msc, vblank-pixmap ? vblank-pixmap-drawable.id : 0, - vblank-window-drawable.id)); + vblank-window ? vblank-window-drawable.id : 0)); /* Drop pixmap reference */ if (vblank-pixmap) -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] uxa/present: Handle sync_flip flag in intel_present_check_flip()
Make sure we reject async flips if we don't support async flips. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- src/uxa/intel_present.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/uxa/intel_present.c b/src/uxa/intel_present.c index d20043f..d2aa9ee 100644 --- a/src/uxa/intel_present.c +++ b/src/uxa/intel_present.c @@ -58,6 +58,8 @@ struct intel_present_vblank_event { uint64_tevent_id; }; +static Bool intel_present_has_async_flip(ScreenPtr screen); + static uint32_t pipe_select(int pipe) { if (pipe 1) @@ -266,6 +268,9 @@ intel_present_check_flip(RRCrtcPtr crtc, if (!bo) return FALSE; +if (!sync_flip !intel_present_has_async_flip(screen)) +return FALSE; + return TRUE; } -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/2] present: Fix use of vsynced pageflips and honor PresentOptionAsync. (v3)
Pageflips for Pixmap presents were not synchronized to vblank on drivers with support for PresentCapabilityAsync, due to some missing init for vblank-sync_flips. The PresentOptionAsync flag was completely ignored for pageflipped presents. Vsynced flips only worked by accident on the intel-ddx, as that driver doesn't have PresentCapabilityAsync support. On nouveau-ddx, which supports PresentCapabilityAsync, this always caused non-vsynced pageflips with pretty ugly tearing. This patch fixes the problem, as tested on top of XOrg 1.16.2 on nouveau and intel. Please also apply to XOrg 1.17 and XOrg 1.16.2 stable. Applying on top of XOrg 1.16.2 may require cherry-picking commit 2051514652481a83bd7cf22e57cb0fcd40333f33 which trivially fixes lack of support for protocol option PresentOptionCopy - get two bug fixes for the price of one! Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- present/present.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/present/present.c b/present/present.c index e5d3fd5..be1c9f1 100644 --- a/present/present.c +++ b/present/present.c @@ -834,7 +834,7 @@ present_pixmap(WindowPtr window, vblank-notifies = notifies; vblank-num_notifies = num_notifies; -if (!screen_priv-info || !(screen_priv-info-capabilities PresentCapabilityAsync)) +if (!(options PresentOptionAsync)) vblank-sync_flip = TRUE; if (!(options PresentOptionCopy) -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 14/19] drm: Don't update vblank timestamp when the counter didn't change
On 23/09/14 15:51, Daniel Vetter wrote: On Tue, Sep 23, 2014 at 03:48:25PM +0300, Jani Nikula wrote: On Mon, 15 Sep 2014, Daniel Vetter dan...@ffwll.ch wrote: On Sat, Sep 13, 2014 at 06:25:54PM +0200, Mario Kleiner wrote: The current drm-next misses Ville's original Patch 14/19, the one i first objected, then objected to my objection. It is needed to avoid actual regressions. Attached a trivially rebased (v2) of Ville's patch to go on top of drm-next, also as tgz in case my e-mail client mangles the patch again, because it's one of those email hates me weeks. Oh dear, I've made a decent mess of all of this really. Picked up to make sure it doesn't get lost again. After all this nice ping pong our QA has reported a bisected regression on this commit: https://bugs.freedesktop.org/show_bug.cgi?id=84161 Looks like a minuscule timing change which resulted in us detecting a fifo underrun. Or at least I don't see any other related information that would indicate otherwise ... -Daniel There's nothing in that code path which could cause this - except for altered execution timing. I've seen that warning as well on my Intel HD Ironlake Mobile (MBP 2010), but only spuriously when plugging/unplugging an external display into the laptop iirc, so i thought it would be unrelated. -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 14/19] drm: Don't update vblank timestamp when the counter didn't change
The current drm-next misses Ville's original Patch 14/19, the one i first objected, then objected to my objection. It is needed to avoid actual regressions. Attached a trivially rebased (v2) of Ville's patch to go on top of drm-next, also as tgz in case my e-mail client mangles the patch again, because it's one of those email hates me weeks. -mario On 08/06/2014 01:49 PM, ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com If we already have a timestamp for the current vblank counter, don't update it with a new timestmap. Small errors can creep in between two timestamp queries for the same vblank count, which could be confusing to userspace when it queries the timestamp for the same vblank sequence number twice. This problem gets exposed when the vblank disable timer is not used (or is set to expire quickly) and thus we can get multiple vblank disable-enable transition during the same frame which would all attempt to update the timestamp with the latest estimate. Testcase: igt/kms_flip/flip-vs-expired-vblank Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index af33df1..0523f5b 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -106,6 +106,9 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) DRM_DEBUG(enabling vblank interrupts on crtc %d, missed %d\n, crtc, diff); + if (diff == 0) + return; + /* Reinitialize corresponding vblank timestamp if high-precision query * available. Skip this step if query unsupported or failed. Will * reinitialize delayed at next vblank interrupt in that case. From c0a5228a7fc43d4c3615a471c340b68bcb2caa16 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= ville.syrj...@linux.intel.com Date: Wed, 6 Aug 2014 14:49:57 +0300 Subject: [PATCH] drm: Don't update vblank timestamp when the counter didn't change (v2) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If we already have a timestamp for the current vblank counter, don't update it with a new timestmap. Small errors can creep in between two timestamp queries for the same vblank count, which could be confusing to userspace when it queries the timestamp for the same vblank sequence number twice. This problem gets exposed when the vblank disable timer is not used (or is set to expire quickly) and thus we can get multiple vblank disable-enable transition during the same frame which would all attempt to update the timestamp with the latest estimate. Testcase: igt/kms_flip/flip-vs-expired-vblank Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com Reviewed-by: Mario Kleiner mario.kleiner...@gmail.com v2:Mario: Trivial rebase on top of current drm-next (13-Sep-2014) Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- drivers/gpu/drm/drm_irq.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 80ff94a..e73cbda 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -126,6 +126,9 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) DRM_DEBUG(updating vblank count on crtc %d, missed %d\n, crtc, diff); + if (diff == 0) + return; + /* Reinitialize corresponding vblank timestamp if high-precision query * available. Skip this step if query unsupported or failed. Will * reinitialize delayed at next vblank interrupt in that case. -- 1.9.1 0001-drm-Don-t-update-vblank-timestamp-when-the-counter-d.patch.tar.gz Description: application/gzip ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PULL] topic/vblank-rework
Hmm, not quite an ack from my side for the pull in its current form. I said if the two remaining issues i mentioned are addressed, then i'm happy with it and can have my reviewed/acked-by. Looking at the code they haven't been adressed. However, this is easily fixable on top of the current patches: 1. A vblank_disable_timeout module parameter of zero should always leave vblank irq's enabled and also override the drivers choice, otherwise a user can't override the driver on a broken driver/gpu combo, which is the only use case for having that module parameter. Currenty the disable_immediately flag overrides the users override - Ouch. So in drm_vblank_put(): ... /* Last user schedules interrupt disable */ if (atomic_dec_and_test(vblank-refcount)) { Insert zero - opt-out check if (drm_vblank_offdelay == 0) return; Remaining code continues if (dev-vblank_disable_immediate || drm_vblank_offdelay 0) vblank_disable_fn((unsigned long)vblank); else if (drm_vblank_offdelay 0) mod_timer(vblank-disable_timer, jiffies + ((drm_vblank_offdelay * HZ)/1000)); ... 2. For the drm: Have the vblank counter account for the time ... patch, we must opt-out of that last timestamp/counter update/bump if the driver doesn't support high-precision vblank timestamping, otherwise the vblank count and timestamp will be inconsistent with each other - or outright wrong in case of the timestamp. Rather deliver a slightly outdated, but correct count+timestamp pair to userspace, which is still useable for practical purposes, than a pair that's outright wrong and will definitely confuse clients. A simple fix in static void vblank_disable_and_save() would be to replace the new... if (!vblank-enabled) { ... check by ... if (!vblank-enabled ) { On Wed, Sep 10, 2014 at 2:05 PM, Daniel Vetter daniel.vet...@ffwll.ch wrote: Hi Dave, So here's the final bits of Ville's vblank rework with a bit of cleanup from Mario on top. The neat thing this finally allows is to immediately disable the vblank interrupt on the last drm_vblank_put if the hardware has perfectly accurate vblank counter and timestamp readout support. On i915 that required piles of small adjustements from Ville since depending upon the platform and port the vblank happens at different scanout lines. Of course this is fully opt-in and per-device (we need that since gen2 doesn't have a hw vblank counter). Mario reviewed the entire pile too and after some initial hesitation (about drivers without accurate timestampt support) acked it. Cheers, Daniel The following changes since commit 21d70354bba9965a098382fc4d7fb17e138111f3: drm: move drm_stub.c to drm_drv.c (2014-08-06 19:10:44 +1000) are available in the git repository at: git://anongit.freedesktop.org/drm-intel tags/topic/vblank-rework-2014-09-10 for you to fetch changes up to 2368ffb18b1d2b04eb80478d225676caa7a3c4c8: drm: Use vblank_disable_and_save in drm_vblank_cleanup() (2014-09-10 09:41:29 +0200) Mario Kleiner (2): drm: Remove drm_vblank_cleanup from drm_vblank_init error path. drm: Use vblank_disable_and_save in drm_vblank_cleanup() Ville Syrjälä (16): drm: Always reject drm_vblank_get() after drm_vblank_off() drm/i915: Warn if drm_vblank_get() still works after drm_vblank_off() drm: Don't clear vblank timestamps when vblank interrupt is disabled drm: Move drm_update_vblank_count() drm: Have the vblank counter account for the time between vblank irq disable and drm_vblank_off() drm: Avoid random vblank counter jumps if the hardware counter has been reset drm: Reduce the amount of dev-vblank[crtc] in the code drm: Fix deadlock between event_lock and vbl_lock/vblank_time_lock drm: Fix race between drm_vblank_off() and drm_queue_vblank_event() drm: Disable vblank interrupt immediately when drm_vblank_offdelay0 drm: Add dev-vblank_disable_immediate flag drm/i915: Opt out of vblank disable timer on gen2 drm: Kick start vblank interrupts at drm_vblank_on() drm/i915: Update scanline_offset only for active crtcs drm: Fix confusing debug message in drm_update_vblank_count() drm: Store the vblank timestamp when adjusting the counter during disable Documentation/DocBook/drm.tmpl | 7 + drivers/gpu/drm/drm_drv.c| 4 +- drivers/gpu/drm/drm_irq.c| 345 ++- drivers/gpu/drm/i915/i915_irq.c | 8 + drivers/gpu/drm/i915/intel_display.c | 17 +- include/drm/drmP.h | 12 +- 6 files changed, 256 insertions(+), 137 deletions(-) -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org
Re: [Intel-gfx] [PULL] topic/vblank-rework
e-mail snafu, sent it too early by accident, and from a gmail web interface which i'm apparently incapable of using properly... The second fix should look like this: A simple fix in static void vblank_disable_and_save() would be to replace the new... if (!vblank-enabled) { ... check by ... if (!vblank-enabled drm_get_last_vbltimestamp(dev, crtc, tvblank, 0)) { ... We need to make sure timestamp queries work and are actually locked to the vblank, otherwise we can't do that last update there in vblank_disable_and_save(). With these two fixes or similar applied i'd be happy, otherwise it will inflict pain and real bugs on real users. thanks, -mario On Wed, Sep 10, 2014 at 4:19 PM, Mario Kleiner mario.kleiner...@gmail.com wrote: Hmm, not quite an ack from my side for the pull in its current form. I said if the two remaining issues i mentioned are addressed, then i'm happy with it and can have my reviewed/acked-by. Looking at the code they haven't been adressed. However, this is easily fixable on top of the current patches: 1. A vblank_disable_timeout module parameter of zero should always leave vblank irq's enabled and also override the drivers choice, otherwise a user can't override the driver on a broken driver/gpu combo, which is the only use case for having that module parameter. Currenty the disable_immediately flag overrides the users override - Ouch. So in drm_vblank_put(): ... /* Last user schedules interrupt disable */ if (atomic_dec_and_test(vblank-refcount)) { Insert zero - opt-out check if (drm_vblank_offdelay == 0) return; Remaining code continues if (dev-vblank_disable_immediate || drm_vblank_offdelay 0) vblank_disable_fn((unsigned long)vblank); else if (drm_vblank_offdelay 0) mod_timer(vblank-disable_timer, jiffies + ((drm_vblank_offdelay * HZ)/1000)); ... 2. For the drm: Have the vblank counter account for the time ... patch, we must opt-out of that last timestamp/counter update/bump if the driver doesn't support high-precision vblank timestamping, otherwise the vblank count and timestamp will be inconsistent with each other - or outright wrong in case of the timestamp. Rather deliver a slightly outdated, but correct count+timestamp pair to userspace, which is still useable for practical purposes, than a pair that's outright wrong and will definitely confuse clients. A simple fix in static void vblank_disable_and_save() would be to replace the new... if (!vblank-enabled) { ... check by ... if (!vblank-enabled ) { On Wed, Sep 10, 2014 at 2:05 PM, Daniel Vetter daniel.vet...@ffwll.ch wrote: Hi Dave, So here's the final bits of Ville's vblank rework with a bit of cleanup from Mario on top. The neat thing this finally allows is to immediately disable the vblank interrupt on the last drm_vblank_put if the hardware has perfectly accurate vblank counter and timestamp readout support. On i915 that required piles of small adjustements from Ville since depending upon the platform and port the vblank happens at different scanout lines. Of course this is fully opt-in and per-device (we need that since gen2 doesn't have a hw vblank counter). Mario reviewed the entire pile too and after some initial hesitation (about drivers without accurate timestampt support) acked it. Cheers, Daniel The following changes since commit 21d70354bba9965a098382fc4d7fb17e138111f3: drm: move drm_stub.c to drm_drv.c (2014-08-06 19:10:44 +1000) are available in the git repository at: git://anongit.freedesktop.org/drm-intel tags/topic/vblank-rework-2014-09-10 for you to fetch changes up to 2368ffb18b1d2b04eb80478d225676caa7a3c4c8: drm: Use vblank_disable_and_save in drm_vblank_cleanup() (2014-09-10 09:41:29 +0200) Mario Kleiner (2): drm: Remove drm_vblank_cleanup from drm_vblank_init error path. drm: Use vblank_disable_and_save in drm_vblank_cleanup() Ville Syrjälä (16): drm: Always reject drm_vblank_get() after drm_vblank_off() drm/i915: Warn if drm_vblank_get() still works after drm_vblank_off() drm: Don't clear vblank timestamps when vblank interrupt is disabled drm: Move drm_update_vblank_count() drm: Have the vblank counter account for the time between vblank irq disable and drm_vblank_off() drm: Avoid random vblank counter jumps if the hardware counter has been reset drm: Reduce the amount of dev-vblank[crtc] in the code drm: Fix deadlock between event_lock and vbl_lock/vblank_time_lock drm: Fix race between drm_vblank_off() and drm_queue_vblank_event() drm: Disable vblank interrupt immediately when drm_vblank_offdelay0 drm: Add dev-vblank_disable_immediate flag drm/i915: Opt out of vblank disable timer on gen2 drm: Kick start vblank interrupts at drm_vblank_on() drm/i915: Update scanline_offset
Re: [Intel-gfx] [PULL] topic/vblank-rework
On Wed, Sep 10, 2014 at 5:29 PM, Daniel Vetter daniel.vet...@ffwll.ch wrote: On Wed, Sep 10, 2014 at 4:19 PM, Mario Kleiner mario.kleiner...@gmail.com wrote: Hmm, not quite an ack from my side for the pull in its current form. I said if the two remaining issues i mentioned are addressed, then i'm happy with it and can have my reviewed/acked-by. Looking at the code they haven't been adressed. Sorry about the confusion, I've somehow thought that you've retracted those comments in Message-ID: caesyxygk4foqhky1wcerak_hybex2ogpftjyhu_zfhlbx46...@mail.gmail.com But I've missed that that was about just one of the issues. Thought so. That one patch turns out to be crucial. My own software immediately complained loudly about broken vblank irqs and switched to lower performance fallbacks when that patch was missing. I'll test the patches on a few more cards in the next days - but so far things look good at least as far as my special test cases go. However, this is easily fixable on top of the current patches: 1. A vblank_disable_timeout module parameter of zero should always leave vblank irq's enabled and also override the drivers choice, otherwise a user can't override the driver on a broken driver/gpu combo, which is the only use case for having that module parameter. Currenty the disable_immediately flag overrides the users override - Ouch. So in drm_vblank_put(): ... /* Last user schedules interrupt disable */ if (atomic_dec_and_test(vblank-refcount)) { Insert zero - opt-out check if (drm_vblank_offdelay == 0) return; Remaining code continues if (dev-vblank_disable_immediate || drm_vblank_offdelay 0) vblank_disable_fn((unsigned long)vblank); else if (drm_vblank_offdelay 0) mod_timer(vblank-disable_timer, jiffies + ((drm_vblank_offdelay * HZ)/1000)); Yeah, I guess that makes sense. I'm not really a fan of giving users too powerful module options to hack around driver bugs since often that means they'll never report the bug :( But we have the support now to mark certain module options as debug-only and they'll taint the kernel if set, so this is fixable. I'll follow up with the patch you've suggested. Thanks. I think the modules parameters i usually care about will get proper testing and reporting, because while my software and users are good at detecting such problems, they wouldn't know how to fix them themselves, and at the same time they crucially depend on this stuff working, so this gets reported to me quickly and i can give them the module param workaround in private e-mail and take it from there with proper bug reports or patches. ... 2. For the drm: Have the vblank counter account for the time ... patch, we must opt-out of that last timestamp/counter update/bump if the driver doesn't support high-precision vblank timestamping, otherwise the vblank count and timestamp will be inconsistent with each other - or outright wrong in case of the timestamp. Rather deliver a slightly outdated, but correct count+timestamp pair to userspace, which is still useable for practical purposes, than a pair that's outright wrong and will definitely confuse clients. A simple fix in static void vblank_disable_and_save() would be to replace the new... if (!vblank-enabled) { ... check by ... if (!vblank-enabled ) { Yeah, makes sense (well the follow-up one ofc). I'll do a patch which adds this and adds a comment. Aside I think it would be useful to add a #define for the 0 return value, since the magic checks all over are imo fairly hard to understand. I'll also float a patch for rfc about that. Good! thanks, -mario Thanks for your comments and again my apologies for missing that there's still outstanding work left to do on this. Cheers, Daniel On Wed, Sep 10, 2014 at 2:05 PM, Daniel Vetter daniel.vet...@ffwll.ch wrote: Hi Dave, So here's the final bits of Ville's vblank rework with a bit of cleanup from Mario on top. The neat thing this finally allows is to immediately disable the vblank interrupt on the last drm_vblank_put if the hardware has perfectly accurate vblank counter and timestamp readout support. On i915 that required piles of small adjustements from Ville since depending upon the platform and port the vblank happens at different scanout lines. Of course this is fully opt-in and per-device (we need that since gen2 doesn't have a hw vblank counter). Mario reviewed the entire pile too and after some initial hesitation (about drivers without accurate timestampt support) acked it. Cheers, Daniel The following changes since commit 21d70354bba9965a098382fc4d7fb17e138111f3: drm: move drm_stub.c to drm_drv.c (2014-08-06 19:10:44 +1000) are available in the git repository at: git://anongit.freedesktop.org/drm-intel tags/topic/vblank-rework-2014-09-10 for you to fetch changes up to 2368ffb18b1d2b04eb80478d225676caa7a3c4c8: drm: Use vblank_disable_and_save
Re: [Intel-gfx] [PATCH 14/19] drm: Don't update vblank timestamp when the counter didn't change
I thought about this one again and opposed to my previous comment now think it's fine, also for drivers without hw vblank counter queries. -mario On Wed, Aug 6, 2014 at 1:49 PM, ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com If we already have a timestamp for the current vblank counter, don't update it with a new timestmap. Small errors can creep in between two timestamp queries for the same vblank count, which could be confusing to userspace when it queries the timestamp for the same vblank sequence number twice. This problem gets exposed when the vblank disable timer is not used (or is set to expire quickly) and thus we can get multiple vblank disable-enable transition during the same frame which would all attempt to update the timestamp with the latest estimate. Testcase: igt/kms_flip/flip-vs-expired-vblank Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index af33df1..0523f5b 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -106,6 +106,9 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) DRM_DEBUG(enabling vblank interrupts on crtc %d, missed %d\n, crtc, diff); + if (diff == 0) + return; + /* Reinitialize corresponding vblank timestamp if high-precision query * available. Skip this step if query unsupported or failed. Will * reinitialize delayed at next vblank interrupt in that case. -- 1.8.5.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/19] drm: Have the vblank counter account for the time between vblank irq disable and drm_vblank_off()
-by: Mario Kleiner mario.kleiner...@gmail.com for the whole series, if you want. thanks, -mario On 08/06/2014 01:49 PM, ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com If the vblank irq has already been disabled (via the disable timer) when we call drm_vblank_off() sample the counter and timestamp one last time. This will make the sure that the user space visible counter will account for time between vblank irq disable and drm_vblank_off(). Reviewed-by: Matt Roper matthew.d.ro...@intel.com Reviewed-by: Daniel Vetter daniel.vet...@ffwll.ch Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index af96517..1f86f6c 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -140,6 +140,19 @@ static void vblank_disable_and_save(struct drm_device *dev, int crtc) */ spin_lock_irqsave(dev-vblank_time_lock, irqflags); + /* +* If the vblank interrupt was already disbled update the count +* and timestamp to maintain the appearance that the counter +* has been ticking all along until this time. This makes the +* count account for the entire time between drm_vblank_on() and +* drm_vblank_off(). +*/ + if (!dev-vblank[crtc].enabled) { + drm_update_vblank_count(dev, crtc); + spin_unlock_irqrestore(dev-vblank_time_lock, irqflags); + return; + } + dev-driver-disable_vblank(dev, crtc); dev-vblank[crtc].enabled = false; ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 00/14] drm: Some more vblank timestampi changes
On 29/10/13 19:06, ville.syrj...@linux.intel.com wrote: So I took another look at the vblank timestamping code, and got a bit excited. The result is this patchset. Summary of changes: - kill crtc-hwmode dependency - eliminate a bunch of 64bit math - fix timestamps for stereo and interlaced modes (on i915 at least) - move the early vbl irq hack into radeon code - add a similar hack to i915, but make it as finely targeted as possibly to minimize the chance of accidentally applying it in the wrong place The s/clock/crtc_clock change could use some radeon people to verify whether changing radeon_atom_get_tv_timings() is enough to make crtc_clock always populated. This series applies on top of Mario's Vblank timestamping improvements/fixes for Linux drm. series. Ville Syrjälä (14): drm: Pass the display mode to drm_calc_timestamping_constants() drm: Pass the display mode to drm_calc_vbltimestamp_from_scanoutpos() drm/i915: Kill hwmode save/restore drm/i915: Call drm_calc_timestamping_constants() earlier drm: Improve drm_calc_timestamping_constants() documentation drm: Simplify the math in drm_calc_timestamping_constants() drm/radeon: Populate crtc_clock in radeon_atom_get_tv_timings() drm: Use crtc_clock in drm_calc_timestamping_constants() drm: Change {pixel,line,frame}dur_ns from s64 to int drm/i915: Fix scanoutpos calculations for interlaced modes drm: Fix vblank timestamping constants for interlaced modes drm: Pass 'flags' from the caller to .get_scanout_position() drm/radeon: Move the early vblank IRQ fixup to radeon_get_crtc_scanoutpos() drm/i915: Add a kludge for DSL incrementing too late and ISR not working Hi Ville, sorry this took way longer than expected. I've reviewed all of your patches. Nice cleanups, nice improvements! You can add a ... Reviewed-by: mario.kleiner...@gmail.com ... to all of them. Patches 0 - 11 and 14 are fine as they are. Only tiny formatting/comment fixes needed so they apply cleanly against the current drm-next. Patch 12 and 13 need some small fixes, after applying those i'm fine with them. I'll send separate e-mails for those. As far as testing goes, i had more encounters with Murphy's law in the last weeks than ever before, hence the long delay. You can add Tested-by: mario.kleiner...@gmail.com to the drm core and intel patches with the following restrictions: I was able to sort of test the patchset on Intel GMA-950 (Gen-3 hw). - I didn't test if your interlaced scanout patches 10 and 11 work as expected, because i was testing the patches first, then reviewing them, so i didn't realize at that point testing interlaced mode would be neccessary. The patches look correct to me though. I no longer have easy access to that machine. - My photodiode test equipment, which i need for Intel testing malfunctioned. Not sure if my testing hardware is dying, or if it is a bug in the kernels usb or serial/tty stack, or some kernel misconfiguration wrt. low-latency, but there was so much timing noise in my equipment that i couldn't test with it. - As a workaround I ran the kms-timestamping for regular non-interlaced mode against the original userspace implementation of the same code in my own toolkit Psychtoolbox, which itself was verified with testing equipment to do the right thing on that GMA-950 netbook earlier this year. Difference was less than 40 microseconds and more likely caused due to userspace noisyness and off-by-one errors in Psychtoolbox than your code, so i assume that your code is essentially correct at least for non-interlaced scanout, and that the DRM core changes are therefore also correct. If you or somebody would want to try this test yourself i can guide you through the steps. Psychtoolbox is easily apt-get'able for Debian and at least Ubuntu. - The next limitation of my testing is wrt. to your early vbl irq handling improvements (patch 14). I currently only have Gen3 hardware which doesn't exercise those code path at all, so while the patch looks correct, it's not really tested by me. As far as Radeon testing goes, i can't test it at all atm. After already not working very stable at all for the last half year, my last machine with an AMD card died during bootup for this test, but not without trying to corrupt the filesystem on my development drive as a little post-christmas gift to me. If somebody has a AMD card and wants to test this, it could be tested against the Psychtoolbox userspace reference implementation, which was verified with very precise external hardware last time a couple of months ago. However, patch 13 needs some fixes or it would crash. The now dead PC wasn't mine, but i still have the AMD card. I will try to hunt for a new PC soon, and hopefully will get your patches better tested during the -rc phase if they get merged into 3.14. Apart from a
Re: [Intel-gfx] [PATCH 12/14] drm: Pass 'flags' from the caller to .get_scanout_position()
On 29/10/13 19:06, ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com Preparation for moving the early vblank IRQ logic into radeon_get_crtc_scanoutpos(). Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com Tiny compile fix needed for this one. The function prototype for radeon_get_crtc_scanoutpos() is also defined in radeon_drv.c, so it needs the same update as the one in radeon_mode.h Other than that Reviewed-by: mario.kleiner...@gmail.com -mario --- drivers/gpu/drm/drm_irq.c | 2 +- drivers/gpu/drm/i915/i915_irq.c | 3 ++- drivers/gpu/drm/radeon/radeon_display.c | 7 --- drivers/gpu/drm/radeon/radeon_mode.h| 1 + drivers/gpu/drm/radeon/radeon_pm.c | 2 +- include/drm/drmP.h | 2 ++ 6 files changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index b5c4d42..b39255f 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -585,7 +585,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc, /* Get vertical and horizontal scanout position vpos, hpos, * and bounding timestamps stime, etime, pre/post query. */ - vbl_status = dev-driver-get_scanout_position(dev, crtc, vpos, + vbl_status = dev-driver-get_scanout_position(dev, crtc, flags, vpos, hpos, stime, etime); /* Get correction for CLOCK_MONOTONIC - CLOCK_REALTIME if diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index f6b3206..70daf3c 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -657,7 +657,8 @@ static bool intel_pipe_in_vblank_locked(struct drm_device *dev, enum pipe pipe) } static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe, -int *vpos, int *hpos, ktime_t *stime, ktime_t *etime) + unsigned int flags, int *vpos, int *hpos, + ktime_t *stime, ktime_t *etime) { struct drm_i915_private *dev_priv = dev-dev_private; struct drm_crtc *crtc = dev_priv-pipe_to_crtc_mapping[pipe]; diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c index ccd8751..3581570 100644 --- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -305,7 +305,7 @@ void radeon_crtc_handle_flip(struct radeon_device *rdev, int crtc_id) * to complete in this vblank? */ if (update_pending - (DRM_SCANOUTPOS_VALID radeon_get_crtc_scanoutpos(rdev-ddev, crtc_id, + (DRM_SCANOUTPOS_VALID radeon_get_crtc_scanoutpos(rdev-ddev, crtc_id, 0, vpos, hpos, NULL, NULL)) ((vpos = (99 * rdev-mode_info.crtcs[crtc_id]-base.hwmode.crtc_vdisplay)/100) || (vpos 0 !ASIC_IS_AVIVO(rdev { @@ -1544,6 +1544,7 @@ bool radeon_crtc_scaling_mode_fixup(struct drm_crtc *crtc, * * \param dev Device to query. * \param crtc Crtc to query. + * \param flags Flags from caller (DRM_CALLED_FROM_VBLIRQ or 0). * \param *vpos Location where vertical scanout position should be stored. * \param *hpos Location where horizontal scanout position should go. * \param *stime Target location for timestamp taken immediately before @@ -1565,8 +1566,8 @@ bool radeon_crtc_scaling_mode_fixup(struct drm_crtc *crtc, * unknown small number of scanlines wrt. real scanout position. * */ -int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, int *hpos, - ktime_t *stime, ktime_t *etime) +int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, unsigned int flags, + int *vpos, int *hpos, ktime_t *stime, ktime_t *etime) { u32 stat_crtc = 0, vbl = 0, position = 0; int vbl_start, vbl_end, vtotal, ret = 0; diff --git a/drivers/gpu/drm/radeon/radeon_mode.h b/drivers/gpu/drm/radeon/radeon_mode.h index 3bfa910..c4016dc 100644 --- a/drivers/gpu/drm/radeon/radeon_mode.h +++ b/drivers/gpu/drm/radeon/radeon_mode.h @@ -758,6 +758,7 @@ extern int radeon_crtc_cursor_move(struct drm_crtc *crtc, int x, int y); extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, + unsigned int flags, int *vpos, int *hpos, ktime_t *stime, ktime_t *etime); diff --git a/drivers/gpu/drm/radeon/radeon_pm.c b/drivers/gpu/drm/radeon/radeon_pm.c index 98bf63b..a394049 100644 --- a/drivers/gpu/drm/radeon/radeon_pm.c +++ b/drivers/gpu/drm/radeon/radeon_pm.c @@ -1468,7 +1468,7 @@ static bool
Re: [Intel-gfx] [PATCH 13/14] drm/radeon: Move the early vblank IRQ fixup to radeon_get_crtc_scanoutpos()
On 29/10/13 19:06, ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com i915 doesn't need this kludge for most platforms. Although we do appear to need something similar on certain platforms, but we can be more accurate when we apply the adjustment since we know exactly why the scanline counter doesn't always quite match the vblank status. Also the current code doesn't handle interlaced modes correctly, and we already deal with interlaced modes in i915 code. So let's just move the current code to radeon_get_crtc_scanoutpos() since that's why it was added. For i915 we'll add a more finely targeted variant. The logic itself looks correct and should work, although i couldn't test it because of the dying PC. But see below for some bugfix and some little nit-pick. Other than that Reviewed-by: mario.kleiner...@gmail.com Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 25 ++--- drivers/gpu/drm/radeon/radeon_display.c | 22 ++ 2 files changed, 24 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index b39255f..a1cc1a3 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -542,7 +542,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc, { ktime_t stime, etime, mono_time_offset; struct timeval tv_etime; - int vbl_status, vtotal, vdisplay; + int vbl_status; int vpos, hpos, i; int framedur_ns, linedur_ns, pixeldur_ns, delta_ns, duration_ns; bool invbl; @@ -558,9 +558,6 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc, return -EIO; } - vtotal = mode-crtc_vtotal; - vdisplay = mode-crtc_vdisplay; - /* Durations of frames, lines, pixels in nanoseconds. */ framedur_ns = refcrtc-framedur_ns; linedur_ns = refcrtc-linedur_ns; @@ -569,7 +566,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc, /* If mode timing undefined, just return as no-op: * Happens during initial modesetting of a crtc. */ - if (vtotal = 0 || vdisplay = 0 || framedur_ns == 0) { + if (framedur_ns == 0) { DRM_DEBUG(crtc %d: Noop due to uninitialized mode.\n, crtc); return -EAGAIN; } @@ -631,24 +628,6 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc, */ delta_ns = vpos * linedur_ns + hpos * pixeldur_ns; - /* Is vpos outside nominal vblank area, but less than -* 1/100 of a frame height away from start of vblank? -* If so, assume this isn't a massively delayed vblank -* interrupt, but a vblank interrupt that fired a few -* microseconds before true start of vblank. Compensate -* by adding a full frame duration to the final timestamp. -* Happens, e.g., on ATI R500, R600. -* -* We only do this if DRM_CALLED_FROM_VBLIRQ. -*/ - if ((flags DRM_CALLED_FROM_VBLIRQ) !invbl - ((vdisplay - vpos) vtotal / 100)) { - delta_ns = delta_ns - framedur_ns; - - /* Signal this correction as applied. */ - vbl_status |= 0x8; - } - if (!drm_timestamp_monotonic) etime = ktime_sub(etime, mono_time_offset); diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c index 3581570..9d02fa7 100644 --- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -1709,5 +1709,27 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, unsigned int fl if (in_vbl) ret |= DRM_SCANOUTPOS_INVBL; + /* Is vpos outside nominal vblank area, but less than +* 1/100 of a frame height away from start of vblank? +* If so, assume this isn't a massively delayed vblank +* interrupt, but a vblank interrupt that fired a few +* microseconds before true start of vblank. Compensate +* by adding a full frame duration to the final timestamp. +* Happens, e.g., on ATI R500, R600. +* +* We only do this if DRM_CALLED_FROM_VBLIRQ. +*/ + if ((flags DRM_CALLED_FROM_VBLIRQ) !in_vbl) { + vbl_start = rdev-mode_info.crtcs[crtc]-base.hwmode.crtc_vdisplay; vbl_start gets already initialized by the code above, so the vbl_start assignment here shouldn't be neccessary. Only the vtotal assignment below is really needed. + vtotal = rdev-mode_info.crtcs[crtc]-base.hwmode.crtc_vtotal; + + if (vbl_start - *vpos vtotal / 100) { + vpos -= vtotal; Here vpos is an int*, so the following line will corrupt kernel memory and die. Obviously then this +
Re: [Intel-gfx] [PATCH 00/14] drm: Some more vblank timestampi changes
On 29/11/13 14:36, Ville Syrjälä wrote: On Wed, Nov 06, 2013 at 01:46:41PM +1000, Dave Airlie wrote: On Wed, Oct 30, 2013 at 4:06 AM, ville.syrj...@linux.intel.com wrote: So I took another look at the vblank timestamping code, and got a bit excited. The result is this patchset. I'd like to merge this, I was hoping Mario could ack it at least as it seems mostly sane to my eyes. So we missed that boat, but maybe we'll get the next one... Pinging Mario. Any chance you can take a look at this stuff at some point? I will, including testing. Hopefully within the coming week, but definitely safely before christmas. Hmm. Do I have the wrong email addres for Mario? Adding the other one too just to make sure... Both work, but the tuebingen.mpg.de one will probably soon turn into a pure forward to the gmail one. -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/4] drm: Push latency sensitive bits of vblank scanoutpos timestamping into kms drivers.
A change in locking of some kms drivers (currently intel-kms) make the old approach too inaccurate and also incompatible with the PREEMPT_RT realtime kernel patchset. The driver-get_scanout_position() method of intel-kms now needs to aquire a spinlock, which clashes badly with the former preempt_disable() calls in the drm, and it also introduces larger delays and timing uncertainty on a contended lock than acceptable. This patch changes the prototype of driver-get_scanout_position() to require/allow kms drivers to perform the ktime_get() system time queries which go along with actual scanout position readout in a way that provides maximum precision and to return those timestamps to the drm. kms drivers implementations of get_scanout_position() are asked to implement timestamping and scanoutpos readout in a way that is as precise as possible and compatible with preempt_disable() on a PREMPT_RT kernel. A driver should follow this pattern in get_scanout_position() for precision and compatibility: spin_lock...(...); preempt_disable_rt(); // On a PREEMPT_RT kernel, otherwise omit. if (stime) *stime = ktime_get(); ... Minimum amount of MMIO register reads to get scanout position ... ... no taking of locks allowed here! ... if (etime) *etime = ktime_get(); preempt_enable_rt(); // On PREEMPT_RT kernel, otherwise omit. spin_unlock...(...); v2: Fix formatting of new multi-line code comments. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com Reviewed-by: Alex Deucher alexander.deuc...@amd.com --- drivers/gpu/drm/drm_irq.c | 20 include/drm/drmP.h| 10 -- 2 files changed, 20 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 33ee515..d80d952 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -219,7 +219,7 @@ int drm_vblank_init(struct drm_device *dev, int num_crtcs) for (i = 0; i num_crtcs; i++) init_waitqueue_head(dev-vblank[i].queue); - DRM_INFO(Supports vblank timestamp caching Rev 1 (10.10.2010).\n); + DRM_INFO(Supports vblank timestamp caching Rev 2 (21.10.2013).\n); /* Driver specific high-precision vblank timestamping supported? */ if (dev-driver-get_vblank_timestamp) @@ -586,14 +586,17 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc, * code gets preempted or delayed for some reason. */ for (i = 0; i DRM_TIMESTAMP_MAXRETRIES; i++) { - /* Get system timestamp before query. */ - stime = ktime_get(); - - /* Get vertical and horizontal scanout pos. vpos, hpos. */ - vbl_status = dev-driver-get_scanout_position(dev, crtc, vpos, hpos); + /* +* Get vertical and horizontal scanout position vpos, hpos, +* and bounding timestamps stime, etime, pre/post query. +*/ + vbl_status = dev-driver-get_scanout_position(dev, crtc, vpos, + hpos, stime, etime); - /* Get system timestamp after query. */ - etime = ktime_get(); + /* +* Get correction for CLOCK_MONOTONIC - CLOCK_REALTIME if +* CLOCK_REALTIME is requested. +*/ if (!drm_timestamp_monotonic) mono_time_offset = ktime_get_monotonic_offset(); @@ -604,6 +607,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc, return -EIO; } + /* Compute uncertainty in timestamp of scanout position query. */ duration_ns = ktime_to_ns(etime) - ktime_to_ns(stime); /* Accept result with max_error nsecs timing uncertainty. */ diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 2b954ad..48d15f0 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -835,12 +835,17 @@ struct drm_driver { /** * Called by vblank timestamping code. * -* Return the current display scanout position from a crtc. +* Return the current display scanout position from a crtc, and an +* optional accurate ktime_get timestamp of when position was measured. * * \param dev DRM device. * \param crtc Id of the crtc to query. * \param *vpos Target location for current vertical scanout position. * \param *hpos Target location for current horizontal scanout position. +* \param *stime Target location for timestamp taken immediately before +* scanout position query. Can be NULL to skip timestamp. +* \param *etime Target location for timestamp taken immediately after +* scanout position query. Can be NULL to skip timestamp
[Intel-gfx] Vblank timestamping improvements/fixes for Linux drm. [v2]
Hi Dave, this is v2 of the patch set for improving/restoring accuracy and robustness of vblank timestamping and for fixing incompatibilities with the PREEMPT_RT patches. Could you please merge this for the next kernel? Would be good to have the old accuracy restored as soon as possible. Thanks. v2: Added the reviewed-by's of Ville and Alex, thanks for the review! Fixed multi-line code formatting as suggested by Ville. Successfully tested on Intel and AMD Radeon hardware. thanks, -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/4] drm/intel: Push get_scanout_position() timestamping into kms driver.
Move the ktime_get() clock readouts and potential preempt_disable() calls from drm core into kms driver to make it compatible with the api changes in the drm core. The intel-kms driver needs to take the uncore.lock inside i915_get_crtc_scanoutpos() and intel_pipe_in_vblank(). This is incompatible with the preempt_disable() on a PREEMPT_RT patched kernel, as regular spin locks must not be taken within a preempt_disable'd section. Lock contention on the uncore.lock also introduced too much uncertainty in vblank timestamps. Push the ktime_get() timestamping for scanoutpos queries and potential preempt_disable_rt() into i915_get_crtc_scanoutpos(), so these problems can be avoided: 1. First lock the uncore.lock (might sleep on a PREEMPT_RT kernel). 2. preempt_disable_rt() (will be added by the rt-linux folks). 3. ktime_get() a timestamp before scanout pos query. 4. Do all mmio reads as fast as possible without grabbing any new locks! 5. ktime_get() a post-query timestamp. 6. preempt_enable_rt() 7. Unlock the uncore.lock. This reduces timestamp uncertainty on a low-end HP Atom Mini netbook with Intel GMA-950 nicely: Before: 3-8 usecs with spikes 20 usecs, triggering query retries. After : Typically 1 usec (98% of all samples), occassionally 2 usecs (2% of all samples), with maximum of 3 usecs (a handful). v2: Fix formatting of new multi-line code comments. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com Reviewed-by: Alex Deucher alexander.deuc...@amd.com --- drivers/gpu/drm/i915/i915_irq.c | 54 +++ 1 file changed, 43 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 156a1a4..7cafe64 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -599,35 +599,40 @@ static u32 gm45_get_vblank_counter(struct drm_device *dev, int pipe) return I915_READ(reg); } -static bool intel_pipe_in_vblank(struct drm_device *dev, enum pipe pipe) +/* raw reads, only for fast reads of display block, no need for forcewake etc. */ +#define __raw_i915_read32(dev_priv__, reg__) readl((dev_priv__)-regs + (reg__)) +#define __raw_i915_read16(dev_priv__, reg__) readw((dev_priv__)-regs + (reg__)) + +static bool intel_pipe_in_vblank_locked(struct drm_device *dev, enum pipe pipe) { struct drm_i915_private *dev_priv = dev-dev_private; uint32_t status; + int reg; if (IS_VALLEYVIEW(dev)) { status = pipe == PIPE_A ? I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT : I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT; - return I915_READ(VLV_ISR) status; + reg = VLV_ISR; } else if (IS_GEN2(dev)) { status = pipe == PIPE_A ? I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT : I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT; - return I915_READ16(ISR) status; + reg = ISR; } else if (INTEL_INFO(dev)-gen 5) { status = pipe == PIPE_A ? I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT : I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT; - return I915_READ(ISR) status; + reg = ISR; } else if (INTEL_INFO(dev)-gen 7) { status = pipe == PIPE_A ? DE_PIPEA_VBLANK : DE_PIPEB_VBLANK; - return I915_READ(DEISR) status; + reg = DEISR; } else { switch (pipe) { default: @@ -642,12 +647,17 @@ static bool intel_pipe_in_vblank(struct drm_device *dev, enum pipe pipe) break; } - return I915_READ(DEISR) status; + reg = DEISR; } + + if (IS_GEN2(dev)) + return __raw_i915_read16(dev_priv, reg) status; + else + return __raw_i915_read32(dev_priv, reg) status; } static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe, -int *vpos, int *hpos) +int *vpos, int *hpos, ktime_t *stime, ktime_t *etime) { struct drm_i915_private *dev_priv = dev-dev_private; struct drm_crtc *crtc = dev_priv-pipe_to_crtc_mapping[pipe]; @@ -657,6 +667,7 @@ static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe, int vbl_start, vbl_end, htotal, vtotal; bool in_vbl = true; int ret = 0; + unsigned long irqflags; if (!intel_crtc-active) { DRM_DEBUG_DRIVER(trying to get scanoutpos for disabled @@ -671,14 +682,27 @@ static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe, ret |= DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_ACCURATE; + /* +* Lock uncore.lock, as we will do multiple
[Intel-gfx] [PATCH 3/4] drm/radeon: Push get_scanout_position() timestamping into kms driver.
Move the ktime_get() clock readouts and potential preempt_disable() calls from drm core into kms driver to make it compatible with the api changes in the drm core. This should not introduce any change in functionality or behaviour in radeon-kms, just a reshuffling of code. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com Reviewed-by: Alex Deucher alexander.deuc...@amd.com --- drivers/gpu/drm/radeon/radeon_display.c | 24 +--- drivers/gpu/drm/radeon/radeon_drv.c |3 ++- drivers/gpu/drm/radeon/radeon_mode.h|3 ++- drivers/gpu/drm/radeon/radeon_pm.c |2 +- 4 files changed, 26 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c index 0d1aa05..ccd8751 100644 --- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -306,7 +306,7 @@ void radeon_crtc_handle_flip(struct radeon_device *rdev, int crtc_id) */ if (update_pending (DRM_SCANOUTPOS_VALID radeon_get_crtc_scanoutpos(rdev-ddev, crtc_id, - vpos, hpos)) + vpos, hpos, NULL, NULL)) ((vpos = (99 * rdev-mode_info.crtcs[crtc_id]-base.hwmode.crtc_vdisplay)/100) || (vpos 0 !ASIC_IS_AVIVO(rdev { /* crtc didn't flip in this target vblank interval, @@ -1539,12 +1539,17 @@ bool radeon_crtc_scaling_mode_fixup(struct drm_crtc *crtc, } /* - * Retrieve current video scanout position of crtc on a given gpu. + * Retrieve current video scanout position of crtc on a given gpu, and + * an optional accurate timestamp of when query happened. * * \param dev Device to query. * \param crtc Crtc to query. * \param *vpos Location where vertical scanout position should be stored. * \param *hpos Location where horizontal scanout position should go. + * \param *stime Target location for timestamp taken immediately before + * scanout position query. Can be NULL to skip timestamp. + * \param *etime Target location for timestamp taken immediately after + * scanout position query. Can be NULL to skip timestamp. * * Returns vpos as a positive number while in active scanout area. * Returns vpos as a negative number inside vblank, counting the number @@ -1560,7 +1565,8 @@ bool radeon_crtc_scaling_mode_fixup(struct drm_crtc *crtc, * unknown small number of scanlines wrt. real scanout position. * */ -int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, int *hpos) +int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, int *hpos, + ktime_t *stime, ktime_t *etime) { u32 stat_crtc = 0, vbl = 0, position = 0; int vbl_start, vbl_end, vtotal, ret = 0; @@ -1568,6 +1574,12 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, int struct radeon_device *rdev = dev-dev_private; + /* preempt_disable_rt() should go right here in PREEMPT_RT patchset. */ + + /* Get optional system timestamp before query. */ + if (stime) + *stime = ktime_get(); + if (ASIC_IS_DCE4(rdev)) { if (crtc == 0) { vbl = RREG32(EVERGREEN_CRTC_V_BLANK_START_END + @@ -1650,6 +1662,12 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, int } } + /* Get optional system timestamp after query. */ + if (etime) + *etime = ktime_get(); + + /* preempt_enable_rt() should go right here in PREEMPT_RT patchset. */ + /* Decode into vertical and horizontal scanout position. */ *vpos = position 0x1fff; *hpos = (position 16) 0x1fff; diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 22f6858..101e7c0 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -106,7 +106,8 @@ int radeon_gem_object_open(struct drm_gem_object *obj, void radeon_gem_object_close(struct drm_gem_object *obj, struct drm_file *file_priv); extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, - int *vpos, int *hpos); + int *vpos, int *hpos, ktime_t *stime, + ktime_t *etime); extern const struct drm_ioctl_desc radeon_ioctls_kms[]; extern int radeon_max_kms_ioctl; int radeon_mmap(struct file *filp, struct vm_area_struct *vma); diff --git a/drivers/gpu/drm/radeon/radeon_mode.h b/drivers/gpu/drm/radeon/radeon_mode.h index ef63d3f..3bfa910 100644 --- a/drivers/gpu/drm/radeon/radeon_mode.h +++ b/drivers/gpu/drm/radeon/radeon_mode.h @@ -758,7 +758,8 @@ extern
[Intel-gfx] [PATCH 1/4] drm: Remove preempt_disable() from vblank timestamping code.
Preemption handling will get pushed into the kms drivers in followup patches, to make timestamping more robust and PREEMPT_RT friendly. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com Reviewed-by: Alex Deucher alexander.deuc...@amd.com --- drivers/gpu/drm/drm_irq.c |7 --- 1 file changed, 7 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index f9af048..33ee515 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -586,11 +586,6 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc, * code gets preempted or delayed for some reason. */ for (i = 0; i DRM_TIMESTAMP_MAXRETRIES; i++) { - /* Disable preemption to make it very likely to -* succeed in the first iteration even on PREEMPT_RT kernel. -*/ - preempt_disable(); - /* Get system timestamp before query. */ stime = ktime_get(); @@ -602,8 +597,6 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc, if (!drm_timestamp_monotonic) mono_time_offset = ktime_get_monotonic_offset(); - preempt_enable(); - /* Return as no-op if scanout query unsupported or failed. */ if (!(vbl_status DRM_SCANOUTPOS_VALID)) { DRM_DEBUG(crtc %d : scanoutpos query failed [%d].\n, -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Vblank timestamping improvements/fixes for Linux drm.
Hi all, this patch set for the kernel pushes the latency sensitive bits of vblank scanoutpos timestamping from the drm core into the kms drivers. A change in the locking of the intel-kms driver for Linux 3.11 made the old approach too inaccurate and also incompatible with the PREEMPT_RT realtime kernel patch set. These patches fix that problem and restore the old level of precision and reliability. The patch set changes the prototype of driver-get_scanout_position() to require/allow kms drivers to perform the ktime_get() system time queries which go along with actual scanout position readout in a way that provides maximum precision and to return those timestamps to the drm. It also converts the only two kms drivers which use this api so far (intel-kms and radeon-kms) to the new api and improves precision and reliability of the intel-kms a lot. Patches have been tested on Intel and AMD Radeon hardware and the Intel bits have received some review and feedback by Ville Syrjälä. Please review and apply if possible. Thanks, -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/4] drm: Remove preempt_disable() from vblank timestamping code.
Preemption handling will get pushed into the kms drivers in followup patches, to make timestamping more robust and PREEMPT_RT friendly. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- drivers/gpu/drm/drm_irq.c |7 --- 1 file changed, 7 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index f9af048..33ee515 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -586,11 +586,6 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc, * code gets preempted or delayed for some reason. */ for (i = 0; i DRM_TIMESTAMP_MAXRETRIES; i++) { - /* Disable preemption to make it very likely to -* succeed in the first iteration even on PREEMPT_RT kernel. -*/ - preempt_disable(); - /* Get system timestamp before query. */ stime = ktime_get(); @@ -602,8 +597,6 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc, if (!drm_timestamp_monotonic) mono_time_offset = ktime_get_monotonic_offset(); - preempt_enable(); - /* Return as no-op if scanout query unsupported or failed. */ if (!(vbl_status DRM_SCANOUTPOS_VALID)) { DRM_DEBUG(crtc %d : scanoutpos query failed [%d].\n, -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/4] drm: Push latency sensitive bits of vblank scanoutpos timestamping into kms drivers.
A change in locking of some kms drivers (currently intel-kms) make the old approach too inaccurate and also incompatible with the PREEMPT_RT realtime kernel patchset. The driver-get_scanout_position() method of intel-kms now needs to aquire a spinlock, which clashes badly with the former preempt_disable() calls in the drm, and it also introduces larger delays and timing uncertainty on a contended lock than acceptable. This patch changes the prototype of driver-get_scanout_position() to require/allow kms drivers to perform the ktime_get() system time queries which go along with actual scanout position readout in a way that provides maximum precision and to return those timestamps to the drm. kms drivers implementations of get_scanout_position() are asked to implement timestamping and scanoutpos readout in a way that is as precise as possible and compatible with preempt_disable() on a PREMPT_RT kernel. A driver should follow this pattern in get_scanout_position() for precision and compatibility: spin_lock...(...); preempt_disable_rt(); // On a PREEMPT_RT kernel, otherwise omit. if (stime) *stime = ktime_get(); ... Minimum amount of MMIO register reads to get scanout position ... ... no taking of locks allowed here! ... if (etime) *etime = ktime_get(); preempt_enable_rt(); // On PREEMPT_RT kernel, otherwise omit. spin_unlock...(...); Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- drivers/gpu/drm/drm_irq.c | 18 ++ include/drm/drmP.h| 10 -- 2 files changed, 18 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 33ee515..2250724 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -219,7 +219,7 @@ int drm_vblank_init(struct drm_device *dev, int num_crtcs) for (i = 0; i num_crtcs; i++) init_waitqueue_head(dev-vblank[i].queue); - DRM_INFO(Supports vblank timestamp caching Rev 1 (10.10.2010).\n); + DRM_INFO(Supports vblank timestamp caching Rev 2 (21.10.2013).\n); /* Driver specific high-precision vblank timestamping supported? */ if (dev-driver-get_vblank_timestamp) @@ -586,14 +586,15 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc, * code gets preempted or delayed for some reason. */ for (i = 0; i DRM_TIMESTAMP_MAXRETRIES; i++) { - /* Get system timestamp before query. */ - stime = ktime_get(); - - /* Get vertical and horizontal scanout pos. vpos, hpos. */ - vbl_status = dev-driver-get_scanout_position(dev, crtc, vpos, hpos); + /* Get vertical and horizontal scanout position vpos, hpos, +* and bounding timestamps stime, etime, pre/post query. +*/ + vbl_status = dev-driver-get_scanout_position(dev, crtc, vpos, + hpos, stime, etime); - /* Get system timestamp after query. */ - etime = ktime_get(); + /* Get correction for CLOCK_MONOTONIC - CLOCK_REALTIME if +* CLOCK_REALTIME is requested. +*/ if (!drm_timestamp_monotonic) mono_time_offset = ktime_get_monotonic_offset(); @@ -604,6 +605,7 @@ int drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev, int crtc, return -EIO; } + /* Compute uncertainty in timestamp of scanout position query. */ duration_ns = ktime_to_ns(etime) - ktime_to_ns(stime); /* Accept result with max_error nsecs timing uncertainty. */ diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 2b954ad..48d15f0 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -835,12 +835,17 @@ struct drm_driver { /** * Called by vblank timestamping code. * -* Return the current display scanout position from a crtc. +* Return the current display scanout position from a crtc, and an +* optional accurate ktime_get timestamp of when position was measured. * * \param dev DRM device. * \param crtc Id of the crtc to query. * \param *vpos Target location for current vertical scanout position. * \param *hpos Target location for current horizontal scanout position. +* \param *stime Target location for timestamp taken immediately before +* scanout position query. Can be NULL to skip timestamp. +* \param *etime Target location for timestamp taken immediately after +* scanout position query. Can be NULL to skip timestamp. * * Returns vpos as a positive number while in active scanout area. * Returns vpos as a negative number inside vblank, counting the number @@ -857,7 +862,8 @@ struct drm_driver
[Intel-gfx] [PATCH 3/4] drm/radeon: Push get_scanout_position() timestamping into kms driver.
Move the ktime_get() clock readouts and potential preempt_disable() calls from drm core into kms driver to make it compatible with the api changes in the drm core. This should not introduce any change in functionality or behaviour in radeon-kms, just a reshuffling of code. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- drivers/gpu/drm/radeon/radeon_display.c | 24 +--- drivers/gpu/drm/radeon/radeon_drv.c |3 ++- drivers/gpu/drm/radeon/radeon_mode.h|3 ++- drivers/gpu/drm/radeon/radeon_pm.c |2 +- 4 files changed, 26 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c index 0d1aa05..ccd8751 100644 --- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -306,7 +306,7 @@ void radeon_crtc_handle_flip(struct radeon_device *rdev, int crtc_id) */ if (update_pending (DRM_SCANOUTPOS_VALID radeon_get_crtc_scanoutpos(rdev-ddev, crtc_id, - vpos, hpos)) + vpos, hpos, NULL, NULL)) ((vpos = (99 * rdev-mode_info.crtcs[crtc_id]-base.hwmode.crtc_vdisplay)/100) || (vpos 0 !ASIC_IS_AVIVO(rdev { /* crtc didn't flip in this target vblank interval, @@ -1539,12 +1539,17 @@ bool radeon_crtc_scaling_mode_fixup(struct drm_crtc *crtc, } /* - * Retrieve current video scanout position of crtc on a given gpu. + * Retrieve current video scanout position of crtc on a given gpu, and + * an optional accurate timestamp of when query happened. * * \param dev Device to query. * \param crtc Crtc to query. * \param *vpos Location where vertical scanout position should be stored. * \param *hpos Location where horizontal scanout position should go. + * \param *stime Target location for timestamp taken immediately before + * scanout position query. Can be NULL to skip timestamp. + * \param *etime Target location for timestamp taken immediately after + * scanout position query. Can be NULL to skip timestamp. * * Returns vpos as a positive number while in active scanout area. * Returns vpos as a negative number inside vblank, counting the number @@ -1560,7 +1565,8 @@ bool radeon_crtc_scaling_mode_fixup(struct drm_crtc *crtc, * unknown small number of scanlines wrt. real scanout position. * */ -int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, int *hpos) +int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, int *hpos, + ktime_t *stime, ktime_t *etime) { u32 stat_crtc = 0, vbl = 0, position = 0; int vbl_start, vbl_end, vtotal, ret = 0; @@ -1568,6 +1574,12 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, int struct radeon_device *rdev = dev-dev_private; + /* preempt_disable_rt() should go right here in PREEMPT_RT patchset. */ + + /* Get optional system timestamp before query. */ + if (stime) + *stime = ktime_get(); + if (ASIC_IS_DCE4(rdev)) { if (crtc == 0) { vbl = RREG32(EVERGREEN_CRTC_V_BLANK_START_END + @@ -1650,6 +1662,12 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, int *vpos, int } } + /* Get optional system timestamp after query. */ + if (etime) + *etime = ktime_get(); + + /* preempt_enable_rt() should go right here in PREEMPT_RT patchset. */ + /* Decode into vertical and horizontal scanout position. */ *vpos = position 0x1fff; *hpos = (position 16) 0x1fff; diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 22f6858..101e7c0 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -106,7 +106,8 @@ int radeon_gem_object_open(struct drm_gem_object *obj, void radeon_gem_object_close(struct drm_gem_object *obj, struct drm_file *file_priv); extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, int crtc, - int *vpos, int *hpos); + int *vpos, int *hpos, ktime_t *stime, + ktime_t *etime); extern const struct drm_ioctl_desc radeon_ioctls_kms[]; extern int radeon_max_kms_ioctl; int radeon_mmap(struct file *filp, struct vm_area_struct *vma); diff --git a/drivers/gpu/drm/radeon/radeon_mode.h b/drivers/gpu/drm/radeon/radeon_mode.h index ef63d3f..3bfa910 100644 --- a/drivers/gpu/drm/radeon/radeon_mode.h +++ b/drivers/gpu/drm/radeon/radeon_mode.h @@ -758,7 +758,8 @@ extern int radeon_crtc_cursor_move(struct drm_crtc *crtc, int x, int y
[Intel-gfx] [PATCH 4/4] drm/intel: Push get_scanout_position() timestamping into kms driver.
Move the ktime_get() clock readouts and potential preempt_disable() calls from drm core into kms driver to make it compatible with the api changes in the drm core. The intel-kms driver needs to take the uncore.lock inside i915_get_crtc_scanoutpos() and intel_pipe_in_vblank(). This is incompatible with the preempt_disable() on a PREEMPT_RT patched kernel, as regular spin locks must not be taken within a preempt_disable'd section. Lock contention on the uncore.lock also introduced too much uncertainty in vblank timestamps. Push the ktime_get() timestamping for scanoutpos queries and potential preempt_disable_rt() into i915_get_crtc_scanoutpos(), so these problems can be avoided: 1. First lock the uncore.lock (might sleep on a PREEMPT_RT kernel). 2. preempt_disable_rt() (will be added by the rt-linux folks). 3. ktime_get() a timestamp before scanout pos query. 4. Do all mmio reads as fast as possible without grabbing any new locks! 5. ktime_get() a post-query timestamp. 6. preempt_enable_rt() 7. Unlock the uncore.lock. This reduces timestamp uncertainty on a low-end HP Atom Mini netbook with Intel GMA-950 nicely: Before: 3-8 usecs with spikes 20 usecs, triggering query retries. After : Typically 1 usec (98% of all samples), occassionally 2 usecs (2% of all samples), with maximum of 3 usecs (a handful). Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- drivers/gpu/drm/i915/i915_irq.c | 53 +++ 1 file changed, 42 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 156a1a4..a3e41d3 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -599,35 +599,40 @@ static u32 gm45_get_vblank_counter(struct drm_device *dev, int pipe) return I915_READ(reg); } -static bool intel_pipe_in_vblank(struct drm_device *dev, enum pipe pipe) +/* raw reads, only for fast reads of display block, no need for forcewake etc. */ +#define __raw_i915_read32(dev_priv__, reg__) readl((dev_priv__)-regs + (reg__)) +#define __raw_i915_read16(dev_priv__, reg__) readw((dev_priv__)-regs + (reg__)) + +static bool intel_pipe_in_vblank_locked(struct drm_device *dev, enum pipe pipe) { struct drm_i915_private *dev_priv = dev-dev_private; uint32_t status; + int reg; if (IS_VALLEYVIEW(dev)) { status = pipe == PIPE_A ? I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT : I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT; - return I915_READ(VLV_ISR) status; + reg = VLV_ISR; } else if (IS_GEN2(dev)) { status = pipe == PIPE_A ? I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT : I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT; - return I915_READ16(ISR) status; + reg = ISR; } else if (INTEL_INFO(dev)-gen 5) { status = pipe == PIPE_A ? I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT : I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT; - return I915_READ(ISR) status; + reg = ISR; } else if (INTEL_INFO(dev)-gen 7) { status = pipe == PIPE_A ? DE_PIPEA_VBLANK : DE_PIPEB_VBLANK; - return I915_READ(DEISR) status; + reg = DEISR; } else { switch (pipe) { default: @@ -642,12 +647,17 @@ static bool intel_pipe_in_vblank(struct drm_device *dev, enum pipe pipe) break; } - return I915_READ(DEISR) status; + reg = DEISR; } + + if (IS_GEN2(dev)) + return __raw_i915_read16(dev_priv, reg) status; + else + return __raw_i915_read32(dev_priv, reg) status; } static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe, -int *vpos, int *hpos) +int *vpos, int *hpos, ktime_t *stime, ktime_t *etime) { struct drm_i915_private *dev_priv = dev-dev_private; struct drm_crtc *crtc = dev_priv-pipe_to_crtc_mapping[pipe]; @@ -657,6 +667,7 @@ static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe, int vbl_start, vbl_end, htotal, vtotal; bool in_vbl = true; int ret = 0; + unsigned long irqflags; if (!intel_crtc-active) { DRM_DEBUG_DRIVER(trying to get scanoutpos for disabled @@ -671,14 +682,26 @@ static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe, ret |= DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_ACCURATE; + /* Lock uncore.lock, as we will do multiple timing critical raw +* register reads, potentially with preemption disabled, so the +* following code must not block on uncore.lock
Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
On 10/11/2013 03:30 PM, Sebastian Andrzej Siewior wrote: On 10/11/2013 02:37 PM, Steven Rostedt wrote: On Fri, 11 Oct 2013 12:18:00 +0200 Sebastian Andrzej Siewior bige...@linutronix.de wrote: * Mario Kleiner | 2013-09-26 18:16:47 [+0200]: Good! I will do that. Thanks for clarifying the irq and constraints on raw locks in the other thread. Are there any suggestions for now? preempt_disable_nort() like Luis suggesed? The preempt_disable_nort() is rather pointless, because the preempt_disable() was added specifically for -rt. When PREEMPT_RT is not enabled, preemption is disabled there already by the previous calls to spin_lock(). Either way. Then I remove the preempt_enable/disable call. Any objections? Good with me. I'm currently working on a replacement. -mario -- Steve Sebastian ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/3] drm/i915: Fix scanoutpos calculations
Daniel, Ville, i tested Ville's patch series for the scanoutpos improvements on a GMA-950, on top of airlied's current drm-next branch. There's one issue: The variable position in i915_get_crtc_scanoutpos() must be turned from a u32 into a int, otherwise funny sign errors happen and we end up with *vpos being off by multiple million scanlines and timestamps being off by over 60 seconds. Other than that looks good. Execution time is now better: Before uncore.lock addition: 3-4 usecs execution time for the scanoutpos query on my machine. After uncore.lock addition (3.12.0-rc3) 9-20 usecs, sometimes repetition of the timing loop triggered. After Ville's patches down to typically 3-8 usecs, occassionally spiking to almost 20 usecs. I'll make my patches for the realtime kernel + increased accuracy on top of drm-next + Ville's patches. thanks, -mario On 09/23/2013 12:02 PM, ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com The reported scanout position must be relative to the end of vblank. Currently we manage to fumble that in a few ways. First we don't consider the case when vtotal != vbl_end. While that isn't very common (happens maybe only w/ old panel fitting hardware), we can fix it easily enough. The second issue is that on pre-CTG hardware we convert the pixel count to horizontal/vertical components at the very beginning, and then forget to adjust the horizontal component to be relative to vbl_end. So instead we should keep our numbers in the pixel count domain while we're adjusting the position to be relative to vbl_end. Then when we do the conversion in the end, both vertical _and_ horizontal components will come out correct. Cc: Mario Kleiner mario.klei...@tuebingen.mpg.de Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/i915/i915_irq.c | 37 - 1 file changed, 24 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 697d62c..4f74f0c 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -615,13 +615,7 @@ static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe, /* No obvious pixelcount register. Only query vertical * scanout position from Display scan line register. */ - position = I915_READ(PIPEDSL(pipe)); - - /* Decode into vertical scanout position. Don't have -* horizontal scanout position. -*/ - *vpos = position 0x1fff; - *hpos = 0; + position = I915_READ(PIPEDSL(pipe)) 0x1fff; } else { /* Have access to pixelcount since start of frame. * We can split this into vertical and horizontal @@ -629,15 +623,32 @@ static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe, */ position = (I915_READ(PIPEFRAMEPIXEL(pipe)) PIPE_PIXEL_MASK) PIPE_PIXEL_SHIFT; - *vpos = position / htotal; - *hpos = position - (*vpos * htotal); + /* convert to pixel counts */ + vbl_start *= htotal; + vbl_end *= htotal; + vtotal *= htotal; } - in_vbl = *vpos = vbl_start *vpos vbl_end; + in_vbl = position = vbl_start position vbl_end; - /* Inside upper part of vblank area? Apply corrective offset: */ - if (in_vbl (*vpos = vbl_start)) - *vpos = *vpos - vtotal; + /* +* While in vblank, position will be negative +* counting up towards 0 at vbl_end. And outside +* vblank, position will be positive counting +* up since vbl_end. +*/ + if (position = vbl_start) + position -= vbl_end; + else + position += vtotal - vbl_end; + + if (IS_G4X(dev) || INTEL_INFO(dev)-gen = 5) { + *vpos = position; + *hpos = 0; + } else { + *vpos = position / htotal; + *hpos = position - (*vpos * htotal); + } /* In vblank? */ if (in_vbl) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/3] drm/i915: Fix scanoutpos calculations
Yes. On Oct 12, 2013 1:18 AM, Daniel Vetter dan...@ffwll.ch wrote: On Fri, Oct 11, 2013 at 04:31:38PM +0200, Mario Kleiner wrote: Daniel, Ville, i tested Ville's patch series for the scanoutpos improvements on a GMA-950, on top of airlied's current drm-next branch. There's one issue: The variable position in i915_get_crtc_scanoutpos() must be turned from a u32 into a int, otherwise funny sign errors happen and we end up with *vpos being off by multiple million scanlines and timestamps being off by over 60 seconds. Other than that looks good. Execution time is now better: Before uncore.lock addition: 3-4 usecs execution time for the scanoutpos query on my machine. After uncore.lock addition (3.12.0-rc3) 9-20 usecs, sometimes repetition of the timing loop triggered. After Ville's patches down to typically 3-8 usecs, occassionally spiking to almost 20 usecs. I'll make my patches for the realtime kernel + increased accuracy on top of drm-next + Ville's patches. So official reviewed-by/tested-by from you on Ville's latest patches in this thread? Yes. -mario -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 1/3] drm/i915: Skip register reads in i915_get_crtc_scanoutpos()
On 25.09.13 10:14, Ville Syrjälä wrote: On Wed, Sep 25, 2013 at 04:35:56AM +0200, Mario Kleiner wrote: On 23.09.13 13:48, ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com We have all the information we need in the mode structure, so going and reading it from the hardware is pointless, and slower. We never populated -get_vblank_timestamp() in the UMS case, and as that is the only way we'd ever call -get_scanout_position(), we can completely ignore UMS in i915_get_crtc_scanoutpos(). Also reorganize intel_irq_init() a bit to clarify the KMS vs. UMS situation. v2: Drop UMS code Cc: Mario Kleiner mario.klei...@tuebingen.mpg.de Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/i915/i915_irq.c | 43 - 1 file changed, 17 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index b356dc1..058f099 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -570,24 +570,29 @@ static u32 gm45_get_vblank_counter(struct drm_device *dev, int pipe) static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe, int *vpos, int *hpos) { - drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev-dev_private; - u32 vbl = 0, position = 0; + struct drm_i915_private *dev_priv = dev-dev_private; + struct drm_crtc *crtc = dev_priv-pipe_to_crtc_mapping[pipe]; + struct intel_crtc *intel_crtc = to_intel_crtc(crtc); + const struct drm_display_mode *mode = intel_crtc-config.adjusted_mode; + u32 position; int vbl_start, vbl_end, htotal, vtotal; bool in_vbl = true; int ret = 0; - enum transcoder cpu_transcoder = intel_pipe_to_cpu_transcoder(dev_priv, - pipe); - if (!i915_pipe_enabled(dev, pipe)) { + if (!intel_crtc-active) { DRM_DEBUG_DRIVER(trying to get scanoutpos for disabled pipe %c\n, pipe_name(pipe)); return 0; } - /* Get vtotal. */ - vtotal = 1 + ((I915_READ(VTOTAL(cpu_transcoder)) 16) 0x1fff); + htotal = mode-crtc_htotal; + vtotal = mode-crtc_vtotal; + vbl_start = mode-crtc_vblank_start; + vbl_end = mode-crtc_vblank_end; - if (INTEL_INFO(dev)-gen = 4) { + ret |= DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_ACCURATE; + + if (IS_G4X(dev) || INTEL_INFO(dev)-gen = 5) { /* No obvious pixelcount register. Only query vertical * scanout position from Display scan line register. */ @@ -605,29 +610,16 @@ static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe, */ position = (I915_READ(PIPEFRAMEPIXEL(pipe)) PIPE_PIXEL_MASK) PIPE_PIXEL_SHIFT; - htotal = 1 + ((I915_READ(HTOTAL(cpu_transcoder)) 16) 0x1fff); *vpos = position / htotal; *hpos = position - (*vpos * htotal); } - /* Query vblank area. */ - vbl = I915_READ(VBLANK(cpu_transcoder)); - - /* Test position against vblank region. */ - vbl_start = vbl 0x1fff; - vbl_end = (vbl 16) 0x1fff; - - if ((*vpos vbl_start) || (*vpos vbl_end)) - in_vbl = false; + in_vbl = *vpos = vbl_start *vpos vbl_end; I think this should be a = instead of in *vpos vbl_end, if it is meant to be equal to the line it replaces (not is =), unless the original comparison was off-by-one? Yeah, I think the original was wrong, in more ways than one. It forgot to add +1 to vbl_start/end, and then it did the comparison wrong as well. Ah ok, that's possible. Then you have my blessing :). On the Intel side i only had and have sporadic access to an old Intel GMA-950 (Gen-3?) when writing that function, so i could only really test one half of the code-path in that function. Also that card only has a VGA output, which limits my actual measurements to use of a photo-diode attached to a CRT monitor. That means i can only verify the accuracy of timestamping down to about 0.2 msecs variability and 0.5 msecs bias due to the limitations/noise of the measurement setup (depending how close i get the photo-diode to the corner of the monitor, how dark it is, etc.). So i know that the jitter in the timestamps is very low, less than 1 usec standard deviation iirc, and that the absolute error wrt. reality is lower than 0.2 msecs, but i wouldn't be able to detect absolute errors of a few scanlines. -mario +in_vbl = *vpos = vbl_start *vpos = vbl_end; Other than that, it looks good. Reviewed-by: mario.kleiner...@gmail.com /* Inside upper part of vblank area? Apply corrective offset: */ if (in_vbl (*vpos = vbl_start)) *vpos = *vpos
Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
On 25.09.13 16:13, Steven Rostedt wrote: On Wed, 25 Sep 2013 06:32:10 +0200 Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: But given the new situation, your proposal is great! If we push the clock readouts into the get_scanoutpos routine, we can make this robust without causing grief for the rt people and without the need for a new separate lock for display regs in intel-kms. E.g., for intel-kms: i915_get_crtc_scanoutpos(..., ktime_t *stime, ktime_t *etime) { ... spin_lock_irqsave(...uncore.lock); preempt_disable(); *stime = ktime_get(); position = __raw_i915_read32(dev_priv, PIPEDSL(pipe)); *etime = ktime_get(); preempt_enable(); spin_unlock_irqrestore(...uncore.lock) ... } With your patchset to reduce the amount of register reads needed in that function, and given that forcewake handling isn't needed for these registers, this should make it robust again and wouldn't need new locks. Unless ktime_get is also a bad thing to do in a preempt disabled section? ktime_get() works fine in preempt_disable sections, although it may add some latencies, but you shouldn't need to worry about it. I like this solution the best too, but if it does go in, I would ask to send us the patch for adding the preempt_disable() and we can add the preempt_disable_rt() to it. Why make mainline have a little more overhead? -- Steve Good! I will do that. Thanks for clarifying the irq and constraints on raw locks in the other thread. -mario ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7
On 25.09.13 09:49, Ville Syrjälä wrote: On Wed, Sep 25, 2013 at 06:32:10AM +0200, Mario Kleiner wrote: On 23.09.13 10:38, Ville Syrjälä wrote: On Sat, Sep 21, 2013 at 12:07:36AM +0200, Mario Kleiner wrote: On 09/17/2013 10:55 PM, Daniel Vetter wrote: On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley pe...@hurleysoftware.com wrote: On 09/11/2013 03:31 PM, Peter Hurley wrote: [+cc dri-devel] On 09/11/2013 11:38 AM, Steven Rostedt wrote: On Wed, 11 Sep 2013 11:16:43 -0400 Peter Hurley pe...@hurleysoftware.com wrote: The funny part is, there's a comment there that shows that this was done even for PREEMPT_RT. Unfortunately, the call to get_scanout_position() can call functions that use the rt-mutex sleeping spin locks and it breaks there. I guess we need to ask the authors of the mainline patch exactly why that preempt_disable() is needed? The drm core associates a timestamp with each vertical blank frame #. Drm drivers can optionally support a 'high resolution' hw timestamp. The vblank frame #/timestamp tuple is user-space visible. The i915 drm driver supports a hw timestamp via this drm helper function which computes the timestamp from the crtc scan position (based on the pixel clock). For mainline, the preempt_disable/_enable() isn't actually necessary because every call tree that leads here already has preemption disabled. For -RT, the maybe i915 register spinlock (uncore.lock) should be raw? No, it should not. Note, any other lock that can be held when it is held would also need to be raw. By that, you mean any other lock that might be claimed would also need to be raw? Hopefully not any other lock already held? And by taking a quick audit of the code, I see this: spin_lock_irqsave(dev_priv-uncore.lock, irqflags); /* Reset the chip */ /* GEN6_GDRST is not in the gt power well, no need to check * for fifo space for the write or forcewake the chip for * the read */ __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL); /* Spin waiting for the device to ack the reset request */ ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) GEN6_GRDOM_FULL) == 0, 500); That spin is unacceptable in RT with preemption and interrupts disabled. Yep. That would be bad. AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included in the force-wake set, so raw reads of the registers would probably be acceptable (thus obviating the need for claiming the uncore.lock). Except that _ALL_ register access is disabled with the uncore.lock during a gpu reset. Not sure if that's meant to include crtc registers or not, or what other synchronization/serialization issues are being handled/hidden by forcing all register accesses to wait during a gpu reset. Hopefully an i915 expert can weigh in here? Daniel, Can you shed some light on whether the i915+ crtc registers (specifically those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter()) read as part of the vblank counter/timestamp handling need to be prevented during gpu reset? The depency here in the locking is a recent addition: commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a Author: Chris Wilson ch...@chris-wilson.co.uk Date: Fri Jul 19 20:36:51 2013 +0100 drm/i915: Serialize almost all register access It's a (slightly) oversized hammer to work around a hardware issue - we could break it down to register blocks, which can be accessed concurrently, but that tends to be more fragile. But the chip really dies if you access (even just reads) the same block concurrently :( We could try break the spinlock protected section a bit in the reset handler - register access on a hung gpu tends to be ill-defined anyway. The implied wait with preemption and interrupts disabled is causing grief in -RT, but also a 4ms wait inside an irq handler seems like a bad idea. Oops, the magic code in wait_for which is just there to make the imo totally misguided kgdb support work papered over the aweful long wait in atomic context ever since we've added this in commit b6e45f866465f42b53d803b0c574da0fc508a0e9 Author: Keith Packard kei...@keithp.com Date: Fri Jan 6 11:34:04 2012 -0800 drm/i915: Move reset forcewake processing to gen6_do_reset Reverting this change should be enough (code moved obviously a bit). Cheers, Daniel Regards, Peter Hurley What's the real issue here? That the vblank timestamp needs to be an accurate measurement of a realtime event. Sleeping/servicing interrupts while reading the registers necessary to compute the timestamp would be bad too. (edit: which hopefully Mario Kleiner clarified in his reply) My point earlier was three-fold: 1. Don't need the preempt_disable() for mainline: all callers are already holding interrupt-disabling spinlocks. 2. -RT still needs to prevent scheduling there. 3. the problem is i915-specific. [update: the radeon driver should also BUG like the i915 driver but probably