from:"Mario Kleiner"

Canceled event: XDC 2023 A Corunha Spain @ Tue Oct 17 - Thu Oct 19, 2023 (amd-gfx@lists.freedesktop.org)

2023-04-17 Thread mario . kleiner . de

BEGIN:VCALENDAR
PRODID:-//Google Inc//Google Calendar 70.9054//EN
VERSION:2.0
CALSCALE:GREGORIAN
METHOD:CANCEL
BEGIN:VEVENT
DTSTART;VALUE=DATE:20231017
DTEND;VALUE=DATE:20231020
DTSTAMP:20230417T170848Z
ORGANIZER;CN=mario.kleiner...@gmail.com:mailto:mario.kleiner...@gmail.com
UID:65qeuuc9e0gll25tq5r7e61...@google.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=et
 na...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:etnaviv@lists.freedesktop
 .org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=xo
 rg-de...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:xorg-devel@lists.freed
 esktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=am
 d-gfx list;X-NUM-GUESTS=0:mailto:amd-gfx@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=in
 tel-gfx;X-NUM-GUESTS=0:mailto:intel-...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=No
 uveau Dev;X-NUM-GUESTS=0:mailto:nouv...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=ACCEPTED;CN=mario.
 kleiner...@gmail.com;X-NUM-GUESTS=0:mailto:mario.kleiner...@gmail.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=bo
 a...@foundation.x.org;X-NUM-GUESTS=0:mailto:bo...@foundation.x.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=li
 bre-soc-...@lists.libre-soc.org;X-NUM-GUESTS=0:mailto:libre-soc-dev@lists.l
 ibre-soc.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=ML
  mesa-dev;X-NUM-GUESTS=0:mailto:mesa-...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=me
 mb...@x.org;X-NUM-GUESTS=0:mailto:memb...@x.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=fr
 eedr...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:freedreno@lists.freedes
 ktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=dr
 oidbit...@gmail.com;X-NUM-GUESTS=0:mailto:droidbit...@gmail.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=wa
 yland-de...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:wayland-devel@lists
 .freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=dr
 i-devel;X-NUM-GUESTS=0:mailto:dri-de...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=si
 gles...@igalia.com;X-NUM-GUESTS=0:mailto:sigles...@igalia.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN=ev
 e...@lists.x.org;X-NUM-GUESTS=0:mailto:eve...@lists.x.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;X-NUM
 -GUESTS=0:mailto:bibby.hs...@mediatek.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;CN="G
 arg, Rohan";X-NUM-GUESTS=0:mailto:rohan.g...@intel.com
X-GOOGLE-CONFERENCE:https://meet.google.com/azn-uwfp-pgw
CREATED:20230417T170310Z
DESCRIPTION:Hello!\n \nRegistration & Call for Proposals are now open for X
 DC 2023\, which will\ntake place on October 17-19\, 2023.\n\nhttps://xdc202
 3.x.org\n \nAs usual\, the conference is free of charge and open to the gen
 eral\npublic. If you plan on attending\, please make sure to register as ea
 rly\nas possible!\n \nIn order to register as attendee\, you will therefore
  need to register\nvia the XDC website.\n \nhttps://indico.freedesktop.org/
 event/4/registrations/\n \nIn addition to registration\, the CfP is now ope
 n for talks\, workshops\nand demos at XDC 2023. While ...\n\n-::~:~::~:~:~:
 ~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-\n
 Join with Google Meet: https://meet.google.com/azn-uwfp-pgw\n\nLearn more a
 bout Meet at: https://support.google.com/a/users/answer/9282720\n\nPlease d
 o not edit this section.\n-::~:~::~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~
 :~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-
LAST-MODIFIED:20230417T170847Z
LOCATION:
SEQUENCE:1
STATUS:CANCELLED
SUMMARY:XDC 2023 A Corunha Spain
TRANSP:TRANSPARENT
END:VEVENT
END:VCALENDAR


invite.ics
Description: application/ics

Invitation: XDC 2023 A Corunha Spain @ Tue Oct 17 - Thu Oct 19, 2023 (amd-gfx@lists.freedesktop.org)

2023-04-17 Thread mario . kleiner . de

BEGIN:VCALENDAR
PRODID:-//Google Inc//Google Calendar 70.9054//EN
VERSION:2.0
CALSCALE:GREGORIAN
METHOD:REQUEST
BEGIN:VEVENT
DTSTART;VALUE=DATE:20231017
DTEND;VALUE=DATE:20231020
DTSTAMP:20230417T170311Z
ORGANIZER;CN=mario.kleiner...@gmail.com:mailto:mario.kleiner...@gmail.com
UID:65qeuuc9e0gll25tq5r7e61...@google.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=etna...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:etnaviv@lists.f
 reedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=xorg-de...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:xorg-devel@l
 ists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=amd-gfx list;X-NUM-GUESTS=0:mailto:amd-gfx@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=intel-gfx;X-NUM-GUESTS=0:mailto:intel-...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=Nouveau Dev;X-NUM-GUESTS=0:mailto:nouv...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=ACCEPTED;RSVP=TRUE
 ;CN=mario.kleiner...@gmail.com;X-NUM-GUESTS=0:mailto:mario.kleiner.de@gmail
 .com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=bo...@foundation.x.org;X-NUM-GUESTS=0:mailto:bo...@foundation.x.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=libre-soc-...@lists.libre-soc.org;X-NUM-GUESTS=0:mailto:libre-soc-d
 e...@lists.libre-soc.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=ML mesa-dev;X-NUM-GUESTS=0:mailto:mesa-...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=memb...@x.org;X-NUM-GUESTS=0:mailto:memb...@x.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=freedr...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:freedreno@lis
 ts.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=droidbit...@gmail.com;X-NUM-GUESTS=0:mailto:droidbit...@gmail.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=wayland-de...@lists.freedesktop.org;X-NUM-GUESTS=0:mailto:wayland-d
 e...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=dri-devel;X-NUM-GUESTS=0:mailto:dri-de...@lists.freedesktop.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=sigles...@igalia.com;X-NUM-GUESTS=0:mailto:sigles...@igalia.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=eve...@lists.x.org;X-NUM-GUESTS=0:mailto:eve...@lists.x.org
X-GOOGLE-CONFERENCE:https://meet.google.com/azn-uwfp-pgw
X-MICROSOFT-CDO-OWNERAPPTID:-2084153633
CREATED:20230417T170310Z
DESCRIPTION:Hello!\n \nRegistration & Call for Proposals are now open for X
 DC 2023\, which will\ntake place on October 17-19\, 2023.\n\nhttps://xdc202
 3.x.org\n \nAs usual\, the conference is free of charge and open to the gen
 eral\npublic. If you plan on attending\, please make sure to register as ea
 rly\nas possible!\n \nIn order to register as attendee\, you will therefore
  need to register\nvia the XDC website.\n \nhttps://indico.freedesktop.org/
 event/4/registrations/\n \nIn addition to registration\, the CfP is now ope
 n for talks\, workshops\nand demos at XDC 2023. While ...\n\n-::~:~::~:~:~:
 ~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-\n
 Join with Google Meet: https://meet.google.com/azn-uwfp-pgw\n\nLearn more a
 bout Meet at: https://support.google.com/a/users/answer/9282720\n\nPlease d
 o not edit this section.\n-::~:~::~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~
 :~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-
LAST-MODIFIED:20230417T170310Z
LOCATION:
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:XDC 2023 A Corunha Spain
TRANSP:TRANSPARENT
BEGIN:VALARM
ACTION:EMAIL
DESCRIPTION:This is an event reminder
SUMMARY:Alarm notification
ATTENDEE:mailto:amd-gfx@lists.freedesktop.org
TRIGGER:-P0DT0H30M0S
END:VALARM
BEGIN:VALARM
ACTION:DISPLAY
DESCRIPTION:This is an event reminder
TRIGGER:-P0DT0H30M0S
END:VALARM
BEGIN:VALARM
ACTION:EMAIL
DESCRIPTION:This is an event reminder
SUMMARY:Alarm notification
ATTENDEE:mailto:amd-gfx@lists.freedesktop.org
TRIGGER:-P0DT7H30M0S
END:VALARM
END:VEVENT
END:VCALENDAR


invite.ics
Description: application/ics

[PATCH] drm/amd/display: Only use depth 36 bpp linebuffers on DCN display engines.

2022-07-11 Thread Mario Kleiner

Various DCE versions had trouble with 36 bpp lb depth, requiring fixes,
last time in commit 353ca0fa5630 ("drm/amd/display: Fix 10bit 4K display
on CIK GPUs") for DCE-8. So far >= DCE-11.2 was considered ok, but now I
found out that on DCE-11.2 it causes dithering when there shouldn't be
any, so identity pixel passthrough with identity gamma LUTs doesn't work
when it should. This breaks various important neuroscience applications,
as reported to me by scientific users of Polaris cards under Ubuntu 22.04
with Linux 5.15, and confirmed by testing it myself on DCE-11.2.

Lets only use depth 36 for DCN engines, where my testing showed that it
is both necessary for high color precision output, e.g., RGBA16 fb's,
and not harmful, as far as more than one year in real-world use showed.

DCE engines seem to work fine for high precision output at 30 bpp, so
this ("famous last words") depth 30 should hopefully fix all known problems
without introducing new ones.

Successfully retested on DCE-11.2 Polaris and DCN-1.0 Raven Ridge on
top of Linux 5.19.0-rc2 + drm-next.

Fixes: 353ca0fa5630 ("drm/amd/display: Fix 10bit 4K display on CIK GPUs")
Signed-off-by: Mario Kleiner 
Tested-by: Mario Kleiner 
Cc: sta...@vger.kernel.org # 5.14.0
Cc: Alex Deucher 
Cc: Harry Wentland 
---
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index 6774dd8bb53e..3fe3fbac1e63 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -1117,12 +1117,13 @@ bool resource_build_scaling_params(struct pipe_ctx 
*pipe_ctx)
 * on certain displays, such as the Sharp 4k. 36bpp is needed
 * to support SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616 and
 * SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 with actual > 10 bpc
-* precision on at least DCN display engines. However, at least
-* Carrizo with DCE_VERSION_11_0 does not like 36 bpp lb depth,
-* so use only 30 bpp on DCE_VERSION_11_0. Testing with DCE 11.2 and 8.3
-* did not show such problems, so this seems to be the exception.
+* precision on DCN display engines, but apparently not for DCE, as
+* far as testing on DCE-11.2 and DCE-8 showed. Various DCE parts have
+* problems: Carrizo with DCE_VERSION_11_0 does not like 36 bpp lb 
depth,
+* neither do DCE-8 at 4k resolution, or DCE-11.2 (broken identify pixel
+* passthrough). Therefore only use 36 bpp on DCN where it is actually 
needed.
 */
-   if (plane_state->ctx->dce_version > DCE_VERSION_11_0)
+   if (plane_state->ctx->dce_version > DCE_VERSION_MAX)
pipe_ctx->plane_res.scl_data.lb_params.depth = 
LB_PIXEL_DEPTH_36BPP;
else
pipe_ctx->plane_res.scl_data.lb_params.depth = 
LB_PIXEL_DEPTH_30BPP;
-- 
2.34.1

Re: [PATCH] drm/amd/display: Fix 10bit 4K display on CIK GPUs

2021-07-15 Thread Mario Kleiner

On Thu, Jul 15, 2021 at 6:10 PM Alex Deucher  wrote:
>
> On Wed, Jul 14, 2021 at 4:15 AM Liviu Dudau  wrote:
> >
> > Commit 72a7cf0aec0c ("drm/amd/display: Keep linebuffer pixel depth at
> > 30bpp for DCE-11.0.") doesn't seems to have fixed 10bit 4K rendering over
> > DisplayPort for CIK GPUs. On my machine with a HAWAII GPU I get a broken
> > image that looks like it has an effective resolution of 1920x1080 but
> > scaled up in an irregular way. Reverting the commit or applying this
> > patch fixes the problem on v5.14-rc1.
> >
> > Fixes: 72a7cf0aec0c ("drm/amd/display: Keep linebuffer pixel depth at 30bpp 
> > for DCE-11.0.")
> > Signed-off-by: Liviu Dudau 
>
> Harry or Mario any ideas?  Maybe we need finer grained DCE version
> checking?  I don't remember all of the caveats of this stuff.  DCE11
> and older is getting to be pretty old at this point.  I can just apply
> this if you don't have any insights.
>
> Alex
>

Hi Alex

I'd be fine with applying this. As my original commit says, photometer
measurements showed that increasing the line buffer depth was only
needed for my DCN-1 RavenRidge, not for my DCE-11.2 Polaris11 or a
DCE-8.3 cik, so this should probably not cause harm to the increased
precision modes.

Note that given the hardware and USB-C/DP-HDMI adapters i have, I only
tested this on a 2560x1440@144 Hz DP monitor with DCN-1, DCE-11.2, and
a 2560x1440@100 Hz HDMI monitor iirc with DCN-1, DCE-8.3, and i think
on a 2880x1800@60 Hz MBP Retina eDP panel with DCE-11.2. These are the
highest resolution/framerate monitors I have atm.I don't have access
to any 4k monitors, so maybe the problem is somehow specific to such
high resolutions? Maybe somewhere else in the code something would
need to be adapted? Lacking actual hw docs, my coding here is by
pattern matching against existing DC code, guessing and testing on my
limited hw samples.

Acked-by: Mario Kleiner 

-mario

> > ---
> >  drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
> > b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> > index a6a67244a322e..1596f6b7fed7c 100644
> > --- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> > +++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> > @@ -1062,7 +1062,7 @@ bool resource_build_scaling_params(struct pipe_ctx 
> > *pipe_ctx)
> >  * so use only 30 bpp on DCE_VERSION_11_0. Testing with DCE 11.2 
> > and 8.3
> >  * did not show such problems, so this seems to be the exception.
> >  */
> > -   if (plane_state->ctx->dce_version != DCE_VERSION_11_0)
> > +   if (plane_state->ctx->dce_version > DCE_VERSION_11_0)
> > pipe_ctx->plane_res.scl_data.lb_params.depth = 
> > LB_PIXEL_DEPTH_36BPP;
> > else
> > pipe_ctx->plane_res.scl_data.lb_params.depth = 
> > LB_PIXEL_DEPTH_30BPP;
> > --
> > 2.32.0
> >
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 2/7] drm/uAPI: Add "active bpc" as feedback channel for "max bpc" drm property

2021-06-14 Thread Mario Kleiner

On Thu, Jun 10, 2021 at 9:55 AM Pekka Paalanen  wrote:
>
> On Tue,  8 Jun 2021 19:43:15 +0200
> Werner Sembach  wrote:
>
> > Add a new general drm property "active bpc" which can be used by graphic 
> > drivers
> > to report the applied bit depth per pixel back to userspace.
> >

Maybe "bit depth per pixel" -> "bit depth per pixel color component"
for slightly more clarity?

> > While "max bpc" can be used to change the color depth, there was no way to 
> > check
> > which one actually got used. While in theory the driver chooses the 
> > best/highest
> > color depth within the max bpc setting a user might not be fully aware what 
> > his
> > hardware is or isn't capable off. This is meant as a quick way to double 
> > check
> > the setup.
> >
> > In the future, automatic color calibration for screens might also depend on 
> > this
> > information being available.
> >
> > Signed-off-by: Werner Sembach 
> > ---
> >  drivers/gpu/drm/drm_atomic_uapi.c |  2 ++
> >  drivers/gpu/drm/drm_connector.c   | 41 +++
> >  include/drm/drm_connector.h   | 15 +++
> >  3 files changed, 58 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
> > b/drivers/gpu/drm/drm_atomic_uapi.c
> > index 268bb69c2e2f..7ae4e40936b5 100644
> > --- a/drivers/gpu/drm/drm_atomic_uapi.c
> > +++ b/drivers/gpu/drm/drm_atomic_uapi.c
> > @@ -873,6 +873,8 @@ drm_atomic_connector_get_property(struct drm_connector 
> > *connector,
> >   *val = 0;
> >   } else if (property == connector->max_bpc_property) {
> >   *val = state->max_requested_bpc;
> > + } else if (property == connector->active_bpc_property) {
> > + *val = state->active_bpc;
> >   } else if (connector->funcs->atomic_get_property) {
> >   return connector->funcs->atomic_get_property(connector,
> >   state, property, val);
> > diff --git a/drivers/gpu/drm/drm_connector.c 
> > b/drivers/gpu/drm/drm_connector.c
> > index 7631f76e7f34..c0c3c09bfed0 100644
> > --- a/drivers/gpu/drm/drm_connector.c
> > +++ b/drivers/gpu/drm/drm_connector.c
> > @@ -1195,6 +1195,14 @@ static const struct drm_prop_enum_list 
> > dp_colorspaces[] = {
> >   *   drm_connector_attach_max_bpc_property() to create and attach the
> >   *   property to the connector during initialization.
> >   *
> > + * active bpc:
> > + *   This read-only range property tells userspace the pixel color bit 
> > depth
> > + *   actually used by the hardware display engine on "the cable" on a
> > + *   connector. The chosen value depends on hardware capabilities, both
> > + *   display engine and connected monitor, and the "max bpc" property.
> > + *   Drivers shall use drm_connector_attach_active_bpc_property() to 
> > install
> > + *   this property.
> > + *
>
> This description is now clear to me, but I wonder, is it also how
> others understand it wrt. dithering?
>
> Dithering done on monitor is irrelevant, because we are talking about
> "on the cable" pixels. But since we are talking about "on the cable"
> pixels, also dithering done by the display engine must not factor in.
> Should the dithering done by display engine result in higher "active
> bpc" number than what is actually transmitted on the cable?
>
> I cannot guess what userspace would want exactly. I think the
> strict "on the cable" interpretation is a safe bet, because it then
> gives a lower limit on observed bpc. Dithering settings should be
> exposed with other KMS properties, so userspace can factor those in.
> But to be absolutely sure, we'd have to ask some color management
> experts.
>
> Cc'ing Mario in case he has an opinion.
>

Thanks. I like this a lot, in fact such a connector property was on my
todo list / wish list for something like that!

I agree with the "active bpc" definition here in this patch and
Pekka's comments. I want what goes out over the cable, not including
any effects of dithering. At least AMD's amdpu-kms driver exposes
"active bpc" already as a per-connector property in debugfs, and i use
reported output from there a lot to debug problems with respect to HDR
display or high color precision output, and to verify i'm not fooling
myself wrt. what goes out, compared to what dithering may "fake" on
top of it.

Software like mine would greatly benefit from getting this directly
off the connector, ie. as a RandR output property, just like with "max
bpc", as mapping X11 output names to driver output names is a guessing
game, directing regular users to those debugfs files is tedious and
error prone, and many regular users don't have root permissions
anyway.

Sometimes one wants to prioritize "active bpc" over resolution or
refresh rate, and especially on now more common HDR displays, and
actual bit depth also changes depending on bandwidth requirements vs.
availability, and how well DP link training went with a flaky or loose
cable, like only getting 10 bpc for HDR-10 when running on less than
maximum

Re: display regression on Carrizo

2021-06-02 Thread Mario Kleiner

On Tue, Jun 1, 2021 at 6:50 PM StDenis, Tom  wrote:
>
> [AMD Official Use Only]
>
> Hi Mario,
>
> Yes, this diff fixes the display on my Carrizo:
>
> [root@carrizo linux]# git diff
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> index cd864cc83539..ca7739c9f6cb 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> @@ -1044,7 +1044,7 @@ bool resource_build_scaling_params(struct pipe_ctx 
> *pipe_ctx)
>  * SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 with actual > 10 bpc
>  * precision on at least DCN display engines.
>  */
> -   pipe_ctx->plane_res.scl_data.lb_params.depth = LB_PIXEL_DEPTH_36BPP;
> +   pipe_ctx->plane_res.scl_data.lb_params.depth = LB_PIXEL_DEPTH_30BPP;
> pipe_ctx->plane_res.scl_data.lb_params.alpha_en = 
> plane_state->per_pixel_alpha;
>
> if (pipe_ctx->plane_res.xfm != NULL)
>
> Tom
>

Thanks. Just sent out a proper patch which should hopefully fix it for
you, limited to dce-11.0 for now.
-mario
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/display: Keep linebuffer pixel depth at 30bpp for DCE-11.0.

2021-06-02 Thread Mario Kleiner

Testing on AMD Carizzo with DCE-11.0 display engine showed that
it doesn't like a 36 bpp linebuffer very much. The display just
showed a solid green.

Testing on RavenRidge DCN-1.0, Polaris11 with DCE-11.2 and Kabini
with DCE-8.3 did not expose any problems, so for now only revert
to 30 bpp linebuffer depth on asics with DCE-11.0 display engine.

Reported-by: Tom StDenis 
Signed-off-by: Mario Kleiner 
Cc: Alex Deucher 
---
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index b2ee3cd77b4e..a4f1ae8930a4 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -1213,9 +1213,16 @@ bool resource_build_scaling_params(struct pipe_ctx 
*pipe_ctx)
 * on certain displays, such as the Sharp 4k. 36bpp is needed
 * to support SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616 and
 * SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 with actual > 10 bpc
-* precision on at least DCN display engines.
+* precision on at least DCN display engines. However, at least
+* Carrizo with DCE_VERSION_11_0 does not like 36 bpp lb depth,
+* so use only 30 bpp on DCE_VERSION_11_0. Testing with DCE 11.2 and 8.3
+* did not show such problems, so this seems to be the exception.
 */
-   pipe_ctx->plane_res.scl_data.lb_params.depth = LB_PIXEL_DEPTH_36BPP;
+   if (plane_state->ctx->dce_version != DCE_VERSION_11_0)
+   pipe_ctx->plane_res.scl_data.lb_params.depth = 
LB_PIXEL_DEPTH_36BPP;
+   else
+   pipe_ctx->plane_res.scl_data.lb_params.depth = 
LB_PIXEL_DEPTH_30BPP;
+
pipe_ctx->plane_res.scl_data.lb_params.alpha_en = 
plane_state->per_pixel_alpha;
 
pipe_ctx->plane_res.scl_data.recout.x += timing->h_border_left;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: bump driver version

2021-06-01 Thread Mario Kleiner

On Tue, Jun 1, 2021 at 4:00 PM Alex Deucher  wrote:
>
> For 16bpc display support.
>
> Signed-off-by: Alex Deucher 
> Cc: Mario Kleiner 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index c21710d72afc..f576426e24fc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -95,9 +95,10 @@
>   * - 3.39.0 - DMABUF implicit sync does a full pipeline sync
>   * - 3.40.0 - Add AMDGPU_IDS_FLAGS_TMZ
>   * - 3.41.0 - Add video codec query
> + * - 3.42.0 - Add 16bpc fixed point display support
>   */
>  #define KMS_DRIVER_MAJOR   3
> -#define KMS_DRIVER_MINOR   41
> +#define KMS_DRIVER_MINOR   42
>  #define KMS_DRIVER_PATCHLEVEL  0
>
>  int amdgpu_vram_limit;
> --
> 2.31.1
>

Reviewed-by: Mario Kleiner 

Thanks Alex.
-mario
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: display regression on Carrizo

2021-06-01 Thread Mario Kleiner

On Mon, May 31, 2021 at 4:14 PM StDenis, Tom  wrote:
>
> [AMD Official Use Only]
>
> Hi Mario,
>

Hi Tom,

> The following commit causes a display regression on my Carrizo when booting 
> linux into a console (e.g. no WM).  When the driver inits the display goes 
> green and is unusable.  The commit prior to this one on amd-staging-drm-next 
> results in a clean init.
>

That's sad. What happens if you only revert the change to
drivers/gpu/drm/amd/display/dc/core/dc_resource.c in this commit,ie.
change the assignment in resource_build_scaling_params() back to:

pipe_ctx->plane_res.scl_data.lb_params.depth = LB_PIXEL_DEPTH_30BPP;

As my testing on Polaris / DCE11.2 showed, for some reason the change
in linebuffer pixeldepth was not required for my Polaris11 to get
12bpc output, only for my Raven Ridge / DCN-1. Maybe I could make a
followup patch to make it conditional on asic? Either only increase lb
depth on DCN-1+, leave it off for DCE, or just exclude DCE-11.0 from
the change, as Carrizo is DCE-11? I seem to remember there were some
other DCE-11 specific restrictions wrt. 64bpp fp16 and the scaler.
Maybe something similar happens here?

-mario

> commit b1114ddd63be03825182d6162ff25fa3492cd6f5
> Author: Mario Kleiner 
> Date:   Fri Mar 19 22:03:15 2021 +0100
>
> drm/amd/display: Increase linebuffer pixel depth to 36bpp.
>
> Testing with the photometer shows that at least Raven Ridge DCN-1.0
> does not achieve more than 10 bpc effective output precision with a
> 16 bpc unorm surface of type SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616,
> unless linebuffer depth is increased from LB_PIXEL_DEPTH_30BPP to
> LB_PIXEL_DEPTH_36BPP. Otherwise precision gets truncated somewhere
> to 10 bpc effective depth.
>
> Strangely this increase was not needed on Polaris11 DCE-11.2 during
> testing to get 12 bpc effective precision. It also is not needed for
> fp16 framebuffers.
>
> Tested on DCN-1.0 and DCE-11.2.
>
> Signed-off-by: Mario Kleiner 
> Signed-off-by: Alex Deucher 
>
>  drivers/gpu/drm/amd/display/dc/core/dc_resource.c  | 7 +--
>  drivers/gpu/drm/amd/display/dc/dce/dce_transform.c | 6 --
>  drivers/gpu/drm/amd/display/dc/dce110/dce110_transform_v.c | 3 ++-
>  drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c   | 3 ++-
>  drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c  | 2 +-
>  drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c   | 3 ++-
>  drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 2 +-
>  drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c   | 3 ++-
>  8 files changed, 19 insertions(+), 10 deletions(-)
>
> Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats.

2021-05-13 Thread Mario Kleiner

On Thu, May 6, 2021 at 8:37 AM Ville Syrjälä
 wrote:
>
> On Sat, Mar 20, 2021 at 04:09:47AM +0200, Ville Syrjälä wrote:
> > On Fri, Mar 19, 2021 at 10:45:10PM +0100, Mario Kleiner wrote:
> > > On Fri, Mar 19, 2021 at 10:16 PM Ville Syrjälä
> > >  wrote:
> > > >
> > > > On Fri, Mar 19, 2021 at 10:03:13PM +0100, Mario Kleiner wrote:
> > > > > These are 16 bits per color channel unsigned normalized formats.
> > > > > They are supported by at least AMD display hw, and suitable for
> > > > > direct scanout of Vulkan swapchain images in the format
> > > > > VK_FORMAT_R16G16B16A16_UNORM.
> > > > >
> > > > > Signed-off-by: Mario Kleiner 
> > > > > ---
> > > > >  drivers/gpu/drm/drm_fourcc.c  | 4 
> > > > >  include/uapi/drm/drm_fourcc.h | 7 +++
> > > > >  2 files changed, 11 insertions(+)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/drm_fourcc.c 
> > > > > b/drivers/gpu/drm/drm_fourcc.c
> > > > > index 03262472059c..ce13d2be5d7b 100644
> > > > > --- a/drivers/gpu/drm/drm_fourcc.c
> > > > > +++ b/drivers/gpu/drm/drm_fourcc.c
> > > > > @@ -203,6 +203,10 @@ const struct drm_format_info 
> > > > > *__drm_format_info(u32 format)
> > > > >   { .format = DRM_FORMAT_ARGB16161616F,   .depth = 0,  
> > > > > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha 
> > > > > = true },
> > > > >   { .format = DRM_FORMAT_ABGR16161616F,   .depth = 0,  
> > > > > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha 
> > > > > = true },
> > > > >   { .format = DRM_FORMAT_AXBXGXRX106106106106, .depth = 
> > > > > 0, .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, 
> > > > > .has_alpha = true },
> > > > > + { .format = DRM_FORMAT_XRGB16161616,.depth = 0,  
> > > > > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> > > > > + { .format = DRM_FORMAT_XBGR16161616,.depth = 0,  
> > > > > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> > > > > + { .format = DRM_FORMAT_ARGB16161616,.depth = 0,  
> > > > > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha 
> > > > > = true },
> > > > > + { .format = DRM_FORMAT_ABGR16161616,.depth = 0,  
> > > > > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha 
> > > > > = true },
> > > > >   { .format = DRM_FORMAT_RGB888_A8,   .depth = 32, 
> > > > > .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha 
> > > > > = true },
> > > > >   { .format = DRM_FORMAT_BGR888_A8,   .depth = 32, 
> > > > > .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha 
> > > > > = true },
> > > > >   { .format = DRM_FORMAT_XRGB_A8, .depth = 32, 
> > > > > .num_planes = 2, .cpp = { 4, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha 
> > > > > = true },
> > > > > diff --git a/include/uapi/drm/drm_fourcc.h 
> > > > > b/include/uapi/drm/drm_fourcc.h
> > > > > index f76de49c768f..f7156322aba5 100644
> > > > > --- a/include/uapi/drm/drm_fourcc.h
> > > > > +++ b/include/uapi/drm/drm_fourcc.h
> > > > > @@ -168,6 +168,13 @@ extern "C" {
> > > > >  #define DRM_FORMAT_RGBA1010102   fourcc_code('R', 'A', '3', '0') 
> > > > > /* [31:0] R:G:B:A 10:10:10:2 little endian */
> > > > >  #define DRM_FORMAT_BGRA1010102   fourcc_code('B', 'A', '3', '0') 
> > > > > /* [31:0] B:G:R:A 10:10:10:2 little endian */
> > > > >
> > > > > +/* 64 bpp RGB */
> > > > > +#define DRM_FORMAT_XRGB16161616  fourcc_code('X', 'R', '4', '8') 
> > > > > /* [63:0] x:R:G:B 16:16:16:16 little endian */
> > > > > +#define DRM_FORMAT_XBGR16161616  fourcc_code('X', 'B', '4', '8') 
> > > > > /* [63:0] x:B:G:R 16:16:16:16 little endian */
> > > > > +
> > > > > +#define DRM_FORMAT_ARGB16161616  fourcc_code('A', 'R', '4', '8') 
> > > > > /* [63:0] A:R:G:B 16:16:16:16 little endian */
> > > > > +#define DRM_FORMAT_ABGR16161616

Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-05-05 Thread Mario Kleiner

On Wed, Apr 28, 2021 at 11:22 PM Alex Deucher  wrote:
>
> On Tue, Apr 20, 2021 at 5:25 PM Alex Deucher  wrote:
> >
> > On Fri, Apr 16, 2021 at 12:29 PM Mario Kleiner
> >  wrote:
> > >
> > > Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
> > > Would be great to get this in sooner than later.
> > >
> >
> > No objections from me.
> >
>
> I don't have any objections to merging this.  Are the IGT tests available?
>
> Alex
>.

IGT Patches are out now, already r-b by Ville, cc'd to you. As
mentioned in the cover letter for those, the new 16 bpc test cases on
top o f IGT master for kms_plane test now work nicely on my
RavenRidge, but i had to add hacks on top of kms_plane test to make it
work at all on RV, ie. get it to the point where it could execute the
tests for the new formats at all. Unmodified kms_plane from master
doesn't even work on RV with Linux 5.8. Seems IGT is quite a bit out
of date wrt. the kernel?

Things i had to do:

- Skip all tests for modifiers other than linear. --> Test
requirements wrt. tiling not met. Seems all the modifier support for
DCC, DCC_RETILE on Vega+ is missing from IGT so far?

- Skip test for format DRM_FORMAT_RGB565. CRC mismatch. Probably
because a 5 bpc container can't represent the net 8 bpc content from
the reference test image? Maybe all tests for < 8 bpc formats should
be skipped?

- Skip tests for yuv planar formats with BT2020 color space: Limited
range unsupported by DC, full range causes CRC mismatch.

- Problems with crc vblank count expected vs. actual for planar YUV formats.

- If the tests try to test more than the primary plane,
igt_pipe_crc_start() fails to open the crtc/crc/data file with -EIO.

See the attached patch with all the needed hacks. Not sure which of
these are limitations of the IGT test, and which are amdgpu bugs or hw
limitations, but applying this hack-patch on top of the patches for
the new formats makes kms_plane pass.

-mario





> > Alex
> >
> >
> > > Thanks and have a nice weekend,
> > > -mario
> > >
> > > On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > this patch series adds the fourcc's for 16 bit fixed point unorm
> > > > framebuffers to the core, and then an implementation for AMD gpu's
> > > > with DisplayCore.
> > > >
> > > > This is intended to allow for pageflipping to, and direct scanout of,
> > > > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > > > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > > > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > > > Link: 
> > > > https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> > > >
> > > > My main motivation for this is squeezing every bit of precision
> > > > out of the hardware for scientific and medical research applications,
> > > > where fp16 in the unorm range is limited to ~11 bpc effective linear
> > > > precision in the upper half [0.5;1.0] of the unorm range, although
> > > > the hardware could do at least 12 bpc.
> > > >
> > > > It has been successfully tested on AMD RavenRidge (DCN-1), and with
> > > > Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> > > > (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> > > > on my hw, both running at 10 bpc DP output depth.
> > > >
> > > > Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> > > > 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
> > > > Apple Retina panel), all running at 10 bpc output depth.
> > > >
> > > > No malfunctions, visual artifacts or other oddities were observed
> > > > (apart from an adventureous mess of cables and adapters on my desk),
> > > > suggesting it works.
> > > >
> > > > I used my automatic photometer measurement procedure to verify the
> > > > effective output precision of 10 bpc DP native signal + spatial
> > > > dithering in the gpu as enabled by the amdgpu driver. Results show
> > > > the expected 12 bpc precision i hoped for -- the current upper limit
> > > > for AMD display hw afaik.
> > > >
> > > > So it seems to work in the way i hoped :).
> > > >
> > > > Some open questions wrt. AMD DC, to be addressed in this patch series, 
> > > > or follow up
> > > > patches if neccessa

Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-05-05 Thread Mario Kleiner

On Tue, May 4, 2021 at 9:22 PM Alex Deucher  wrote:
>
> On Wed, Apr 28, 2021 at 5:21 PM Alex Deucher  wrote:
> >
> > On Tue, Apr 20, 2021 at 5:25 PM Alex Deucher  wrote:
> > >
> > > On Fri, Apr 16, 2021 at 12:29 PM Mario Kleiner
> > >  wrote:
> > > >
> > > > Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
> > > > Would be great to get this in sooner than later.
> > > >
> > >
> > > No objections from me.
> > >
> >
> > I don't have any objections to merging this.  Are the IGT tests available?
> >
>
> Any preference on whether I merge this through the AMD tree or drm-misc?
>
> Alex
>

Hi Alex, in case the question is addressed to myself: I prefer
whatever gets it into drm-next asap, so we can sync the drm_fourcc.h
headers from drm-next to the IGT tests, libdrm, amdvlk etc.

Another thing:Unless this would still make it into the Linux 5.13
merge window, we'd also need a KMS_DRIVER_MINOR bump 41 -> 42. This
way amdgpu-pro's Vulkan driver could know about the new 16 bpc pixel
formats for the out of tree amdgpu-dkms package when running against
older kernels.

thanks,
-mario

>
> > Alex
> >
> > > Alex
> > >
> > >
> > > > Thanks and have a nice weekend,
> > > > -mario
> > > >
> > > > On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
> > > >  wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > this patch series adds the fourcc's for 16 bit fixed point unorm
> > > > > framebuffers to the core, and then an implementation for AMD gpu's
> > > > > with DisplayCore.
> > > > >
> > > > > This is intended to allow for pageflipping to, and direct scanout of,
> > > > > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > > > > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > > > > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > > > > Link: 
> > > > > https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> > > > >
> > > > > My main motivation for this is squeezing every bit of precision
> > > > > out of the hardware for scientific and medical research applications,
> > > > > where fp16 in the unorm range is limited to ~11 bpc effective linear
> > > > > precision in the upper half [0.5;1.0] of the unorm range, although
> > > > > the hardware could do at least 12 bpc.
> > > > >
> > > > > It has been successfully tested on AMD RavenRidge (DCN-1), and with
> > > > > Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> > > > > (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> > > > > on my hw, both running at 10 bpc DP output depth.
> > > > >
> > > > > Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> > > > > 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 
> > > > > 2880x1800@60Hz
> > > > > Apple Retina panel), all running at 10 bpc output depth.
> > > > >
> > > > > No malfunctions, visual artifacts or other oddities were observed
> > > > > (apart from an adventureous mess of cables and adapters on my desk),
> > > > > suggesting it works.
> > > > >
> > > > > I used my automatic photometer measurement procedure to verify the
> > > > > effective output precision of 10 bpc DP native signal + spatial
> > > > > dithering in the gpu as enabled by the amdgpu driver. Results show
> > > > > the expected 12 bpc precision i hoped for -- the current upper limit
> > > > > for AMD display hw afaik.
> > > > >
> > > > > So it seems to work in the way i hoped :).
> > > > >
> > > > > Some open questions wrt. AMD DC, to be addressed in this patch 
> > > > > series, or follow up
> > > > > patches if neccessary:
> > > > >
> > > > > - For the atomic check for plane scaling, the current patch will
> > > > > apply the same hw limits as for other rgb fixed point fb's, e.g.,
> > > > > for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> > > > > limits, because this is also a 64 bpp format? Or something new
> > > > > entirely?
> > > > >
> > > > > -

Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-04-16 Thread Mario Kleiner

Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
Would be great to get this in sooner than later.

Thanks and have a nice weekend,
-mario

On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
 wrote:
>
> Hi,
>
> this patch series adds the fourcc's for 16 bit fixed point unorm
> framebuffers to the core, and then an implementation for AMD gpu's
> with DisplayCore.
>
> This is intended to allow for pageflipping to, and direct scanout of,
> Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> Link: 
> https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
>
> My main motivation for this is squeezing every bit of precision
> out of the hardware for scientific and medical research applications,
> where fp16 in the unorm range is limited to ~11 bpc effective linear
> precision in the upper half [0.5;1.0] of the unorm range, although
> the hardware could do at least 12 bpc.
>
> It has been successfully tested on AMD RavenRidge (DCN-1), and with
> Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> on my hw, both running at 10 bpc DP output depth.
>
> Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
> Apple Retina panel), all running at 10 bpc output depth.
>
> No malfunctions, visual artifacts or other oddities were observed
> (apart from an adventureous mess of cables and adapters on my desk),
> suggesting it works.
>
> I used my automatic photometer measurement procedure to verify the
> effective output precision of 10 bpc DP native signal + spatial
> dithering in the gpu as enabled by the amdgpu driver. Results show
> the expected 12 bpc precision i hoped for -- the current upper limit
> for AMD display hw afaik.
>
> So it seems to work in the way i hoped :).
>
> Some open questions wrt. AMD DC, to be addressed in this patch series, or 
> follow up
> patches if neccessary:
>
> - For the atomic check for plane scaling, the current patch will
> apply the same hw limits as for other rgb fixed point fb's, e.g.,
> for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> limits, because this is also a 64 bpp format? Or something new
> entirely?
>
> - I haven't added the new fourcc to the DCC tables yet. Should i?
>
> - I had to change an assert for DCE to allow 36bpp linebuffers (patch 4/5).
> It looks to me as if that assert was inconsistent with other places
> in the driver where COLOR_DEPTH121212 is supported, and looking at
> the code, the change seems harmless. At least on DCE-11.2 the change
> didn't cause any noticeable (by myself) or measurable (by my equipment)
> problems on any of the 3 connected displays.
>
> - Related to that change, while i needed to increase lb pixelsize to 36bpp
> to get > 10 bpc effective precision on DCN, i didn't need to do that
> on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
> to get > 10 bpc precision for fp16 framebuffers, so something seems to
> behave differently for floating point 16 vs. fixed point 16. This all
> seems to suggest one could leave lb pixelsize at the old 30 bpp value
> on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
> to avoid the changes of patch 4/5.
>
> Thanks,
> -mario
>
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-04-16 Thread Mario Kleiner

On Mon, Mar 22, 2021 at 4:52 PM Ville Syrjälä
 wrote:
>
> On Fri, Mar 19, 2021 at 10:03:12PM +0100, Mario Kleiner wrote:
> > Hi,
> >
> > this patch series adds the fourcc's for 16 bit fixed point unorm
> > framebuffers to the core, and then an implementation for AMD gpu's
> > with DisplayCore.
> >
> > This is intended to allow for pageflipping to, and direct scanout of,
> > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > Link: 
> > https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
>
> We should also add support for these formats into igt.a Should
> be semi-easy by just adding the suitable float<->uint16
> conversion stuff.
>

Hi Ville,

Could you point me to a specific test case / file that I should look
at for adding this?

thanks,
-mario

> --
> Ville Syrjälä
> Intel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats.

2021-03-19 Thread Mario Kleiner

On Fri, Mar 19, 2021 at 10:16 PM Ville Syrjälä
 wrote:
>
> On Fri, Mar 19, 2021 at 10:03:13PM +0100, Mario Kleiner wrote:
> > These are 16 bits per color channel unsigned normalized formats.
> > They are supported by at least AMD display hw, and suitable for
> > direct scanout of Vulkan swapchain images in the format
> > VK_FORMAT_R16G16B16A16_UNORM.
> >
> > Signed-off-by: Mario Kleiner 
> > ---
> >  drivers/gpu/drm/drm_fourcc.c  | 4 
> >  include/uapi/drm/drm_fourcc.h | 7 +++
> >  2 files changed, 11 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
> > index 03262472059c..ce13d2be5d7b 100644
> > --- a/drivers/gpu/drm/drm_fourcc.c
> > +++ b/drivers/gpu/drm/drm_fourcc.c
> > @@ -203,6 +203,10 @@ const struct drm_format_info *__drm_format_info(u32 
> > format)
> >   { .format = DRM_FORMAT_ARGB16161616F,   .depth = 0,  
> > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = 
> > true },
> >   { .format = DRM_FORMAT_ABGR16161616F,   .depth = 0,  
> > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = 
> > true },
> >   { .format = DRM_FORMAT_AXBXGXRX106106106106, .depth = 0, 
> > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = 
> > true },
> > + { .format = DRM_FORMAT_XRGB16161616,.depth = 0,  
> > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> > + { .format = DRM_FORMAT_XBGR16161616,.depth = 0,  
> > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
> > + { .format = DRM_FORMAT_ARGB16161616,.depth = 0,  
> > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = 
> > true },
> > + { .format = DRM_FORMAT_ABGR16161616,.depth = 0,  
> > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = 
> > true },
> >   { .format = DRM_FORMAT_RGB888_A8,   .depth = 32, 
> > .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = 
> > true },
> >   { .format = DRM_FORMAT_BGR888_A8,   .depth = 32, 
> > .num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = 
> > true },
> >   { .format = DRM_FORMAT_XRGB_A8, .depth = 32, 
> > .num_planes = 2, .cpp = { 4, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = 
> > true },
> > diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> > index f76de49c768f..f7156322aba5 100644
> > --- a/include/uapi/drm/drm_fourcc.h
> > +++ b/include/uapi/drm/drm_fourcc.h
> > @@ -168,6 +168,13 @@ extern "C" {
> >  #define DRM_FORMAT_RGBA1010102   fourcc_code('R', 'A', '3', '0') /* 
> > [31:0] R:G:B:A 10:10:10:2 little endian */
> >  #define DRM_FORMAT_BGRA1010102   fourcc_code('B', 'A', '3', '0') /* 
> > [31:0] B:G:R:A 10:10:10:2 little endian */
> >
> > +/* 64 bpp RGB */
> > +#define DRM_FORMAT_XRGB16161616  fourcc_code('X', 'R', '4', '8') /* 
> > [63:0] x:R:G:B 16:16:16:16 little endian */
> > +#define DRM_FORMAT_XBGR16161616  fourcc_code('X', 'B', '4', '8') /* 
> > [63:0] x:B:G:R 16:16:16:16 little endian */
> > +
> > +#define DRM_FORMAT_ARGB16161616  fourcc_code('A', 'R', '4', '8') /* 
> > [63:0] A:R:G:B 16:16:16:16 little endian */
> > +#define DRM_FORMAT_ABGR16161616  fourcc_code('A', 'B', '4', '8') /* 
> > [63:0] A:B:G:R 16:16:16:16 little endian */
>
> These look reasonable enough to me. IIRC we should be able to expose
> them on some recent Intel hw as well.
>
> Reviewed-by: Ville Syrjälä 
>

Thanks Ville!

Indeed i looked over the Intel PRM's, and while fp16 support seems to
be rather recent (Gen8? Gen9? Gen10? Can't remember atm.), iirc, I
found references to rgb16 fixed point back to gen5 / Ironlake. That
would be pretty cool! The precision limit for the encoders on Intel is
also 12 bpc atm., right?

-mario

> --
> Ville Syrjälä
> Intel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 5/5] drm/amd/display: Enable support for 16 bpc fixed-point framebuffers.

2021-03-19 Thread Mario Kleiner

This is intended to enable direct high-precision scanout and pageflip
of Vulkan swapchain images in format VK_FORMAT_R16G16B16A16_UNORM.

Expose DRM_FORMAT_XRGB16161616, DRM_FORMAT_ARGB16161616,
DRM_FORMAT_XBGR16161616 and DRM_FORMAT_ABGR16161616 as 16 bpc
unsigned normalized formats. These allow to take full advantage
of the maximum precision of the display hardware, ie. currently
up to 12 bpc.

Searching through old AMD M56, M76 and RV630 hw programming docs
suggests that these 16 bpc formats are supported by all DCE and
DCN display engines, so we can expose the formats unconditionally.

Successfully tested on AMD Polaris11 DCE-11.2 an RavenRidge DCN-1.0
with a HDR-10 monitor over 10 bpc DP output with spatial dithering
enabled by the driver. Picture looks good, and my photometer
measurement procedure confirms an effective 12 bpc color
reproduction.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 94cd5ddd67ef..1a6e90e20f10 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4563,6 +4563,14 @@ fill_dc_plane_info_and_addr(struct amdgpu_device *adev,
case DRM_FORMAT_ABGR16161616F:
plane_info->format = SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F;
break;
+   case DRM_FORMAT_XRGB16161616:
+   case DRM_FORMAT_ARGB16161616:
+   plane_info->format = SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616;
+   break;
+   case DRM_FORMAT_XBGR16161616:
+   case DRM_FORMAT_ABGR16161616:
+   plane_info->format = SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616;
+   break;
default:
DRM_ERROR(
"Unsupported screen format %s\n",
@@ -6541,6 +6549,10 @@ static const uint32_t rgb_formats[] = {
DRM_FORMAT_XBGR2101010,
DRM_FORMAT_ARGB2101010,
DRM_FORMAT_ABGR2101010,
+   DRM_FORMAT_XRGB16161616,
+   DRM_FORMAT_XBGR16161616,
+   DRM_FORMAT_ARGB16161616,
+   DRM_FORMAT_ABGR16161616,
DRM_FORMAT_XBGR,
DRM_FORMAT_ABGR,
DRM_FORMAT_RGB565,
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 4/5] drm/amd/display: Make assert in DCE's program_bit_depth_reduction more lenient.

2021-03-19 Thread Mario Kleiner

This is needed to avoid warnings with linebuffer depth 36 bpp.
Testing on a Polaris11, DCE-11.2 on a 10 bit HDR-10 monitor
showed no obvious problems, and this 12 bpc limit is consistent
with what other function in the DCE bit depth reduction path use.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/dc/dce/dce_transform.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
index 92b53a30d954..d9fd4ec60588 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
@@ -794,7 +794,7 @@ static void program_bit_depth_reduction(
enum dcp_out_trunc_round_mode trunc_mode;
bool spatial_dither_enable;
 
-   ASSERT(depth < COLOR_DEPTH_121212); /* Invalid clamp bit depth */
+   ASSERT(depth <= COLOR_DEPTH_121212); /* Invalid clamp bit depth */
 
spatial_dither_enable = bit_depth_params->flags.SPATIAL_DITHER_ENABLED;
/* Default to 12 bit truncation without rounding */
@@ -854,7 +854,7 @@ static void dce60_program_bit_depth_reduction(
enum dcp_out_trunc_round_mode trunc_mode;
bool spatial_dither_enable;
 
-   ASSERT(depth < COLOR_DEPTH_121212); /* Invalid clamp bit depth */
+   ASSERT(depth <= COLOR_DEPTH_121212); /* Invalid clamp bit depth */
 
spatial_dither_enable = bit_depth_params->flags.SPATIAL_DITHER_ENABLED;
/* Default to 12 bit truncation without rounding */
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/5] drm/amd/display: Increase linebuffer pixel depth to 36bpp.

2021-03-19 Thread Mario Kleiner

Testing with the photometer shows that at least Raven Ridge DCN-1.0
does not achieve more than 10 bpc effective output precision with a
16 bpc unorm surface of type SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616,
unless linebuffer depth is increased from LB_PIXEL_DEPTH_30BPP to
LB_PIXEL_DEPTH_36BPP. Otherwise precision gets truncated somewhere
to 10 bpc effective depth.

Strangely this increase was not needed on Polaris11 DCE-11.2 during
testing to get 12 bpc effective precision. It also is not needed for
fp16 framebuffers.

Tested on DCN-1.0 and DCE-11.2.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c  | 7 +--
 drivers/gpu/drm/amd/display/dc/dce/dce_transform.c | 6 --
 drivers/gpu/drm/amd/display/dc/dce110/dce110_transform_v.c | 3 ++-
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c   | 3 ++-
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c  | 2 +-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c   | 3 ++-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 2 +-
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c   | 3 ++-
 8 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index f1aed40b3124..51e91b546d69 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -1167,9 +1167,12 @@ bool resource_build_scaling_params(struct pipe_ctx 
*pipe_ctx)
 
/**
 * Setting line buffer pixel depth to 24bpp yields banding
-* on certain displays, such as the Sharp 4k
+* on certain displays, such as the Sharp 4k. 36bpp is needed
+* to support SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616 and
+* SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 with actual > 10 bpc
+* precision on at least DCN display engines.
 */
-   pipe_ctx->plane_res.scl_data.lb_params.depth = LB_PIXEL_DEPTH_30BPP;
+   pipe_ctx->plane_res.scl_data.lb_params.depth = LB_PIXEL_DEPTH_36BPP;
pipe_ctx->plane_res.scl_data.lb_params.alpha_en = 
plane_state->per_pixel_alpha;
 
pipe_ctx->plane_res.scl_data.recout.x += timing->h_border_left;
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
index 151dc7bf6d23..92b53a30d954 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
@@ -1647,7 +1647,8 @@ void dce_transform_construct(
xfm_dce->lb_pixel_depth_supported =
LB_PIXEL_DEPTH_18BPP |
LB_PIXEL_DEPTH_24BPP |
-   LB_PIXEL_DEPTH_30BPP;
+   LB_PIXEL_DEPTH_30BPP |
+   LB_PIXEL_DEPTH_36BPP;
 
xfm_dce->lb_bits_per_entry = LB_BITS_PER_ENTRY;
xfm_dce->lb_memory_size = LB_TOTAL_NUMBER_OF_ENTRIES; /*0x6B0*/
@@ -1675,7 +1676,8 @@ void dce60_transform_construct(
xfm_dce->lb_pixel_depth_supported =
LB_PIXEL_DEPTH_18BPP |
LB_PIXEL_DEPTH_24BPP |
-   LB_PIXEL_DEPTH_30BPP;
+   LB_PIXEL_DEPTH_30BPP |
+   LB_PIXEL_DEPTH_36BPP;
 
xfm_dce->lb_bits_per_entry = LB_BITS_PER_ENTRY;
xfm_dce->lb_memory_size = LB_TOTAL_NUMBER_OF_ENTRIES; /*0x6B0*/
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_transform_v.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_transform_v.c
index 29438c6050db..45bca0db5e5e 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_transform_v.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_transform_v.c
@@ -708,7 +708,8 @@ bool dce110_transform_v_construct(
xfm_dce->lb_pixel_depth_supported =
LB_PIXEL_DEPTH_18BPP |
LB_PIXEL_DEPTH_24BPP |
-   LB_PIXEL_DEPTH_30BPP;
+   LB_PIXEL_DEPTH_30BPP |
+   LB_PIXEL_DEPTH_36BPP;
 
xfm_dce->prescaler_on = true;
xfm_dce->lb_bits_per_entry = LB_BITS_PER_ENTRY;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c
index a77e7bd3b8d5..91fdfcd8a14e 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c
@@ -568,7 +568,8 @@ void dpp1_construct(
dpp->lb_pixel_depth_supported =
LB_PIXEL_DEPTH_18BPP |
LB_PIXEL_DEPTH_24BPP |
-   LB_PIXEL_DEPTH_30BPP;
+   LB_PIXEL_DEPTH_30BPP |
+   LB_PIXEL_DEPTH_36BPP;
 
dpp->lb_bits_per_entry = LB_BITS_PER_ENTRY;
dpp->lb_memory_size = LB_TOTAL_NUMBER_OF_ENTRIES; /*0x1404*/
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/driver

[PATCH 2/5] drm/amd/display: Add support for SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616.

2021-03-19 Thread Mario Kleiner

Add the necessary format definition, bandwidth and pixel size mappings,
prescaler setup, and pixelformat selection, following the logic
already present for SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616.

The new SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 is implemented as the
old SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616 format, but with swapped
red <-> green color channel, by use of the hardware xbar.

Please note that on the DCN 1/2/3 display engines, the pixelformat
in hubp and dpp setup for the old SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616
and the new SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 was changed from
format id 22 to id 26. See amd/include/navi10_enum.h for the meaning
of the id's.

For format 22, the display engine read the framebuffer in 16 bpc format,
but truncated to the 12 bpc actually supported by later pipeline stages.
However, the engine took the 12 LSB of each color component for
truncation, which is incompatible with rendering at least under Vulkan,
where content is 16 bit wide, and a 12 MSB alignment would be appropriate,
if any. Format 20 for ARGB16161616_12MSB does work, but even better, we
can choose format 26 for ARGB16161616_UNORM, keeping all 16 bits around
until later stages of the display pipeline.

This allows to directly consume what the rendering hw produces under
Vulkan for swapchain format VK_FORMAT_R16G16B16A16_UNORM, as tested
with a patched version of the current AMD open-source amdvlk driver
which maps swapchain format VK_FORMAT_R16G16B16A16_UNORM onto
DRM_FORMAT_XBGR16161616.

The old id 22 would cause colorful pixeltrash to be displayed instead.

Tested under DCN-1.0 and DCE-11.2.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c| 2 ++
 drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c| 2 ++
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c   | 2 ++
 drivers/gpu/drm/amd/display/dc/dc_hw_types.h| 2 ++
 drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c  | 2 ++
 drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 1 +
 drivers/gpu/drm/amd/display/dc/dce110/dce110_mem_input_v.c  | 1 +
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c| 6 --
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c | 1 +
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c   | 4 +++-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c| 3 ++-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubbub.c | 1 +
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c   | 4 +++-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c   | 1 +
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c| 3 ++-
 15 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c 
b/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
index e633f8a51edb..4e3664db7456 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
@@ -2827,6 +2827,7 @@ static void populate_initial_data(
data->bytes_per_pixel[num_displays + 4] = 4;
break;
case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+   case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
data->bytes_per_pixel[num_displays + 4] = 8;
break;
@@ -2930,6 +2931,7 @@ static void populate_initial_data(
data->bytes_per_pixel[num_displays + 4] = 4;
break;
case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+   case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
data->bytes_per_pixel[num_displays + 4] = 8;
break;
diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c 
b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
index d4df4da5b81a..0e18df1283b6 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
@@ -236,6 +236,7 @@ static enum dcn_bw_defs tl_pixel_format_to_bw_defs(enum 
surface_pixel_format for
case SURFACE_PIXEL_FORMAT_GRPH_ABGR2101010_XR_BIAS:
return dcn_bw_rgb_sub_32;
case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+   case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F:
case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
return dcn_bw_rgb_sub_64;
@@ -375,6 +376,7 @@ static void pipe_ctx_to_e2e_pipe_params (
input->src.viewport_height_c   = input->src.viewport_height / 2;
break;
case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
+   case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616:
case SURFACE_PIXEL_F

[PATCH 1/5] drm/fourcc: Add 16 bpc fixed point framebuffer formats.

2021-03-19 Thread Mario Kleiner

These are 16 bits per color channel unsigned normalized formats.
They are supported by at least AMD display hw, and suitable for
direct scanout of Vulkan swapchain images in the format
VK_FORMAT_R16G16B16A16_UNORM.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/drm_fourcc.c  | 4 
 include/uapi/drm/drm_fourcc.h | 7 +++
 2 files changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
index 03262472059c..ce13d2be5d7b 100644
--- a/drivers/gpu/drm/drm_fourcc.c
+++ b/drivers/gpu/drm/drm_fourcc.c
@@ -203,6 +203,10 @@ const struct drm_format_info *__drm_format_info(u32 format)
{ .format = DRM_FORMAT_ARGB16161616F,   .depth = 0,  
.num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
{ .format = DRM_FORMAT_ABGR16161616F,   .depth = 0,  
.num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
{ .format = DRM_FORMAT_AXBXGXRX106106106106, .depth = 0, 
.num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
+   { .format = DRM_FORMAT_XRGB16161616,.depth = 0,  
.num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
+   { .format = DRM_FORMAT_XBGR16161616,.depth = 0,  
.num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
+   { .format = DRM_FORMAT_ARGB16161616,.depth = 0,  
.num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
+   { .format = DRM_FORMAT_ABGR16161616,.depth = 0,  
.num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
{ .format = DRM_FORMAT_RGB888_A8,   .depth = 32, 
.num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
{ .format = DRM_FORMAT_BGR888_A8,   .depth = 32, 
.num_planes = 2, .cpp = { 3, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
{ .format = DRM_FORMAT_XRGB_A8, .depth = 32, 
.num_planes = 2, .cpp = { 4, 1, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },
diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
index f76de49c768f..f7156322aba5 100644
--- a/include/uapi/drm/drm_fourcc.h
+++ b/include/uapi/drm/drm_fourcc.h
@@ -168,6 +168,13 @@ extern "C" {
 #define DRM_FORMAT_RGBA1010102 fourcc_code('R', 'A', '3', '0') /* [31:0] 
R:G:B:A 10:10:10:2 little endian */
 #define DRM_FORMAT_BGRA1010102 fourcc_code('B', 'A', '3', '0') /* [31:0] 
B:G:R:A 10:10:10:2 little endian */
 
+/* 64 bpp RGB */
+#define DRM_FORMAT_XRGB16161616fourcc_code('X', 'R', '4', '8') /* 
[63:0] x:R:G:B 16:16:16:16 little endian */
+#define DRM_FORMAT_XBGR16161616fourcc_code('X', 'B', '4', '8') /* 
[63:0] x:B:G:R 16:16:16:16 little endian */
+
+#define DRM_FORMAT_ARGB16161616fourcc_code('A', 'R', '4', '8') /* 
[63:0] A:R:G:B 16:16:16:16 little endian */
+#define DRM_FORMAT_ABGR16161616fourcc_code('A', 'B', '4', '8') /* 
[63:0] A:B:G:R 16:16:16:16 little endian */
+
 /*
  * Floating point 64bpp RGB
  * IEEE 754-2008 binary16 half-precision float
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-03-19 Thread Mario Kleiner

Hi,

this patch series adds the fourcc's for 16 bit fixed point unorm
framebuffers to the core, and then an implementation for AMD gpu's
with DisplayCore.

This is intended to allow for pageflipping to, and direct scanout of,
Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
for swapchains, mapping to DRM_FORMAT_XBGR16161616:
Link: 
https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4

My main motivation for this is squeezing every bit of precision
out of the hardware for scientific and medical research applications,
where fp16 in the unorm range is limited to ~11 bpc effective linear
precision in the upper half [0.5;1.0] of the unorm range, although
the hardware could do at least 12 bpc.

It has been successfully tested on AMD RavenRidge (DCN-1), and with
Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
(DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
on my hw, both running at 10 bpc DP output depth.

Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
Apple Retina panel), all running at 10 bpc output depth.

No malfunctions, visual artifacts or other oddities were observed
(apart from an adventureous mess of cables and adapters on my desk),
suggesting it works.

I used my automatic photometer measurement procedure to verify the
effective output precision of 10 bpc DP native signal + spatial
dithering in the gpu as enabled by the amdgpu driver. Results show
the expected 12 bpc precision i hoped for -- the current upper limit
for AMD display hw afaik.

So it seems to work in the way i hoped :).

Some open questions wrt. AMD DC, to be addressed in this patch series, or 
follow up
patches if neccessary:

- For the atomic check for plane scaling, the current patch will
apply the same hw limits as for other rgb fixed point fb's, e.g.,
for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
limits, because this is also a 64 bpp format? Or something new
entirely?

- I haven't added the new fourcc to the DCC tables yet. Should i?

- I had to change an assert for DCE to allow 36bpp linebuffers (patch 4/5).
It looks to me as if that assert was inconsistent with other places
in the driver where COLOR_DEPTH121212 is supported, and looking at
the code, the change seems harmless. At least on DCE-11.2 the change
didn't cause any noticeable (by myself) or measurable (by my equipment)
problems on any of the 3 connected displays.

- Related to that change, while i needed to increase lb pixelsize to 36bpp
to get > 10 bpc effective precision on DCN, i didn't need to do that
on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
to get > 10 bpc precision for fp16 framebuffers, so something seems to
behave differently for floating point 16 vs. fixed point 16. This all
seems to suggest one could leave lb pixelsize at the old 30 bpp value
on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
to avoid the changes of patch 4/5.

Thanks,
-mario


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Allow spatial dither to 10 bpc on all DCE

2021-02-18 Thread Mario Kleiner

Your v2 has my
Reviewed-by: Mario Kleiner 

thanks,
-mario

On Wed, Feb 17, 2021 at 11:51 PM Alex Deucher  wrote:

> From: Mario Kleiner 
>
> Spatial dithering to 10 bpc depth was disabled for all DCE's.
>
> Testing on DCE-8.3 and DCE-11.2 did not show any obvious ill
> effects, but a measureable precision improvement (via colorimeter)
> when displaying a fp16 framebuffer to a 10 bpc DP or HDMI connected
> HDR-10 monitor.
>
> v2: enable it for all DCEs (Alex)
>
> Signed-off-by: Mario Kleiner 
> Cc: Alex Deucher 
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/display/dc/dce/dce_opp.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c
> b/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c
> index 4600231da6cb..895b015b02e8 100644
> --- a/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c
> +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c
> @@ -216,9 +216,7 @@ static void set_spatial_dither(
> REG_UPDATE(FMT_BIT_DEPTH_CONTROL,
> FMT_TEMPORAL_DITHER_EN, 0);
>
> -   /* no 10bpc on DCE11*/
> -   if (params->flags.SPATIAL_DITHER_ENABLED == 0 ||
> -   params->flags.SPATIAL_DITHER_DEPTH == 2)
> +   if (params->flags.SPATIAL_DITHER_ENABLED == 0)
> return;
>
> /* only use FRAME_COUNTER_MAX if frameRandom == 1*/
> --
> 2.29.2
>
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Allow spatial dither to 10 bpc on all != DCE-11.0.

2021-02-15 Thread Mario Kleiner

On Mon, Feb 15, 2021 at 8:09 PM Alex Deucher  wrote:

> On Fri, Feb 12, 2021 at 5:30 PM Mario Kleiner
>  wrote:
> >
> > Spatial dithering to 10 bpc depth was disabled for all DCE's.
> > Restrict this to DCE-11.0, but allow it on other DCE's.
> >
> > Testing on DCE-8.3 and DCE-11.2 did not show any obvious ill
> > effects, but a measureable precision improvement (via colorimeter)
> > when displaying a fp16 framebuffer to a 10 bpc DP or HDMI connected
> > HDR-10 monitor.
> >
> > Alex suggests this may have been a workaround for some DCE-11.0
> > Carrizo and Stoney Asics, so lets try to restrict this to DCE 11.0.
> >
> > Signed-off-by: Mario Kleiner 
> > Cc: Alex Deucher 
> > ---
> >  drivers/gpu/drm/amd/display/dc/dce/dce_opp.c | 9 ++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c
> b/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c
> > index 4600231da6cb..4ed886cdb8d8 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c
> > +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c
> > @@ -216,9 +216,12 @@ static void set_spatial_dither(
> > REG_UPDATE(FMT_BIT_DEPTH_CONTROL,
> > FMT_TEMPORAL_DITHER_EN, 0);
> >
> > -   /* no 10bpc on DCE11*/
> > -   if (params->flags.SPATIAL_DITHER_ENABLED == 0 ||
> > -   params->flags.SPATIAL_DITHER_DEPTH == 2)
> > +   if (params->flags.SPATIAL_DITHER_ENABLED == 0)
> > +   return;
> > +
> > +   /* No dithering to 10 bpc on DCE-11.0 */
> > +   if (params->flags.SPATIAL_DITHER_DEPTH == 2 &&
> > +   opp110->base.ctx->dce_version == DCE_VERSION_11_0)
> > return;
>
> I'm inclined to just remove this check altogether.  This is just the
> dithering control.  I think the limitations are more around the
> formats (e.g., FP formats) than the dithering.
>
> Alex
>
>
Certainly no objections from myself.
-mario



>
> >
> > /* only use FRAME_COUNTER_MAX if frameRandom == 1*/
> > --
> > 2.25.1
> >
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: Why no spatial dithering to 10 bit depth on DCE?

2021-02-12 Thread Mario Kleiner

On Fri, Feb 12, 2021 at 6:32 AM Alex Deucher  wrote:

> On Thu, Feb 11, 2021 at 4:12 PM Mario Kleiner
>  wrote:
> >
> > On Wed, Feb 10, 2021 at 10:36 PM Alex Deucher 
> wrote:
> >>
> >> On Wed, Feb 10, 2021 at 4:08 PM Mario Kleiner
> >>  wrote:
> >> >
> >> > Resending this one as well, in the hope of some clarification or
> background information.
> >> >
> >>
> >
> > Thanks Alex.
> >
> >> I suspect this may have been a limitation from DCE11.0 (E.g.,
> >> carrizo/stoney APUs).  They had some bandwidth limitations with
> >> respect to high bit depth IIRC.  I suspect it should be fine on the
> >> relevant dGPUs.  The code was probably originally added for the APUs
> >
> >
> > That sounds as if it would make sense for me to try to submit a patch to
> you that restricts this limitation to DCE 11.0 only?
>
> I suspect older DCE 8.x APUs have similar limitations.  Although it
> may only be an issue with multiple monitors or something like that.  I
> don't remember the details.  @Harry Wentland do you remember?
>
> >
>

Fwiw, the tested DCE-8.3 was a mullins APU, at least that one was fine,
although only testable with 10 bpc HDMI + 6 bpc eDP, the only available
outputs.

I just sent out a patch to restrict the dithering restriction to DCE-11.0.
Successfully retested on DCE-11.2 via DP for extra care.

Have a nice weekend,
-mario


> All i can say is during my testing with DCE-8.3 over HDMI and DCE-11.2
> over DP under amdvlk with fp16 mode and ouptut_bpc set to 10 bpc, ie.
> dithering down from 12 bpc to 10 bpc, i didn't notice any problems when
> hacking this out, and photometer measurements showed good improvements of
> luminance reproduction with dithering.
> >
> >> and just never updated or the changes were accidentally lost when we
> >> consolidated the DCE code.  Unfortunately, there are not a lot of apps
> >> that work properly in Linux with >8 bits per channel.
> >>
> >
> > Mine does ;-). As does apparently the Kodi media player. And at least
> Gnome/X11 works now, whereas KDE's Kwin/X11 used to work nicely, but
> regressed. And amdvlk does have fp16 support now since a while ago, so
> that's one way to get high precision without disturbing conventional
> desktop apps. I'll probably look into Mesa's Vulkan/WSI for 10 bpc / fp16
> sometime this year if nobody beats me to it.
> >
>
> Sounds good.
>
> Alex
>
> > -mario
> >
> >
> >> Alex
> >>
> >>
> >> > Thanks,
> >> > -mario
> >> >
> >> > On Mon, Jan 25, 2021 at 3:56 AM Mario Kleiner <
> mario.kleiner...@gmail.com> wrote:
> >> >>
> >> >> Hi Harry and Nicholas,
> >> >>
> >> >> I'm still on an extended quest to squeeze as much HDR out of Linux +
> your hw as possible, although the paid contract with Vesa has officially
> ended by now, and stumbled over this little conundrum:
> >> >>
> >> >> DC's set_spatial_dither() function (see link below) has this
> particular comment:
> >> >> "/* no 10bpc on DCE11*/" followed by code that skips dithering setup
> if the target output depth is 10 bpc:
> >> >>
> >> >>
> https://elixir.bootlin.com/linux/v5.11-rc4/source/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c#L219
> >> >>
> >> >> I couldn't find any hint in the commit messages towards the reason,
> so why is that?
> >> >>
> >> >> This gets in the way if one has a HDR-10 monitor with 10 bit native
> output depth connected and wants to output a fp16 framebuffer and retain
> some of the > 10 bit linear precision by use of spatial dithering down to
> 10 bit. One only gets the same precision as a 10 bpc unorm fb. Also the
> routine is called for all DCE's, not only DCE11, so it affects all of them.
> >> >>
> >> >> The same restrictions don't exist in the old kms code for amdgpu-kms
> and radeon-kms. I added a mmio hack to Psychtoolbox to go behind the
> drivers back and hack the FMT_BIT_DEPTH_CONTROL register to use spatial
> dithering down to 10 bpc anyway to circumvent this limitation. My
> photometer measurements on fp16 framebuffers feeding into 10 bit output
> show that I get a nice looking response consistent with dithering to 10 bpc
> properly working on DCE. Also i don't see any visual artifacts in displayed
> pictures, so the hw seems to be just fine. This on DCE-11.2, and more
> lightly tested on DCE-8.3.
> >> >>
> >> >> So i wonder if this is some leftover from some hw bringup, or if
> there is a good reason for it being there? Maybe it could be removed or
> made more specific to some problematic asic?
> >> >>
> >> >> Thanks for any insights you could provide. Stay safe,
> >> >> -mario
> >> >>
> >> > ___
> >> > amd-gfx mailing list
> >> > amd-gfx@lists.freedesktop.org
> >> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/display: Allow spatial dither to 10 bpc on all != DCE-11.0.

2021-02-12 Thread Mario Kleiner

Spatial dithering to 10 bpc depth was disabled for all DCE's.
Restrict this to DCE-11.0, but allow it on other DCE's.

Testing on DCE-8.3 and DCE-11.2 did not show any obvious ill
effects, but a measureable precision improvement (via colorimeter)
when displaying a fp16 framebuffer to a 10 bpc DP or HDMI connected
HDR-10 monitor.

Alex suggests this may have been a workaround for some DCE-11.0
Carrizo and Stoney Asics, so lets try to restrict this to DCE 11.0.

Signed-off-by: Mario Kleiner 
Cc: Alex Deucher 
---
 drivers/gpu/drm/amd/display/dc/dce/dce_opp.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c
index 4600231da6cb..4ed886cdb8d8 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c
@@ -216,9 +216,12 @@ static void set_spatial_dither(
REG_UPDATE(FMT_BIT_DEPTH_CONTROL,
FMT_TEMPORAL_DITHER_EN, 0);
 
-   /* no 10bpc on DCE11*/
-   if (params->flags.SPATIAL_DITHER_ENABLED == 0 ||
-   params->flags.SPATIAL_DITHER_DEPTH == 2)
+   if (params->flags.SPATIAL_DITHER_ENABLED == 0)
+   return;
+
+   /* No dithering to 10 bpc on DCE-11.0 */
+   if (params->flags.SPATIAL_DITHER_DEPTH == 2 &&
+   opp110->base.ctx->dce_version == DCE_VERSION_11_0)
return;
 
/* only use FRAME_COUNTER_MAX if frameRandom == 1*/
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: Why no spatial dithering to 10 bit depth on DCE?

2021-02-11 Thread Mario Kleiner

On Wed, Feb 10, 2021 at 10:36 PM Alex Deucher  wrote:

> On Wed, Feb 10, 2021 at 4:08 PM Mario Kleiner
>  wrote:
> >
> > Resending this one as well, in the hope of some clarification or
> background information.
> >
>
>
Thanks Alex.

I suspect this may have been a limitation from DCE11.0 (E.g.,
> carrizo/stoney APUs).  They had some bandwidth limitations with
> respect to high bit depth IIRC.  I suspect it should be fine on the
> relevant dGPUs.  The code was probably originally added for the APUs
>

That sounds as if it would make sense for me to try to submit a patch to
you that restricts this limitation to DCE 11.0 only?

All i can say is during my testing with DCE-8.3 over HDMI and DCE-11.2 over
DP under amdvlk with fp16 mode and ouptut_bpc set to 10 bpc, ie. dithering
down from 12 bpc to 10 bpc, i didn't notice any problems when hacking this
out, and photometer measurements showed good improvements of luminance
reproduction with dithering.

and just never updated or the changes were accidentally lost when we
> consolidated the DCE code.  Unfortunately, there are not a lot of apps
> that work properly in Linux with >8 bits per channel.
>
>
Mine does ;-). As does apparently the Kodi media player. And at least
Gnome/X11 works now, whereas KDE's Kwin/X11 used to work nicely, but
regressed. And amdvlk does have fp16 support now since a while ago, so
that's one way to get high precision without disturbing conventional
desktop apps. I'll probably look into Mesa's Vulkan/WSI for 10 bpc / fp16
sometime this year if nobody beats me to it.

-mario


Alex
>
>
> > Thanks,
> > -mario
> >
> > On Mon, Jan 25, 2021 at 3:56 AM Mario Kleiner <
> mario.kleiner...@gmail.com> wrote:
> >>
> >> Hi Harry and Nicholas,
> >>
> >> I'm still on an extended quest to squeeze as much HDR out of Linux +
> your hw as possible, although the paid contract with Vesa has officially
> ended by now, and stumbled over this little conundrum:
> >>
> >> DC's set_spatial_dither() function (see link below) has this particular
> comment:
> >> "/* no 10bpc on DCE11*/" followed by code that skips dithering setup if
> the target output depth is 10 bpc:
> >>
> >>
> https://elixir.bootlin.com/linux/v5.11-rc4/source/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c#L219
> >>
> >> I couldn't find any hint in the commit messages towards the reason, so
> why is that?
> >>
> >> This gets in the way if one has a HDR-10 monitor with 10 bit native
> output depth connected and wants to output a fp16 framebuffer and retain
> some of the > 10 bit linear precision by use of spatial dithering down to
> 10 bit. One only gets the same precision as a 10 bpc unorm fb. Also the
> routine is called for all DCE's, not only DCE11, so it affects all of them.
> >>
> >> The same restrictions don't exist in the old kms code for amdgpu-kms
> and radeon-kms. I added a mmio hack to Psychtoolbox to go behind the
> drivers back and hack the FMT_BIT_DEPTH_CONTROL register to use spatial
> dithering down to 10 bpc anyway to circumvent this limitation. My
> photometer measurements on fp16 framebuffers feeding into 10 bit output
> show that I get a nice looking response consistent with dithering to 10 bpc
> properly working on DCE. Also i don't see any visual artifacts in displayed
> pictures, so the hw seems to be just fine. This on DCE-11.2, and more
> lightly tested on DCE-8.3.
> >>
> >> So i wonder if this is some leftover from some hw bringup, or if there
> is a good reason for it being there? Maybe it could be removed or made more
> specific to some problematic asic?
> >>
> >> Thanks for any insights you could provide. Stay safe,
> >> -mario
> >>
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: Why no spatial dithering to 10 bit depth on DCE?

2021-02-10 Thread Mario Kleiner

Resending this one as well, in the hope of some clarification or background
information.

Thanks,
-mario

On Mon, Jan 25, 2021 at 3:56 AM Mario Kleiner 
wrote:

> Hi Harry and Nicholas,
>
> I'm still on an extended quest to squeeze as much HDR out of Linux + your
> hw as possible, although the paid contract with Vesa has officially ended
> by now, and stumbled over this little conundrum:
>
> DC's set_spatial_dither() function (see link below) has this particular
> comment:
> "/* no 10bpc on DCE11*/" followed by code that skips dithering setup if
> the target output depth is 10 bpc:
>
>
> https://elixir.bootlin.com/linux/v5.11-rc4/source/drivers/gpu/drm/amd/display/dc/dce/dce_opp.c#L219
>
> I couldn't find any hint in the commit messages towards the reason, so why
> is that?
>
> This gets in the way if one has a HDR-10 monitor with 10 bit native output
> depth connected and wants to output a fp16 framebuffer and retain some of
> the > 10 bit linear precision by use of spatial dithering down to 10 bit.
> One only gets the same precision as a 10 bpc unorm fb. Also the routine is
> called for all DCE's, not only DCE11, so it affects all of them.
>
> The same restrictions don't exist in the old kms code for amdgpu-kms and
> radeon-kms. I added a mmio hack to Psychtoolbox to go behind the drivers
> back and hack the FMT_BIT_DEPTH_CONTROL register to use spatial dithering
> down to 10 bpc anyway to circumvent this limitation. My photometer
> measurements on fp16 framebuffers feeding into 10 bit output show that I
> get a nice looking response consistent with dithering to 10 bpc properly
> working on DCE. Also i don't see any visual artifacts in displayed
> pictures, so the hw seems to be just fine. This on DCE-11.2, and more
> lightly tested on DCE-8.3.
>
> So i wonder if this is some leftover from some hw bringup, or if there is
> a good reason for it being there? Maybe it could be removed or made more
> specific to some problematic asic?
>
> Thanks for any insights you could provide. Stay safe,
> -mario
>
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/2] drm/amd/display: Fix HDMI deep color output for DCE 6-11.

2021-02-10 Thread Mario Kleiner

Ping. Any bit of info appreciated wrt. the DCE-11.2+ situation.
-mario

On Mon, Jan 25, 2021 at 8:24 PM Mario Kleiner 
wrote:

> Thanks Alex and Nicholas! Brings quite a bit of extra shiny to those older
> asics :)
>
> Nicholas, any thoughts on my cover-letter wrt. why a similar patch (that I
> wrote and tested to no good or bad effect) not seem to be needed on DCN,
> and probably not DCE-11.2+ either? Is what is left in DC for those asic's
> just dead code? My Atombios disassembly sort of pointed into that
> direction, but reading disassembly is not easy on the brain, and my brain
> was getting quite mushy towards the end of digging through all the code. So
> some official statement would add peace of mind on my side. Is there a
> certain DCE version at which your team starts validating output precision /
> HDR etc. on hw?
>
> Thanks,
> -mario
>
>
> On Mon, Jan 25, 2021 at 8:16 PM Kazlauskas, Nicholas <
> nicholas.kazlaus...@amd.com> wrote:
>
>> On 2021-01-25 12:57 p.m., Alex Deucher wrote:
>> > On Thu, Jan 21, 2021 at 1:17 AM Mario Kleiner
>> >  wrote:
>> >>
>> >> This fixes corrupted display output in HDMI deep color
>> >> 10/12 bpc mode at least as observed on AMD Mullins, DCE-8.3.
>> >>
>> >> It will hopefully also provide fixes for other DCE's up to
>> >> DCE-11, assuming those will need similar fixes, but i could
>> >> not test that for HDMI due to lack of suitable hw, so viewer
>> >> discretion is advised.
>> >>
>> >> dce110_stream_encoder_hdmi_set_stream_attribute() is used for
>> >> HDMI setup on all DCE's and is missing color_depth assignment.
>> >>
>> >> dce110_program_pix_clk() is used for pixel clock setup on HDMI
>> >> for DCE 6-11, and is missing color_depth assignment.
>> >>
>> >> Additionally some of the underlying Atombios specific encoder
>> >> and pixelclock setup functions are missing code which is in
>> >> the classic amdgpu kms modesetting path and the in the radeon
>> >> kms driver for DCE6/DCE8.
>> >>
>> >> encoder_control_digx_v3() - Was missing setup code wrt. amdgpu
>> >> and radeon kms classic drivers. Added here, but untested due to
>> >> lack of suitable test hw.
>> >>
>> >> encoder_control_digx_v4() - Added missing setup code.
>> >> Successfully tested on AMD mullins / DCE-8.3 with HDMI deep color
>> >> output at 10 bpc and 12 bpc.
>> >>
>> >> Note that encoder_control_digx_v5() has proper setup code in place
>> >> and is used, e.g., by DCE-11.2, but this code wasn't used for deep
>> >> color setup due to the missing cntl.color_depth setup in the calling
>> >> function for HDMI.
>> >>
>> >> set_pixel_clock_v5() - Missing setup code wrt. classic amdgpu/radeon
>> >> kms. Added here, but untested due to lack of hw.
>> >>
>> >> set_pixel_clock_v6() - Missing setup code added. Successfully tested
>> >> on AMD mullins DCE-8.3. This fixes corrupted display output at HDMI
>> >> deep color output with 10 bpc or 12 bpc.
>> >>
>> >> Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")
>> >>
>> >> Signed-off-by: Mario Kleiner 
>> >> Cc: Harry Wentland 
>> >
>> > These make sense. I've applied the series.  I'll let the display guys
>> > gauge the other points in your cover letter.
>> >
>> > Alex
>>
>> I don't have any concerns with this patch.
>>
>> Even though it's already applied feel free to have my:
>>
>> Reviewed-by: Nicholas Kazlauskas 
>>
>> Regards,
>> Nicholas Kazlauskas
>>
>> >
>> >
>> >> ---
>> >>   .../drm/amd/display/dc/bios/command_table.c   | 61
>> +++
>> >>   .../drm/amd/display/dc/dce/dce_clock_source.c | 14 +
>> >>   .../amd/display/dc/dce/dce_stream_encoder.c   |  1 +
>> >>   3 files changed, 76 insertions(+)
>> >>
>> >> diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table.c
>> b/drivers/gpu/drm/amd/display/dc/bios/command_table.c
>> >> index 070459e3e407..afc10b954ffa 100644
>> >> --- a/drivers/gpu/drm/amd/display/dc/bios/command_table.c
>> >> +++ b/drivers/gpu/drm/amd/display/dc/bios/command_table.c
>> >> @@ -245,6 +245,23 @@ static enum bp_result encoder_control_digx_v3(
>> >>  cntl->enable

Re: [PATCH 2/2] drm/amd/display: Fix HDMI deep color output for DCE 6-11.

2021-01-25 Thread Mario Kleiner

Thanks Alex and Nicholas! Brings quite a bit of extra shiny to those older
asics :)

Nicholas, any thoughts on my cover-letter wrt. why a similar patch (that I
wrote and tested to no good or bad effect) not seem to be needed on DCN,
and probably not DCE-11.2+ either? Is what is left in DC for those asic's
just dead code? My Atombios disassembly sort of pointed into that
direction, but reading disassembly is not easy on the brain, and my brain
was getting quite mushy towards the end of digging through all the code. So
some official statement would add peace of mind on my side. Is there a
certain DCE version at which your team starts validating output precision /
HDR etc. on hw?

Thanks,
-mario


On Mon, Jan 25, 2021 at 8:16 PM Kazlauskas, Nicholas <
nicholas.kazlaus...@amd.com> wrote:

> On 2021-01-25 12:57 p.m., Alex Deucher wrote:
> > On Thu, Jan 21, 2021 at 1:17 AM Mario Kleiner
> >  wrote:
> >>
> >> This fixes corrupted display output in HDMI deep color
> >> 10/12 bpc mode at least as observed on AMD Mullins, DCE-8.3.
> >>
> >> It will hopefully also provide fixes for other DCE's up to
> >> DCE-11, assuming those will need similar fixes, but i could
> >> not test that for HDMI due to lack of suitable hw, so viewer
> >> discretion is advised.
> >>
> >> dce110_stream_encoder_hdmi_set_stream_attribute() is used for
> >> HDMI setup on all DCE's and is missing color_depth assignment.
> >>
> >> dce110_program_pix_clk() is used for pixel clock setup on HDMI
> >> for DCE 6-11, and is missing color_depth assignment.
> >>
> >> Additionally some of the underlying Atombios specific encoder
> >> and pixelclock setup functions are missing code which is in
> >> the classic amdgpu kms modesetting path and the in the radeon
> >> kms driver for DCE6/DCE8.
> >>
> >> encoder_control_digx_v3() - Was missing setup code wrt. amdgpu
> >> and radeon kms classic drivers. Added here, but untested due to
> >> lack of suitable test hw.
> >>
> >> encoder_control_digx_v4() - Added missing setup code.
> >> Successfully tested on AMD mullins / DCE-8.3 with HDMI deep color
> >> output at 10 bpc and 12 bpc.
> >>
> >> Note that encoder_control_digx_v5() has proper setup code in place
> >> and is used, e.g., by DCE-11.2, but this code wasn't used for deep
> >> color setup due to the missing cntl.color_depth setup in the calling
> >> function for HDMI.
> >>
> >> set_pixel_clock_v5() - Missing setup code wrt. classic amdgpu/radeon
> >> kms. Added here, but untested due to lack of hw.
> >>
> >> set_pixel_clock_v6() - Missing setup code added. Successfully tested
> >> on AMD mullins DCE-8.3. This fixes corrupted display output at HDMI
> >> deep color output with 10 bpc or 12 bpc.
> >>
> >> Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")
> >>
> >> Signed-off-by: Mario Kleiner 
> >> Cc: Harry Wentland 
> >
> > These make sense. I've applied the series.  I'll let the display guys
> > gauge the other points in your cover letter.
> >
> > Alex
>
> I don't have any concerns with this patch.
>
> Even though it's already applied feel free to have my:
>
> Reviewed-by: Nicholas Kazlauskas 
>
> Regards,
> Nicholas Kazlauskas
>
> >
> >
> >> ---
> >>   .../drm/amd/display/dc/bios/command_table.c   | 61 +++
> >>   .../drm/amd/display/dc/dce/dce_clock_source.c | 14 +
> >>   .../amd/display/dc/dce/dce_stream_encoder.c   |  1 +
> >>   3 files changed, 76 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table.c
> b/drivers/gpu/drm/amd/display/dc/bios/command_table.c
> >> index 070459e3e407..afc10b954ffa 100644
> >> --- a/drivers/gpu/drm/amd/display/dc/bios/command_table.c
> >> +++ b/drivers/gpu/drm/amd/display/dc/bios/command_table.c
> >> @@ -245,6 +245,23 @@ static enum bp_result encoder_control_digx_v3(
> >>  cntl->enable_dp_audio);
> >>  params.ucLaneNum = (uint8_t)(cntl->lanes_number);
> >>
> >> +   switch (cntl->color_depth) {
> >> +   case COLOR_DEPTH_888:
> >> +   params.ucBitPerColor = PANEL_8BIT_PER_COLOR;
> >> +   break;
> >> +   case COLOR_DEPTH_101010:
> >> +   params.ucBitPerColor = PANEL_10BIT_PER_COLOR;
> >> +   break;
> >> +   case COLOR_DEPTH_1212

Re: Enable fp16 display support for DCE8+, next try.

2021-01-20 Thread Mario Kleiner

On Mon, Jan 4, 2021 at 6:16 PM Alex Deucher  wrote:
>
> On Mon, Dec 28, 2020 at 1:51 PM Mario Kleiner
>  wrote:
> >
> > Hi and happy post-christmas!
> >
> > I wrote a patch 1/1 that now checks plane scaling factors against
> > the pixel-format specific limits in the asic specific dc_plane_cap
> > structures during atomic check and other appropriate places.
> >
> > This should prevent things like asking for scaling on fp16 framebuffers
> > if the hw can't do that. Hopefully this will now allow to safely enable
> > fp16 scanout also on older asic's like DCE-11.0, DCE-10 and DCE-8.
> > Patch 2/2 enables those DCE's now for fp16.
> >
> > I used some quickly hacked up of IGT test kms_plane_scaling, manually
> > hacking the src fb size to make sure the patch correctly accepts or
> > rejects atomic commits based on allowable scaling factors for rgbx/a
> > 8 bit, 10, and fp16.
> >
> > This fp16 support has been successfully tested with a Sea Islands /
> > DCE-8 laptop. I also confirmed that at least basic HDR signalling
> > over HDMI works for that DCE-8 machine with a HDR monitor. For this
> > i used the amdvlk driver which exposes fp16 since a while on supported
> > hw.
>
> Patches look good to me, but I'd like to get some feedback from the
> display folks as well.
>
> >
> > There are other bugs in DC wrt. DCE-8 though, which didn't prevent
> > my testing, but may be worth looking into. My DCE-8 machine scrambles
> > the video output picture somewhat under Vulkan (radv and admvlk) if the
> > output signal precision isn't 8 bpc, ie. on 6 bpc (eDP laptop panel)
> > and 10 bpc, 12 bpc (HDMI deep color on external HDR monitor).
> >
> > Another fun thing is getting a black screen if DC is enabled on at least
> > Linux 5.10+ (but not if i use the classic kms code in amdgpu-kms). If
> > i recompile the driver with a Ubuntu kconfig for Linux 5.9, the 5.10
> > kernel works, and the only obvious DC related difference is that DC's
> > new SI / DCE-6 asic support is disabled at compile time.
>
> Fixed here:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6bdeff12a96c9a5da95c8d11fefd145eb165e32a
> Patch should be in stable for 5.10 as well.

Yes, in recent 5.10 stable these fix the problem I experienced.

Thanks Alex,
-mario


>
> Alex
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/2] drm/amd/display: Fix HDMI deep color output for DCE 6-11.

2021-01-20 Thread Mario Kleiner

This fixes corrupted display output in HDMI deep color
10/12 bpc mode at least as observed on AMD Mullins, DCE-8.3.

It will hopefully also provide fixes for other DCE's up to
DCE-11, assuming those will need similar fixes, but i could
not test that for HDMI due to lack of suitable hw, so viewer
discretion is advised.

dce110_stream_encoder_hdmi_set_stream_attribute() is used for
HDMI setup on all DCE's and is missing color_depth assignment.

dce110_program_pix_clk() is used for pixel clock setup on HDMI
for DCE 6-11, and is missing color_depth assignment.

Additionally some of the underlying Atombios specific encoder
and pixelclock setup functions are missing code which is in
the classic amdgpu kms modesetting path and the in the radeon
kms driver for DCE6/DCE8.

encoder_control_digx_v3() - Was missing setup code wrt. amdgpu
and radeon kms classic drivers. Added here, but untested due to
lack of suitable test hw.

encoder_control_digx_v4() - Added missing setup code.
Successfully tested on AMD mullins / DCE-8.3 with HDMI deep color
output at 10 bpc and 12 bpc.

Note that encoder_control_digx_v5() has proper setup code in place
and is used, e.g., by DCE-11.2, but this code wasn't used for deep
color setup due to the missing cntl.color_depth setup in the calling
function for HDMI.

set_pixel_clock_v5() - Missing setup code wrt. classic amdgpu/radeon
kms. Added here, but untested due to lack of hw.

set_pixel_clock_v6() - Missing setup code added. Successfully tested
on AMD mullins DCE-8.3. This fixes corrupted display output at HDMI
deep color output with 10 bpc or 12 bpc.

Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")

Signed-off-by: Mario Kleiner 
Cc: Harry Wentland 
---
 .../drm/amd/display/dc/bios/command_table.c   | 61 +++
 .../drm/amd/display/dc/dce/dce_clock_source.c | 14 +
 .../amd/display/dc/dce/dce_stream_encoder.c   |  1 +
 3 files changed, 76 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table.c 
b/drivers/gpu/drm/amd/display/dc/bios/command_table.c
index 070459e3e407..afc10b954ffa 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/command_table.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/command_table.c
@@ -245,6 +245,23 @@ static enum bp_result encoder_control_digx_v3(
cntl->enable_dp_audio);
params.ucLaneNum = (uint8_t)(cntl->lanes_number);
 
+   switch (cntl->color_depth) {
+   case COLOR_DEPTH_888:
+   params.ucBitPerColor = PANEL_8BIT_PER_COLOR;
+   break;
+   case COLOR_DEPTH_101010:
+   params.ucBitPerColor = PANEL_10BIT_PER_COLOR;
+   break;
+   case COLOR_DEPTH_121212:
+   params.ucBitPerColor = PANEL_12BIT_PER_COLOR;
+   break;
+   case COLOR_DEPTH_161616:
+   params.ucBitPerColor = PANEL_16BIT_PER_COLOR;
+   break;
+   default:
+   break;
+   }
+
if (EXEC_BIOS_CMD_TABLE(DIGxEncoderControl, params))
result = BP_RESULT_OK;
 
@@ -274,6 +291,23 @@ static enum bp_result encoder_control_digx_v4(
cntl->enable_dp_audio));
params.ucLaneNum = (uint8_t)(cntl->lanes_number);
 
+   switch (cntl->color_depth) {
+   case COLOR_DEPTH_888:
+   params.ucBitPerColor = PANEL_8BIT_PER_COLOR;
+   break;
+   case COLOR_DEPTH_101010:
+   params.ucBitPerColor = PANEL_10BIT_PER_COLOR;
+   break;
+   case COLOR_DEPTH_121212:
+   params.ucBitPerColor = PANEL_12BIT_PER_COLOR;
+   break;
+   case COLOR_DEPTH_161616:
+   params.ucBitPerColor = PANEL_16BIT_PER_COLOR;
+   break;
+   default:
+   break;
+   }
+
if (EXEC_BIOS_CMD_TABLE(DIGxEncoderControl, params))
result = BP_RESULT_OK;
 
@@ -1057,6 +1091,19 @@ static enum bp_result set_pixel_clock_v5(
 * driver choose program it itself, i.e. here we program it
 * to 888 by default.
 */
+   if (bp_params->signal_type == SIGNAL_TYPE_HDMI_TYPE_A)
+   switch (bp_params->color_depth) {
+   case TRANSMITTER_COLOR_DEPTH_30:
+   /* yes this is correct, the atom define is 
wrong */
+   clk.sPCLKInput.ucMiscInfo |= 
PIXEL_CLOCK_V5_MISC_HDMI_32BPP;
+   break;
+   case TRANSMITTER_COLOR_DEPTH_36:
+   /* yes this is correct, the atom define is 
wrong */
+   clk.sPCLKInput.ucMiscInfo |= 
PIXEL_CLOCK_V5_MISC_HDMI_30BPP;
+   break;
+   default:
+   break;
+   }
 
if (EXEC_BIOS_CMD_TABLE(SetPixelClock, cl

[PATCH 1/2] drm/amd/display: Fix 10/12 bpc setup in DCE output bit depth reduction.

2021-01-20 Thread Mario Kleiner

In set_clamp(), the comments and definitions for the COLOR_DEPTH_101010
and COLOR_DEPTH_121212 cases directly contradict the code comment which
explains how this should work, whereas the COLOR_DEPTH_888 case
is consistent with the code comments. Comment says the bitmask should
be chosen to align to the top-most 10 or 12 MSB's on a 14 bit bus, but
the implementation contradicts that: 10 bit case sets a mask for 12 bpc
clamping, whereas 12 bit case sets a mask for 14 bpc clamping.

Note that during my limited testing on DCE-8.3 (HDMI deep color)
and DCE-11.2 (DP deep color), this didn't have any obvious ill
effects, neither did fixing it change anything obvious for the
better, so this fix may be inconsequential on DCE, and just
reduce the confusion of innocent bystanders when reading the code
and trying to investigate problems with 10 bpc+ output.

Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")

Signed-off-by: Mario Kleiner 
Cc: Harry Wentland 
---
 drivers/gpu/drm/amd/display/dc/dce/dce_transform.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
index 130a0a0c8332..68028ec995e7 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c
@@ -601,12 +601,12 @@ static void set_clamp(
clamp_max = 0x3FC0;
break;
case COLOR_DEPTH_101010:
-   /* 10bit MSB aligned on 14 bit bus '11   1100' */
-   clamp_max = 0x3FFC;
+   /* 10bit MSB aligned on 14 bit bus '11   ' */
+   clamp_max = 0x3FF0;
break;
case COLOR_DEPTH_121212:
-   /* 12bit MSB aligned on 14 bit bus '11   ' */
-   clamp_max = 0x3FFF;
+   /* 12bit MSB aligned on 14 bit bus '11   1100' */
+   clamp_max = 0x3FFC;
break;
default:
clamp_max = 0x3FC0;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Some HDMI deep color output fixes for DC on DCE 6-11

2021-01-20 Thread Mario Kleiner

Hi,

these two patches fix non-working HDMI deep color output on DCE-8.3,
AMD Mullins when amdgpu-kms is used with Displaycore force-enabled,
ie. for radeon.cik_support=0 amdgpu.cik_support=1 amdgpu.dc=1:
I suspect they might fix similar problems on other older asics of
DCE-11.0 and earlier.

Patch 1/2 is a fix for some oddity i found while hunting for the
HDMI deep color bug. It fixes what looks like an obvious mistake,
but the fix did not improve or degrade anything, so maybe the hw
doesn't care all too much about the wrong clamping/truncation mask?
Anyway, it makes the code less confusing.

Patch 2/2 fixes HDMI deep color output at 10 bpc or 12 bpc output
on AMD Mullins, DCE-8.3, where at output_bpc 10 or 12 the display
would be scrambled. With the patch, the display looks correct, and
photometer measurements on a HDR-10 monitor suggest we probably get
the correct output signal. I found the fix by comparing DC against
the classic amdgpu-kms and radeon-kms modesetting path, readding
missing stuff.

Given other encoder/pixelclock setup functions than the ones used
on DCE-8.3 showed the same omissions, i added missing code there as
well, but i couldn't test it due to lack of hw, so i hope that fixes
instead of breaks things on asic's other than DCE-8.3.

I also created a similar patch for DCE-11.2 and later, not included
here, but during testing on a Raven Ridge DCN-1, the patch neither
helped nor hurt. Output was correct without the patch, and adding the
patch didn't change or break anything on DCN-1. Looking at disassembled
AtomBios tables for DCN-1 and a DCE-11.2, i think AtomBios may not do
much with the info that was missing, which would explain why the
current upstream code seems to work fine without it? At least as
verified on DCN-1. I can't test on DCE-11.2 or DCE-12 due to lack
of hw with actual HDMI output. But it would be interesting for me to
know what changed wrt. Atombios in later asic versions to make some
of this setup apparently redundant in DC?

Do you test DC wrt. HDMI deep color starting at a specific DCE
revision, given that the bug went unnoticed in DCE-8.3, but things
seem to be fine on at least DCN-1?

Thanks,
-mario


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm: Check actual format for legacy pageflip.

2021-01-07 Thread Mario Kleiner

On Thu, Jan 7, 2021 at 7:04 PM Daniel Vetter  wrote:

> On Thu, Jan 7, 2021 at 7:00 PM Mario Kleiner 
> wrote:
> >
> > On Thu, Jan 7, 2021 at 6:57 PM Daniel Vetter  wrote:
> >>
> >> On Sat, Jan 02, 2021 at 04:31:36PM +0100, Mario Kleiner wrote:
> >> > On Sat, Jan 2, 2021 at 3:02 PM Bas Nieuwenhuizen
> >> >  wrote:
> >> > >
> >> > > With modifiers one can actually have different format_info structs
> >> > > for the same format, which now matters for AMDGPU since we convert
> >> > > implicit modifiers to explicit modifiers with multiple planes.
> >> > >
> >> > > I checked other drivers and it doesn't look like they end up
> triggering
> >> > > this case so I think this is safe to relax.
> >> > >
> >> > > Signed-off-by: Bas Nieuwenhuizen 
> >> > > Fixes: 816853f9dc40 ("drm/amd/display: Set new format info for
> converted metadata.")
> >> > > ---
> >> > >  drivers/gpu/drm/drm_plane.c | 2 +-
> >> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> >> > >
> >> > > diff --git a/drivers/gpu/drm/drm_plane.c
> b/drivers/gpu/drm/drm_plane.c
> >> > > index e6231947f987..f5085990cfac 100644
> >> > > --- a/drivers/gpu/drm/drm_plane.c
> >> > > +++ b/drivers/gpu/drm/drm_plane.c
> >> > > @@ -1163,7 +1163,7 @@ int drm_mode_page_flip_ioctl(struct
> drm_device *dev,
> >> > > if (ret)
> >> > > goto out;
> >> > >
> >> > > -   if (old_fb->format != fb->format) {
> >> > > +   if (old_fb->format->format != fb->format->format) {
> >> >
> >> > This was btw. the original way before Ville made it more strict about
> >> > 4 years ago, to catch issues related to tiling, and more complex
> >> > layouts, like the dcc tiling/retiling introduced by your modifier
> >> > patches. That's why I hope my alternative patch is a good solution for
> >> > atomic drivers while keeping the strictness for potential legacy
> >> > drivers.
> >>
> >> Yeah this doesn't work in full generality, because hw might need to do a
> >> full modeset to do a full modeset to reallocate resources (like scanout
> >> fifo space) if the format changes.
> >>
> >> But for atomic drivers that should be caught in ->atomic_check, which
> >> should result in -EINVAL, so should do the right thing. So it should be
> >> all good, but imo needs a comment to explain what's going on:
> >>
> >> /*
> >>  * Only check the FOURCC format code, excluding modifiers. This
> is
> >>  * enough for all legacy drivers. Atomic drivers have their own
> >>  * checks in their ->atomic_check implementation, which will
> >>  * return -EINVAL if any hw or driver constraint is violated due
> >>  * to modifier changes.
> >>  */
> >>
> >> Also can you pls cc: intel-gfx to get this vetted by the intel-gfx ci?
> >>
> >> With that:
> >>
> >> Reviewed-by: Daniel Vetter 
> >>
> >
> > Ah, my "atomic expert", posting simultaneously with myself :). Happy new
> year. Opinions on my variant, just replied a minute ago?
>
> Full disclosure, Ville wanted to do something similar since forever
> I'm not a huge fan of removing limitations of legacy ioctls. Worst
> case we break something, best case no gain in features since why don't
> you just use atomic. Since this (amdgpu modifiers) broke something we
> have to fix it, hence I'd go with the more minimal version from Bas
> here.
>
>
Fair point. Means though that somebody will have to convert many user-space
clients, e.g., all OSS Vulkan drivers. And XOrg could not do that, as the
kernel uabi even blocks use of atomic drmSetClientCap(...ATOMIC...) for any
process whose taskname starts with 'X', as a workaround for a
modesetting-ddx with broken atomic implementation. So at least for (pun
ahead) "X" applications, atomic modesetting is not an option.

For my use cases, X11/XOrg native is still the only display server capable
enough to fulfill the needs, although I'm mixing in a bit of
Vulkan/WSI/DirectDisplay for direct DRM/KMS access to work around some
limitations, e.g., to get HDR or fp16 support.

But in general your patch should be correct too.
> -Daniel
>
>
Thanks for the feedback. I rest my case.
-mario


> >
> > thanks,
> > -mario
> >
> >> >
> >> > -mario
> >> >
> >> > > DRM_DEBUG_KMS("Page flip is not allowed to change
> frame buffer format.\n");
> >> > > ret = -EINVAL;
> >> > > goto out;
> >> > > --
> >> > > 2.29.2
> >> > >
> >>
> >> --
> >> Daniel Vetter
> >> Software Engineer, Intel Corporation
> >> http://blog.ffwll.ch
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm: Check actual format for legacy pageflip.

2021-01-07 Thread Mario Kleiner

On Thu, Jan 7, 2021 at 6:57 PM Daniel Vetter  wrote:

> On Sat, Jan 02, 2021 at 04:31:36PM +0100, Mario Kleiner wrote:
> > On Sat, Jan 2, 2021 at 3:02 PM Bas Nieuwenhuizen
> >  wrote:
> > >
> > > With modifiers one can actually have different format_info structs
> > > for the same format, which now matters for AMDGPU since we convert
> > > implicit modifiers to explicit modifiers with multiple planes.
> > >
> > > I checked other drivers and it doesn't look like they end up triggering
> > > this case so I think this is safe to relax.
> > >
> > > Signed-off-by: Bas Nieuwenhuizen 
> > > Fixes: 816853f9dc40 ("drm/amd/display: Set new format info for
> converted metadata.")
> > > ---
> > >  drivers/gpu/drm/drm_plane.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
> > > index e6231947f987..f5085990cfac 100644
> > > --- a/drivers/gpu/drm/drm_plane.c
> > > +++ b/drivers/gpu/drm/drm_plane.c
> > > @@ -1163,7 +1163,7 @@ int drm_mode_page_flip_ioctl(struct drm_device
> *dev,
> > > if (ret)
> > > goto out;
> > >
> > > -   if (old_fb->format != fb->format) {
> > > +   if (old_fb->format->format != fb->format->format) {
> >
> > This was btw. the original way before Ville made it more strict about
> > 4 years ago, to catch issues related to tiling, and more complex
> > layouts, like the dcc tiling/retiling introduced by your modifier
> > patches. That's why I hope my alternative patch is a good solution for
> > atomic drivers while keeping the strictness for potential legacy
> > drivers.
>
> Yeah this doesn't work in full generality, because hw might need to do a
> full modeset to do a full modeset to reallocate resources (like scanout
> fifo space) if the format changes.
>
> But for atomic drivers that should be caught in ->atomic_check, which
> should result in -EINVAL, so should do the right thing. So it should be
> all good, but imo needs a comment to explain what's going on:
>
> /*
>  * Only check the FOURCC format code, excluding modifiers. This is
>  * enough for all legacy drivers. Atomic drivers have their own
>  * checks in their ->atomic_check implementation, which will
>  * return -EINVAL if any hw or driver constraint is violated due
>  * to modifier changes.
>  */
>
> Also can you pls cc: intel-gfx to get this vetted by the intel-gfx ci?
>
> With that:
>
> Reviewed-by: Daniel Vetter 
>
>
Ah, my "atomic expert", posting simultaneously with myself :). Happy new
year. Opinions on my variant, just replied a minute ago?

thanks,
-mario

>
> > -mario
> >
> > > DRM_DEBUG_KMS("Page flip is not allowed to change
> frame buffer format.\n");
> > > ret = -EINVAL;
> > > goto out;
> > > --
> > > 2.29.2
> > >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm: Check actual format for legacy pageflip.

2021-01-07 Thread Mario Kleiner

On Thu, Jan 7, 2021 at 6:21 PM Liu, Zhan  wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
> > -Original Message-
> > From: Liu, Zhan
> > Sent: 2021/January/06, Wednesday 10:04 AM
> > To: Bas Nieuwenhuizen ; Mario Kleiner
> > 
> > Cc: dri-devel ; amd-gfx list  > g...@lists.freedesktop.org>; Deucher, Alexander
> > ; Daniel Vetter ;
> > Kazlauskas, Nicholas ; Ville Syrjälä
> > 
> > Subject: RE: [PATCH] drm: Check actual format for legacy pageflip.
> >
> >
> > > -Original Message-
> > > From: Liu, Zhan 
> > > Sent: 2021/January/04, Monday 3:46 PM
> > > To: Bas Nieuwenhuizen ; Mario Kleiner
> > > 
> > > Cc: dri-devel ; amd-gfx list  > > g...@lists.freedesktop.org>; Deucher, Alexander
> > > ; Daniel Vetter ;
> > > Kazlauskas, Nicholas ; Ville Syrjälä
> > > 
> > > Subject: Re: [PATCH] drm: Check actual format for legacy pageflip.
> > >
> > >
> > >
> > > + Ville
> > >
> > > On Sat, Jan 2, 2021 at 4:31 PM Mario Kleiner
> > > 
> > > wrote:
> > > >
> > > > On Sat, Jan 2, 2021 at 3:02 PM Bas Nieuwenhuizen
> > > >  wrote:
> > > > >
> > > > > With modifiers one can actually have different format_info structs
> > > > > for the same format, which now matters for AMDGPU since we convert
> > > > > implicit modifiers to explicit modifiers with multiple planes.
> > > > >
> > > > > I checked other drivers and it doesn't look like they end up
> > > > > triggering this case so I think this is safe to relax.
> > > > >
> > > > > Signed-off-by: Bas Nieuwenhuizen 
> > > > > Fixes: 816853f9dc40 ("drm/amd/display: Set new format info for
> > > > >converted metadata.")
> > > > > ---
> > > > >  drivers/gpu/drm/drm_plane.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/drm_plane.c
> > > > > b/drivers/gpu/drm/drm_plane.c index e6231947f987..f5085990cfac
> > > > > 100644
> > > > > --- a/drivers/gpu/drm/drm_plane.c
> > > > > +++ b/drivers/gpu/drm/drm_plane.c
> > > > > @@ -1163,7 +1163,7 @@ int drm_mode_page_flip_ioctl(struct
> > > drm_device
> > > > >*dev,
> > > > > if (ret)
> > > > > goto out;
> > > > >
> > > > > -   if (old_fb->format != fb->format) {
> > > > > +   if (old_fb->format->format != fb->format->format) {
> > > >
> > >
> > > I agree with this patch, though considering the original way was made
> > > by Ville, I will wait for Ville's input first. Adding my "Acked-by"
here.
> > >
> > > This patch is:
> > > Acked-by: Zhan Liu 
>
> Since there is no objection from the community on this patch over the
past few days, and this patch totally makes sense to me, this patch is:
>
> Reviewed-by: Zhan Liu 
>

Well, there is my alternative one-line patch, which is equally simple and
solves the problem in a similar way that doesn't undo Ville's stricter
checks, but it doesn't seem to get any attention:

https://lists.freedesktop.org/archives/dri-devel/2021-January/292763.html

Mine keeps the check as strict as before for non-atomic drivers, but
removes the check for atomic drivers, given the assumption that they should
be able to do without it.

In the end both patches solve the problem in the short term, also
satisfying my (users) needs, and the future is unknown. But it would be
nice to get an opinion from an atomic expert which one is the more future
proof / elegant / final solution to stick to in the face of potential
future atomic kms drivers

With that said, i will add to Bas patch a

Reported-by: Mario Kleiner 
Acked-by: Mario Kleiner 

thanks,
-mario

> >
> > Ping...
> >
> > >
> > > > This was btw. the original way before Ville made it more strict
> > > > about
> > > > 4 years ago, to catch issues related to tiling, and more complex
> > > > layouts, like the dcc tiling/retiling introduced by your modifier
> > > > patches. That's why I hope my alternative patch is a good solution
> > > > for atomic drivers while keeping the strictness for potential legacy
> > > > drivers.
> > > >
> > > > -mario
> > > >
> > > > > DRM_DEBUG_KMS("Page flip is not allowed to change
> > > > >frame buffer format.\n");
> > > > > ret = -EINVAL;
> > > > > goto out;
> > > > > --
> > > > > 2.29.2
> > > > >
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Fix pageflipping for XOrg in Linux 5.11+

2021-01-02 Thread Mario Kleiner

On Sat, Jan 2, 2021 at 7:51 PM Ilia Mirkin  wrote:

> On Sat, Jan 2, 2021 at 1:35 PM Mario Kleiner 
> wrote:
> > I'm less sure about nouveau. It uses modifiers, but has atomic support
> > only on nv50+ and that atomic support is off by default -- needs a
> > nouveau.nouveau_atomic=1 boot parameter to switch it on. It seems to
> > enable modifier support unconditionally regardless if atomic or not,
> > see:
> >
> https://elixir.bootlin.com/linux/v5.11-rc1/source/drivers/gpu/drm/nouveau/nouveau_display.c#L703
> >
> > Atm. nouveau doesn't assign a new format_info though, so wouldn't
> > trigger this issue atm.
>
> Note that pre-nv50, no modifiers exist. Also,
> drm_drv_uses_atomic_modeset() doesn't care whether the client is an
> atomic client or not. It will return true for nv50+ no matter what.
> nouveau_atomic=1 affects whether atomic UAPI is exposed. Not sure if
> this impacts your discussion.
>
>
Thanks Ilia. So nouveau is fine in any case, as nv50 => modifiers and
atomic commit even if atomic UAPI is off. Also
drm_drv_uses_atomic_modeset() is the right choice, as my patch should check
if the atomic driver uses atomic commit, it doesn't care about atomic UAPI
or the client being atomic.

-mario
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Fix pageflipping for XOrg in Linux 5.11+

2021-01-02 Thread Mario Kleiner

On Sat, Jan 2, 2021 at 4:49 PM Bas Nieuwenhuizen
 wrote:
>
> On Sat, Jan 2, 2021 at 4:05 PM Mario Kleiner  
> wrote:
> >
> > On Sat, Jan 2, 2021 at 3:05 PM Bas Nieuwenhuizen
> >  wrote:
> > >
> > > I think the problem here is that application A can set the FB and then
> > > application B can use getfb2 (say ffmpeg).
> >
> >
> > Yes. That, and also the check for 'X' won't get us far, because if i
> > use my own software Psychtoolbox under Vulkan in direct display mode
> > (leased RandR outputs), e.g., under radv or amdvlk, then the ->comm
> > name will be "PTB mainthread" and 'P' != 'X'. But the Vulkan drivers
> > all use legacy pageflips as well in der WSI/drm, so if Vulkan gets
> > framebuffers with DCC modifiers, it would just fail the same way.
> >
> > Neither would it work to check for atomic client, as they sometimes
> > use atomic commit ioctl only for certain use cases, e.g., setting HDR
> > metadata, but still use the legacy pageflip ioctl for flipping.
> >
> > So that patch of mine is not the proper solution for anything non-X.
> >
> > >
> > > https://lists.freedesktop.org/archives/dri-devel/2021-January/292761.html
> > > would be my alternative patch.
> > >
> >
> > I also produced and tested hopefully better alternative to my original
> > one yesterday, but was too tired to send it. I just sent it out to
> > you:
> > https://lists.freedesktop.org/archives/dri-devel/2021-January/292763.html
> >
> > This one keeps the format_info check as is for non-atomic drivers, but
> > no longer rejects pageflip if the underlying kms driver is atomic. I
> > checked, and current atomic drivers use the drm_atomic... helper for
> > implementing legacy pageflips, and that helper just wraps the pageflip
> > into a "set new fb on plane" + atomic check + atomic commit.
> >
> > My understanding is that one can do these format changes safely under
> > atomic commit, so i hope this would be safe and future proof.
>
> So I think the difference between your patch and mine seem to boil
> down to whether we want any uabi extension, since AFAICT none of the
> pre-atomic drivers support modifiers.
>

That's a point: Although the uabi extension would only relax rules,
instead of tighten them, so current drm clients would be ok, i guess.

Afaict the current non-atomic modesetting drivers are:

gma500, shmobile, radeon, nouveau, amdgpu non-DC.

gma500, shmobile and radeon don't use modifiers, and probably won't
get any in the future?

Also amdgpu without DC? Atm. you only enable explicit modifiers for >=
FAMILY_AI, ie. Vega and later, and DC is a requirement for Vega and
later, so modifiers ==> DC ==> atomic.

But some of your code was moved from amdgpu_dm to amdgpu_display
specifically to allow compiling it without DC, and any client i tested
apart from Waylands weston (and that only for cursor planes) didn't
use addfb2 ioctl with modifiers at all, so all of X and Vulkan
currently hits the new convert_tiling_flags_to_modifier() fallback
code that converts old style tiling flags into modifiers. That
fallback path is the reason for triggering this issue in the first
place, as it converts some tiling flags to DCC/DCC-retile modifiers
and therefore changes the format_info.

Modifiers are only enabled if DC is on. So as long as nobody decides
to add modifiers in the legacy non-DC kms path, we'd be ok.

I'm less sure about nouveau. It uses modifiers, but has atomic support
only on nv50+ and that atomic support is off by default -- needs a
nouveau.nouveau_atomic=1 boot parameter to switch it on. It seems to
enable modifier support unconditionally regardless if atomic or not,
see:
https://elixir.bootlin.com/linux/v5.11-rc1/source/drivers/gpu/drm/nouveau/nouveau_display.c#L703

Atm. nouveau doesn't assign a new format_info though, so wouldn't
trigger this issue atm.

So i think the decision is about relaxing uabi a bit with my patch,
vs. less future-proofing with your patch?

Atm. both patches would solve the immediate problem, which is very
serious for my users' use cases, so I'd be ok with any of them. I just
don't want this issue to repeat in the future. Tracking it down killed
almost two full days for me, although I involuntarily learned more
about the current state of modifiers in kernel and user space than I
ever thought I wanted to know :/.

-mario

> >
> > > (I'm not good at detecting the effects of tearing  apparently but
> > > tested this avoids the pageflip failure by debug-prints)
> >
> >
> > XOrg log (e.g., ~/.local/share/xorg/XOrg0.log on current Ubuntu's) is
> > a good place on native XOrg, where the amdgpu-ddx was flooding the log
> > with present flip fa

Re: [PATCH] drm: Check actual format for legacy pageflip.

2021-01-02 Thread Mario Kleiner

On Sat, Jan 2, 2021 at 3:02 PM Bas Nieuwenhuizen
 wrote:
>
> With modifiers one can actually have different format_info structs
> for the same format, which now matters for AMDGPU since we convert
> implicit modifiers to explicit modifiers with multiple planes.
>
> I checked other drivers and it doesn't look like they end up triggering
> this case so I think this is safe to relax.
>
> Signed-off-by: Bas Nieuwenhuizen 
> Fixes: 816853f9dc40 ("drm/amd/display: Set new format info for converted 
> metadata.")
> ---
>  drivers/gpu/drm/drm_plane.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
> index e6231947f987..f5085990cfac 100644
> --- a/drivers/gpu/drm/drm_plane.c
> +++ b/drivers/gpu/drm/drm_plane.c
> @@ -1163,7 +1163,7 @@ int drm_mode_page_flip_ioctl(struct drm_device *dev,
> if (ret)
> goto out;
>
> -   if (old_fb->format != fb->format) {
> +   if (old_fb->format->format != fb->format->format) {

This was btw. the original way before Ville made it more strict about
4 years ago, to catch issues related to tiling, and more complex
layouts, like the dcc tiling/retiling introduced by your modifier
patches. That's why I hope my alternative patch is a good solution for
atomic drivers while keeping the strictness for potential legacy
drivers.

-mario

> DRM_DEBUG_KMS("Page flip is not allowed to change frame 
> buffer format.\n");
> ret = -EINVAL;
> goto out;
> --
> 2.29.2
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Fix pageflipping for XOrg in Linux 5.11+

2021-01-02 Thread Mario Kleiner

On Sat, Jan 2, 2021 at 3:05 PM Bas Nieuwenhuizen
 wrote:
>
> I think the problem here is that application A can set the FB and then
> application B can use getfb2 (say ffmpeg).

Yes. That, and also the check for 'X' won't get us far, because if i
use my own software Psychtoolbox under Vulkan in direct display mode
(leased RandR outputs), e.g., under radv or amdvlk, then the ->comm
name will be "PTB mainthread" and 'P' != 'X'. But the Vulkan drivers
all use legacy pageflips as well in der WSI/drm, so if Vulkan gets
framebuffers with DCC modifiers, it would just fail the same way.

Neither would it work to check for atomic client, as they sometimes
use atomic commit ioctl only for certain use cases, e.g., setting HDR
metadata, but still use the legacy pageflip ioctl for flipping.

So that patch of mine is not the proper solution for anything non-X.

>
> https://lists.freedesktop.org/archives/dri-devel/2021-January/292761.html
> would be my alternative patch.
>

I also produced and tested hopefully better alternative to my original
one yesterday, but was too tired to send it. I just sent it out to
you:
https://lists.freedesktop.org/archives/dri-devel/2021-January/292763.html

This one keeps the format_info check as is for non-atomic drivers, but
no longer rejects pageflip if the underlying kms driver is atomic. I
checked, and current atomic drivers use the drm_atomic... helper for
implementing legacy pageflips, and that helper just wraps the pageflip
into a "set new fb on plane" + atomic check + atomic commit.

My understanding is that one can do these format changes safely under
atomic commit, so i hope this would be safe and future proof.

> (I'm not good at detecting the effects of tearing  apparently but
> tested this avoids the pageflip failure by debug-prints)

XOrg log (e.g., ~/.local/share/xorg/XOrg0.log on current Ubuntu's) is
a good place on native XOrg, where the amdgpu-ddx was flooding the log
with present flip failures. Or drm.debug=4 for the kernel log.

Piglit has the OML_sync_control tests for timing correctness, although
they are mostly pointless if not run in fullscreen mode, which they
are not by default.

I can also highly recommend (sudo apt install octave-psychtoolbox-3)
on Debian/Ubutu based systems for X11. It is used for neuroscience and
medical research and critically depends on properly working pageflips
and timestamping on native X11/GLX under OpenGL and recently also
under Vulkan/WSI (radv,anv,amdvlk) in direct display mode. Working
FOSS AMD and Intel are especially critical for this research, so far
under X11+Mesa/OpenGL, but lately also under Vulkan direct display
mode. It has many built-in correctness tests and will shout angrily if
something software-detectable is broken wrt. pageflipping or timing.
E.g., octave-cli --eval PerceptualVBLSyncTest
PerceptualVBLSyncTest creates a flicker pattern that will tear like
crazy under Mesa if pageflipping isn't used. Also good to test
synchronization on dual-display setups, e.g., for udal-display stereo
presentation.

I was actually surprised that this patch made it through the various
test suites and into drm-next. I thought page-flipping was covered
well enough somewhere.

Happy new year!
-mario

>
> On Thu, Dec 31, 2020 at 9:52 PM Mario Kleiner
>  wrote:
> >
> > Commit 816853f9dc4057b6c7ee3c45ca9bd5905 ("drm/amd/display: Set new
> > format info for converted metadata.") may fix the getfb2 ioctl, but
> > in exchange it completely breaks all pageflipping for classic user
> > space, e.g., XOrg, as tested with both amdgpu-ddx and modesetting-ddx.
> > This leads to massive tearing, broken visual timing/timestamping etc.
> >
> > Reason is that the classic pageflip ioctl doesn't allow a fb format
> > change during flip, and at least X uses classic pageflip ioctl and no
> > atomic modesetting api at all.
> >
> > As one attempted workaround, only set the new format info for converted
> > metadata if the calling client isn't X. Not sure if this is the best
> > way, or if a better check would not be "not all atomic clients" or
> > similar? In any case it works for XOrg X-Server. Checking the ddx
> > code of intel-ddx/modesetting-ddx/amdgpu-ddx as well as grepping over
> > Mesa doesn't show any users of the getfb2 ioctl(), so the need for this
> > format info assignment seems to be more the exception than the rule?
> >
> > Fixes: 816853f9dc40 ("drm/amd/display: Set new format info for converted 
> > metadata.")
> > Cc: Bas Nieuwenhuizen 
> > Cc: Alex Deucher 
> > Signed-off-by: Mario Kleiner 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
> > b/d

[PATCH] drm: Allow format change during legacy page-flip if driver is atomic.

2021-01-02 Thread Mario Kleiner

This is a slight improvement for legacy flipping, but also an attempted
fix for a bug/regression introduced into Linux 4.11-rc.

Commit 816853f9dc4057b6c7ee3c45ca9bd5905 ("drm/amd/display: Set new
format info for converted metadata.") fixes the getfb2 ioctl, but
in exchange it completely breaks all pageflipping for classic user
space, e.g., XOrg, as tested with both amdgpu-ddx and modesetting-ddx.
This leads to massive tearing, broken visual timing/timestamping, and
a xorg log flooded with error messages, as tested on Ubuntu 20.04.1-LTS
with X-Server 1.20.8, Mesa 20.0.8, amdgpu-ddx 19.1.0 and also with the
modesetting-ddx with/without atomic on a AMD Raven Ridge gpu. Changes
to future Mesa Vulkan drivers beyond 20.0.8 may break (or already have
broken?) page flipping on those as well.

The reason is that the classic pageflip ioctl doesn't allow a fb format
change during flip, and at least X uses classic pageflip ioctl and no
atomic modesetting api for flipping, as do all inspected Vulkan
drivers, e.g., anv, radv, amdvlk. Above commit assigns new fb->format
for use of (retiling) DCC on AMD gpu's for some tiling flags, which
is detected (and rejected) by the pageflip ioctl as a format change.

However, current atomic kms drivers hook up the ->page_flip() driver
function to the atomic helper function drm_atomic_helper_page_flip(),
which implements the legacy flip as an atomic commit. My understanding
is that a format change during flip via such an atomic commit is safe.

Therefore only reject the legacy pageflip ioctl if a fb format change
is requested on a kms driver which isn't DRIVER_ATOMIC.

This makes "legacy" flipping work again on Linux 4.11 with amdgpu-kms
and DisplayCore enabled.

Fixes: 816853f9dc40 ("drm/amd/display: Set new format info for converted 
metadata.")
Cc: Bas Nieuwenhuizen 
Cc: Alex Deucher 
Cc: David Airlie 
Cc: Ville Syrjälä 
Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/drm_plane.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index e6231947f987..4688360a078d 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -1163,7 +1163,11 @@ int drm_mode_page_flip_ioctl(struct drm_device *dev,
if (ret)
goto out;
 
-   if (old_fb->format != fb->format) {
+   /*
+* Format change during legacy pageflip only works if page flip is done
+* via an atomic commit, e.g., via drm_atomic_helper_page_flip() helper.
+*/
+   if ((old_fb->format != fb->format) && 
!drm_drv_uses_atomic_modeset(dev)) {
DRM_DEBUG_KMS("Page flip is not allowed to change frame buffer 
format.\n");
ret = -EINVAL;
goto out;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/display: Fix pageflipping for XOrg in Linux 5.11+

2020-12-31 Thread Mario Kleiner

Commit 816853f9dc4057b6c7ee3c45ca9bd5905 ("drm/amd/display: Set new
format info for converted metadata.") may fix the getfb2 ioctl, but
in exchange it completely breaks all pageflipping for classic user
space, e.g., XOrg, as tested with both amdgpu-ddx and modesetting-ddx.
This leads to massive tearing, broken visual timing/timestamping etc.

Reason is that the classic pageflip ioctl doesn't allow a fb format
change during flip, and at least X uses classic pageflip ioctl and no
atomic modesetting api at all.

As one attempted workaround, only set the new format info for converted
metadata if the calling client isn't X. Not sure if this is the best
way, or if a better check would not be "not all atomic clients" or
similar? In any case it works for XOrg X-Server. Checking the ddx
code of intel-ddx/modesetting-ddx/amdgpu-ddx as well as grepping over
Mesa doesn't show any users of the getfb2 ioctl(), so the need for this
format info assignment seems to be more the exception than the rule?

Fixes: 816853f9dc40 ("drm/amd/display: Set new format info for converted 
metadata.")
Cc: Bas Nieuwenhuizen 
Cc: Alex Deucher 
Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index f764803c53a4..cb414b3d327a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -828,7 +828,8 @@ static int convert_tiling_flags_to_modifier(struct 
amdgpu_framebuffer *afb)
if (!format_info)
return -EINVAL;
 
-   afb->base.format = format_info;
+   if (afb->base.comm[0] != 'X')
+   afb->base.format = format_info;
}
}
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/2] drm/amd/display: Enable fp16 also on DCE-8/10/11.

2020-12-28 Thread Mario Kleiner

The hw supports fp16, this is not only useful for HDR,
but also for standard dynamic range displays, because
it allows to get more precise color reproduction with
about 11 - 12 bpc linear precision in the unorm range
0.0 - 1.0.

Working fp16 scanout+display (and HDR over HDMI) was
verified on a DCE-8 asic, so i assume that the more
recent DCE-10/11 will work equally well, now that
format-specific plane scaling constraints are properly
enforced, e.g., the inability of fp16 to scale on older
hw like DCE-8 to DCE-11.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c | 2 +-
 drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c | 2 +-
 drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c 
b/drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c
index 8ab9d6c79808..f20ed05a5050 100644
--- a/drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c
@@ -385,7 +385,7 @@ static const struct dc_plane_cap plane_cap = {
.pixel_format_support = {
.argb = true,
.nv12 = false,
-   .fp16 = false
+   .fp16 = true
},
 
.max_upscale_factor = {
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
index 3f63822b8e28..af208f9bd03b 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
@@ -410,7 +410,7 @@ static const struct dc_plane_cap plane_cap = {
.pixel_format_support = {
.argb = true,
.nv12 = false,
-   .fp16 = false
+   .fp16 = true
},
 
.max_upscale_factor = {
diff --git a/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c 
b/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c
index 390a0fa37239..26fe25caa281 100644
--- a/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c
@@ -402,7 +402,7 @@ static const struct dc_plane_cap plane_cap = {
.pixel_format_support = {
.argb = true,
.nv12 = false,
-   .fp16 = false
+   .fp16 = true
},
 
.max_upscale_factor = {
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/2] drm/amd/display: Check plane scaling against format specific hw plane caps.

2020-12-28 Thread Mario Kleiner

This takes hw constraints specific to pixel formats into account,
e.g., the inability of older hw to scale fp16 format framebuffers.

It should now allow safely to enable fp16 formats also on DCE-8,
DCE-10, DCE-11.0

Signed-off-by: Mario Kleiner 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 81 +--
 1 file changed, 73 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 2c4dbdeec46a..a3745cd8a459 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3759,10 +3759,53 @@ static const struct drm_encoder_funcs 
amdgpu_dm_encoder_funcs = {
 };
 
 
+static void get_min_max_dc_plane_scaling(struct drm_device *dev,
+struct drm_framebuffer *fb,
+int *min_downscale, int *max_upscale)
+{
+   struct amdgpu_device *adev = drm_to_adev(dev);
+   struct dc *dc = adev->dm.dc;
+   /* Caps for all supported planes are the same on DCE and DCN 1 - 3 */
+   struct dc_plane_cap *plane_cap = >caps.planes[0];
+
+   switch (fb->format->format) {
+   case DRM_FORMAT_P010:
+   case DRM_FORMAT_NV12:
+   case DRM_FORMAT_NV21:
+   *max_upscale = plane_cap->max_upscale_factor.nv12;
+   *min_downscale = plane_cap->max_downscale_factor.nv12;
+   break;
+
+   case DRM_FORMAT_XRGB16161616F:
+   case DRM_FORMAT_ARGB16161616F:
+   case DRM_FORMAT_XBGR16161616F:
+   case DRM_FORMAT_ABGR16161616F:
+   *max_upscale = plane_cap->max_upscale_factor.fp16;
+   *min_downscale = plane_cap->max_downscale_factor.fp16;
+   break;
+
+   default:
+   *max_upscale = plane_cap->max_upscale_factor.argb;
+   *min_downscale = plane_cap->max_downscale_factor.argb;
+   break;
+   }
+
+   /*
+* A factor of 1 in the plane_cap means to not allow scaling, ie. use a
+* scaling factor of 1.0 == 1000 units.
+*/
+   if (*max_upscale == 1)
+   *max_upscale = 1000;
+
+   if (*min_downscale == 1)
+   *min_downscale = 1000;
+}
+
+
 static int fill_dc_scaling_info(const struct drm_plane_state *state,
struct dc_scaling_info *scaling_info)
 {
-   int scale_w, scale_h;
+   int scale_w, scale_h, min_downscale, max_upscale;
 
memset(scaling_info, 0, sizeof(*scaling_info));
 
@@ -3794,17 +3837,25 @@ static int fill_dc_scaling_info(const struct 
drm_plane_state *state,
/* DRM doesn't specify clipping on destination output. */
scaling_info->clip_rect = scaling_info->dst_rect;
 
-   /* TODO: Validate scaling per-format with DC plane caps */
+   /* Validate scaling per-format with DC plane caps */
+   if (state->plane && state->plane->dev && state->fb) {
+   get_min_max_dc_plane_scaling(state->plane->dev, state->fb,
+_downscale, _upscale);
+   } else {
+   min_downscale = 250;
+   max_upscale = 16000;
+   }
+
scale_w = scaling_info->dst_rect.width * 1000 /
  scaling_info->src_rect.width;
 
-   if (scale_w < 250 || scale_w > 16000)
+   if (scale_w < min_downscale || scale_w > max_upscale)
return -EINVAL;
 
scale_h = scaling_info->dst_rect.height * 1000 /
  scaling_info->src_rect.height;
 
-   if (scale_h < 250 || scale_h > 16000)
+   if (scale_h < min_downscale || scale_h > max_upscale)
return -EINVAL;
 
/*
@@ -6424,12 +6475,26 @@ static void dm_plane_helper_cleanup_fb(struct drm_plane 
*plane,
 static int dm_plane_helper_check_state(struct drm_plane_state *state,
   struct drm_crtc_state *new_crtc_state)
 {
-   int max_downscale = 0;
-   int max_upscale = INT_MAX;
+   struct drm_framebuffer *fb = state->fb;
+   int min_downscale, max_upscale;
+   int min_scale = 0;
+   int max_scale = INT_MAX;
+
+   /* Plane enabled? Get min/max allowed scaling factors from plane caps. 
*/
+   if (fb && state->crtc) {
+   get_min_max_dc_plane_scaling(state->crtc->dev, fb,
+_downscale, _upscale);
+   /*
+* Convert to drm convention: 16.16 fixed point, instead of dc's
+* 1.0 == 1000. Also drm scaling is src/dst instead of dc's
+* dst/src, so min_scale = 1.0 / max_upscale, etc.
+*/
+   min_scale = (1000 << 16) / max_upscale;
+   max_scale = (1000 << 16) / min_downscale;
+   }

Enable fp16 display support for DCE8+, next try.

2020-12-28 Thread Mario Kleiner

Hi and happy post-christmas!

I wrote a patch 1/1 that now checks plane scaling factors against
the pixel-format specific limits in the asic specific dc_plane_cap
structures during atomic check and other appropriate places.

This should prevent things like asking for scaling on fp16 framebuffers
if the hw can't do that. Hopefully this will now allow to safely enable
fp16 scanout also on older asic's like DCE-11.0, DCE-10 and DCE-8.
Patch 2/2 enables those DCE's now for fp16.

I used some quickly hacked up of IGT test kms_plane_scaling, manually
hacking the src fb size to make sure the patch correctly accepts or
rejects atomic commits based on allowable scaling factors for rgbx/a
8 bit, 10, and fp16.

This fp16 support has been successfully tested with a Sea Islands /
DCE-8 laptop. I also confirmed that at least basic HDR signalling
over HDMI works for that DCE-8 machine with a HDR monitor. For this
i used the amdvlk driver which exposes fp16 since a while on supported
hw.

There are other bugs in DC wrt. DCE-8 though, which didn't prevent
my testing, but may be worth looking into. My DCE-8 machine scrambles
the video output picture somewhat under Vulkan (radv and admvlk) if the
output signal precision isn't 8 bpc, ie. on 6 bpc (eDP laptop panel)
and 10 bpc, 12 bpc (HDMI deep color on external HDR monitor).

Another fun thing is getting a black screen if DC is enabled on at least
Linux 5.10+ (but not if i use the classic kms code in amdgpu-kms). If
i recompile the driver with a Ubuntu kconfig for Linux 5.9, the 5.10
kernel works, and the only obvious DC related difference is that DC's
new SI / DCE-6 asic support is disabled at compile time.

Thanks,
-mario


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/2] drm/amd/display: Enable fp16 also on DCE-10.

2020-06-06 Thread Mario Kleiner

Assuming the hw supports fp16, this would be also
useful for standard dynamic range displays, not
only for HDR use cases, because it would allow
to get more precise color reproduction with about
~11 bpc linear precision in the unorm range
0.0 -1.0.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c 
b/drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c
index a28c4ae0f259..10bb4e6e7bac 100644
--- a/drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c
@@ -385,7 +385,7 @@ static const struct dc_plane_cap plane_cap = {
.pixel_format_support = {
.argb = true,
.nv12 = false,
-   .fp16 = false
+   .fp16 = true
},
 
.max_upscale_factor = {
-- 
2.24.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/2] drm/amd/display: Enable fp16 also on DCE-8.

2020-06-06 Thread Mario Kleiner

Assuming the hw supports fp16, this would be also
useful for standard dynamic range displays, not
only for HDR use cases, because it would allow
to get more precise color reproduction with about
~11 bpc linear precision in the unorm range
0.0 -1.0.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c 
b/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c
index a19be9de2df7..0dcea3106886 100644
--- a/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c
@@ -402,7 +402,7 @@ static const struct dc_plane_cap plane_cap = {
.pixel_format_support = {
.argb = true,
.nv12 = false,
-   .fp16 = false
+   .fp16 = true
},
 
.max_upscale_factor = {
-- 
2.24.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Enable fp16 scanout on DCE 8 and DCE 10

2020-06-06 Thread Mario Kleiner

So i'm sending these on the off-chance that DCE-8/10 do
support fp16 scanout without hw bugs. Would be nice to
enable this if it is supported, even if the hw can't do
HDR. This is also useful for non-HDR to get effectively
11 bpc precision from the fb to the display outputs for
more precise color reproduction.

thanks,
-mario


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/2] drm/amd/display: Enable fp16 also on DCE-11.0 - DCE-12.

2020-05-20 Thread Mario Kleiner

On Wed, May 20, 2020 at 9:07 PM Kazlauskas, Nicholas <
nicholas.kazlaus...@amd.com> wrote:

> On 2020-05-20 2:44 p.m., Mario Kleiner wrote:
> > On Wed, May 20, 2020 at 8:25 PM Alex Deucher  > <mailto:alexdeuc...@gmail.com>> wrote:
> >
> > On Wed, May 20, 2020 at 12:39 PM Harry Wentland  > <mailto:hwent...@amd.com>> wrote:
> >  >
> >  > On 2020-05-15 1:19 a.m., Mario Kleiner wrote:
> >  > > Testing on a Polaris11 gpu with DCE-11.2 suggests that it
> >  > > seems to work fine there, so optimistically enable it for
> >  > > DCE-11 and later.
> >  > >
> >  > > Signed-off-by: Mario Kleiner  > <mailto:mario.kleiner...@gmail.com>>
> >  > > ---
> >  > >  drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c | 2 +-
> >  > >  drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c | 2 +-
> >  > >  drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c | 2 +-
> >  > >  3 files changed, 3 insertions(+), 3 deletions(-)
> >  > >
> >  > > diff --git
> > a/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
> > b/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
> >  > > index 9597fc79d7fa..a043ddae5149 100644
> >  > > --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
> >  > > +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
> >  > > @@ -410,7 +410,7 @@ static const struct dc_plane_cap plane_cap
> = {
> >  > >   .pixel_format_support = {
> >  > >   .argb = true,
> >  > >   .nv12 = false,
> >  > > - .fp16 = false
> >  > > + .fp16 = true
> >  >
> >  > Carrizo (DCE 11.0) has a HW bug where FP16 scaling doesn't work. I
> >  > recommend we leave it off here.
> >
> > I'll drop this hunk for upstream.
> >
> > Alex
> >
> >
> > Ok, no fixup patch needed from myself, thanks Alex. Does the scaling bug
> > refer to scaling the planes (those max_downscale_factor /
> > max_upscale_factor definitions seem to be unused) or the fp16 values
> itself?
> >
> > What about DCE 8 and DCE 10 hw capabilities wrt. fp16? Should i send
> > fp16 enable patches for those as well?
> >
> > -mario
>
> Yeah, the upscale and downscale factors were intended to block FP16
> accepted and reject the commit but I guess nobody ever added those to
> atomic check.
>
> I reviewed the patch with the idea in mind that we already blocked this
> on a DC level. We can re-enable it in the caps after this is in I think.
>
> Off the top of my head I don't remember what DCE8/DCE10 supports, but
> I'm also not sure if they even support sending the SDP message for those
> to really be usable.
>

While HDR is the typical user for fp16, even on SDR displays, without any
HDR signalling, fp16 should give an additional bit of precision ~ 11 bpc
effective in standard 0.0 - 1.0 unorm range on a 12 bit pipeline with a 12
bpc panel or even on a 10 bpc panel with dithering. Useful for
neuroscience/medical research applications or the color precision obsessed
people. I take every bit i can get ;)

-mario



> Regards,
> Nicholas Kazlauskas
>
> >
> >  >
> >  > Harry
> >  >
> >  > >   },
> >  > >
> >  > >   .max_upscale_factor = {
> >  > > diff --git
> > a/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
> > b/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
> >  > > index 4a7796de2ff5..51b3fe502670 100644
> >  > > --- a/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
> >  > > +++ b/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
> >  > > @@ -411,7 +411,7 @@ static const struct dc_plane_cap plane_cap
> = {
> >  > >   .pixel_format_support = {
> >  > >   .argb = true,
> >  > >   .nv12 = false,
> >  > > - .fp16 = false
> >  > > + .fp16 = true
> >  > >   },
> >  > >
> >  > >   .max_upscale_factor = {
> >  > > diff --git
> > a/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
> > b/drivers/gpu/d

Re: [PATCH 2/2] drm/amd/display: Enable fp16 also on DCE-11.0 - DCE-12.

2020-05-20 Thread Mario Kleiner

On Wed, May 20, 2020 at 8:25 PM Alex Deucher  wrote:

> On Wed, May 20, 2020 at 12:39 PM Harry Wentland  wrote:
> >
> > On 2020-05-15 1:19 a.m., Mario Kleiner wrote:
> > > Testing on a Polaris11 gpu with DCE-11.2 suggests that it
> > > seems to work fine there, so optimistically enable it for
> > > DCE-11 and later.
> > >
> > > Signed-off-by: Mario Kleiner 
> > > ---
> > >  drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c | 2 +-
> > >  drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c | 2 +-
> > >  drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c | 2 +-
> > >  3 files changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
> b/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
> > > index 9597fc79d7fa..a043ddae5149 100644
> > > --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
> > > +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
> > > @@ -410,7 +410,7 @@ static const struct dc_plane_cap plane_cap = {
> > >   .pixel_format_support = {
> > >   .argb = true,
> > >   .nv12 = false,
> > > - .fp16 = false
> > > + .fp16 = true
> >
> > Carrizo (DCE 11.0) has a HW bug where FP16 scaling doesn't work. I
> > recommend we leave it off here.
>
> I'll drop this hunk for upstream.
>
> Alex
>
>
Ok, no fixup patch needed from myself, thanks Alex. Does the scaling bug
refer to scaling the planes (those max_downscale_factor /
max_upscale_factor definitions seem to be unused) or the fp16 values itself?

What about DCE 8 and DCE 10 hw capabilities wrt. fp16? Should i send fp16
enable patches for those as well?

-mario

>
> > Harry
> >
> > >   },
> > >
> > >   .max_upscale_factor = {
> > > diff --git a/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
> b/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
> > > index 4a7796de2ff5..51b3fe502670 100644
> > > --- a/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
> > > +++ b/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
> > > @@ -411,7 +411,7 @@ static const struct dc_plane_cap plane_cap = {
> > >   .pixel_format_support = {
> > >   .argb = true,
> > >   .nv12 = false,
> > > - .fp16 = false
> > > + .fp16 = true
> > >   },
> > >
> > >   .max_upscale_factor = {
> > > diff --git a/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
> b/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
> > > index 9a9764cbd78d..8f362e8c1787 100644
> > > --- a/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
> > > +++ b/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
> > > @@ -516,7 +516,7 @@ static const struct dc_plane_cap plane_cap = {
> > >   .pixel_format_support = {
> > >   .argb = true,
> > >   .nv12 = false,
> > > - .fp16 = false
> > > + .fp16 = true
> > >   },
> > >
> > >   .max_upscale_factor = {
> > >
> > ___
> > dri-devel mailing list
> > dri-de...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/2] drm/amd/display: Enable fp16 also on DCE-11.0 - DCE-12.

2020-05-14 Thread Mario Kleiner

Testing on a Polaris11 gpu with DCE-11.2 suggests that it
seems to work fine there, so optimistically enable it for
DCE-11 and later.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c | 2 +-
 drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c | 2 +-
 drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
index 9597fc79d7fa..a043ddae5149 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
@@ -410,7 +410,7 @@ static const struct dc_plane_cap plane_cap = {
.pixel_format_support = {
.argb = true,
.nv12 = false,
-   .fp16 = false
+   .fp16 = true
},
 
.max_upscale_factor = {
diff --git a/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c 
b/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
index 4a7796de2ff5..51b3fe502670 100644
--- a/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
@@ -411,7 +411,7 @@ static const struct dc_plane_cap plane_cap = {
.pixel_format_support = {
.argb = true,
.nv12 = false,
-   .fp16 = false
+   .fp16 = true
},
 
.max_upscale_factor = {
diff --git a/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c 
b/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
index 9a9764cbd78d..8f362e8c1787 100644
--- a/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
@@ -516,7 +516,7 @@ static const struct dc_plane_cap plane_cap = {
.pixel_format_support = {
.argb = true,
.nv12 = false,
-   .fp16 = false
+   .fp16 = true
},
 
.max_upscale_factor = {
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/2] drm/amd/display: Expose support for xBGR ordered fp16 formats.

2020-05-14 Thread Mario Kleiner

Expose support for DRM_FORMAT_ABGR16161616F and
DRM_FORMAT_XBGR16161616F to the DRM core, complementing
the already existing xRGB ordered fp16 formats.

These are especially useful for creating presentable
swapchains in Vulkan for VK_FORMAT_R16G16B16A16_SFLOAT.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 48f2b3710e7c..bd0c9eda8f93 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3651,6 +3651,10 @@ fill_dc_plane_info_and_addr(struct amdgpu_device *adev,
case DRM_FORMAT_ARGB16161616F:
plane_info->format = SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F;
break;
+   case DRM_FORMAT_XBGR16161616F:
+   case DRM_FORMAT_ABGR16161616F:
+   plane_info->format = SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F;
+   break;
default:
DRM_ERROR(
"Unsupported screen format %s\n",
@@ -5566,6 +5570,8 @@ static int get_plane_formats(const struct drm_plane 
*plane,
if (plane_cap && plane_cap->pixel_format_support.fp16) {
formats[num_formats++] = DRM_FORMAT_XRGB16161616F;
formats[num_formats++] = DRM_FORMAT_ARGB16161616F;
+   formats[num_formats++] = DRM_FORMAT_XBGR16161616F;
+   formats[num_formats++] = DRM_FORMAT_ABGR16161616F;
}
break;
 
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

fp16 support in xBGR order, and for DCE-11+

2020-05-14 Thread Mario Kleiner

Hi,

two patches. The first one adds the xBGR ordered variants of fp16
in addition to the xRGB ordered variants that were merged just
very recently.

These variants are required for direct scanout of OpenGL and Vulkan
rendered content, as both OpenGL (GL_RGBA16F) and Vulkan
(VK_FORMAT_R16G16B16A16_SFLOAT) expect RGBA channel order for fp16,
instead of the previously exposed BGRA fp16 ordering.

I have a proof of concept patch against the amdvlk Vulkan driver
that allows fp16 rendering and presentation with this format, but
not with the xRGB format. Nicholas already has the test patch in
his inbox. Results look visually correct, also when used on a HDR
monitor.

I also tested with Patch 2/2 on top on a Polaris11 gpu with DCE-11.2
display engine, and also got correct results, so maybe it makes sense
to enable fp16 scanout support also for DCE and not only DCN?
Patch 2/2 does not enable fp16 on DCE 10 or DCE 8, because i don't
have hardware to test it. But it would be nice to expose fp16 on all
supported display engines.

Thanks,
-mario


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Fix vblank and pageflip event handling for FreeSync

2020-05-08 Thread Mario Kleiner

On Thu, May 7, 2020 at 7:35 PM Kazlauskas, Nicholas <
nicholas.kazlaus...@amd.com> wrote:

> It applies on top of Alex's amd-staging-drm-next-branch.
>
> It is essentially just a revert to the old behavior with
> acrtc_state->active_planes == 0 special case you added on top and some
> small refactoring.
>
> The only remaining bits that are kind of questionable is the
> VUPDATE_NO_LOCK vs VUPDATE bit again along with some of the locking around
> where we check if FreeSync is active or not.
>
> I also don't think that VSTARTUP is the correct place to be performing the
> update for the registers since the offset between VSTARTUP and VUPDATE can
> be small enough such that the programming won't finish in time for some
> timings. We should be doing it on line 0 instead.
>
> All these issues existed before this patch series at least though.
>
> Regards,
> Nicholas Kazlauskas
>
>
Ok, thanks. Tested it on a Samsung monitor with 48 Hz - 144 Hz range on a
Raven Ridge gpu. It worked with and without your patch, both with my tests,
and with the that VRRTester application from GitHub, so i guess the problem
is highly dependent on small timing differences in game execution, vblank
durations, etc.

So fwif here's my

Reviewed-and-Tested-by: Mario Kleiner 

-mario

On 2020-05-07 12:58 p.m., Mario Kleiner wrote:
>
> Looking over it now, will do some testing. Alex amdgpu-drm-next branch
> would be the best to test this?
>
> It looks like a revert of the whole vstartup series, except for that one
> if(acrtc_state->active_planes == 0)  special case, so i expect it should be
> fine?
>
> thanks,
> -mario
>
>
> On Thu, May 7, 2020 at 5:56 PM Leo Li  wrote:
>
>>
>>
>> On 2020-05-06 3:47 p.m., Nicholas Kazlauskas wrote:
>> > [Why]
>> > We're the drm vblank event a frame too early in the case where the
>> ^sending
>>
>> Thanks for catching this!
>>
>> Reviewed-by: Leo Li 
>>
>> > pageflip happens close to VUPDATE and ends up blocking the signal.
>> >
>> > The implementation in DM was previously correct *before* we started
>> > sending vblank events from VSTARTUP unconditionally to handle cases
>> > where HUBP was off, OTG was ON and userspace was still requesting some
>> > DRM planes enabled. As part of that patch series we dropped VUPDATE
>> > since it was deemed close enough to VSTARTUP, but there's a key
>> > difference betweeen VSTARTUP and VUPDATE - the VUPDATE signal can be
>> > blocked if we're holding the pipe lock >
>> > There was a fix recently to revert the unconditional behavior for the
>> > DCN VSTARTUP vblank event since it was sending the pageflip event on
>> > the wrong frame - once again, due to blocking VUPDATE and having the
>> > address start scanning out two frames later.
>> >
>> > The problem with this fix is it didn't update the logic that calls
>> > drm_crtc_handle_vblank(), so the timestamps are totally bogus now.
>> >
>> > [How]
>> > Essentially reverts most of the original VSTARTUP series but retains
>> > the behavior to send back events when active planes == 0.
>> >
>> > Some refactoring/cleanup was done to not have duplicated code in both
>> > the handlers.
>> >
>> > Fixes: 16f17eda8bad ("drm/amd/display: Send vblank and user events at
>> vsartup for DCN")
>> > Fixes: 3a2ce8d66a4b ("drm/amd/display: Disable VUpdate interrupt for
>> DCN hardware")
>> > Fixes: 2b5aed9ac3f7 ("drm/amd/display: Fix pageflip event race
>> condition for DCN.")
>> >
>> > Cc: Leo Li 
>> > Cc: Mario Kleiner 
>> > Signed-off-by: Nicholas Kazlauskas 
>> > ---
>> >   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 137 +++---
>> >   1 file changed, 55 insertions(+), 82 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> > index 59f1d4a94f12..30ce28f7c444 100644
>> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> > @@ -441,7 +441,7 @@ static void dm_vupdate_high_irq(void
>> *interrupt_params)
>> >
>> >   /**
>> >* dm_crtc_high_irq() - Handles CRTC interrupt
>> > - * @interrupt_params: ignored
>> > + * @interrupt_params: used for determining the CRTC instance
>> >*
>> >* Handles the CRTC/VSYNC interrupt by notfying DRM's VBLANK
>> >* event

Re: [PATCH] drm/amd/display: Fix vblank and pageflip event handling for FreeSync

2020-05-07 Thread Mario Kleiner

Looking over it now, will do some testing. Alex amdgpu-drm-next branch
would be the best to test this?

It looks like a revert of the whole vstartup series, except for that one
if(acrtc_state->active_planes == 0)  special case, so i expect it should be
fine?

thanks,
-mario


On Thu, May 7, 2020 at 5:56 PM Leo Li  wrote:

>
>
> On 2020-05-06 3:47 p.m., Nicholas Kazlauskas wrote:
> > [Why]
> > We're the drm vblank event a frame too early in the case where the
> ^sending
>
> Thanks for catching this!
>
> Reviewed-by: Leo Li 
>
> > pageflip happens close to VUPDATE and ends up blocking the signal.
> >
> > The implementation in DM was previously correct *before* we started
> > sending vblank events from VSTARTUP unconditionally to handle cases
> > where HUBP was off, OTG was ON and userspace was still requesting some
> > DRM planes enabled. As part of that patch series we dropped VUPDATE
> > since it was deemed close enough to VSTARTUP, but there's a key
> > difference betweeen VSTARTUP and VUPDATE - the VUPDATE signal can be
> > blocked if we're holding the pipe lock >
> > There was a fix recently to revert the unconditional behavior for the
> > DCN VSTARTUP vblank event since it was sending the pageflip event on
> > the wrong frame - once again, due to blocking VUPDATE and having the
> > address start scanning out two frames later.
> >
> > The problem with this fix is it didn't update the logic that calls
> > drm_crtc_handle_vblank(), so the timestamps are totally bogus now.
> >
> > [How]
> > Essentially reverts most of the original VSTARTUP series but retains
> > the behavior to send back events when active planes == 0.
> >
> > Some refactoring/cleanup was done to not have duplicated code in both
> > the handlers.
> >
> > Fixes: 16f17eda8bad ("drm/amd/display: Send vblank and user events at
> vsartup for DCN")
> > Fixes: 3a2ce8d66a4b ("drm/amd/display: Disable VUpdate interrupt for DCN
> hardware")
> > Fixes: 2b5aed9ac3f7 ("drm/amd/display: Fix pageflip event race condition
> for DCN.")
> >
> > Cc: Leo Li 
> > Cc: Mario Kleiner 
> > Signed-off-by: Nicholas Kazlauskas 
> > ---
> >   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 137 +++---
> >   1 file changed, 55 insertions(+), 82 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 59f1d4a94f12..30ce28f7c444 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -441,7 +441,7 @@ static void dm_vupdate_high_irq(void
> *interrupt_params)
> >
> >   /**
> >* dm_crtc_high_irq() - Handles CRTC interrupt
> > - * @interrupt_params: ignored
> > + * @interrupt_params: used for determining the CRTC instance
> >*
> >* Handles the CRTC/VSYNC interrupt by notfying DRM's VBLANK
> >* event handler.
> > @@ -455,70 +455,6 @@ static void dm_crtc_high_irq(void *interrupt_params)
> >   unsigned long flags;
> >
> >   acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src -
> IRQ_TYPE_VBLANK);
> > -
> > - if (acrtc) {
> > - acrtc_state = to_dm_crtc_state(acrtc->base.state);
> > -
> > - DRM_DEBUG_VBL("crtc:%d, vupdate-vrr:%d\n",
> > -   acrtc->crtc_id,
> > -   amdgpu_dm_vrr_active(acrtc_state));
> > -
> > - /* Core vblank handling at start of front-porch is only
> possible
> > -  * in non-vrr mode, as only there vblank timestamping will
> give
> > -  * valid results while done in front-porch. Otherwise
> defer it
> > -  * to dm_vupdate_high_irq after end of front-porch.
> > -  */
> > - if (!amdgpu_dm_vrr_active(acrtc_state))
> > - drm_crtc_handle_vblank(>base);
> > -
> > - /* Following stuff must happen at start of vblank, for crc
> > -  * computation and below-the-range btr support in vrr mode.
> > -  */
> > - amdgpu_dm_crtc_handle_crc_irq(>base);
> > -
> > - if (acrtc_state->stream && adev->family >=
> AMDGPU_FAMILY_AI &&
> > - acrtc_state->vrr_params.supported &&
> > - acrtc_state->freesync_config.state ==
> VRR_STATE_ACTIVE_VARIABLE) {
> > - spin_lock_irqsave(>

Re: [PATCH] drm/amd/display: Fix pageflip event race condition for DCN. (v2)

2020-03-13 Thread Mario Kleiner

On Fri, Mar 13, 2020 at 5:02 PM Michel Dänzer  wrote:

> On 2020-03-13 1:35 p.m., Kazlauskas, Nicholas wrote:
> > On 2020-03-12 10:32 a.m., Alex Deucher wrote:
> >> On Thu, Mar 5, 2020 at 4:21 PM Mario Kleiner
> >>  wrote:
> >>>
> >>> Commit '16f17eda8bad ("drm/amd/display: Send vblank and user
> >>> events at vsartup for DCN")' introduces a new way of pageflip
> >>> completion handling for DCN, and some trouble.
> >>>
> >>> The current implementation introduces a race condition, which
> >>> can cause pageflip completion events to be sent out one vblank
> >>> too early, thereby confusing userspace and causing flicker:
> >>>
> >>> prepare_flip_isr():
> >>>
> >>> 1. Pageflip programming takes the ddev->event_lock.
> >>> 2. Sets acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED
> >>> 3. Releases ddev->event_lock.
> >>>
> >>> --> Deadline for surface address regs double-buffering passes on
> >>>  target pipe.
> >>>
> >>> 4. dc_commit_updates_for_stream() MMIO programs the new pageflip
> >>> into hw, but too late for current vblank.
> >>>
> >>> => pflip_status == AMDGPU_FLIP_SUBMITTED, but flip won't complete
> >>> in current vblank due to missing the double-buffering deadline
> >>> by a tiny bit.
> >>>
> >>> 5. VSTARTUP trigger point in vblank is reached, VSTARTUP irq fires,
> >>> dm_dcn_crtc_high_irq() gets called.
> >>>
> >>> 6. Detects pflip_status == AMDGPU_FLIP_SUBMITTED and assumes the
> >>> pageflip has been completed/will complete in this vblank and
> >>> sends out pageflip completion event to userspace and resets
> >>> pflip_status = AMDGPU_FLIP_NONE.
> >>>
> >>> => Flip completion event sent out one vblank too early.
> >>>
> >>> This behaviour has been observed during my testing with measurement
> >>> hardware a couple of time.
> >>>
> >>> The commit message says that the extra flip event code was added to
> >>> dm_dcn_crtc_high_irq() to prevent missing to send out pageflip events
> >>> in case the pflip irq doesn't fire, because the "DCH HUBP" component
> >>> is clock gated and doesn't fire pflip irqs in that state. Also that
> >>> this clock gating may happen if no planes are active. According to
> >>> Nicholas, the clock gating can also happen if psr is active, and the
> >>> gating is controlled independently by the hardware, so difficult to
> >>> detect if and when the completion code in above commit is needed.
> >>>
> >>> This patch tries the following solution: It only executes the extra
> >>> pflip
> >>> completion code in dm_dcn_crtc_high_irq() iff the hardware reports
> >>> that there aren't any surface updated pending in the double-buffered
> >>> surface scanout address registers. Otherwise it leaves pflip completion
> >>> to the pflip irq handler, for a more race-free experience.
> >>>
> >>> This would only guard against the order of events mentioned above.
> >>> If Step 5 (VSTARTUP trigger) happens before step 4 then this won't help
> >>> at all, because 1-3 + 5 might happen even without the hw being
> >>> programmed
> >>> at all, ie. no surface update pending because none yet programmed
> >>> into hw.
> >>>
> >>> Therefore this patch also changes locking in amdgpu_dm_commit_planes(),
> >>> so that prepare_flip_isr() and dc_commit_updates_for_stream() are done
> >>> under event_lock protection within the same critical section.
> >>>
> >>> v2: Take Nicholas comments into account, try a different solution.
> >>>
> >>> Lightly tested on Polaris (locking) and Raven (the whole DCN stuff).
> >>> Seems to work without causing obvious new trouble.
> >>
> >> Nick, any comments on this?  Can we get this committed or do you think
> >> it needs additional rework?
> >>
> >> Thanks,
> >>
> >> Alex
> >
> > Hi Alex, Mario,
> >
> > This might be a little strange, but if we want to get this in as a fix
> > for regressions caused by the original vblank and user events at
> > vstartup patch then I'm actually going to give my reviewed by on the
> > *v1* of this patch (but not this v2):
> >
&g

Re: [PATCH] drm/amd/display: Fix pageflip event race condition for DCN.

2020-03-05 Thread Mario Kleiner

On Mon, Mar 2, 2020 at 2:57 PM Kazlauskas, Nicholas
 wrote:
>
> On 2020-03-02 1:17 a.m., Mario Kleiner wrote:
> > Commit '16f17eda8bad ("drm/amd/display: Send vblank and user
> > events at vsartup for DCN")' introduces a new way of pageflip
> > completion handling for DCN, and some trouble.
> >
> > The current implementation introduces a race condition, which
> > can cause pageflip completion events to be sent out one vblank
> > too early, thereby confusing userspace and causing flicker:
> >
> > prepare_flip_isr():
> >
> > 1. Pageflip programming takes the ddev->event_lock.
> > 2. Sets acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED
> > 3. Releases ddev->event_lock.
> >
> > --> Deadline for surface address regs double-buffering passes on
> >  target pipe.
> >
> > 4. dc_commit_updates_for_stream() MMIO programs the new pageflip
> > into hw, but too late for current vblank.
> >
> > => pflip_status == AMDGPU_FLIP_SUBMITTED, but flip won't complete
> > in current vblank due to missing the double-buffering deadline
> > by a tiny bit.
> >
> > 5. VSTARTUP trigger point in vblank is reached, VSTARTUP irq fires,
> > dm_dcn_crtc_high_irq() gets called.
> >
> > 6. Detects pflip_status == AMDGPU_FLIP_SUBMITTED and assumes the
> > pageflip has been completed/will complete in this vblank and
> > sends out pageflip completion event to userspace and resets
> > pflip_status = AMDGPU_FLIP_NONE.
> >
> > => Flip completion event sent out one vblank too early.
> >
> > This behaviour has been observed during my testing with measurement
> > hardware a couple of time.
> >
> > The commit message says that the extra flip event code was added to
> > dm_dcn_crtc_high_irq() to prevent missing to send out pageflip events
> > in case the pflip irq doesn't fire, because the "DCH HUBP" component
> > is clock gated and doesn't fire pflip irqs in that state. Also that
> > this clock gating may happen if no planes are active. This suggests
> > that the problem addressed by that commit can't happen if planes
> > are active.
> >
> > The proposed solution is therefore to only execute the extra pflip
> > completion code iff the count of active planes is zero and otherwise
> > leave pflip completion handling to the pflip irq handler, for a
> > more race-free experience.
> >
> > Note that i don't know if this fixes the problem the original commit
> > tried to address, as i don't know what the test scenario was. It
> > does fix the observed too early pageflip events though and points
> > out the problem introduced.
>
> This looks like a valid race condition that should be addressed.

Indeed! And if possible in any way, before Linux 5.6 is released. For
my use cases, neuroscience/medical research, this is a serious problem
which would make DCN gpu's basically unusable for most work. Problems
affecting flip timing / flip completion / timestamping are always
worst case problems for my users.

>
> Unfortunately this also doesn't fix the problem the original commit was
> trying to address.
>
> HUBP interrupts only trigger when it's not clock gated. But there are
> cases (for example, PSR) where the HUBP can be clock gated but the
> active plane count is greater than zero.
>
> The clock gating switch can typically happens outside of x86 control
> flow so we're not really going to understand in advance whether or not
> we'll be able to receive the pflip IRQ.
>

Oh dear! So how can that happen? Could explain to me in more detail,
how this works? What's the job of HUBP, apart from (not) triggering
pflip interrupts reliably? Is the scenario here that the desktop is
detected idle for a while (how?) and PSR kicks in and HUBP gets clock
gated, but somehow vblank interrupts are still active? I thought panel
self refresh only enables on long idle display, so scanout from the
gpu can be basically disabled while the panel refreshes itself? Is a
programmed pageflip then automatically (no host cpu involvement)
putting the panel out of self refresh and turning scanout on and the
pageflip completion enables HUBP again, but HUBP doesn't trigger the
pflip irq because it somehow missed that due to being clock-gated at
time of flip completion?

I'd really like to understand in more detail how this stuff works on
your recent hw, and also which irqs / events / trigger points are
associated with what actions of the hw. I have the feeling there will
be more "fun" lingering in the future. I also wanted to experiment
more with some VRR ideas, so any details about which hw events happen
when and fire which irq's and which double-buffer

[PATCH] drm/amd/display: Fix pageflip event race condition for DCN. (v2)

2020-03-05 Thread Mario Kleiner

Commit '16f17eda8bad ("drm/amd/display: Send vblank and user
events at vsartup for DCN")' introduces a new way of pageflip
completion handling for DCN, and some trouble.

The current implementation introduces a race condition, which
can cause pageflip completion events to be sent out one vblank
too early, thereby confusing userspace and causing flicker:

prepare_flip_isr():

1. Pageflip programming takes the ddev->event_lock.
2. Sets acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED
3. Releases ddev->event_lock.

--> Deadline for surface address regs double-buffering passes on
target pipe.

4. dc_commit_updates_for_stream() MMIO programs the new pageflip
   into hw, but too late for current vblank.

=> pflip_status == AMDGPU_FLIP_SUBMITTED, but flip won't complete
   in current vblank due to missing the double-buffering deadline
   by a tiny bit.

5. VSTARTUP trigger point in vblank is reached, VSTARTUP irq fires,
   dm_dcn_crtc_high_irq() gets called.

6. Detects pflip_status == AMDGPU_FLIP_SUBMITTED and assumes the
   pageflip has been completed/will complete in this vblank and
   sends out pageflip completion event to userspace and resets
   pflip_status = AMDGPU_FLIP_NONE.

=> Flip completion event sent out one vblank too early.

This behaviour has been observed during my testing with measurement
hardware a couple of time.

The commit message says that the extra flip event code was added to
dm_dcn_crtc_high_irq() to prevent missing to send out pageflip events
in case the pflip irq doesn't fire, because the "DCH HUBP" component
is clock gated and doesn't fire pflip irqs in that state. Also that
this clock gating may happen if no planes are active. According to
Nicholas, the clock gating can also happen if psr is active, and the
gating is controlled independently by the hardware, so difficult to
detect if and when the completion code in above commit is needed.

This patch tries the following solution: It only executes the extra pflip
completion code in dm_dcn_crtc_high_irq() iff the hardware reports
that there aren't any surface updated pending in the double-buffered
surface scanout address registers. Otherwise it leaves pflip completion
to the pflip irq handler, for a more race-free experience.

This would only guard against the order of events mentioned above.
If Step 5 (VSTARTUP trigger) happens before step 4 then this won't help
at all, because 1-3 + 5 might happen even without the hw being programmed
at all, ie. no surface update pending because none yet programmed into hw.

Therefore this patch also changes locking in amdgpu_dm_commit_planes(),
so that prepare_flip_isr() and dc_commit_updates_for_stream() are done
under event_lock protection within the same critical section.

v2: Take Nicholas comments into account, try a different solution.

Lightly tested on Polaris (locking) and Raven (the whole DCN stuff).
Seems to work without causing obvious new trouble.

Fixes: 16f17eda8bad ("drm/amd/display: Send vblank and user events at vsartup 
for DCN")
Signed-off-by: Mario Kleiner 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 80 ---
 1 file changed, 67 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index d7df1a85e72f..aa4e941b276f 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -287,6 +287,28 @@ static inline bool amdgpu_dm_vrr_active(struct 
dm_crtc_state *dm_state)
   dm_state->freesync_config.state == VRR_STATE_ACTIVE_FIXED;
 }
 
+/**
+ * dm_crtc_is_flip_pending() - Is a pageflip pending on this crtc?
+ *
+ * Returns true if any plane on the crtc has a flip pending, false otherwise.
+ */
+static bool dm_crtc_is_flip_pending(struct dm_crtc_state *acrtc_state)
+{
+   struct dc_stream_status *status = 
dc_stream_get_status(acrtc_state->stream);
+   const struct dc_plane_status *plane_status;
+   int i;
+   bool pending = false;
+
+   for (i = 0; i < status->plane_count; i++) {
+   plane_status = dc_plane_get_status(status->plane_states[i]);
+   pending |= plane_status->is_flip_pending;
+   DRM_DEBUG_DRIVER("plane:%d, flip_pending=%d\n",
+i, plane_status->is_flip_pending);
+   }
+
+   return pending;
+}
+
 /**
  * dm_pflip_high_irq() - Handle pageflip interrupt
  * @interrupt_params: ignored
@@ -435,6 +457,11 @@ static void dm_vupdate_high_irq(void *interrupt_params)
spin_unlock_irqrestore(>ddev->event_lock, 
flags);
}
}
+
+   if (acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED) {
+   DRM_DEBUG_DRIVER("%s:crtc:%d, flip_pending=%d\n", 
__func__,
+   acrtc->crtc_id,

[PATCH] drm/amd/display: Fix pageflip event race condition for DCN.

2020-03-01 Thread Mario Kleiner

Commit '16f17eda8bad ("drm/amd/display: Send vblank and user
events at vsartup for DCN")' introduces a new way of pageflip
completion handling for DCN, and some trouble.

The current implementation introduces a race condition, which
can cause pageflip completion events to be sent out one vblank
too early, thereby confusing userspace and causing flicker:

prepare_flip_isr():

1. Pageflip programming takes the ddev->event_lock.
2. Sets acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED
3. Releases ddev->event_lock.

--> Deadline for surface address regs double-buffering passes on
target pipe.

4. dc_commit_updates_for_stream() MMIO programs the new pageflip
   into hw, but too late for current vblank.

=> pflip_status == AMDGPU_FLIP_SUBMITTED, but flip won't complete
   in current vblank due to missing the double-buffering deadline
   by a tiny bit.

5. VSTARTUP trigger point in vblank is reached, VSTARTUP irq fires,
   dm_dcn_crtc_high_irq() gets called.

6. Detects pflip_status == AMDGPU_FLIP_SUBMITTED and assumes the
   pageflip has been completed/will complete in this vblank and
   sends out pageflip completion event to userspace and resets
   pflip_status = AMDGPU_FLIP_NONE.

=> Flip completion event sent out one vblank too early.

This behaviour has been observed during my testing with measurement
hardware a couple of time.

The commit message says that the extra flip event code was added to
dm_dcn_crtc_high_irq() to prevent missing to send out pageflip events
in case the pflip irq doesn't fire, because the "DCH HUBP" component
is clock gated and doesn't fire pflip irqs in that state. Also that
this clock gating may happen if no planes are active. This suggests
that the problem addressed by that commit can't happen if planes
are active.

The proposed solution is therefore to only execute the extra pflip
completion code iff the count of active planes is zero and otherwise
leave pflip completion handling to the pflip irq handler, for a
more race-free experience.

Note that i don't know if this fixes the problem the original commit
tried to address, as i don't know what the test scenario was. It
does fix the observed too early pageflip events though and points
out the problem introduced.

Fixes: 16f17eda8bad ("drm/amd/display: Send vblank and user events at vsartup 
for DCN")
Signed-off-by: Mario Kleiner 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 63e8a12a74bc..3502d6d52160 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -522,8 +522,9 @@ static void dm_dcn_crtc_high_irq(void *interrupt_params)
 
acrtc_state = to_dm_crtc_state(acrtc->base.state);
 
-   DRM_DEBUG_DRIVER("crtc:%d, vupdate-vrr:%d\n", acrtc->crtc_id,
-   amdgpu_dm_vrr_active(acrtc_state));
+   DRM_DEBUG_DRIVER("crtc:%d, vupdate-vrr:%d, planes:%d\n", acrtc->crtc_id,
+amdgpu_dm_vrr_active(acrtc_state),
+acrtc_state->active_planes);
 
amdgpu_dm_crtc_handle_crc_irq(>base);
drm_crtc_handle_vblank(>base);
@@ -543,7 +544,18 @@ static void dm_dcn_crtc_high_irq(void *interrupt_params)
_state->vrr_params.adjust);
}
 
-   if (acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED) {
+   /*
+* If there aren't any active_planes then DCH HUBP may be clock-gated.
+* In that case, pageflip completion interrupts won't fire and pageflip
+* completion events won't get delivered. Prevent this by sending
+* pending pageflip events from here if a flip is still pending.
+*
+* If any planes are enabled, use dm_pflip_high_irq() instead, to
+* avoid race conditions between flip programming and completion,
+* which could cause too early flip completion events.
+*/
+   if (acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED &&
+   acrtc_state->active_planes == 0) {
if (acrtc->event) {
drm_crtc_send_vblank_event(>base, acrtc->event);
acrtc->event = NULL;
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/2] drm/amd/display: Allow current eDP link settings to override verified ones.

2020-02-28 Thread Mario Kleiner

On Thu, Feb 27, 2020 at 8:11 PM Mario Kleiner
 wrote:
>
> Hi Harry
>
> Ok, back from various other emergencies and deadlines, sorry for the
> late reply. I also fixed my e-mail address - it was mistyped, causing
> all these delivery failures :/
>
> On Thu, Jan 9, 2020 at 10:26 PM Harry Wentland  wrote:
> >
> > On 2020-01-09 4:13 p.m., Mario Kleiner wrote:
> > > On Thu, Jan 9, 2020 at 7:44 PM Harry Wentland  > > <mailto:hwent...@amd.com>> wrote:
> > >
> > > On 2020-01-09 10:20 a.m., Mario Kleiner wrote:
> > > > If the current eDP link settings, as read from hw, provide a higher
> > > > bandwidth than the verified_link_cap ones (= reported_link_cap), 
> > > then
> > > > override verified_link_cap with current settings.
> > > >
> > > > These initial current eDP link settings have been set up by
> > > > firmware during boot, so they should work on the eDP panel.
> > > > Therefore use them if the firmware thinks they are good and
> > > > they provide higher link bandwidth, e.g., to enable higher
> > > > resolutions / color depths.
> > > >
> ... snip ...
> > >
> > >
> > > Tried that already (see other mail), replacing the whole if statement
> > > with a if (true) to force reading DP_SUPPORTED_LINK_RATES. The whole
> > > table reads back as all-zero, and versions are DP 1.1, eDP 1.3, not 1.4+
> > > as what seems to be required. The use the classic link bw stuff, but
> > > with a non-standard link bandwidth multiplier of 0xc, and a reported
> > > DP_MAX_LINK_RATE of 0xa, contradicting the 0xc setting that the firmware
> > > sets at bootup.
> > >
> > > Seems to be a very Apple thing...
> >
> > Indeed. I think it was a funky panel that was "ahead of its time" and
> > ahead of the spec.
> >
> > I would prefer a DPCD quirk for this panel that updates the reported DP
> > caps, rather than picking the "current" ones from the FW lightup.
> >
> > Harry
> >
>
> How would i do this? I see various options:
>
> I could rewrite my current patch, move it down inside
> dc_link_detect_helper() until after the edid was read and we have
> vendor/model id available, then say if(everything that's there now &&
> (vendor=Apple) && (model=Troublesomepanel)) { ... }
>
> Or i could add quirk code to detect_edp_sink_caps() after
> retrieve_link_cap() [or inside retrieve_link_cap] to override the
> reported_link_cap. But at that point we don't have edid yet and
> therefore no vendor/model id. Is there something inside the dpcd one
> can use to uniquely identify this display model?
>
> struct dp_device_vendor_id sink_id; queried inside retrieve_link_cap()
> sounds like it could be a unique id? I don't know about that.
>
> My intention was to actually do nothing on the AMD side here, as my
> photometer measurements suggest that the panel gives better quality
> results for >= 10 bpc output if it is operated at 8 bit and then the
> gpu's spatial dithering convincingly fakes the extra bits. Quality
> seems worse if one actually switches the panel into 10 bpc, as it
> doesn't seem to be a real 10 bit panel, just a 8 bit panel that
> accepts 10 bit and then badly dithers it to 10 bit.
>
> The situation has changed for Linux 5.6-rc, because of this recent
> commit from Roman Li, which is already in 5.6-rc:
> 4a8ca46bae8affba063aabac85a0b1401ba810a3 "drm/amd/display: Default max
> bpc to 16 for eDP"
>
> While that commit supposedly fixes some darkness on some other eDP
> panel, it now breaks my eDP panel. It leaves edid reported bpc
> unclamped, so the driver uses 10 bpc as basis for required bandwidth
> calculations and then the required bandwidth for all modes exceeds the
> link bandwidth. I end with the eDP panel having no valid modes at all
> ==> Panel goes black, game over.
>
> We either need to revert that commit for drm-fixes, or quirk it for
> the specific panels that are troublesome, or need to get some solution
> into 5.6-rc, otherwise there will be a lot of regressions for at least
> Apple MBP users.
>
> thanks,
> -mario
>

Ok, just sent out a patch with a specific dpcd quirk for this as:

[PATCH] drm/amd/display: Add link_rate quirk for Apple 15" MBP 2017

Tested against drm-next for Linux 5.6, works.

-mario
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/display: Add link_rate quirk for Apple 15" MBP 2017

2020-02-28 Thread Mario Kleiner

This fixes a problem found on the MacBookPro 2017 Retina panel:

The panel reports 10 bpc color depth in its EDID, and the
firmware chooses link settings at boot which support enough
bandwidth for 10 bpc (324000 kbit/sec aka LINK_RATE_RBR2
aka 0xc), but the DP_MAX_LINK_RATE dpcd register only reports
2.7 Gbps (multiplier value 0xa) as possible, in direct
contradiction of what the firmware successfully set up.

This restricts the panel to 8 bpc, not providing the full
color depth of the panel on Linux <= 5.5. Additionally, commit
'4a8ca46bae8a ("drm/amd/display: Default max bpc to 16 for eDP")'
introduced into Linux 5.6-rc1 will unclamp panel depth to
its full 10 bpc, thereby requiring a eDP bandwidth for all
modes that exceeds the bandwidth available and causes all modes
to fail validation -> No modes for the laptop panel -> failure
to set any mode -> Panel goes dark.

This patch adds a quirk specific to the MBP 2017 15" Retina
panel to override reported max link rate to the correct maximum
of 0xc = LINK_RATE_RBR2 to fix the darkness and reduced display
precision.

Please apply for Linux 5.6+ to avoid regressing Apple MBP panel
support.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index cb731c1d30b1..fd9e69634c50 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -3401,6 +3401,17 @@ static bool retrieve_link_cap(struct dc_link *link)
sink_id.ieee_device_id,
sizeof(sink_id.ieee_device_id));
 
+   /* Quirk Apple MBP 2017 15" Retina panel: Wrong DP_MAX_LINK_RATE */
+   {
+   uint8_t str_mbp_2017[] = { 101, 68, 21, 101, 98, 97 };
+
+   if ((link->dpcd_caps.sink_dev_id == 0x0010fa) &&
+   !memcmp(link->dpcd_caps.sink_dev_id_str, str_mbp_2017,
+   sizeof(str_mbp_2017))) {
+   link->reported_link_cap.link_rate = 0x0c;
+   }
+   }
+
core_link_read_dpcd(
link,
DP_SINK_HW_REVISION_START,
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/2] drm/amd/display: Allow current eDP link settings to override verified ones.

2020-02-27 Thread Mario Kleiner

Hi Harry

Ok, back from various other emergencies and deadlines, sorry for the
late reply. I also fixed my e-mail address - it was mistyped, causing
all these delivery failures :/

On Thu, Jan 9, 2020 at 10:26 PM Harry Wentland  wrote:
>
> On 2020-01-09 4:13 p.m., Mario Kleiner wrote:
> > On Thu, Jan 9, 2020 at 7:44 PM Harry Wentland  > <mailto:hwent...@amd.com>> wrote:
> >
> > On 2020-01-09 10:20 a.m., Mario Kleiner wrote:
> > > If the current eDP link settings, as read from hw, provide a higher
> > > bandwidth than the verified_link_cap ones (= reported_link_cap), then
> > > override verified_link_cap with current settings.
> > >
> > > These initial current eDP link settings have been set up by
> > > firmware during boot, so they should work on the eDP panel.
> > > Therefore use them if the firmware thinks they are good and
> > > they provide higher link bandwidth, e.g., to enable higher
> > > resolutions / color depths.
> > >
... snip ...
> >
> >
> > Tried that already (see other mail), replacing the whole if statement
> > with a if (true) to force reading DP_SUPPORTED_LINK_RATES. The whole
> > table reads back as all-zero, and versions are DP 1.1, eDP 1.3, not 1.4+
> > as what seems to be required. The use the classic link bw stuff, but
> > with a non-standard link bandwidth multiplier of 0xc, and a reported
> > DP_MAX_LINK_RATE of 0xa, contradicting the 0xc setting that the firmware
> > sets at bootup.
> >
> > Seems to be a very Apple thing...
>
> Indeed. I think it was a funky panel that was "ahead of its time" and
> ahead of the spec.
>
> I would prefer a DPCD quirk for this panel that updates the reported DP
> caps, rather than picking the "current" ones from the FW lightup.
>
> Harry
>

How would i do this? I see various options:

I could rewrite my current patch, move it down inside
dc_link_detect_helper() until after the edid was read and we have
vendor/model id available, then say if(everything that's there now &&
(vendor=Apple) && (model=Troublesomepanel)) { ... }

Or i could add quirk code to detect_edp_sink_caps() after
retrieve_link_cap() [or inside retrieve_link_cap] to override the
reported_link_cap. But at that point we don't have edid yet and
therefore no vendor/model id. Is there something inside the dpcd one
can use to uniquely identify this display model?

struct dp_device_vendor_id sink_id; queried inside retrieve_link_cap()
sounds like it could be a unique id? I don't know about that.

My intention was to actually do nothing on the AMD side here, as my
photometer measurements suggest that the panel gives better quality
results for >= 10 bpc output if it is operated at 8 bit and then the
gpu's spatial dithering convincingly fakes the extra bits. Quality
seems worse if one actually switches the panel into 10 bpc, as it
doesn't seem to be a real 10 bit panel, just a 8 bit panel that
accepts 10 bit and then badly dithers it to 10 bit.

The situation has changed for Linux 5.6-rc, because of this recent
commit from Roman Li, which is already in 5.6-rc:
4a8ca46bae8affba063aabac85a0b1401ba810a3 "drm/amd/display: Default max
bpc to 16 for eDP"

While that commit supposedly fixes some darkness on some other eDP
panel, it now breaks my eDP panel. It leaves edid reported bpc
unclamped, so the driver uses 10 bpc as basis for required bandwidth
calculations and then the required bandwidth for all modes exceeds the
link bandwidth. I end with the eDP panel having no valid modes at all
==> Panel goes black, game over.

We either need to revert that commit for drm-fixes, or quirk it for
the specific panels that are troublesome, or need to get some solution
into 5.6-rc, otherwise there will be a lot of regressions for at least
Apple MBP users.

thanks,
-mario

> > -mario
> >
> >
> >
> > Thanks,
> > Harry
> >
> > > This fixes a problem found on the MacBookPro 2017 Retina panel:
> > >
> > > The panel reports 10 bpc color depth in its EDID, and the
> > > firmware chooses link settings at boot which support enough
> > > bandwidth for 10 bpc (324000 kbit/sec aka LINK_RATE_RBR2),
> > > but the DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps
> > > as possible, so verified_link_cap is only good for 2.7 Gbps
> > > and 8 bpc, not providing the full color depth of the panel.
> > >
> > > Signed-off-by: Mario Kleiner  > <mailto:mario.kleiner...@gmail.com>>
> > > Cc: Alex Deucher  > <mailto:alexander.deuc...@amd.com>>
> > > ---
> > >  drivers/gpu/drm

Re: [PATCH 2/2] drm/amd/display: Allow current eDP link settings to override verified ones.

2020-01-09 Thread Mario Kleiner

On Thu, Jan 9, 2020 at 7:44 PM Harry Wentland  wrote:

> On 2020-01-09 10:20 a.m., Mario Kleiner wrote:
> > If the current eDP link settings, as read from hw, provide a higher
> > bandwidth than the verified_link_cap ones (= reported_link_cap), then
> > override verified_link_cap with current settings.
> >
> > These initial current eDP link settings have been set up by
> > firmware during boot, so they should work on the eDP panel.
> > Therefore use them if the firmware thinks they are good and
> > they provide higher link bandwidth, e.g., to enable higher
> > resolutions / color depths.
> >
>


Hi Harry, happy new year!

This only works when taking over from UEFI, so on boot or resume from
> hibernate. This wouldn't work on a normal suspend/resume.
>
>
See the other thread i just cc'ed you on. Depends if
dc_link_detect_helper() gets skipped/early returns or not on EDP. Some if
statement suggests it might get skipped on EDP + resume?


> Can you check if setting link->dc->config.optimize_edp_link_rate (see
> first if statement in detect_edp_sink_caps) fixes this? I imagine we
> need to read the reported settings from DP_SUPPORTED_LINK_RATES and fail
> to do so.
>

Tried that already (see other mail), replacing the whole if statement with
a if (true) to force reading DP_SUPPORTED_LINK_RATES. The whole table reads
back as all-zero, and versions are DP 1.1, eDP 1.3, not 1.4+ as what seems
to be required. The use the classic link bw stuff, but with a non-standard
link bandwidth multiplier of 0xc, and a reported DP_MAX_LINK_RATE of 0xa,
contradicting the 0xc setting that the firmware sets at bootup.

Seems to be a very Apple thing...
-mario


>
> Thanks,
> Harry
>
> > This fixes a problem found on the MacBookPro 2017 Retina panel:
> >
> > The panel reports 10 bpc color depth in its EDID, and the
> > firmware chooses link settings at boot which support enough
> > bandwidth for 10 bpc (324000 kbit/sec aka LINK_RATE_RBR2),
> > but the DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps
> > as possible, so verified_link_cap is only good for 2.7 Gbps
> > and 8 bpc, not providing the full color depth of the panel.
> >
> > Signed-off-by: Mario Kleiner 
> > Cc: Alex Deucher 
> > ---
> >  drivers/gpu/drm/amd/display/dc/core/dc_link.c | 21 +++
> >  1 file changed, 21 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> > index 5ea4a1675259..f3acdb8fead5 100644
> > --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> > +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> > @@ -819,6 +819,27 @@ static bool dc_link_detect_helper(struct dc_link
> *link,
> >   case SIGNAL_TYPE_EDP: {
> >   detect_edp_sink_caps(link);
> >   read_current_link_settings_on_detect(link);
> > +
> > + /* If cur_link_settings provides higher bandwidth
> than
> > +  * verified_link_cap, then use cur_link_settings
> as new
> > +  * verified_link_cap, as it obviously works
> according to
> > +  * firmware boot setup.
> > +  *
> > +  * This has been observed on the Apple MacBookPro
> 2017
> > +  * Retina panel, which boots with a link setting
> higher
> > +  * than what dpcd[DP_MAX_LINK_RATE] claims as
> possible.
> > +  * Overriding allows to run the panel at 10 bpc /
> 30 bit.
> > +  */
> > + if (dc_link_bandwidth_kbps(link,
> >cur_link_settings) >
> > + dc_link_bandwidth_kbps(link,
> >verified_link_cap)) {
> > + DC_LOG_DETECTION_DP_CAPS(
> > + "eDP current link setting bw %d kbps >
> verified_link_cap %d kbps. Override.",
> > + dc_link_bandwidth_kbps(link,
> >cur_link_settings),
> > + dc_link_bandwidth_kbps(link,
> >verified_link_cap));
> > +
> > + link->verified_link_cap =
> link->cur_link_settings;
> > + }
> > +
> >   sink_caps.transaction_type =
> DDC_TRANSACTION_TYPE_I2C_OVER_AUX;
> >   sink_caps.signal = SIGNAL_TYPE_EDP;
> >   break;
> >
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/2] drm/amd/display: Allow current eDP link settings to override verified ones.

2020-01-09 Thread Mario Kleiner

If the current eDP link settings, as read from hw, provide a higher
bandwidth than the verified_link_cap ones (= reported_link_cap), then
override verified_link_cap with current settings.

These initial current eDP link settings have been set up by
firmware during boot, so they should work on the eDP panel.
Therefore use them if the firmware thinks they are good and
they provide higher link bandwidth, e.g., to enable higher
resolutions / color depths.

This fixes a problem found on the MacBookPro 2017 Retina panel:

The panel reports 10 bpc color depth in its EDID, and the
firmware chooses link settings at boot which support enough
bandwidth for 10 bpc (324000 kbit/sec aka LINK_RATE_RBR2),
but the DP_MAX_LINK_RATE dpcd register only reports 2.7 Gbps
as possible, so verified_link_cap is only good for 2.7 Gbps
and 8 bpc, not providing the full color depth of the panel.

Signed-off-by: Mario Kleiner 
Cc: Alex Deucher 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 21 +++
 1 file changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 5ea4a1675259..f3acdb8fead5 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -819,6 +819,27 @@ static bool dc_link_detect_helper(struct dc_link *link,
case SIGNAL_TYPE_EDP: {
detect_edp_sink_caps(link);
read_current_link_settings_on_detect(link);
+
+   /* If cur_link_settings provides higher bandwidth than
+* verified_link_cap, then use cur_link_settings as new
+* verified_link_cap, as it obviously works according to
+* firmware boot setup.
+*
+* This has been observed on the Apple MacBookPro 2017
+* Retina panel, which boots with a link setting higher
+* than what dpcd[DP_MAX_LINK_RATE] claims as possible.
+* Overriding allows to run the panel at 10 bpc / 30 
bit.
+*/
+   if (dc_link_bandwidth_kbps(link, 
>cur_link_settings) >
+   dc_link_bandwidth_kbps(link, 
>verified_link_cap)) {
+   DC_LOG_DETECTION_DP_CAPS(
+   "eDP current link setting bw %d kbps > 
verified_link_cap %d kbps. Override.",
+   dc_link_bandwidth_kbps(link, 
>cur_link_settings),
+   dc_link_bandwidth_kbps(link, 
>verified_link_cap));
+
+   link->verified_link_cap = 
link->cur_link_settings;
+   }
+
sink_caps.transaction_type = 
DDC_TRANSACTION_TYPE_I2C_OVER_AUX;
sink_caps.signal = SIGNAL_TYPE_EDP;
break;
-- 
2.24.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/2] drm/amd/display: Reorder detect_edp_sink_caps before link settings read.

2020-01-09 Thread Mario Kleiner

read_current_link_settings_on_detect() on eDP 1.4+ may use the
edp_supported_link_rates table which is set up by
detect_edp_sink_caps(), so that function needs to be called first.

Signed-off-by: Mario Kleiner 
Cc: Martin Leung 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index cef8c1ba9797..5ea4a1675259 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -817,8 +817,8 @@ static bool dc_link_detect_helper(struct dc_link *link,
}
 
case SIGNAL_TYPE_EDP: {
-   read_current_link_settings_on_detect(link);
detect_edp_sink_caps(link);
+   read_current_link_settings_on_detect(link);
sink_caps.transaction_type = 
DDC_TRANSACTION_TYPE_I2C_OVER_AUX;
sink_caps.signal = SIGNAL_TYPE_EDP;
break;
-- 
2.24.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Some eDP fixes/improvements.

2020-01-09 Thread Mario Kleiner

Hi and happy new year!

Since i now have a MBP 2017 to play with, with a 10 bit Retina panel,
and Polaris gpu, i'm trying to get it to get 10 bits, and found one
small bug [fix: patch1], and a quirk of Apples Retina eDP sink, for
which i propose patch2 as solution. I sent a similar patch to i915 to
make 10 bit Retina work with the Intel Kabylake igp on that machine.

If these make sense, it would be cool to still get them into drm-fixes
for Linux 5.5, so users of spring distro updates like Ubuntu 20.04 LTS
can get a more colorful new year.

thanks,
-mario


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/2] drm/amd/display: Send vblank and user events at vsartup for DCN

2019-11-29 Thread Mario Kleiner

Hi Leo and others,

sorry for the late reply. I just spent some time looking at your patches
and testing them on a Raven DCN-1.

I looked at how the vstartup line is computed in the dc_bandwidth_calcs
etc., and added some DRM_DEBUG statements to the dm_dcn_crtc_high_irq and
dm_pflip_high_irq handlers to print the scanline at which the handlers get
invoked.

>From my reading and the results my understanding is that VSTARTUP always
fires after end of front-porch in VRR mode, so the dm_dcn_crtc_high_irq
handler will only get invoked in the vsync/back-porch area? This is good
and very important, as otherwise all the vblank and timestamp calculations
would often be wrong (if it ever happened inside front-porch).

Could you give me some overview of which interrupts / hw events happens
when on DCN vs DCE? I intend to spend quite a bit of quality time in
December playing with the freesync code in DC and see if i can hack up some
proof-of-concept for precisely timed pageflips - the approach Harry
suggested in his XDC2019 talk which i finally found time to watch. I think
with the highly precise vblank and pageflip timestamps we should be able to
implement this precisely without the need for (jittery) software timers,
just some extensions to DRR hw programming and some trickery similar to
what below-the-range BTR support does. That would be so cool, especially
for neuro-science/vision-science/medical research applications.

My rough undestanding so far for DCN seems to be:

1. Pageflips can execute in front-porch, ie. the register double-buffering
can switch there. Can they also still execute after front-porch? How far
into vsync/back-porch? I assume at some point close to the end of
back-porch they can't anymore, because after a flip the line buffer needs
time to prefetch the new pixeldata from the new scanout buffer [a]?

2. The VSTARTUP interrupt/event in VRR mode happens somewhere programmable
after end of front-porch (suggested by the bandwidth calc code), but before
VUPDATE? Is VSTARTUP the last point at which double-buffering for a
pageflip can happen, ie. after that the line-buffer refill for the next
frame starts, ie. [a]?

3. The VUPDATE interrupt/event marks the end of vblank? And that's where
double-buffering / switch of new values for the DRR registers happens? So
DRR values programmed before VUPDATE will take effect after VUPDATE and
thereby affect the vblank after the current one ie. after the following
video frame?

Is this correct? And how does it differ from Vega/DCE-12 and older <=
Polaris / <= DCE-11 ? I remember from earlier this year that BTR works much
better on DCN (tested) and DCE-12 (presumably, don't have hw to test) than
it does on DCE-11 and earlier. This was due to different behaviour of when
the DRR programing takes effect, allowing for much quicker switching on
DCN. I'd like to understand in detail how the DRR
switching/latching/double-buffering differs, if one of you can enlighten me.

There's one thing about this patch though that i think is not right. The
sending of pageflip completion events from within dm_dcn_crtc_high_irq()
seems to be both not needed and possibly causing potentially wrong results
in pageflip events/timestamps / visual glitches due to races?

Two cases:

a) If a pageflip completes in front porch and the pageflip handler
dm_pflip_high_irq() executes while in front-porch, it will queue the proper
pageflip event for later delivery to user space by drm_crtc_handle_vblank()
which is called by dm_dcn_crtc_high_irq() already.

b) If dm_pflip_high_irq() executes after front-porch (pageflip completes in
back-porch if this is possible), it will deliver the pageflip event itself
after updating the vblank count and timestamps correctly via
drm_crtc_accurate_vblank_count().

There isn't a need for the extra code in your patch (if
(acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED) {...}) and indeed i can just
comment it out and everything works fine.

I think the code could be even harmful if a pageflip gets queued into the
hardware before invocation of dm_dcn_crtc_high_irq() (ie. a bit before
VSTARTUP + irq handling delay), but after missing the deadline for
double-buffering of the hardwares primary surface base address registers.
You could end up with setting acrtc->pflip_status = AMDGPU_FLIP_SUBMITTED,
missing the hw double-buffering deadline, and then dm_dcn_crtc_high_irq()
would decide to send out a pageflip completion event to userspace for a
flip that hasn't actually taken place in the hw in this vblank. Userspace
would then misschedule its presentation due to the wrong pageflip event /
timestamp and you'd end up with the previous rendered image presented one
scanout cycle too long, and the current image silently dropped and never
displayed!

Indeed debug output i added shows that the dm_pflip_high_irq() handler
essentially turns into doing nothing with your patch applied, so pageflip
completion events sent to user space no longer correspond to true hw flips.

I have some

Re: [PATCH] drm/amd/display: Allow faking displays as VRR capable.

2019-04-30 Thread Mario Kleiner

On Tue, Apr 30, 2019 at 2:22 PM Kazlauskas, Nicholas
 wrote:
>
> On 4/30/19 3:44 AM, Michel Dänzer wrote:
> > [CAUTION: External Email]
> >
> > On 2019-04-30 9:37 a.m., Mario Kleiner wrote:
> >> Allow to detect any connected display to be marked as
> >> VRR capable. This is useful for testing the basics of
> >> VRR mode, e.g., scheduling and timestamping, BTR, and
> >> transition logic, on non-VRR capable displays, e.g.,
> >> to perform IGT test-suit kms_vrr test runs.
> >>
> >> This fake VRR display mode is enabled by setting the
> >> optional module parameter amdgpu.fakevrrdisplay=1.
> >>
> >> It will try to use VRR range info parsed from EDID on
> >> DisplayPort displays which have a compatible EDID,
> >> but not compatible DPCD caps for Adaptive Sync. E.g.,
> >> NVidia G-Sync compatible displays expose a proper EDID,
> >> but not proper DPCD caps.
> >>
> >> It will use a hard-coded VRR range of 30 Hz - 144 Hz on
> >> other displays without suitable EDID, e.g., standard
> >> DisplayPort, HDMI, DVI monitors.
> >>
> >> Signed-off-by: Mario Kleiner 
> >>
> >> [...]
> >>
> >>   struct amdgpu_mgpu_info mgpu_info = {
> >>.mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
> >> @@ -665,6 +666,16 @@ MODULE_PARM_DESC(halt_if_hws_hang, "Halt if HWS hang 
> >> is detected (0 = off (defau
> >>   MODULE_PARM_DESC(dcfeaturemask, "all stable DC features enabled 
> >> (default))");
> >>   module_param_named(dcfeaturemask, amdgpu_dc_feature_mask, uint, 0444);
> >>
> >> +/**
> >> + * DOC: fakevrrdisplay (int)
> >> + * Override detection of VRR displays to mark any display as VRR capable, 
> >> even
> >> + * if it is not. Useful for basic testing of VRR without need to attach 
> >> such a
> >> + * display, e.g., for igt tests.
> >> + * Setting 1 enables faking VRR. Default value, 0, does normal detection.
> >> + */
> >> +module_param_named(fakevrrdisplay, amdgpu_fake_vrr_display, int, 0644);
> >> +MODULE_PARM_DESC(fakevrrdisplay, "Detect any display as VRR capable (0 = 
> >> off (default), 1 = on)");
> >
> > amdgpu has too many module parameters already; IMHO this kind of niche
> > use-case doesn't justify adding yet another one. For the vast majority
> > of users, this would just be another knob to break things, resulting in
> > support burden for us.
> >
> > How about e.g. making the vrr_capable property mutable, or adding
> > another property for this?
> >
> >
> > --
> > Earthling Michel Dänzer   |  https://www.amd.com
> > Libre software enthusiast | Mesa and X developer
> >
>
> Since vrr_capable is already an optional property I think making it
> mutable could potentially be an option. It would allow for userspace to
> be able to disable capability as well that way.

Yes, that would have been useful for at least my case. In my own
toolkit i will need to control vrr on/off on a run-by-run basis,
depending on what the users experiment scripts want. So i'll add code
to manipulate my fullscreen windows attached XAtom directly and
override Mesa's choices. A bit of a hack, but should hopefully work.

At least for my special niche, more easily accessible (== RandR output
props) info is always helpful. Other things that would probably make
my case easier would be optional properties to report the "vrr_active"
state back, so that the toolkit can know cheaply via a simple query
without doubt at any point in time if vrr is active or not, because i
need to use very different ways of scheduling swapbuffers and
correctness checking the results for vrr vs. non-vrr.

Or some vrr_range property, so the toolkit can know if it is operating
the hw in normal vrr range, or in BTR mode, and adapt its swapbuffers
scheduling, e.g., to try to help the current vrr/btr code a bit to
make the "right" decisions for stable timing.

Of course that makes userspace clients more dependent on current hw
implementation details, so i can see why it's probably not a popular
choice for a generic api. The best long-term solution is to have
proper api for the client to just provide a target presentation
timestamp and leave the rest to the magic little elves inside the
machine.

>
> It's a pretty niche usecase though. However, as Michel said, it would
> probably just end up being another setting that allows users to break
> their own setup.
>
> Nicholas Kazlauskas

Ok, fair enough, thank you two for the feedback. I assumed users
wouldn't m

[PATCH] drm/amd/display: Allow faking displays as VRR capable.

2019-04-30 Thread Mario Kleiner

Allow to detect any connected display to be marked as
VRR capable. This is useful for testing the basics of
VRR mode, e.g., scheduling and timestamping, BTR, and
transition logic, on non-VRR capable displays, e.g.,
to perform IGT test-suit kms_vrr test runs.

This fake VRR display mode is enabled by setting the
optional module parameter amdgpu.fakevrrdisplay=1.

It will try to use VRR range info parsed from EDID on
DisplayPort displays which have a compatible EDID,
but not compatible DPCD caps for Adaptive Sync. E.g.,
NVidia G-Sync compatible displays expose a proper EDID,
but not proper DPCD caps.

It will use a hard-coded VRR range of 30 Hz - 144 Hz on
other displays without suitable EDID, e.g., standard
DisplayPort, HDMI, DVI monitors.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   | 11 +++
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 30 +++
 3 files changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 23c3375623d7..351af38e7fd3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -159,6 +159,7 @@ extern uint amdgpu_dc_feature_mask;
 extern struct amdgpu_mgpu_info mgpu_info;
 extern int amdgpu_ras_enable;
 extern uint amdgpu_ras_mask;
+extern int amdgpu_fake_vrr_display;
 
 #ifdef CONFIG_DRM_AMDGPU_SI
 extern int amdgpu_si_support;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 1e2cc9d68a05..3a9fc0fbc76e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -134,6 +134,7 @@ int amdgpu_emu_mode = 0;
 uint amdgpu_smu_memory_pool_size = 0;
 /* FBC (bit 0) disabled by default*/
 uint amdgpu_dc_feature_mask = 0;
+int amdgpu_fake_vrr_display = 0;
 
 struct amdgpu_mgpu_info mgpu_info = {
.mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
@@ -665,6 +666,16 @@ MODULE_PARM_DESC(halt_if_hws_hang, "Halt if HWS hang is 
detected (0 = off (defau
 MODULE_PARM_DESC(dcfeaturemask, "all stable DC features enabled (default))");
 module_param_named(dcfeaturemask, amdgpu_dc_feature_mask, uint, 0444);
 
+/**
+ * DOC: fakevrrdisplay (int)
+ * Override detection of VRR displays to mark any display as VRR capable, even
+ * if it is not. Useful for basic testing of VRR without need to attach such a
+ * display, e.g., for igt tests.
+ * Setting 1 enables faking VRR. Default value, 0, does normal detection.
+ */
+module_param_named(fakevrrdisplay, amdgpu_fake_vrr_display, int, 0644);
+MODULE_PARM_DESC(fakevrrdisplay, "Detect any display as VRR capable (0 = off 
(default), 1 = on)");
+
 static const struct pci_device_id pciidlist[] = {
 #ifdef  CONFIG_DRM_AMDGPU_SI
{0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 1854506e3e8f..148ec5bb9fa8 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6921,6 +6921,15 @@ void amdgpu_dm_update_freesync_caps(struct drm_connector 
*connector,
edid_check_required = is_dp_capable_without_timing_msa(
adev->dm.dc,
amdgpu_dm_connector);
+
+   /* Force detection of not-quite adaptive sync capable
+* displays as vrr capable if requested by moduleparam.
+* Works, e.g., with G-Sync displays.
+*/
+   if (!edid_check_required && amdgpu_fake_vrr_display) {
+   edid_check_required = true;
+   DRM_INFO("amdgpu.fakevrrdisplay=1 -> Force 
vrr.\n");
+   }
}
}
if (edid_check_required == true && (edid->version > 1 ||
@@ -6948,6 +6957,12 @@ void amdgpu_dm_update_freesync_caps(struct drm_connector 
*connector,
amdgpu_dm_connector->max_vfreq = range->max_vfreq;
amdgpu_dm_connector->pixel_clock_mhz =
range->pixel_clock_mhz * 10;
+
+   DRM_DEBUG("edid vrr %d: %d - %d Hz, clock %d Mhz\n",
+ i, amdgpu_dm_connector->min_vfreq,
+ amdgpu_dm_connector->max_vfreq,
+ amdgpu_dm_connector->pixel_clock_mhz);
+
break;
}
 
@@ -6958,6 +6973,21 @@ void amdgpu_dm_update_freesync_caps(struct drm_connector 
*connector,
}
}
 
+   /* Fake a vrr display with hard-coded properties if none was
+* detec

Re: [PATCH 3/3] drm/amd/display: Compensate for pre-DCE12 BTR-VRR hw limitations. (v3)

2019-04-29 Thread Mario Kleiner

On Mon, Apr 29, 2019 at 2:51 PM Kazlauskas, Nicholas
 wrote:
>
> On 4/26/19 5:40 PM, Mario Kleiner wrote:
> > Pre-DCE12 needs special treatment for BTR / low framerate
> > compensation for more stable behaviour:
> >
> > According to comments in the code and some testing on DCE-8
> > and DCE-11, DCE-11 and earlier only apply VTOTAL_MIN/MAX
> > programming with a lag of one frame, so the special BTR hw
> > programming for intermediate fixed duration frames must be
> > done inside the current frame at flip submission in atomic
> > commit tail, ie. one vblank earlier, and the fixed refresh
> > intermediate frame mode must be also terminated one vblank
> > earlier on pre-DCE12 display engines.
> >
> > To achieve proper termination on < DCE-12 shift the point
> > when the switch-back from fixed vblank duration to variable
> > vblank duration happens from the start of VBLANK (vblank irq,
> > as done on DCE-12+) to back-porch or end of VBLANK (handled
> > by vupdate irq handler). We must leave the switch-back code
> > inside VBLANK irq for DCE12+, as before.
> >
> > Doing this, we get much better behaviour of BTR for up-sweeps,
> > ie. going from short to long frame durations (~high to low fps)
> > and for constant framerate flips, as tested on DCE-8 and
> > DCE-11. Behaviour is still not quite as good as on DCN-1
> > though.
> >
> > On down-sweeps, going from long to short frame durations
> > (low fps to high fps) < DCE-12 is a little bit improved,
> > although by far not as much as for up-sweeps and constant
> > fps.
> >
> > v2: Fix some wrong locking, as pointed out by Nicholas.
> > v3: Simplify if-condition in vupdate-irq - nit by Nicholas.
> > Signed-off-by: Mario Kleiner 
>
> Reviewed-by: Nicholas Kazlauskas 
>

Thanks.

> I'd like to push patches 1+3 if you're okay with this.
>

Yes please! I'd love to have these in Linux 5.2, so that would be the
first well suitable kernel to target for my specific VRR use cases.

> I can see the value in the 2nd patch from the graphs and test
> application you've posted but it'll likely take some time to do thorough
> review and testing on this.
>

I understand.

> The question that needs to be examined with the second is what the
> optimal margin is, if any, for typical usecases across common monitor
> ranges. Not all monitors that support BTR have the same range, and many
> have a lower bound of ~48Hz. To me ~53Hz sounds rather early to be
> entering, but I'm not sure how it affects the user experience overall.
>

That's indeed a tricky tuning problem and i didn't spend much thought
on that. Just took the BTR_EXIT_MARGIN already defined as 2 msecs,
because it was conveniently there for the exit path and this way it
was nicely symmetric to the BTR exit path. 1 msec or less might just
be as good, or something more fancy and adaptive. But most value comes
from patch 3/3 anyway.

Atm. i don't have a Freesync capable display at all, so i don't know
how well this patch works out wrt. flicker etc, i can only look at my
plots and the output of the kms_vrr igt test. I just hacked the driver
to accept any DP or HDMI display as VRR capable, so i can run the
synthetic timing tests blindly (A G-Sync display mostly goes blank or
tells me a sad "Out of range signal" on its OSD, a HDMI->VGA connected
CRT monitor cycles between "Out of range", black, and some heavily
distorted image). The G-Sync display has a 30 - 144 Hz range and my
faking patch even set a range 25 - 144 Hz for most tests.

I also thought about sending my "fake VRR display" patch after
cleaning it up. I could add some amdgpu module parameter, e.g., bool
amdgpu.fakevrroutput,
that if set to 1 would allow the driver to accept any DP/HDMI display
as VRR capable, parsing the VRR range from EDID if present (that works
for G-Sync displays) or just faking a range like 30 - 144 Hz or
whatever is realistic? A blanked out display is boring to look at, but
at least it allows to run timing tests or the IGT kms_vrr tests
without the need for a dedicated FreeSync display.

From reading on the web I understand that "FreeSync2 HDR" displays are
also HDR capable and go through some testing and certification at AMD
wrt. to minimal flicker? Would you have some recommendation or
pointers to a display i could buy if i needed a large VRR range,
minimal flicker and also good HDR capabilities - ideally with locally
dimmable backlight or such? I will probably have the money to buy one
FreeSync display, but that should also be useable for working on HDR
related stuff.

Thanks,
-mario

> Nicholas Kazlauskas
>
> > ---
> >   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 48 +--
> >   1 file changed, 44

Re: [PATCH 3/3] drm/amd/display: Compensate for pre-DCE12 BTR-VRR hw limitations. (v2)

2019-04-26 Thread Mario Kleiner

On Fri, Apr 26, 2019 at 7:12 PM Kazlauskas, Nicholas
 wrote:
>
> On 4/26/19 6:50 AM, Mario Kleiner wrote:
> > Pre-DCE12 needs special treatment for BTR / low framerate
> > compensation for more stable behaviour:
> >
> > According to comments in the code and some testing on DCE-8
> > and DCE-11, DCE-11 and earlier only apply VTOTAL_MIN/MAX
> > programming with a lag of one frame, so the special BTR hw
> > programming for intermediate fixed duration frames must be
> > done inside the current frame at flip submission in atomic
> > commit tail, ie. one vblank earlier, and the fixed refresh
> > intermediate frame mode must be also terminated one vblank
> > earlier on pre-DCE12 display engines.
> >
> > To achieve proper termination on < DCE-12 shift the point
> > when the switch-back from fixed vblank duration to variable
> > vblank duration happens from the start of VBLANK (vblank irq,
> > as done on DCE-12+) to back-porch or end of VBLANK (handled
> > by vupdate irq handler). We must leave the switch-back code
> > inside VBLANK irq for DCE12+, as before.
> >
> > Doing this, we get much better behaviour of BTR for up-sweeps,
> > ie. going from short to long frame durations (~high to low fps)
> > and for constant framerate flips, as tested on DCE-8 and
> > DCE-11. Behaviour is still not quite as good as on DCN-1
> > though.
> >
> > On down-sweeps, going from long to short frame durations
> > (low fps to high fps) < DCE-12 is a little bit improved,
> > although by far not as much as for up-sweeps and constant
> > fps.
> >
> > v2: Fix some wrong locking, as pointed out by Nicholas.
> > Signed-off-by: Mario Kleiner 
>
> Still looks good to me, but I just noticed one other minor nitpick (sorry!)
>
> > ---
> >   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 45 +--
> >   1 file changed, 42 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 76b6e621793f..7241e1f3ebec 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -364,6 +364,7 @@ static void dm_vupdate_high_irq(void *interrupt_params)
> >   struct amdgpu_device *adev = irq_params->adev;
> >   struct amdgpu_crtc *acrtc;
> >   struct dm_crtc_state *acrtc_state;
> > + unsigned long flags;
> >
> >   acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
> > IRQ_TYPE_VUPDATE);
> >
> > @@ -381,6 +382,22 @@ static void dm_vupdate_high_irq(void *interrupt_params)
> >*/
> >   if (amdgpu_dm_vrr_active(acrtc_state))
> >   drm_crtc_handle_vblank(>base);
>
> Can't this block be merged with the one below?
>
> With the condition also changed to just:
>
> if (acrtc_state->stream && adev->family < AMDGPU_FAMILY_AI)
> ...

Done. New series out, retested.
-mario

>
> I also noticed that the crtc_state itself is always accessed unlocked
> for checking whether VRR is on/off which is probably a bug... but this
> patch shouldn't be the one to fix that. At the very least though it can
> leave things in a bit better shape like you've already done. Thanks!
>
> Nicholas Kazlauskas
>
> > +
> > + if (acrtc_state->stream && adev->family < AMDGPU_FAMILY_AI &&
> > + acrtc_state->vrr_params.supported &&
> > + acrtc_state->freesync_config.state == 
> > VRR_STATE_ACTIVE_VARIABLE) {
> > + spin_lock_irqsave(>ddev->event_lock, flags);
> > + mod_freesync_handle_v_update(
> > + adev->dm.freesync_module,
> > + acrtc_state->stream,
> > + _state->vrr_params);
> > +
> > + dc_stream_adjust_vmin_vmax(
> > + adev->dm.dc,
> > + acrtc_state->stream,
> > + _state->vrr_params.adjust);
> > + spin_unlock_irqrestore(>ddev->event_lock, 
> > flags);
> > + }
> >   }
> >   }
> >
> > @@ -390,6 +407,7 @@ static void dm_crtc_high_irq(void *interrupt_params)
> >   struct amdgpu_device *adev = irq_params->adev;
> >   struct amdgpu_crtc *acrtc;
> >   struct dm_crtc_state *acrtc_state

[PATCH 2/3] drm/amd/display: Enter VRR BTR earlier.

2019-04-26 Thread Mario Kleiner

Use a 2 msecs switching headroom not only for slightly
delayed exiting of BTR mode, but now also for entering
it a bit before the max frame duration is exceeded.

With slowly changing time delay between successive flips
or with a bit of jitter in arrival of flips, this adapts
vblank early and prevents missed vblanks at the border
between non-BTR and BTR.

Testing on DCE-8, DCE-11 and DCN-1.0 shows that this more
often avoids skipped frames when moving across the BTR
boundary, especially on DCE-8 and DCE-11 with the followup
commit for dealing with pre-DCE-12 hw.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/modules/freesync/freesync.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c 
b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
index e56543c36eba..a965ab5466dc 100644
--- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
+++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
@@ -350,7 +350,7 @@ static void apply_below_the_range(struct core_freesync 
*core_freesync,
in_out_vrr->btr.frame_counter = 0;
in_out_vrr->btr.btr_active = false;
}
-   } else if (last_render_time_in_us > max_render_time_in_us) {
+   } else if (last_render_time_in_us + BTR_EXIT_MARGIN > 
max_render_time_in_us) {
/* Enter Below the Range */
in_out_vrr->btr.btr_active = true;
}
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/3] drm/amd/display: Fix and simplify apply_below_the_range()

2019-04-26 Thread Mario Kleiner

The comparison of inserted_frame_duration_in_us against a
duration calculated from max_refresh_in_uhz is both wrong
in its math and not needed, as the min_duration_in_us value
is already cached in in_out_vrr for reuse. No need to
recalculate it wrongly at each invocation.

Signed-off-by: Mario Kleiner 
Reviewed-by: Nicholas Kazlauskas 
---
 drivers/gpu/drm/amd/display/modules/freesync/freesync.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c 
b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
index 71274683da04..e56543c36eba 100644
--- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
+++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
@@ -437,10 +437,8 @@ static void apply_below_the_range(struct core_freesync 
*core_freesync,
inserted_frame_duration_in_us = last_render_time_in_us /
frames_to_insert;
 
-   if (inserted_frame_duration_in_us <
-   (100 / in_out_vrr->max_refresh_in_uhz))
-   inserted_frame_duration_in_us =
-   (100 / in_out_vrr->max_refresh_in_uhz);
+   if (inserted_frame_duration_in_us < 
in_out_vrr->min_duration_in_us)
+   inserted_frame_duration_in_us = 
in_out_vrr->min_duration_in_us;
 
/* Cache the calculated variables */
in_out_vrr->btr.inserted_duration_in_us =
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/3] drm/amd/display: Compensate for pre-DCE12 BTR-VRR hw limitations. (v3)

2019-04-26 Thread Mario Kleiner

Pre-DCE12 needs special treatment for BTR / low framerate
compensation for more stable behaviour:

According to comments in the code and some testing on DCE-8
and DCE-11, DCE-11 and earlier only apply VTOTAL_MIN/MAX
programming with a lag of one frame, so the special BTR hw
programming for intermediate fixed duration frames must be
done inside the current frame at flip submission in atomic
commit tail, ie. one vblank earlier, and the fixed refresh
intermediate frame mode must be also terminated one vblank
earlier on pre-DCE12 display engines.

To achieve proper termination on < DCE-12 shift the point
when the switch-back from fixed vblank duration to variable
vblank duration happens from the start of VBLANK (vblank irq,
as done on DCE-12+) to back-porch or end of VBLANK (handled
by vupdate irq handler). We must leave the switch-back code
inside VBLANK irq for DCE12+, as before.

Doing this, we get much better behaviour of BTR for up-sweeps,
ie. going from short to long frame durations (~high to low fps)
and for constant framerate flips, as tested on DCE-8 and
DCE-11. Behaviour is still not quite as good as on DCN-1
though.

On down-sweeps, going from long to short frame durations
(low fps to high fps) < DCE-12 is a little bit improved,
although by far not as much as for up-sweeps and constant
fps.

v2: Fix some wrong locking, as pointed out by Nicholas.
v3: Simplify if-condition in vupdate-irq - nit by Nicholas.
Signed-off-by: Mario Kleiner 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 48 +--
 1 file changed, 44 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 76b6e621793f..92b3c2cec7dd 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -364,6 +364,7 @@ static void dm_vupdate_high_irq(void *interrupt_params)
struct amdgpu_device *adev = irq_params->adev;
struct amdgpu_crtc *acrtc;
struct dm_crtc_state *acrtc_state;
+   unsigned long flags;
 
acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
IRQ_TYPE_VUPDATE);
 
@@ -379,8 +380,25 @@ static void dm_vupdate_high_irq(void *interrupt_params)
 * page-flip completion events that have been queued to us
 * if a pageflip happened inside front-porch.
 */
-   if (amdgpu_dm_vrr_active(acrtc_state))
+   if (amdgpu_dm_vrr_active(acrtc_state)) {
drm_crtc_handle_vblank(>base);
+
+   /* BTR processing for pre-DCE12 ASICs */
+   if (acrtc_state->stream &&
+   adev->family < AMDGPU_FAMILY_AI) {
+   spin_lock_irqsave(>ddev->event_lock, 
flags);
+   mod_freesync_handle_v_update(
+   adev->dm.freesync_module,
+   acrtc_state->stream,
+   _state->vrr_params);
+
+   dc_stream_adjust_vmin_vmax(
+   adev->dm.dc,
+   acrtc_state->stream,
+   _state->vrr_params.adjust);
+   spin_unlock_irqrestore(>ddev->event_lock, 
flags);
+   }
+   }
}
 }
 
@@ -390,6 +408,7 @@ static void dm_crtc_high_irq(void *interrupt_params)
struct amdgpu_device *adev = irq_params->adev;
struct amdgpu_crtc *acrtc;
struct dm_crtc_state *acrtc_state;
+   unsigned long flags;
 
acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
IRQ_TYPE_VBLANK);
 
@@ -412,9 +431,10 @@ static void dm_crtc_high_irq(void *interrupt_params)
 */
amdgpu_dm_crtc_handle_crc_irq(>base);
 
-   if (acrtc_state->stream &&
+   if (acrtc_state->stream && adev->family >= AMDGPU_FAMILY_AI &&
acrtc_state->vrr_params.supported &&
acrtc_state->freesync_config.state == 
VRR_STATE_ACTIVE_VARIABLE) {
+   spin_lock_irqsave(>ddev->event_lock, flags);
mod_freesync_handle_v_update(
adev->dm.freesync_module,
acrtc_state->stream,
@@ -424,6 +444,7 @@ static void dm_crtc_high_irq(void *interrupt_params)
adev->dm.dc,
acrtc_state->stream,
_state->vrr_params.adjust);
+   spin_unlock_irqrestore(>ddev->event_lock, flags);
}
}
 }
@@ -4878,8 +4899,10 @@ static void update_freesync_state_on_stream

VRR BTR patches revision 3

2019-04-26 Thread Mario Kleiner

Same as rev 2, except for patch 3/3 v3 to apply Nicholas
latest feedback into account. Retested on DCN1 and DCE8.

thanks,
-mario


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/3] drm/amd/display: Compensate for pre-DCE12 BTR-VRR hw limitations. (v2)

2019-04-26 Thread Mario Kleiner

Pre-DCE12 needs special treatment for BTR / low framerate
compensation for more stable behaviour:

According to comments in the code and some testing on DCE-8
and DCE-11, DCE-11 and earlier only apply VTOTAL_MIN/MAX
programming with a lag of one frame, so the special BTR hw
programming for intermediate fixed duration frames must be
done inside the current frame at flip submission in atomic
commit tail, ie. one vblank earlier, and the fixed refresh
intermediate frame mode must be also terminated one vblank
earlier on pre-DCE12 display engines.

To achieve proper termination on < DCE-12 shift the point
when the switch-back from fixed vblank duration to variable
vblank duration happens from the start of VBLANK (vblank irq,
as done on DCE-12+) to back-porch or end of VBLANK (handled
by vupdate irq handler). We must leave the switch-back code
inside VBLANK irq for DCE12+, as before.

Doing this, we get much better behaviour of BTR for up-sweeps,
ie. going from short to long frame durations (~high to low fps)
and for constant framerate flips, as tested on DCE-8 and
DCE-11. Behaviour is still not quite as good as on DCN-1
though.

On down-sweeps, going from long to short frame durations
(low fps to high fps) < DCE-12 is a little bit improved,
although by far not as much as for up-sweeps and constant
fps.

v2: Fix some wrong locking, as pointed out by Nicholas.
Signed-off-by: Mario Kleiner 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 45 +--
 1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 76b6e621793f..7241e1f3ebec 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -364,6 +364,7 @@ static void dm_vupdate_high_irq(void *interrupt_params)
struct amdgpu_device *adev = irq_params->adev;
struct amdgpu_crtc *acrtc;
struct dm_crtc_state *acrtc_state;
+   unsigned long flags;
 
acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
IRQ_TYPE_VUPDATE);
 
@@ -381,6 +382,22 @@ static void dm_vupdate_high_irq(void *interrupt_params)
 */
if (amdgpu_dm_vrr_active(acrtc_state))
drm_crtc_handle_vblank(>base);
+
+   if (acrtc_state->stream && adev->family < AMDGPU_FAMILY_AI &&
+   acrtc_state->vrr_params.supported &&
+   acrtc_state->freesync_config.state == 
VRR_STATE_ACTIVE_VARIABLE) {
+   spin_lock_irqsave(>ddev->event_lock, flags);
+   mod_freesync_handle_v_update(
+   adev->dm.freesync_module,
+   acrtc_state->stream,
+   _state->vrr_params);
+
+   dc_stream_adjust_vmin_vmax(
+   adev->dm.dc,
+   acrtc_state->stream,
+   _state->vrr_params.adjust);
+   spin_unlock_irqrestore(>ddev->event_lock, flags);
+   }
}
 }
 
@@ -390,6 +407,7 @@ static void dm_crtc_high_irq(void *interrupt_params)
struct amdgpu_device *adev = irq_params->adev;
struct amdgpu_crtc *acrtc;
struct dm_crtc_state *acrtc_state;
+   unsigned long flags;
 
acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
IRQ_TYPE_VBLANK);
 
@@ -412,9 +430,10 @@ static void dm_crtc_high_irq(void *interrupt_params)
 */
amdgpu_dm_crtc_handle_crc_irq(>base);
 
-   if (acrtc_state->stream &&
+   if (acrtc_state->stream && adev->family >= AMDGPU_FAMILY_AI &&
acrtc_state->vrr_params.supported &&
acrtc_state->freesync_config.state == 
VRR_STATE_ACTIVE_VARIABLE) {
+   spin_lock_irqsave(>ddev->event_lock, flags);
mod_freesync_handle_v_update(
adev->dm.freesync_module,
acrtc_state->stream,
@@ -424,6 +443,7 @@ static void dm_crtc_high_irq(void *interrupt_params)
adev->dm.dc,
acrtc_state->stream,
_state->vrr_params.adjust);
+   spin_unlock_irqrestore(>ddev->event_lock, flags);
}
}
 }
@@ -4878,8 +4898,10 @@ static void update_freesync_state_on_stream(
struct dc_plane_state *surface,
u32 flip_timestamp_in_us)
 {
-   struct mod_vrr_params vrr_params = new_crtc_state->vrr_params;
+   struct mod_vrr_params vrr_params;
struct dc_info_packet vrr_infopacket = {0};
+   struct amdgpu_device *adev = dm->adev;
+   un

[PATCH 2/3] drm/amd/display: Enter VRR BTR earlier.

2019-04-26 Thread Mario Kleiner

Use a 2 msecs switching headroom not only for slightly
delayed exiting of BTR mode, but now also for entering
it a bit before the max frame duration is exceeded.

With slowly changing time delay between successive flips
or with a bit of jitter in arrival of flips, this adapts
vblank early and prevents missed vblanks at the border
between non-BTR and BTR.

Testing on DCE-8, DCE-11 and DCN-1.0 shows that this more
often avoids skipped frames when moving across the BTR
boundary, especially on DCE-8 and DCE-11 with the followup
commit for dealing with pre-DCE-12 hw.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/modules/freesync/freesync.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c 
b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
index e56543c36eba..a965ab5466dc 100644
--- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
+++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
@@ -350,7 +350,7 @@ static void apply_below_the_range(struct core_freesync 
*core_freesync,
in_out_vrr->btr.frame_counter = 0;
in_out_vrr->btr.btr_active = false;
}
-   } else if (last_render_time_in_us > max_render_time_in_us) {
+   } else if (last_render_time_in_us + BTR_EXIT_MARGIN > 
max_render_time_in_us) {
/* Enter Below the Range */
in_out_vrr->btr.btr_active = true;
}
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/3] drm/amd/display: Fix and simplify apply_below_the_range()

2019-04-26 Thread Mario Kleiner

The comparison of inserted_frame_duration_in_us against a
duration calculated from max_refresh_in_uhz is both wrong
in its math and not needed, as the min_duration_in_us value
is already cached in in_out_vrr for reuse. No need to
recalculate it wrongly at each invocation.

Signed-off-by: Mario Kleiner 
Reviewed-by: Nicholas Kazlauskas 
---
 drivers/gpu/drm/amd/display/modules/freesync/freesync.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c 
b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
index 71274683da04..e56543c36eba 100644
--- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
+++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
@@ -437,10 +437,8 @@ static void apply_below_the_range(struct core_freesync 
*core_freesync,
inserted_frame_duration_in_us = last_render_time_in_us /
frames_to_insert;
 
-   if (inserted_frame_duration_in_us <
-   (100 / in_out_vrr->max_refresh_in_uhz))
-   inserted_frame_duration_in_us =
-   (100 / in_out_vrr->max_refresh_in_uhz);
+   if (inserted_frame_duration_in_us < 
in_out_vrr->min_duration_in_us)
+   inserted_frame_duration_in_us = 
in_out_vrr->min_duration_in_us;
 
/* Cache the calculated variables */
in_out_vrr->btr.inserted_duration_in_us =
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

VRR BTR patches revision 2

2019-04-26 Thread Mario Kleiner

Updated series. The debug patch is dropped, a r-b by Nicholas is tacked
onto patch 1/3 and patch 3/3 has the locking fix that Nicholas proposed.
In terms of testing 3/3 didn't change anything for the better or worse,
observed behaviour on retested DCN-1 and DCE-8 is the same.
Patch 2/3 is identical.

For reference i made a little git repo with a cleaned up version
of my test script VRRTest.m and some pdf's with some plots from my
testing with a slightly earlier version of that script:

https://github.com/kleinerm/VRRTestPlots

Easy to use on a Debian/Ubuntu based system as you can get GNU/Octave
+ psychtoolbox-3 from the distro repo. Explanations in the Readm

The following plots illustrate how patch 2/3 can sometimes help to
make transition to BTR a bit smoother min VRR was 25 Hz or 40 msecs
in this test case on DCE8 and 30 Hz or 33 msecs on DCE11.

Without patch 2/3:

https://github.com/kleinerm/VRRTestPlots/blob/master/VRR_DCE8_2upstepping_VUPDATEbtr+removehysteresispatch.pdf
https://github.com/kleinerm/VRRTestPlots/blob/master/VRR_DCE11_2upstepping_VUPDATEbtr+removehysteresispatch.pdf

With patch 2/3:

https://github.com/kleinerm/VRRTestPlots/blob/master/VRR_DCE11_2upstepping_VUPDATEbtr.pdf
https://github.com/kleinerm/VRRTestPlots/blob/master/VRR_DCE8_2upstepping_VUPDATEbtr.pdf

-mario


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 4/4] drm/amd/display: Compensate for pre-DCE12 BTR-VRR hw limitations.

2019-04-26 Thread Mario Kleiner

On Wed, Apr 24, 2019 at 4:34 PM Kazlauskas, Nicholas
 wrote:
>
> On 4/17/19 11:51 PM, Mario Kleiner wrote:
> > Pre-DCE12 needs special treatment for BTR / low framerate
> > compensation for more stable behaviour:
> >
> > According to comments in the code and some testing on DCE-8
> > and DCE-11, DCE-11 and earlier only apply VTOTAL_MIN/MAX
> > programming with a lag of one frame, so the special BTR hw
> > programming for intermediate fixed duration frames must be
> > done inside the current frame at flip submission in atomic
> > commit tail, ie. one vblank earlier, and the fixed refresh
> > intermediate frame mode must be also terminated one vblank
> > earlier on pre-DCE12 display engines.
> >
> > To achieve proper termination on < DCE-12 shift the point
> > when the switch-back from fixed vblank duration to variable
> > vblank duration happens from the start of VBLANK (vblank irq,
> > as done on DCE-12+) to back-porch or end of VBLANK (handled
> > by vupdate irq handler). We must leave the switch-back code
> > inside VBLANK irq for DCE12+, as before.
> >
> > Doing this, we get much better behaviour of BTR for up-sweeps,
> > ie. going from short to long frame durations (~high to low fps)
> > and for constant framerate flips, as tested on DCE-8 and
> > DCE-11. Behaviour is still not quite as good as on DCN-1
> > though.
> >
> > On down-sweeps, going from long to short frame durations
> > (low fps to high fps) < DCE-12 is a little bit improved,
> > although by far not as much as for up-sweeps and constant
> > fps.
> >
> > Signed-off-by: Mario Kleiner 
> > ---
>
> I did some debugging/testing with this patch and it certainly does
> improve things quite a bit for pre DCE12 (since this really should have
> been handled in VUPDATE before).
>

Do you know at which exact point in the frame cycle or vblank the new
VTOTAL_MIN/MAX gets latched by the hw?

Btw. for reference i made a little git repo with a cleaned up version
my test script VRRTest.m and some pdf's with some plots from my
testing with a slightly earlier version of that script:

https://github.com/kleinerm/VRRTestPlots

Easy to use on a Debian/Ubuntu based system as you can get GNU/Octave
+ psychtoolbox-3 from the distro repo. Explanations in the Readme.md

> I have one comment inline below:
>
>
> >   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 32 ++-
> >   1 file changed, 31 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 76b6e621793f..9c8c94f82b35 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -364,6 +364,7 @@ static void dm_vupdate_high_irq(void *interrupt_params)
> >   struct amdgpu_device *adev = irq_params->adev;
> >   struct amdgpu_crtc *acrtc;
> >   struct dm_crtc_state *acrtc_state;
> > + unsigned long flags;
> >
> >   acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
> > IRQ_TYPE_VUPDATE);
> >
> > @@ -381,6 +382,22 @@ static void dm_vupdate_high_irq(void *interrupt_params)
> >*/
> >   if (amdgpu_dm_vrr_active(acrtc_state))
> >   drm_crtc_handle_vblank(>base);
> > +
> > + if (acrtc_state->stream && adev->family < AMDGPU_FAMILY_AI &&
> > + acrtc_state->vrr_params.supported &&
> > + acrtc_state->freesync_config.state == 
> > VRR_STATE_ACTIVE_VARIABLE) {
> > + spin_lock_irqsave(>ddev->event_lock, flags);
> > + mod_freesync_handle_v_update(
> > + adev->dm.freesync_module,
> > + acrtc_state->stream,
> > + _state->vrr_params);
> > +
> > + dc_stream_adjust_vmin_vmax(
> > + adev->dm.dc,
> > + acrtc_state->stream,
> > + _state->vrr_params.adjust);
> > + spin_unlock_irqrestore(>ddev->event_lock, 
> > flags);
> > + }
> >   }
> >   }
> >
> > @@ -390,6 +407,7 @@ static void dm_crtc_high_irq(void *interrupt_params)
> >   struct amdgpu_device *adev = irq_params->adev;
> >   struct amdgpu_crtc *acrtc;
> >   struct dm_crtc_state *acrtc_state;
> > + unsigned long flags;
> >
>

Improvements to VRR below-the-range/low framerate compensation.

2019-04-17 Thread Mario Kleiner

Hi

This patch-series tries to improve amdgpu's below-the-range
behaviour with Freesync, hopefully not only for my use case,
but also for games etc.

Patch 1/4 adds a bit of debug output i found very useful, so maybe
worth adding?

Patch 2/4 fixes a bug i found when reading over the freesync code.

Patches 3/4 and 4/4 are optimizations to improve stability
in BTR.

My desired application of VRR for neuroscience/vision research
is to control the timing of when frames show up onscreen, e.g.,
to show animations at different "unconventional" framerates,
so i'm mostly interested in how well one can control the timing
between successive OpenGL bufferswaps. This is a bit different
from what a game wants to get out of VRR, probably closer to
what a movie player might want to do.

I spent quite a bit of time testing how FreeSync behaves when
flipping at a rate below the displays VRR minimum refresh rate.

For that, my own application was submitting glXSwapBuffers() flip
requests at different fps rates / time delays between successive
flips. The test script is for GNU/Octave + psychtoolbox-3, but
in principle the C equivalent would be this pseudo-code:

for (i = 0; i < n; i++) {
  // Wait for pending flip to complete, get pageflip timestamp t_last
  // of flip completion:
  glXWaitForSbcOML(, _last[i],...);

  // Fetch some delay value until next flip tdelay[i]
  tdelay[i] = Some function of varying frame delay over samples.

  // Try to flip tdelay[i] secs after previous flip t_last[i]:
  t_next = t_last[i] + tdelay[i];
  clock_nanosleep(t_next);

  // Flip
  glXSwapBuffers(...);
}

For tdelay[i] i used different test profiles, e.g., on a display with
a VRR range from 30 Hz to 144 Hz ~ 7 msecs - 33 msecs:

tdelay[i] = 0.050 // One flip each 50 msecs, want constant 20 fps.
tdelay[i] = rand() // Some randomly chosen delay for each flip.
tdelay[i] = 0.007 + some sin() sine profile of changing delays/fps.
tdelay[i] = 0.007 + i * 0.001; linear increase in delay by 1 msec/flip,
starting at 7 msecs.
tdelay[i] = ... linear decrease by 1 msec/flip.. starting at 120 msecs.

etc. Then i plotted requested flip delays tdelay[] against actual
flip delays (~ t_last[i+1] - t_last[i]) to see how well VRR can follow
requested fps.

Inside the VRR range ~ 7 msecs - 33 msecs, Freesync behaved basically
perfect with average errors of less than 0.1 msecs and jitter of less
than 1 msec.

When going for tdelay's > 33 msecs, ie. when low framerate compensation/
BTR kicked in, my DCN-1 Raven Ridge APU behaved almost as well as within
the VRR range (for reasonably smooth changes in fps). When doing the
same on a DCE-8 and DCE-11 gpu, BTR made much bigger errors between what
was requested and what was measured.

Patch 3/4 helps avoiding glitches on DCN when transitioning from VRR range
to below min VRR range, and helps even more in avoiding glitches on DCE.

Patch 4/4 tries to improve behaviour on pre-DCE12. It helps quite a lot
when testing on DCE-8 and DCE-11 as described in the commit message. This
makes sense, as the code has some TODO comment about the different hw
behaviour of pre-DCE12, mentioning that pre-DCE12 hw only responds to 
programming different VTOTAL_MIN/MAX values with a lag of 1 frame. The
patch tries to work around this hardware limitation with some success.
DCE8/11 behaviour is still not as good as DCN-1 behaviour, but at least
it is not totally useless for my type of application with this patchset.

I don't have a Vega discrete gpu with DCE-12, but assume it would behave
like DCN-1 if the comments in the code are correct? Therefore the patch
switches BTR handling for < AMDGPU_FAMILY_AI vs. >= AMDGPU_FAMILY_AI.

Thanks,
-mario


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/4] drm/amd/display: Add some debug output for VRR BTR.

2019-04-17 Thread Mario Kleiner

Helps with debugging issues with low framerate compensation.

Signed-off-by: Mario Kleiner 
---
 .../amd/display/modules/freesync/freesync.c| 18 ++
 1 file changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c 
b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
index 3d867e34f8b3..71274683da04 100644
--- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
+++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
@@ -1041,6 +1041,11 @@ void mod_freesync_handle_preflip(struct mod_freesync 
*mod_freesync,
average_render_time_in_us += last_render_time_in_us;
average_render_time_in_us /= DC_PLANE_UPDATE_TIMES_MAX;
 
+   DRM_DEBUG_DRIVER("vrr flip: avg %d us, last %d us, max %d us\n",
+average_render_time_in_us,
+last_render_time_in_us,
+in_out_vrr->max_duration_in_us);
+
if (in_out_vrr->btr.btr_enabled) {
apply_below_the_range(core_freesync,
stream,
@@ -1053,6 +1058,10 @@ void mod_freesync_handle_preflip(struct mod_freesync 
*mod_freesync,
in_out_vrr);
}
 
+   DRM_DEBUG_DRIVER("vrr btr_active:%d - num %d of dur %d us\n",
+in_out_vrr->btr.btr_active,
+in_out_vrr->btr.frames_to_insert,
+in_out_vrr->btr.inserted_duration_in_us);
}
 }
 
@@ -1090,11 +1099,17 @@ void mod_freesync_handle_v_update(struct mod_freesync 
*mod_freesync,
in_out_vrr->btr.inserted_duration_in_us);
in_out_vrr->adjust.v_total_max =
in_out_vrr->adjust.v_total_min;
+   DRM_DEBUG_DRIVER("btr start: c=%d, vtotal=%d\n",
+in_out_vrr->btr.frames_to_insert,
+in_out_vrr->adjust.v_total_min);
}
 
if (in_out_vrr->btr.frame_counter > 0)
in_out_vrr->btr.frame_counter--;
 
+   DRM_DEBUG_DRIVER("btr upd: count %d\n",
+in_out_vrr->btr.frame_counter);
+
/* Restore FreeSync */
if (in_out_vrr->btr.frame_counter == 0) {
in_out_vrr->adjust.v_total_min =
@@ -1103,6 +1118,9 @@ void mod_freesync_handle_v_update(struct mod_freesync 
*mod_freesync,
in_out_vrr->adjust.v_total_max =
calc_v_total_from_refresh(stream,
in_out_vrr->min_refresh_in_uhz);
+   DRM_DEBUG_DRIVER("btr end: vtotal_min=%d/max=%d\n",
+in_out_vrr->adjust.v_total_min,
+in_out_vrr->adjust.v_total_max);
}
}
 
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/4] drm/amd/display: Enter VRR BTR earlier.

2019-04-17 Thread Mario Kleiner

Use a 2 msecs switching headroom not only for slightly
delayed exiting of BTR mode, but now also for entering
it a bit before the max frame duration is exceeded.

With slowly changing time delay between successive flips
or with a bit of jitter in arrival of flips, this adapts
vblank early and prevents missed vblanks at the border
between non-BTR and BTR.

Testing on DCE-8, DCE-11 and DCN-1.0 shows that this more
often avoids skipped frames when moving across the BTR
boundary, especially on DCE-8 and DCE-11 with the followup
commit for dealing with pre-DCE-12 hw.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/modules/freesync/freesync.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c 
b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
index e56543c36eba..a965ab5466dc 100644
--- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
+++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
@@ -350,7 +350,7 @@ static void apply_below_the_range(struct core_freesync 
*core_freesync,
in_out_vrr->btr.frame_counter = 0;
in_out_vrr->btr.btr_active = false;
}
-   } else if (last_render_time_in_us > max_render_time_in_us) {
+   } else if (last_render_time_in_us + BTR_EXIT_MARGIN > 
max_render_time_in_us) {
/* Enter Below the Range */
in_out_vrr->btr.btr_active = true;
}
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 4/4] drm/amd/display: Compensate for pre-DCE12 BTR-VRR hw limitations.

2019-04-17 Thread Mario Kleiner

Pre-DCE12 needs special treatment for BTR / low framerate
compensation for more stable behaviour:

According to comments in the code and some testing on DCE-8
and DCE-11, DCE-11 and earlier only apply VTOTAL_MIN/MAX
programming with a lag of one frame, so the special BTR hw
programming for intermediate fixed duration frames must be
done inside the current frame at flip submission in atomic
commit tail, ie. one vblank earlier, and the fixed refresh
intermediate frame mode must be also terminated one vblank
earlier on pre-DCE12 display engines.

To achieve proper termination on < DCE-12 shift the point
when the switch-back from fixed vblank duration to variable
vblank duration happens from the start of VBLANK (vblank irq,
as done on DCE-12+) to back-porch or end of VBLANK (handled
by vupdate irq handler). We must leave the switch-back code
inside VBLANK irq for DCE12+, as before.

Doing this, we get much better behaviour of BTR for up-sweeps,
ie. going from short to long frame durations (~high to low fps)
and for constant framerate flips, as tested on DCE-8 and
DCE-11. Behaviour is still not quite as good as on DCN-1
though.

On down-sweeps, going from long to short frame durations
(low fps to high fps) < DCE-12 is a little bit improved,
although by far not as much as for up-sweeps and constant
fps.

Signed-off-by: Mario Kleiner 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 32 ++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 76b6e621793f..9c8c94f82b35 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -364,6 +364,7 @@ static void dm_vupdate_high_irq(void *interrupt_params)
struct amdgpu_device *adev = irq_params->adev;
struct amdgpu_crtc *acrtc;
struct dm_crtc_state *acrtc_state;
+   unsigned long flags;
 
acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
IRQ_TYPE_VUPDATE);
 
@@ -381,6 +382,22 @@ static void dm_vupdate_high_irq(void *interrupt_params)
 */
if (amdgpu_dm_vrr_active(acrtc_state))
drm_crtc_handle_vblank(>base);
+
+   if (acrtc_state->stream && adev->family < AMDGPU_FAMILY_AI &&
+   acrtc_state->vrr_params.supported &&
+   acrtc_state->freesync_config.state == 
VRR_STATE_ACTIVE_VARIABLE) {
+   spin_lock_irqsave(>ddev->event_lock, flags);
+   mod_freesync_handle_v_update(
+   adev->dm.freesync_module,
+   acrtc_state->stream,
+   _state->vrr_params);
+
+   dc_stream_adjust_vmin_vmax(
+   adev->dm.dc,
+   acrtc_state->stream,
+   _state->vrr_params.adjust);
+   spin_unlock_irqrestore(>ddev->event_lock, flags);
+   }
}
 }
 
@@ -390,6 +407,7 @@ static void dm_crtc_high_irq(void *interrupt_params)
struct amdgpu_device *adev = irq_params->adev;
struct amdgpu_crtc *acrtc;
struct dm_crtc_state *acrtc_state;
+   unsigned long flags;
 
acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
IRQ_TYPE_VBLANK);
 
@@ -412,9 +430,10 @@ static void dm_crtc_high_irq(void *interrupt_params)
 */
amdgpu_dm_crtc_handle_crc_irq(>base);
 
-   if (acrtc_state->stream &&
+   if (acrtc_state->stream && adev->family >= AMDGPU_FAMILY_AI &&
acrtc_state->vrr_params.supported &&
acrtc_state->freesync_config.state == 
VRR_STATE_ACTIVE_VARIABLE) {
+   spin_lock_irqsave(>ddev->event_lock, flags);
mod_freesync_handle_v_update(
adev->dm.freesync_module,
acrtc_state->stream,
@@ -424,6 +443,7 @@ static void dm_crtc_high_irq(void *interrupt_params)
adev->dm.dc,
acrtc_state->stream,
_state->vrr_params.adjust);
+   spin_unlock_irqrestore(>ddev->event_lock, flags);
}
}
 }
@@ -4880,6 +4900,8 @@ static void update_freesync_state_on_stream(
 {
struct mod_vrr_params vrr_params = new_crtc_state->vrr_params;
struct dc_info_packet vrr_infopacket = {0};
+   struct amdgpu_device *adev = dm->adev;
+   unsigned long flags;
 
if (!new_stream)
return;
@@ -4899,6 +4921,14 @@ static void update_freesync_state_on_stream(
new_stream,

[PATCH 2/4] drm/amd/display: Fix and simplify apply_below_the_range()

2019-04-17 Thread Mario Kleiner

The comparison of inserted_frame_duration_in_us against a
duration calculated from max_refresh_in_uhz is both wrong
in its math and not needed, as the min_duration_in_us value
is already cached in in_out_vrr for reuse. No need to
recalculate it wrongly at each invocation.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/modules/freesync/freesync.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c 
b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
index 71274683da04..e56543c36eba 100644
--- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
+++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
@@ -437,10 +437,8 @@ static void apply_below_the_range(struct core_freesync 
*core_freesync,
inserted_frame_duration_in_us = last_render_time_in_us /
frames_to_insert;
 
-   if (inserted_frame_duration_in_us <
-   (100 / in_out_vrr->max_refresh_in_uhz))
-   inserted_frame_duration_in_us =
-   (100 / in_out_vrr->max_refresh_in_uhz);
+   if (inserted_frame_duration_in_us < 
in_out_vrr->min_duration_in_us)
+   inserted_frame_duration_in_us = 
in_out_vrr->min_duration_in_us;
 
/* Cache the calculated variables */
in_out_vrr->btr.inserted_duration_in_us =
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 4/5] drm/amd/display: In VRR mode, do DRM core vblank handling at end of vblank. (v2)

2019-03-29 Thread Mario Kleiner

In VRR mode, proper vblank/pageflip timestamps can only be computed
after the display scanout position has left front-porch. Therefore
delay calls to drm_crtc_handle_vblank(), and thereby calls to
drm_update_vblank_count() and pageflip event delivery, to after the
end of front-porch when in VRR mode.

We add a new vupdate irq, which triggers at the end of the vupdate
interval, ie. at the end of vblank, and calls the core vblank handler
function. The new irq handler is not executed in standard non-VRR
mode, so vblank handling for fixed refresh rate mode is identical
to the past implementation.

v2: Implement feedback by Nicholas and Paul Menzel.

Signed-off-by: Mario Kleiner 
Acked-by: Harry Wentland 
Reviewed-by: Nicholas Kazlauskas 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   1 +
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 128 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |   9 ++
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c |  22 +++
 .../dc/irq/dce110/irq_service_dce110.c|   7 +-
 .../dc/irq/dce120/irq_service_dce120.c|   7 +-
 .../display/dc/irq/dce80/irq_service_dce80.c  |   6 +-
 .../display/dc/irq/dcn10/irq_service_dcn10.c  |  40 --
 8 files changed, 204 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 6e71749cb3bb..6294316f24c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -827,6 +827,7 @@ struct amdgpu_device {
/* For pre-DCE11. DCE11 and later are in "struct amdgpu_device->dm" */
struct work_struct  hotplug_work;
struct amdgpu_irq_src   crtc_irq;
+   struct amdgpu_irq_src   vupdate_irq;
struct amdgpu_irq_src   pageflip_irq;
struct amdgpu_irq_src   hpd_irq;
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 7366528e5cc2..1a5be73e4172 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -315,6 +315,32 @@ static void dm_pflip_high_irq(void *interrupt_params)
drm_crtc_vblank_put(_crtc->base);
 }
 
+static void dm_vupdate_high_irq(void *interrupt_params)
+{
+   struct common_irq_params *irq_params = interrupt_params;
+   struct amdgpu_device *adev = irq_params->adev;
+   struct amdgpu_crtc *acrtc;
+   struct dm_crtc_state *acrtc_state;
+
+   acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
IRQ_TYPE_VUPDATE);
+
+   if (acrtc) {
+   acrtc_state = to_dm_crtc_state(acrtc->base.state);
+
+   DRM_DEBUG_DRIVER("crtc:%d, vupdate-vrr:%d\n", acrtc->crtc_id,
+amdgpu_dm_vrr_active(acrtc_state));
+
+   /* Core vblank handling is done here after end of front-porch in
+* vrr mode, as vblank timestamping will give valid results
+* while now done after front-porch. This will also deliver
+* page-flip completion events that have been queued to us
+* if a pageflip happened inside front-porch.
+*/
+   if (amdgpu_dm_vrr_active(acrtc_state))
+   drm_crtc_handle_vblank(>base);
+   }
+}
+
 static void dm_crtc_high_irq(void *interrupt_params)
 {
struct common_irq_params *irq_params = interrupt_params;
@@ -325,11 +351,24 @@ static void dm_crtc_high_irq(void *interrupt_params)
acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
IRQ_TYPE_VBLANK);
 
if (acrtc) {
-   drm_crtc_handle_vblank(>base);
-   amdgpu_dm_crtc_handle_crc_irq(>base);
-
acrtc_state = to_dm_crtc_state(acrtc->base.state);
 
+   DRM_DEBUG_DRIVER("crtc:%d, vupdate-vrr:%d\n", acrtc->crtc_id,
+amdgpu_dm_vrr_active(acrtc_state));
+
+   /* Core vblank handling at start of front-porch is only possible
+* in non-vrr mode, as only there vblank timestamping will give
+* valid results while done in front-porch. Otherwise defer it
+* to dm_vupdate_high_irq after end of front-porch.
+*/
+   if (!amdgpu_dm_vrr_active(acrtc_state))
+   drm_crtc_handle_vblank(>base);
+
+   /* Following stuff must happen at start of vblank, for crc
+* computation and below-the-range btr support in vrr mode.
+*/
+   amdgpu_dm_crtc_handle_crc_irq(>base);
+
if (acrtc_state->stream &&
acrtc_state->vrr_params.supported &&
acrtc_state->freesync_config.state == 
VRR_STATE_ACTIVE_VARIABLE) {
@@ -1447,6 +1486,27 @@ static int dce110_register_irq_han

[PATCH 3/5] drm/amd/display: Rework vrr flip throttling for late vblank irq.

2019-03-29 Thread Mario Kleiner

For throttling to work correctly, we always need a baseline vblank
count last_flip_vblank that increments at start of front-porch.

This is the case for drm_crtc_vblank_count() in non-VRR mode, where
the vblank irq fires at start of front-porch and triggers DRM core
vblank handling, but it is no longer the case in VRR mode, where
core vblank handling is done later, after end of front-porch.

Therefore drm_crtc_vblank_count() is no longer useful for this.
We also can't use drm_crtc_accurate_vblank_count(), as that would
screw up vblank timestamps in VRR mode when called in front-porch.

To solve this, use the cooked hardware vblank counter returned by
amdgpu_get_vblank_counter_kms() instead, as that one is cooked to
always increment at start of front-porch, independent of when
vblank related irq's fire.

This patch allows vblank irq handling to happen anywhere within
vblank of even after it, without a negative impact on flip
throttling, so followup patches can shift the vblank core
handling trigger point wherever they need it.

Signed-off-by: Mario Kleiner 
Reviewed-by: Nicholas Kazlauskas 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  |  2 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 23 +++
 2 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index 889e4437..add238fe4b57 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -406,7 +406,7 @@ struct amdgpu_crtc {
struct amdgpu_flip_work *pflip_works;
enum amdgpu_flip_status pflip_status;
int deferred_flip_completion;
-   u64 last_flip_vblank;
+   u32 last_flip_vblank;
/* pll sharing */
struct amdgpu_atom_ss ss;
bool ss_enabled;
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 6c413bc012af..7366528e5cc2 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -286,7 +286,7 @@ static void dm_pflip_high_irq(void *interrupt_params)
}
 
/* Update to correct count(s) if racing with vblank irq */
-   amdgpu_crtc->last_flip_vblank = 
drm_crtc_accurate_vblank_count(_crtc->base);
+   drm_crtc_accurate_vblank_count(_crtc->base);
 
/* wake up userspace */
if (amdgpu_crtc->event) {
@@ -298,6 +298,14 @@ static void dm_pflip_high_irq(void *interrupt_params)
} else
WARN_ON(1);
 
+   /* Keep track of vblank of this flip for flip throttling. We use the
+* cooked hw counter, as that one incremented at start of this vblank
+* of pageflip completion, so last_flip_vblank is the forbidden count
+* for queueing new pageflips if vsync + VRR is enabled.
+*/
+   amdgpu_crtc->last_flip_vblank = 
amdgpu_get_vblank_counter_kms(adev->ddev,
+   amdgpu_crtc->crtc_id);
+
amdgpu_crtc->pflip_status = AMDGPU_FLIP_NONE;
spin_unlock_irqrestore(>ddev->event_lock, flags);
 
@@ -4789,9 +4797,8 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
unsigned long flags;
struct amdgpu_bo *abo;
uint64_t tiling_flags;
-   uint32_t target, target_vblank;
-   uint64_t last_flip_vblank;
-   bool vrr_active = acrtc_state->freesync_config.state == 
VRR_STATE_ACTIVE_VARIABLE;
+   uint32_t target_vblank, last_flip_vblank;
+   bool vrr_active = amdgpu_dm_vrr_active(acrtc_state);
bool pflip_present = false;
 
struct {
@@ -4935,7 +4942,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
 * clients using the GLX_OML_sync_control extension or
 * DRI3/Present extension with defined target_msc.
 */
-   last_flip_vblank = drm_crtc_vblank_count(pcrtc);
+   last_flip_vblank = 
amdgpu_get_vblank_counter_kms(dm->ddev, acrtc_attach->crtc_id);
}
else {
/* For variable refresh rate mode only:
@@ -4951,11 +4958,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
spin_unlock_irqrestore(>dev->event_lock, flags);
}
 
-   target = (uint32_t)last_flip_vblank + wait_for_vblank;
-
-   /* Prepare wait for target vblank early - before the 
fence-waits */
-   target_vblank = target - (uint32_t)drm_crtc_vblank_count(pcrtc) 
+
-   amdgpu_get_vblank_counter_kms(pcrtc->dev, 
acrtc_attach->crtc_id);
+   target_vblank = last_flip_vblank + wait_for_vblank;
 
/*
 * Wait until we're out of t

[PATCH 2/5] drm/amd/display: Prevent vblank irq disable while VRR is active. (v3)

2019-03-29 Thread Mario Kleiner

During VRR mode we can not allow vblank irq dis-/enable
transitions, as an enable after a disable can happen at
an arbitrary time during the video refresh cycle, e.g.,
with a high likelyhood inside vblank front-porch. An
enable during front-porch would cause vblank timestamp
updates/calculations which are completely bogus, given
the code can't know when the vblank will end as long
as we are in front-porch with no page flip completed.

Hold a permanent vblank reference on the crtc while
in active VRR mode to prevent a vblank disable, and
drop the reference again when switching back to fixed
refresh rate non-VRR mode.

v2: Make sure transition is also handled if vrr is
disabled and stream gets disabled in the same
atomic commit by moving the call to the transition
function outside of plane commit.
Suggested by Nicholas.

v3: Trivial rebase onto previous patch.

Signed-off-by: Mario Kleiner 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 +++
 1 file changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 6528258f8975..6c413bc012af 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -251,6 +251,12 @@ get_crtc_by_otg_inst(struct amdgpu_device *adev,
return NULL;
 }
 
+static inline bool amdgpu_dm_vrr_active(struct dm_crtc_state *dm_state)
+{
+   return dm_state->freesync_config.state == VRR_STATE_ACTIVE_VARIABLE ||
+  dm_state->freesync_config.state == VRR_STATE_ACTIVE_FIXED;
+}
+
 static void dm_pflip_high_irq(void *interrupt_params)
 {
struct amdgpu_crtc *amdgpu_crtc;
@@ -4737,6 +4743,31 @@ static void pre_update_freesync_state_on_stream(
new_crtc_state->vrr_params = vrr_params;
 }
 
+static void amdgpu_dm_handle_vrr_transition(struct dm_crtc_state *old_state,
+   struct dm_crtc_state *new_state)
+{
+   bool old_vrr_active = amdgpu_dm_vrr_active(old_state);
+   bool new_vrr_active = amdgpu_dm_vrr_active(new_state);
+
+   if (!old_vrr_active && new_vrr_active) {
+   /* Transition VRR inactive -> active:
+* While VRR is active, we must not disable vblank irq, as a
+* reenable after disable would compute bogus vblank/pflip
+* timestamps if it likely happened inside display front-porch.
+*/
+   drm_crtc_vblank_get(new_state->base.crtc);
+   DRM_DEBUG_DRIVER("%s: crtc=%u VRR off->on: Get vblank ref\n",
+__func__, new_state->base.crtc->base.id);
+   } else if (old_vrr_active && !new_vrr_active) {
+   /* Transition VRR active -> inactive:
+* Allow vblank irq disable again for fixed refresh rate.
+*/
+   drm_crtc_vblank_put(new_state->base.crtc);
+   DRM_DEBUG_DRIVER("%s: crtc=%u VRR on->off: Drop vblank ref\n",
+__func__, new_state->base.crtc->base.id);
+   }
+}
+
 static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
struct dc_state *dc_state,
struct drm_device *dev,
@@ -5277,6 +5308,11 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
 
dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
+
+   /* Handle vrr on->off / off->on transitions */
+   amdgpu_dm_handle_vrr_transition(dm_old_crtc_state,
+   dm_new_crtc_state);
+
modeset_needed = modeset_required(
new_crtc_state,
dm_new_crtc_state->stream,
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 5/5] drm/amd/display: Make pageflip event delivery compatible with VRR.

2019-03-29 Thread Mario Kleiner

We want vblank counts and timestamps of flip completion as sent
in pageflip completion events to be consistent with the vblank
count and timestamp of the vblank of flip completion, like in non
VRR mode.

In VRR mode, drm_update_vblank_count() - and thereby vblank
count and timestamp updates - must be delayed until after the
end of front-porch of each vblank, as it is only safe to
calculate vblank timestamps outside of the front-porch, when
we actually know when the vblank will end or has ended.

The function drm_update_vblank_count() which updates timestamps
and counts gets called by drm_crtc_accurate_vblank_count() or by
drm_crtc_handle_vblank().

Therefore we must make sure that pageflip events for a completed
flip are only sent out after drm_crtc_accurate_vblank_count() or
drm_crtc_handle_vblank() is executed, after end of front-porch
for the vblank of flip completion.

Two cases:

a) Pageflip irq handler executes inside front-porch:
   In this case we must defer sending pageflip events until
   drm_crtc_handle_vblank() executes after end of front-porch,
   and thereby calculates proper vblank count and timestamp.
   Iow. the pflip irq handler must just arm a pageflip event
   to be sent out by drm_crtc_handle_vblank() later on.

b) Pageflip irq handler executes after end of front-porch, e.g.,
   after flip completion in back-porch or due to a massively
   delayed handler invocation into the active scanout of the new
   frame. In this case we can call drm_crtc_accurate_vblank_count()
   to safely force calculation of a proper vblank count and
   timestamp, and must send the pageflip completion event
   ourselves from the pageflip irq handler.

   This is the same behaviour as needed for standard fixed refresh
   rate mode.

To decide from within pageflip handler if we are in case a) or b),
we check the current scanout position against the boundary of
front-porch. In non-VRR mode we just do what we did in the past.

Signed-off-by: Mario Kleiner 
Reviewed-by: Nicholas Kazlauskas 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 68 +++
 1 file changed, 55 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 1a5be73e4172..e9ef42e86a73 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -263,6 +263,10 @@ static void dm_pflip_high_irq(void *interrupt_params)
struct common_irq_params *irq_params = interrupt_params;
struct amdgpu_device *adev = irq_params->adev;
unsigned long flags;
+   struct drm_pending_vblank_event *e;
+   struct dm_crtc_state *acrtc_state;
+   uint32_t vpos, hpos, v_blank_start, v_blank_end;
+   bool vrr_active;
 
amdgpu_crtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
IRQ_TYPE_PFLIP);
 
@@ -285,18 +289,57 @@ static void dm_pflip_high_irq(void *interrupt_params)
return;
}
 
-   /* Update to correct count(s) if racing with vblank irq */
-   drm_crtc_accurate_vblank_count(_crtc->base);
+   /* page flip completed. */
+   e = amdgpu_crtc->event;
+   amdgpu_crtc->event = NULL;
 
-   /* wake up userspace */
-   if (amdgpu_crtc->event) {
-   drm_crtc_send_vblank_event(_crtc->base, 
amdgpu_crtc->event);
+   if (!e)
+   WARN_ON(1);
 
-   /* page flip completed. clean up */
-   amdgpu_crtc->event = NULL;
+   acrtc_state = to_dm_crtc_state(amdgpu_crtc->base.state);
+   vrr_active = amdgpu_dm_vrr_active(acrtc_state);
+
+   /* Fixed refresh rate, or VRR scanout position outside front-porch? */
+   if (!vrr_active ||
+   !dc_stream_get_scanoutpos(acrtc_state->stream, _blank_start,
+ _blank_end, , ) ||
+   (vpos < v_blank_start)) {
+   /* Update to correct count and vblank timestamp if racing with
+* vblank irq. This also updates to the correct vblank timestamp
+* even in VRR mode, as scanout is past the front-porch atm.
+*/
+   drm_crtc_accurate_vblank_count(_crtc->base);
 
-   } else
-   WARN_ON(1);
+   /* Wake up userspace by sending the pageflip event with proper
+* count and timestamp of vblank of flip completion.
+*/
+   if (e) {
+   drm_crtc_send_vblank_event(_crtc->base, e);
+
+   /* Event sent, so done with vblank for this flip */
+   drm_crtc_vblank_put(_crtc->base);
+   }
+   } else if (e) {
+   /* VRR active and inside front-porch: vblank count and
+* timestamp for pageflip event will only be up to date after
+* drm_crtc_handle_vblank() has been executed from late vblank
+

AMD Freesync patches v3

2019-03-29 Thread Mario Kleiner

The hopefully final patch series, with feedback applied and
r-b / acked-by tags added. Rebased to current agd5f/drm-5.2-wip
branch.

Numbering has slightly changed. Patches 3-5 are identical to last
series. Patch 2/5 v3 (the former 1/4 v2) trivially rebased on top of
the new 1/5.

Patch 1/5 is new and splits the VRR state updating to a part
early in commit_tail, before the vrr transition handling from
Patch 2/5, and a late part in the plane commit code, which must
happen after transition handling/during plane commit.

This should address all of Nicholas remaining comments. Tested
and nicely working on DCN-1.0 / Raven as well.

Thanks,
-mario


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/5] drm/amd/display: Update VRR state earlier in atomic_commit_tail.

2019-03-29 Thread Mario Kleiner

We need the VRR active/inactive state info earlier in
the commit sequence, so VRR related setup functions like
amdgpu_dm_handle_vrr_transition() know the final VRR state
when they need to do their hw setup work.

Split update_freesync_state_on_stream() into an early part,
that can run at the beginning of commit tail before the
vrr transition handling, and a late part that must run after
vrr transition handling inside the commit planes code for
enabled crtc's.

Suggested by Nicholas Kazlauskas.
Signed-off-by: Mario Kleiner 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 61 ++-
 1 file changed, 46 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 7d1c782072ee..6528258f8975 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4645,7 +4645,6 @@ static void update_freesync_state_on_stream(
 {
struct mod_vrr_params vrr_params = new_crtc_state->vrr_params;
struct dc_info_packet vrr_infopacket = {0};
-   struct mod_freesync_config config = new_crtc_state->freesync_config;
 
if (!new_stream)
return;
@@ -4658,20 +4657,6 @@ static void update_freesync_state_on_stream(
if (!new_stream->timing.h_total || !new_stream->timing.v_total)
return;
 
-   if (new_crtc_state->vrr_supported &&
-   config.min_refresh_in_uhz &&
-   config.max_refresh_in_uhz) {
-   config.state = new_crtc_state->base.vrr_enabled ?
-   VRR_STATE_ACTIVE_VARIABLE :
-   VRR_STATE_INACTIVE;
-   } else {
-   config.state = VRR_STATE_UNSUPPORTED;
-   }
-
-   mod_freesync_build_vrr_params(dm->freesync_module,
- new_stream,
- , _params);
-
if (surface) {
mod_freesync_handle_preflip(
dm->freesync_module,
@@ -4712,6 +4697,46 @@ static void update_freesync_state_on_stream(
  (int)vrr_params.state);
 }
 
+static void pre_update_freesync_state_on_stream(
+   struct amdgpu_display_manager *dm,
+   struct dm_crtc_state *new_crtc_state)
+{
+   struct dc_stream_state *new_stream = new_crtc_state->stream;
+   struct mod_vrr_params vrr_params = new_crtc_state->vrr_params;
+   struct mod_freesync_config config = new_crtc_state->freesync_config;
+
+   if (!new_stream)
+   return;
+
+   /*
+* TODO: Determine why min/max totals and vrefresh can be 0 here.
+* For now it's sufficient to just guard against these conditions.
+*/
+   if (!new_stream->timing.h_total || !new_stream->timing.v_total)
+   return;
+
+   if (new_crtc_state->vrr_supported &&
+   config.min_refresh_in_uhz &&
+   config.max_refresh_in_uhz) {
+   config.state = new_crtc_state->base.vrr_enabled ?
+   VRR_STATE_ACTIVE_VARIABLE :
+   VRR_STATE_INACTIVE;
+   } else {
+   config.state = VRR_STATE_UNSUPPORTED;
+   }
+
+   mod_freesync_build_vrr_params(dm->freesync_module,
+ new_stream,
+ , _params);
+
+   new_crtc_state->freesync_timing_changed |=
+   (memcmp(_crtc_state->vrr_params.adjust,
+   _params.adjust,
+   sizeof(vrr_params.adjust)) != 0);
+
+   new_crtc_state->vrr_params = vrr_params;
+}
+
 static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
struct dc_state *dc_state,
struct drm_device *dev,
@@ -5233,6 +5258,12 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
mutex_unlock(>dc_lock);
}
 
+   /* Update freesync state before amdgpu_dm_handle_vrr_transition(). */
+   for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) {
+   dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
+   pre_update_freesync_state_on_stream(dm, dm_new_crtc_state);
+   }
+
for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state,
new_crtc_state, i) {
/*
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 4/4] drm/amd/display: Make pageflip event delivery compatible with VRR.

2019-03-22 Thread Mario Kleiner

We want vblank counts and timestamps of flip completion as sent
in pageflip completion events to be consistent with the vblank
count and timestamp of the vblank of flip completion, like in non
VRR mode.

In VRR mode, drm_update_vblank_count() - and thereby vblank
count and timestamp updates - must be delayed until after the
end of front-porch of each vblank, as it is only safe to
calculate vblank timestamps outside of the front-porch, when
we actually know when the vblank will end or has ended.

The function drm_update_vblank_count() which updates timestamps
and counts gets called by drm_crtc_accurate_vblank_count() or by
drm_crtc_handle_vblank().

Therefore we must make sure that pageflip events for a completed
flip are only sent out after drm_crtc_accurate_vblank_count() or
drm_crtc_handle_vblank() is executed, after end of front-porch
for the vblank of flip completion.

Two cases:

a) Pageflip irq handler executes inside front-porch:
   In this case we must defer sending pageflip events until
   drm_crtc_handle_vblank() executes after end of front-porch,
   and thereby calculates proper vblank count and timestamp.
   Iow. the pflip irq handler must just arm a pageflip event
   to be sent out by drm_crtc_handle_vblank() later on.

b) Pageflip irq handler executes after end of front-porch, e.g.,
   after flip completion in back-porch or due to a massively
   delayed handler invocation into the active scanout of the new
   frame. In this case we can call drm_crtc_accurate_vblank_count()
   to safely force calculation of a proper vblank count and
   timestamp, and must send the pageflip completion event
   ourselves from the pageflip irq handler.

   This is the same behaviour as needed for standard fixed refresh
   rate mode.

To decide from within pageflip handler if we are in case a) or b),
we check the current scanout position against the boundary of
front-porch. In non-VRR mode we just do what we did in the past.

Signed-off-by: Mario Kleiner 
Reviewed-by: Nicholas Kazlauskas 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 68 +++
 1 file changed, 55 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index fe207988d0b2..c20e9f695f11 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -263,6 +263,10 @@ static void dm_pflip_high_irq(void *interrupt_params)
struct common_irq_params *irq_params = interrupt_params;
struct amdgpu_device *adev = irq_params->adev;
unsigned long flags;
+   struct drm_pending_vblank_event *e;
+   struct dm_crtc_state *acrtc_state;
+   uint32_t vpos, hpos, v_blank_start, v_blank_end;
+   bool vrr_active;
 
amdgpu_crtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
IRQ_TYPE_PFLIP);
 
@@ -285,18 +289,57 @@ static void dm_pflip_high_irq(void *interrupt_params)
return;
}
 
-   /* Update to correct count(s) if racing with vblank irq */
-   drm_crtc_accurate_vblank_count(_crtc->base);
+   /* page flip completed. */
+   e = amdgpu_crtc->event;
+   amdgpu_crtc->event = NULL;
 
-   /* wake up userspace */
-   if (amdgpu_crtc->event) {
-   drm_crtc_send_vblank_event(_crtc->base, 
amdgpu_crtc->event);
+   if (!e)
+   WARN_ON(1);
 
-   /* page flip completed. clean up */
-   amdgpu_crtc->event = NULL;
+   acrtc_state = to_dm_crtc_state(amdgpu_crtc->base.state);
+   vrr_active = amdgpu_dm_vrr_active(acrtc_state);
+
+   /* Fixed refresh rate, or VRR scanout position outside front-porch? */
+   if (!vrr_active ||
+   !dc_stream_get_scanoutpos(acrtc_state->stream, _blank_start,
+ _blank_end, , ) ||
+   (vpos < v_blank_start)) {
+   /* Update to correct count and vblank timestamp if racing with
+* vblank irq. This also updates to the correct vblank timestamp
+* even in VRR mode, as scanout is past the front-porch atm.
+*/
+   drm_crtc_accurate_vblank_count(_crtc->base);
 
-   } else
-   WARN_ON(1);
+   /* Wake up userspace by sending the pageflip event with proper
+* count and timestamp of vblank of flip completion.
+*/
+   if (e) {
+   drm_crtc_send_vblank_event(_crtc->base, e);
+
+   /* Event sent, so done with vblank for this flip */
+   drm_crtc_vblank_put(_crtc->base);
+   }
+   } else if (e) {
+   /* VRR active and inside front-porch: vblank count and
+* timestamp for pageflip event will only be up to date after
+* drm_crtc_handle_vblank() has been executed from late vblank
+

[PATCH 1/4] drm/amd/display: Prevent vblank irq disable while VRR is active. (v2)

2019-03-22 Thread Mario Kleiner

During VRR mode we can not allow vblank irq dis-/enable
transitions, as an enable after a disable can happen at
an arbitrary time during the video refresh cycle, e.g.,
with a high likelyhood inside vblank front-porch. An
enable during front-porch would cause vblank timestamp
updates/calculations which are completely bogus, given
the code can't know when the vblank will end as long
as we are in front-porch with no page flip completed.

Hold a permanent vblank reference on the crtc while
in active VRR mode to prevent a vblank disable, and
drop the reference again when switching back to fixed
refresh rate non-VRR mode.

v2: Make sure transition is also handled if vrr is
disabled and stream gets disabled in the same
atomic commit. Suggested by Nicholas.

Signed-off-by: Mario Kleiner 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 +++
 1 file changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 42531ed6ae75..b73313d6450f 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -251,6 +251,12 @@ get_crtc_by_otg_inst(struct amdgpu_device *adev,
return NULL;
 }
 
+static inline bool amdgpu_dm_vrr_active(struct dm_crtc_state *dm_state)
+{
+   return dm_state->freesync_config.state == VRR_STATE_ACTIVE_VARIABLE ||
+  dm_state->freesync_config.state == VRR_STATE_ACTIVE_FIXED;
+}
+
 static void dm_pflip_high_irq(void *interrupt_params)
 {
struct amdgpu_crtc *amdgpu_crtc;
@@ -4712,6 +4718,31 @@ static void update_freesync_state_on_stream(
  (int)vrr_params.state);
 }
 
+static void amdgpu_dm_handle_vrr_transition(struct dm_crtc_state *old_state,
+   struct dm_crtc_state *new_state)
+{
+   bool old_vrr_active = amdgpu_dm_vrr_active(old_state);
+   bool new_vrr_active = amdgpu_dm_vrr_active(new_state);
+
+   if (!old_vrr_active && new_vrr_active) {
+   /* Transition VRR inactive -> active:
+* While VRR is active, we must not disable vblank irq, as a
+* reenable after disable would compute bogus vblank/pflip
+* timestamps if it likely happened inside display front-porch.
+*/
+   drm_crtc_vblank_get(new_state->base.crtc);
+   DRM_DEBUG_DRIVER("%s: crtc=%u VRR off->on: Get vblank ref\n",
+__func__, new_state->base.crtc->base.id);
+   } else if (old_vrr_active && !new_vrr_active) {
+   /* Transition VRR active -> inactive:
+* Allow vblank irq disable again for fixed refresh rate.
+*/
+   drm_crtc_vblank_put(new_state->base.crtc);
+   DRM_DEBUG_DRIVER("%s: crtc=%u VRR on->off: Drop vblank ref\n",
+__func__, new_state->base.crtc->base.id);
+   }
+}
+
 static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
struct dc_state *dc_state,
struct drm_device *dev,
@@ -5246,6 +5277,11 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
 
dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
+
+   /* Handle vrr on->off / off->on transitions */
+   amdgpu_dm_handle_vrr_transition(dm_old_crtc_state,
+   dm_new_crtc_state);
+
modeset_needed = modeset_required(
new_crtc_state,
dm_new_crtc_state->stream,
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

AMD Freesync patches v2

2019-03-22 Thread Mario Kleiner

The current patch series, with feedback from Paul, Nicholas and Harry
applied and r-b / acked-by tags added. Thanks for the feedback.
Rebased to current drm-5.2-wip branch.

Patch 1/4 is still the same though. Don't know if i or Nicholas could
fix it in a followup patch, or this one needs more work directly, or
who does what. Nicholas?

thanks,
-mario


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/4] drm/amd/display: In VRR mode, do DRM core vblank handling at end of vblank. (v2)

2019-03-22 Thread Mario Kleiner

In VRR mode, proper vblank/pageflip timestamps can only be computed
after the display scanout position has left front-porch. Therefore
delay calls to drm_crtc_handle_vblank(), and thereby calls to
drm_update_vblank_count() and pageflip event delivery, to after the
end of front-porch when in VRR mode.

We add a new vupdate irq, which triggers at the end of the vupdate
interval, ie. at the end of vblank, and calls the core vblank handler
function. The new irq handler is not executed in standard non-VRR
mode, so vblank handling for fixed refresh rate mode is identical
to the past implementation.

v2: Implement feedback by Nicholas and Paul Menzel.

Signed-off-by: Mario Kleiner 
Acked-by: Harry Wentland 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   1 +
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 128 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |   9 ++
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c |  22 +++
 .../dc/irq/dce110/irq_service_dce110.c|   7 +-
 .../dc/irq/dce120/irq_service_dce120.c|   7 +-
 .../display/dc/irq/dce80/irq_service_dce80.c  |   6 +-
 .../display/dc/irq/dcn10/irq_service_dcn10.c  |  40 --
 8 files changed, 204 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 6e71749cb3bb..6294316f24c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -827,6 +827,7 @@ struct amdgpu_device {
/* For pre-DCE11. DCE11 and later are in "struct amdgpu_device->dm" */
struct work_struct  hotplug_work;
struct amdgpu_irq_src   crtc_irq;
+   struct amdgpu_irq_src   vupdate_irq;
struct amdgpu_irq_src   pageflip_irq;
struct amdgpu_irq_src   hpd_irq;
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index aabd23fa0766..fe207988d0b2 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -315,6 +315,32 @@ static void dm_pflip_high_irq(void *interrupt_params)
drm_crtc_vblank_put(_crtc->base);
 }
 
+static void dm_vupdate_high_irq(void *interrupt_params)
+{
+   struct common_irq_params *irq_params = interrupt_params;
+   struct amdgpu_device *adev = irq_params->adev;
+   struct amdgpu_crtc *acrtc;
+   struct dm_crtc_state *acrtc_state;
+
+   acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
IRQ_TYPE_VUPDATE);
+
+   if (acrtc) {
+   acrtc_state = to_dm_crtc_state(acrtc->base.state);
+
+   DRM_DEBUG_DRIVER("crtc:%d, vupdate-vrr:%d\n", acrtc->crtc_id,
+amdgpu_dm_vrr_active(acrtc_state));
+
+   /* Core vblank handling is done here after end of front-porch in
+* vrr mode, as vblank timestamping will give valid results
+* while now done after front-porch. This will also deliver
+* page-flip completion events that have been queued to us
+* if a pageflip happened inside front-porch.
+*/
+   if (amdgpu_dm_vrr_active(acrtc_state))
+   drm_crtc_handle_vblank(>base);
+   }
+}
+
 static void dm_crtc_high_irq(void *interrupt_params)
 {
struct common_irq_params *irq_params = interrupt_params;
@@ -325,11 +351,24 @@ static void dm_crtc_high_irq(void *interrupt_params)
acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
IRQ_TYPE_VBLANK);
 
if (acrtc) {
-   drm_crtc_handle_vblank(>base);
-   amdgpu_dm_crtc_handle_crc_irq(>base);
-
acrtc_state = to_dm_crtc_state(acrtc->base.state);
 
+   DRM_DEBUG_DRIVER("crtc:%d, vupdate-vrr:%d\n", acrtc->crtc_id,
+amdgpu_dm_vrr_active(acrtc_state));
+
+   /* Core vblank handling at start of front-porch is only possible
+* in non-vrr mode, as only there vblank timestamping will give
+* valid results while done in front-porch. Otherwise defer it
+* to dm_vupdate_high_irq after end of front-porch.
+*/
+   if (!amdgpu_dm_vrr_active(acrtc_state))
+   drm_crtc_handle_vblank(>base);
+
+   /* Following stuff must happen at start of vblank, for crc
+* computation and below-the-range btr support in vrr mode.
+*/
+   amdgpu_dm_crtc_handle_crc_irq(>base);
+
if (acrtc_state->stream &&
acrtc_state->vrr_params.supported &&
acrtc_state->freesync_config.state == 
VRR_STATE_ACTIVE_VARIABLE) {
@@ -1447,6 +1486,27 @@ static int dce110_register_irq_handlers(struct 
amdgpu_device *adev)

[PATCH 2/4] drm/amd/display: Rework vrr flip throttling for late vblank irq.

2019-03-22 Thread Mario Kleiner

For throttling to work correctly, we always need a baseline vblank
count last_flip_vblank that increments at start of front-porch.

This is the case for drm_crtc_vblank_count() in non-VRR mode, where
the vblank irq fires at start of front-porch and triggers DRM core
vblank handling, but it is no longer the case in VRR mode, where
core vblank handling is done later, after end of front-porch.

Therefore drm_crtc_vblank_count() is no longer useful for this.
We also can't use drm_crtc_accurate_vblank_count(), as that would
screw up vblank timestamps in VRR mode when called in front-porch.

To solve this, use the cooked hardware vblank counter returned by
amdgpu_get_vblank_counter_kms() instead, as that one is cooked to
always increment at start of front-porch, independent of when
vblank related irq's fire.

This patch allows vblank irq handling to happen anywhere within
vblank of even after it, without a negative impact on flip
throttling, so followup patches can shift the vblank core
handling trigger point wherever they need it.

Signed-off-by: Mario Kleiner 
Reviewed-by: Nicholas Kazlauskas 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  |  2 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 23 +++
 2 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index 889e4437..add238fe4b57 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -406,7 +406,7 @@ struct amdgpu_crtc {
struct amdgpu_flip_work *pflip_works;
enum amdgpu_flip_status pflip_status;
int deferred_flip_completion;
-   u64 last_flip_vblank;
+   u32 last_flip_vblank;
/* pll sharing */
struct amdgpu_atom_ss ss;
bool ss_enabled;
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index b73313d6450f..aabd23fa0766 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -286,7 +286,7 @@ static void dm_pflip_high_irq(void *interrupt_params)
}
 
/* Update to correct count(s) if racing with vblank irq */
-   amdgpu_crtc->last_flip_vblank = 
drm_crtc_accurate_vblank_count(_crtc->base);
+   drm_crtc_accurate_vblank_count(_crtc->base);
 
/* wake up userspace */
if (amdgpu_crtc->event) {
@@ -298,6 +298,14 @@ static void dm_pflip_high_irq(void *interrupt_params)
} else
WARN_ON(1);
 
+   /* Keep track of vblank of this flip for flip throttling. We use the
+* cooked hw counter, as that one incremented at start of this vblank
+* of pageflip completion, so last_flip_vblank is the forbidden count
+* for queueing new pageflips if vsync + VRR is enabled.
+*/
+   amdgpu_crtc->last_flip_vblank = 
amdgpu_get_vblank_counter_kms(adev->ddev,
+   amdgpu_crtc->crtc_id);
+
amdgpu_crtc->pflip_status = AMDGPU_FLIP_NONE;
spin_unlock_irqrestore(>ddev->event_lock, flags);
 
@@ -4764,9 +4772,8 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
unsigned long flags;
struct amdgpu_bo *abo;
uint64_t tiling_flags;
-   uint32_t target, target_vblank;
-   uint64_t last_flip_vblank;
-   bool vrr_active = acrtc_state->freesync_config.state == 
VRR_STATE_ACTIVE_VARIABLE;
+   uint32_t target_vblank, last_flip_vblank;
+   bool vrr_active = amdgpu_dm_vrr_active(acrtc_state);
bool pflip_present = false;
 
struct {
@@ -4910,7 +4917,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
 * clients using the GLX_OML_sync_control extension or
 * DRI3/Present extension with defined target_msc.
 */
-   last_flip_vblank = drm_crtc_vblank_count(pcrtc);
+   last_flip_vblank = 
amdgpu_get_vblank_counter_kms(dm->ddev, acrtc_attach->crtc_id);
}
else {
/* For variable refresh rate mode only:
@@ -4926,11 +4933,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
spin_unlock_irqrestore(>dev->event_lock, flags);
}
 
-   target = (uint32_t)last_flip_vblank + wait_for_vblank;
-
-   /* Prepare wait for target vblank early - before the 
fence-waits */
-   target_vblank = target - (uint32_t)drm_crtc_vblank_count(pcrtc) 
+
-   amdgpu_get_vblank_counter_kms(pcrtc->dev, 
acrtc_attach->crtc_id);
+   target_vblank = last_flip_vblank + wait_for_vblank;
 
/*
 * Wait until we're out of t

Re: [PATCH 3/4] drm/amd/display: In VRR mode, do DRM core vblank handling at end of vblank.

2019-03-21 Thread Mario Kleiner

On Wed, Mar 20, 2019 at 1:53 PM Kazlauskas, Nicholas
 wrote:
>
> On 3/20/19 3:51 AM, Mario Kleiner wrote:
> > Ok, fixed all the style issues and ran checkpatch over the patches. Thanks.
> >
> > On Tue, Mar 19, 2019 at 2:32 PM Kazlauskas, Nicholas
> >  wrote:
> >>
> >> On 3/19/19 9:23 AM, Kazlauskas, Nicholas wrote:
> >>> On 3/18/19 1:19 PM, Mario Kleiner wrote:
> >>>> In VRR mode, proper vblank/pageflip timestamps can only be computed
> >>>> after the display scanout position has left front-porch. Therefore
> >>>> delay calls to drm_crtc_handle_vblank(), and thereby calls to
> >>>> drm_update_vblank_count() and pageflip event delivery, to after the
> >>>> end of front-porch when in VRR mode.
> >>>>
> >>>> We add a new vupdate irq, which triggers at the end of the vupdate
> >>>> interval, ie. at the end of vblank, and calls the core vblank handler
> >>>> function. The new irq handler is not executed in standard non-VRR
> >>>> mode, so vblank handling for fixed refresh rate mode is identical
> >>>> to the past implementation.
> >>>>
> >>>> Signed-off-by: Mario Kleiner 
> >>
> >> Looks I lost some of my comments I wanted to send in my last email. Just
> >> a few nitpicks (including some things Paul mentioned).
> >>
> >> Also meant to CC Harry on this as well.
> >>
> >>>> ---
> >>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h|   1 +
> >>>> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 129 
> >>>> -
> >>>> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h  |   9 ++
> >>>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c  |  22 
> >>>> .../amd/display/dc/irq/dce110/irq_service_dce110.c |   7 +-
> >>>> .../amd/display/dc/irq/dce120/irq_service_dce120.c |   7 +-
> >>>> .../amd/display/dc/irq/dce80/irq_service_dce80.c   |   6 +-
> >>>> .../amd/display/dc/irq/dcn10/irq_service_dcn10.c   |  40 +--
> >>>> 8 files changed, 205 insertions(+), 16 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>> index f88761a..64167dd 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>> @@ -827,6 +827,7 @@ struct amdgpu_device {
> >>>>   /* For pre-DCE11. DCE11 and later are in "struct 
> >>>> amdgpu_device->dm" */
> >>>>   struct work_struct  hotplug_work;
> >>>>   struct amdgpu_irq_src   crtc_irq;
> >>>> +struct amdgpu_irq_src   vupdate_irq;
> >>>>   struct amdgpu_irq_src   pageflip_irq;
> >>>>   struct amdgpu_irq_src   hpd_irq;
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> >>>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>>> index 85e4f87..555d9e9f 100644
> >>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>>> @@ -315,6 +315,32 @@ static void dm_pflip_high_irq(void 
> >>>> *interrupt_params)
> >>>>   drm_crtc_vblank_put(_crtc->base);
> >>>> }
> >>>>
> >>>> +static void dm_vupdate_high_irq(void *interrupt_params)
> >>>> +{
> >>>> +struct common_irq_params *irq_params = interrupt_params;
> >>>> +struct amdgpu_device *adev = irq_params->adev;
> >>>> +struct amdgpu_crtc *acrtc;
> >>>> +struct dm_crtc_state *acrtc_state;
> >>>> +
> >>>> +acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
> >>>> IRQ_TYPE_VUPDATE);
> >>>> +
> >>>> +if (acrtc) {
> >>>> +acrtc_state = to_dm_crtc_state(acrtc->base.state);
> >>>> +
> >>>> +DRM_DEBUG_DRIVER("crtc:%d, vupdate-vrr:%d\n", 
> >>>> acrtc->crtc_id,
> >>>> + amdgpu_dm_vrr_active(acrtc_state));
> >>>> +
> >>>> +/* Core vblank handl

Re: [PATCH 1/4] drm/amd/display: Prevent vblank irq disable while VRR is active. (v2)

2019-03-21 Thread Mario Kleiner

On Wed, Mar 20, 2019 at 2:11 PM Kazlauskas, Nicholas
 wrote:
>
> On 3/20/19 4:12 AM, Mario Kleiner wrote:
> > During VRR mode we can not allow vblank irq dis-/enable
> > transitions, as an enable after a disable can happen at
> > an arbitrary time during the video refresh cycle, e.g.,
> > with a high likelyhood inside vblank front-porch. An
> > enable during front-porch would cause vblank timestamp
> > updates/calculations which are completely bogus, given
> > the code can't know when the vblank will end as long
> > as we are in front-porch with no page flip completed.
> >
> > Hold a permanent vblank reference on the crtc while
> > in active VRR mode to prevent a vblank disable, and
> > drop the reference again when switching back to fixed
> > refresh rate non-VRR mode.
> >
> > v2: Make sure transition is also handled if vrr is
> >  disabled and stream gets disabled in the same
> >  atomic commit. Suggested by Nicholas.
> >
> > Signed-off-by: Mario Kleiner 
> > ---
> >   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 
> > +++
> >   1 file changed, 36 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index a718ac2..1c83e80 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -251,6 +251,12 @@ get_crtc_by_otg_inst(struct amdgpu_device *adev,
> >   return NULL;
> >   }
> >
> > +static inline bool amdgpu_dm_vrr_active(struct dm_crtc_state *dm_state)
> > +{
> > + return dm_state->freesync_config.state == VRR_STATE_ACTIVE_VARIABLE ||
> > +dm_state->freesync_config.state == VRR_STATE_ACTIVE_FIXED;
> > +}
> > +
> >   static void dm_pflip_high_irq(void *interrupt_params)
> >   {
> >   struct amdgpu_crtc *amdgpu_crtc;
> > @@ -4716,6 +4722,31 @@ static void update_freesync_state_on_stream(
> > (int)vrr_params.state);
> >   }
> >
> > +static void amdgpu_dm_handle_vrr_transition(struct dm_crtc_state 
> > *old_state,
> > + struct dm_crtc_state *new_state)
> > +{
> > + bool old_vrr_active = amdgpu_dm_vrr_active(old_state);
> > + bool new_vrr_active = amdgpu_dm_vrr_active(new_state);
> > +
> > + if (!old_vrr_active && new_vrr_active) {
> > + /* Transition VRR inactive -> active:
> > +  * While VRR is active, we must not disable vblank irq, as a
> > +  * reenable after disable would compute bogus vblank/pflip
> > +  * timestamps if it likely happened inside display 
> > front-porch.
> > +  */
> > + drm_crtc_vblank_get(new_state->base.crtc);
> > + DRM_DEBUG_DRIVER("%s: crtc=%u VRR off->on: Get vblank ref\n",
> > +  __func__, new_state->base.crtc->base.id);
> > + } else if (old_vrr_active && !new_vrr_active) {
> > + /* Transition VRR active -> inactive:
> > +  * Allow vblank irq disable again for fixed refresh rate.
> > +  */
> > + drm_crtc_vblank_put(new_state->base.crtc);
> > + DRM_DEBUG_DRIVER("%s: crtc=%u VRR on->off: Drop vblank ref\n",
> > +  __func__, new_state->base.crtc->base.id);
> > + }
> > +}
> > +
> >   static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
> >   struct dc_state *dc_state,
> >   struct drm_device *dev,
> > @@ -5250,6 +5281,11 @@ static void amdgpu_dm_atomic_commit_tail(struct 
> > drm_atomic_state *state)
> >
> >   dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
> >   dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
> > +
> > + /* Handle vrr on->off / off->on transitions */
> > + amdgpu_dm_handle_vrr_transition(dm_old_crtc_state,
> > + dm_new_crtc_state);
> > +
>
> I guess there's actually another problem with this here - we won't have
> the actual dm_state->freesync_config.state until the commit following
> this one since it gets calculated below this transition handler.
>
> We need this handler to trigger before the new framebuffer address is
> written but afte

Re: [PATCH 1/4] drm/amd/display: Prevent vblank irq disable while VRR is active.

2019-03-20 Thread Mario Kleiner

On Mon, Mar 18, 2019 at 6:29 PM Kazlauskas, Nicholas
 wrote:
>
> On 3/18/19 1:19 PM, Mario Kleiner wrote:
> > During VRR mode we can not allow vblank irq dis-/enable
> > transitions, as an enable after a disable can happen at
> > an arbitrary time during the video refresh cycle, e.g.,
> > with a high likelyhood inside vblank front-porch. An
> > enable during front-porch would cause vblank timestamp
> > updates/calculations which are completely bogus, given
> > the code can't know when the vblank will end as long
> > as we are in front-porch with no page flip completed.
> >
> > Hold a permanent vblank reference on the crtc while
> > in active VRR mode to prevent a vblank disable, and
> > drop the reference again when switching back to fixed
> > refresh rate non-VRR mode.
> >
> > Signed-off-by: Mario Kleiner 
> > ---
> >   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 35 
> > +++
> >   1 file changed, 35 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index a718ac2..c1c3815 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -251,6 +251,12 @@ get_crtc_by_otg_inst(struct amdgpu_device *adev,
> >   return NULL;
> >   }
> >
> > +static inline bool amdgpu_dm_vrr_active(struct dm_crtc_state *dm_state)
> > +{
> > + return dm_state->freesync_config.state == VRR_STATE_ACTIVE_VARIABLE ||
> > +dm_state->freesync_config.state == VRR_STATE_ACTIVE_FIXED;
> > +}
> > +
> >   static void dm_pflip_high_irq(void *interrupt_params)
> >   {
> >   struct amdgpu_crtc *amdgpu_crtc;
> > @@ -4716,6 +4722,32 @@ static void update_freesync_state_on_stream(
> > (int)vrr_params.state);
> >   }
> >
> > +static void amdgpu_dm_handle_vrr_transition(struct dm_crtc_state 
> > *old_state,
> > + struct dm_crtc_state *new_state)
> > +{
> > + bool old_vrr_active = amdgpu_dm_vrr_active(old_state);
> > + bool new_vrr_active = amdgpu_dm_vrr_active(new_state);
> > +
> > + if (!old_vrr_active && new_vrr_active) {
> > + /* Transition VRR inactive -> active:
> > +  * While VRR is active, we must not disable vblank irq, as a
> > +  * reenable after disable would compute bogus vblank/pflip
> > +  * timestamps if it likely happened inside display 
> > front-porch.
> > +  */
> > + drm_crtc_vblank_get(new_state->base.crtc);
> > + DRM_DEBUG_DRIVER("%s: crtc=%u VRR off->on: Get vblank ref\n",
> > +  __func__, new_state->base.crtc->base.id);
> > + }
> > + else if (old_vrr_active && !new_vrr_active) {
> > + /* Transition VRR active -> inactive:
> > +  * Allow vblank irq disable again for fixed refresh rate.
> > +  */
> > + drm_crtc_vblank_put(new_state->base.crtc);
> > + DRM_DEBUG_DRIVER("%s: crtc=%u VRR on->off: Drop vblank ref\n",
> > +  __func__, new_state->base.crtc->base.id);
> > + }
> > +}
> > +
> >   static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
> >   struct dc_state *dc_state,
> >   struct drm_device *dev,
> > @@ -4757,6 +4789,9 @@ static void amdgpu_dm_commit_planes(struct 
> > drm_atomic_state *state,
> >   goto cleanup;
> >   }
> >
> > + /* Take care to hold extra vblank ref for a crtc in VRR mode */
> > + amdgpu_dm_handle_vrr_transition(dm_old_crtc_state, acrtc_state);
>
> I think this forgets to drop the extra reference in the case where the
> stream is being disabled at the same time VRR is -
> amdgpu_dm_commit_planes is only called when when the stream is non-NULL.
>
> I think the logic will work if simply moved outside of the function into
> the loop that calls this.
>
> Nicholas Kazlauskas
>

Yep! v2 patch out, thanks.
-mario
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/4] drm/amd/display: Prevent vblank irq disable while VRR is active. (v2)

2019-03-20 Thread Mario Kleiner

During VRR mode we can not allow vblank irq dis-/enable
transitions, as an enable after a disable can happen at
an arbitrary time during the video refresh cycle, e.g.,
with a high likelyhood inside vblank front-porch. An
enable during front-porch would cause vblank timestamp
updates/calculations which are completely bogus, given
the code can't know when the vblank will end as long
as we are in front-porch with no page flip completed.

Hold a permanent vblank reference on the crtc while
in active VRR mode to prevent a vblank disable, and
drop the reference again when switching back to fixed
refresh rate non-VRR mode.

v2: Make sure transition is also handled if vrr is
disabled and stream gets disabled in the same
atomic commit. Suggested by Nicholas.

Signed-off-by: Mario Kleiner 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 +++
 1 file changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index a718ac2..1c83e80 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -251,6 +251,12 @@ get_crtc_by_otg_inst(struct amdgpu_device *adev,
return NULL;
 }
 
+static inline bool amdgpu_dm_vrr_active(struct dm_crtc_state *dm_state)
+{
+   return dm_state->freesync_config.state == VRR_STATE_ACTIVE_VARIABLE ||
+  dm_state->freesync_config.state == VRR_STATE_ACTIVE_FIXED;
+}
+
 static void dm_pflip_high_irq(void *interrupt_params)
 {
struct amdgpu_crtc *amdgpu_crtc;
@@ -4716,6 +4722,31 @@ static void update_freesync_state_on_stream(
  (int)vrr_params.state);
 }
 
+static void amdgpu_dm_handle_vrr_transition(struct dm_crtc_state *old_state,
+   struct dm_crtc_state *new_state)
+{
+   bool old_vrr_active = amdgpu_dm_vrr_active(old_state);
+   bool new_vrr_active = amdgpu_dm_vrr_active(new_state);
+
+   if (!old_vrr_active && new_vrr_active) {
+   /* Transition VRR inactive -> active:
+* While VRR is active, we must not disable vblank irq, as a
+* reenable after disable would compute bogus vblank/pflip
+* timestamps if it likely happened inside display front-porch.
+*/
+   drm_crtc_vblank_get(new_state->base.crtc);
+   DRM_DEBUG_DRIVER("%s: crtc=%u VRR off->on: Get vblank ref\n",
+__func__, new_state->base.crtc->base.id);
+   } else if (old_vrr_active && !new_vrr_active) {
+   /* Transition VRR active -> inactive:
+* Allow vblank irq disable again for fixed refresh rate.
+*/
+   drm_crtc_vblank_put(new_state->base.crtc);
+   DRM_DEBUG_DRIVER("%s: crtc=%u VRR on->off: Drop vblank ref\n",
+__func__, new_state->base.crtc->base.id);
+   }
+}
+
 static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
struct dc_state *dc_state,
struct drm_device *dev,
@@ -5250,6 +5281,11 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
 
dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
+
+   /* Handle vrr on->off / off->on transitions */
+   amdgpu_dm_handle_vrr_transition(dm_old_crtc_state,
+   dm_new_crtc_state);
+
modeset_needed = modeset_required(
new_crtc_state,
dm_new_crtc_state->stream,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

1 2 >

1 - 100 of 173 matches

Mail list logo