Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-05-05 Thread Mario Kleiner
On Wed, Apr 28, 2021 at 11:22 PM Alex Deucher  wrote:
>
> On Tue, Apr 20, 2021 at 5:25 PM Alex Deucher  wrote:
> >
> > On Fri, Apr 16, 2021 at 12:29 PM Mario Kleiner
> >  wrote:
> > >
> > > Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
> > > Would be great to get this in sooner than later.
> > >
> >
> > No objections from me.
> >
>
> I don't have any objections to merging this.  Are the IGT tests available?
>
> Alex
>.

IGT Patches are out now, already r-b by Ville, cc'd to you. As
mentioned in the cover letter for those, the new 16 bpc test cases on
top o f IGT master for kms_plane test now work nicely on my
RavenRidge, but i had to add hacks on top of kms_plane test to make it
work at all on RV, ie. get it to the point where it could execute the
tests for the new formats at all. Unmodified kms_plane from master
doesn't even work on RV with Linux 5.8. Seems IGT is quite a bit out
of date wrt. the kernel?

Things i had to do:

- Skip all tests for modifiers other than linear. --> Test
requirements wrt. tiling not met. Seems all the modifier support for
DCC, DCC_RETILE on Vega+ is missing from IGT so far?

- Skip test for format DRM_FORMAT_RGB565. CRC mismatch. Probably
because a 5 bpc container can't represent the net 8 bpc content from
the reference test image? Maybe all tests for < 8 bpc formats should
be skipped?

- Skip tests for yuv planar formats with BT2020 color space: Limited
range unsupported by DC, full range causes CRC mismatch.

- Problems with crc vblank count expected vs. actual for planar YUV formats.

- If the tests try to test more than the primary plane,
igt_pipe_crc_start() fails to open the crtc/crc/data file with -EIO.

See the attached patch with all the needed hacks. Not sure which of
these are limitations of the IGT test, and which are amdgpu bugs or hw
limitations, but applying this hack-patch on top of the patches for
the new formats makes kms_plane pass.

-mario





> > Alex
> >
> >
> > > Thanks and have a nice weekend,
> > > -mario
> > >
> > > On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > this patch series adds the fourcc's for 16 bit fixed point unorm
> > > > framebuffers to the core, and then an implementation for AMD gpu's
> > > > with DisplayCore.
> > > >
> > > > This is intended to allow for pageflipping to, and direct scanout of,
> > > > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > > > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > > > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > > > Link: 
> > > > https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> > > >
> > > > My main motivation for this is squeezing every bit of precision
> > > > out of the hardware for scientific and medical research applications,
> > > > where fp16 in the unorm range is limited to ~11 bpc effective linear
> > > > precision in the upper half [0.5;1.0] of the unorm range, although
> > > > the hardware could do at least 12 bpc.
> > > >
> > > > It has been successfully tested on AMD RavenRidge (DCN-1), and with
> > > > Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> > > > (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> > > > on my hw, both running at 10 bpc DP output depth.
> > > >
> > > > Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> > > > 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
> > > > Apple Retina panel), all running at 10 bpc output depth.
> > > >
> > > > No malfunctions, visual artifacts or other oddities were observed
> > > > (apart from an adventureous mess of cables and adapters on my desk),
> > > > suggesting it works.
> > > >
> > > > I used my automatic photometer measurement procedure to verify the
> > > > effective output precision of 10 bpc DP native signal + spatial
> > > > dithering in the gpu as enabled by the amdgpu driver. Results show
> > > > the expected 12 bpc precision i hoped for -- the current upper limit
> > > > for AMD display hw afaik.
> > > >
> > > > So it seems to work in the way i hoped :).
> > > >
> > > > Some open questions wrt. AMD DC, to be addressed in this patch series, 
> > > > or follow up
> > > > patches if neccessary:
> > > >
> > > > - For the atomic check for plane scaling, the current patch will
> > > > apply the same hw limits as for other rgb fixed point fb's, e.g.,
> > > > for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> > > > limits, because this is also a 64 bpp format? Or something new
> > > > entirely?
> > > >
> > > > - I haven't added the new fourcc to the DCC tables yet. Should i?
> > > >
> > > > - I had to change an assert for DCE to allow 36bpp linebuffers (patch 
> > > > 4/5).
> > > > It looks to me as if that assert was inconsistent with other places
> > > > in the driver where COLOR_DEPTH121212 is supported, and looking at
> > > > the code, the change seems 

Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-05-05 Thread Mario Kleiner
On Tue, May 4, 2021 at 9:22 PM Alex Deucher  wrote:
>
> On Wed, Apr 28, 2021 at 5:21 PM Alex Deucher  wrote:
> >
> > On Tue, Apr 20, 2021 at 5:25 PM Alex Deucher  wrote:
> > >
> > > On Fri, Apr 16, 2021 at 12:29 PM Mario Kleiner
> > >  wrote:
> > > >
> > > > Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
> > > > Would be great to get this in sooner than later.
> > > >
> > >
> > > No objections from me.
> > >
> >
> > I don't have any objections to merging this.  Are the IGT tests available?
> >
>
> Any preference on whether I merge this through the AMD tree or drm-misc?
>
> Alex
>

Hi Alex, in case the question is addressed to myself: I prefer
whatever gets it into drm-next asap, so we can sync the drm_fourcc.h
headers from drm-next to the IGT tests, libdrm, amdvlk etc.

Another thing:Unless this would still make it into the Linux 5.13
merge window, we'd also need a KMS_DRIVER_MINOR bump 41 -> 42. This
way amdgpu-pro's Vulkan driver could know about the new 16 bpc pixel
formats for the out of tree amdgpu-dkms package when running against
older kernels.

thanks,
-mario

>
> > Alex
> >
> > > Alex
> > >
> > >
> > > > Thanks and have a nice weekend,
> > > > -mario
> > > >
> > > > On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
> > > >  wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > this patch series adds the fourcc's for 16 bit fixed point unorm
> > > > > framebuffers to the core, and then an implementation for AMD gpu's
> > > > > with DisplayCore.
> > > > >
> > > > > This is intended to allow for pageflipping to, and direct scanout of,
> > > > > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > > > > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > > > > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > > > > Link: 
> > > > > https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> > > > >
> > > > > My main motivation for this is squeezing every bit of precision
> > > > > out of the hardware for scientific and medical research applications,
> > > > > where fp16 in the unorm range is limited to ~11 bpc effective linear
> > > > > precision in the upper half [0.5;1.0] of the unorm range, although
> > > > > the hardware could do at least 12 bpc.
> > > > >
> > > > > It has been successfully tested on AMD RavenRidge (DCN-1), and with
> > > > > Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> > > > > (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> > > > > on my hw, both running at 10 bpc DP output depth.
> > > > >
> > > > > Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> > > > > 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 
> > > > > 2880x1800@60Hz
> > > > > Apple Retina panel), all running at 10 bpc output depth.
> > > > >
> > > > > No malfunctions, visual artifacts or other oddities were observed
> > > > > (apart from an adventureous mess of cables and adapters on my desk),
> > > > > suggesting it works.
> > > > >
> > > > > I used my automatic photometer measurement procedure to verify the
> > > > > effective output precision of 10 bpc DP native signal + spatial
> > > > > dithering in the gpu as enabled by the amdgpu driver. Results show
> > > > > the expected 12 bpc precision i hoped for -- the current upper limit
> > > > > for AMD display hw afaik.
> > > > >
> > > > > So it seems to work in the way i hoped :).
> > > > >
> > > > > Some open questions wrt. AMD DC, to be addressed in this patch 
> > > > > series, or follow up
> > > > > patches if neccessary:
> > > > >
> > > > > - For the atomic check for plane scaling, the current patch will
> > > > > apply the same hw limits as for other rgb fixed point fb's, e.g.,
> > > > > for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> > > > > limits, because this is also a 64 bpp format? Or something new
> > > > > entirely?
> > > > >
> > > > > - I haven't added the new fourcc to the DCC tables yet. Should i?
> > > > >
> > > > > - I had to change an assert for DCE to allow 36bpp linebuffers (patch 
> > > > > 4/5).
> > > > > It looks to me as if that assert was inconsistent with other places
> > > > > in the driver where COLOR_DEPTH121212 is supported, and looking at
> > > > > the code, the change seems harmless. At least on DCE-11.2 the change
> > > > > didn't cause any noticeable (by myself) or measurable (by my 
> > > > > equipment)
> > > > > problems on any of the 3 connected displays.
> > > > >
> > > > > - Related to that change, while i needed to increase lb pixelsize to 
> > > > > 36bpp
> > > > > to get > 10 bpc effective precision on DCN, i didn't need to do that
> > > > > on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
> > > > > to get > 10 bpc precision for fp16 framebuffers, so something seems to
> > > > > behave differently for floating point 16 vs. fixed point 16. This all
> > > > > seems to suggest one could leave lb pixelsize at the old 30 bpp 

Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-05-04 Thread Alex Deucher
On Wed, Apr 28, 2021 at 5:21 PM Alex Deucher  wrote:
>
> On Tue, Apr 20, 2021 at 5:25 PM Alex Deucher  wrote:
> >
> > On Fri, Apr 16, 2021 at 12:29 PM Mario Kleiner
> >  wrote:
> > >
> > > Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
> > > Would be great to get this in sooner than later.
> > >
> >
> > No objections from me.
> >
>
> I don't have any objections to merging this.  Are the IGT tests available?
>

Any preference on whether I merge this through the AMD tree or drm-misc?

Alex


> Alex
>
> > Alex
> >
> >
> > > Thanks and have a nice weekend,
> > > -mario
> > >
> > > On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > this patch series adds the fourcc's for 16 bit fixed point unorm
> > > > framebuffers to the core, and then an implementation for AMD gpu's
> > > > with DisplayCore.
> > > >
> > > > This is intended to allow for pageflipping to, and direct scanout of,
> > > > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > > > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > > > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > > > Link: 
> > > > https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> > > >
> > > > My main motivation for this is squeezing every bit of precision
> > > > out of the hardware for scientific and medical research applications,
> > > > where fp16 in the unorm range is limited to ~11 bpc effective linear
> > > > precision in the upper half [0.5;1.0] of the unorm range, although
> > > > the hardware could do at least 12 bpc.
> > > >
> > > > It has been successfully tested on AMD RavenRidge (DCN-1), and with
> > > > Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> > > > (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> > > > on my hw, both running at 10 bpc DP output depth.
> > > >
> > > > Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> > > > 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
> > > > Apple Retina panel), all running at 10 bpc output depth.
> > > >
> > > > No malfunctions, visual artifacts or other oddities were observed
> > > > (apart from an adventureous mess of cables and adapters on my desk),
> > > > suggesting it works.
> > > >
> > > > I used my automatic photometer measurement procedure to verify the
> > > > effective output precision of 10 bpc DP native signal + spatial
> > > > dithering in the gpu as enabled by the amdgpu driver. Results show
> > > > the expected 12 bpc precision i hoped for -- the current upper limit
> > > > for AMD display hw afaik.
> > > >
> > > > So it seems to work in the way i hoped :).
> > > >
> > > > Some open questions wrt. AMD DC, to be addressed in this patch series, 
> > > > or follow up
> > > > patches if neccessary:
> > > >
> > > > - For the atomic check for plane scaling, the current patch will
> > > > apply the same hw limits as for other rgb fixed point fb's, e.g.,
> > > > for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> > > > limits, because this is also a 64 bpp format? Or something new
> > > > entirely?
> > > >
> > > > - I haven't added the new fourcc to the DCC tables yet. Should i?
> > > >
> > > > - I had to change an assert for DCE to allow 36bpp linebuffers (patch 
> > > > 4/5).
> > > > It looks to me as if that assert was inconsistent with other places
> > > > in the driver where COLOR_DEPTH121212 is supported, and looking at
> > > > the code, the change seems harmless. At least on DCE-11.2 the change
> > > > didn't cause any noticeable (by myself) or measurable (by my equipment)
> > > > problems on any of the 3 connected displays.
> > > >
> > > > - Related to that change, while i needed to increase lb pixelsize to 
> > > > 36bpp
> > > > to get > 10 bpc effective precision on DCN, i didn't need to do that
> > > > on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
> > > > to get > 10 bpc precision for fp16 framebuffers, so something seems to
> > > > behave differently for floating point 16 vs. fixed point 16. This all
> > > > seems to suggest one could leave lb pixelsize at the old 30 bpp value
> > > > on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
> > > > to avoid the changes of patch 4/5.
> > > >
> > > > Thanks,
> > > > -mario
> > > >
> > > >
> > > ___
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-04-28 Thread Alex Deucher
On Tue, Apr 20, 2021 at 5:25 PM Alex Deucher  wrote:
>
> On Fri, Apr 16, 2021 at 12:29 PM Mario Kleiner
>  wrote:
> >
> > Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
> > Would be great to get this in sooner than later.
> >
>
> No objections from me.
>

I don't have any objections to merging this.  Are the IGT tests available?

Alex

> Alex
>
>
> > Thanks and have a nice weekend,
> > -mario
> >
> > On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
> >  wrote:
> > >
> > > Hi,
> > >
> > > this patch series adds the fourcc's for 16 bit fixed point unorm
> > > framebuffers to the core, and then an implementation for AMD gpu's
> > > with DisplayCore.
> > >
> > > This is intended to allow for pageflipping to, and direct scanout of,
> > > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > > Link: 
> > > https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> > >
> > > My main motivation for this is squeezing every bit of precision
> > > out of the hardware for scientific and medical research applications,
> > > where fp16 in the unorm range is limited to ~11 bpc effective linear
> > > precision in the upper half [0.5;1.0] of the unorm range, although
> > > the hardware could do at least 12 bpc.
> > >
> > > It has been successfully tested on AMD RavenRidge (DCN-1), and with
> > > Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> > > (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> > > on my hw, both running at 10 bpc DP output depth.
> > >
> > > Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> > > 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
> > > Apple Retina panel), all running at 10 bpc output depth.
> > >
> > > No malfunctions, visual artifacts or other oddities were observed
> > > (apart from an adventureous mess of cables and adapters on my desk),
> > > suggesting it works.
> > >
> > > I used my automatic photometer measurement procedure to verify the
> > > effective output precision of 10 bpc DP native signal + spatial
> > > dithering in the gpu as enabled by the amdgpu driver. Results show
> > > the expected 12 bpc precision i hoped for -- the current upper limit
> > > for AMD display hw afaik.
> > >
> > > So it seems to work in the way i hoped :).
> > >
> > > Some open questions wrt. AMD DC, to be addressed in this patch series, or 
> > > follow up
> > > patches if neccessary:
> > >
> > > - For the atomic check for plane scaling, the current patch will
> > > apply the same hw limits as for other rgb fixed point fb's, e.g.,
> > > for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> > > limits, because this is also a 64 bpp format? Or something new
> > > entirely?
> > >
> > > - I haven't added the new fourcc to the DCC tables yet. Should i?
> > >
> > > - I had to change an assert for DCE to allow 36bpp linebuffers (patch 
> > > 4/5).
> > > It looks to me as if that assert was inconsistent with other places
> > > in the driver where COLOR_DEPTH121212 is supported, and looking at
> > > the code, the change seems harmless. At least on DCE-11.2 the change
> > > didn't cause any noticeable (by myself) or measurable (by my equipment)
> > > problems on any of the 3 connected displays.
> > >
> > > - Related to that change, while i needed to increase lb pixelsize to 36bpp
> > > to get > 10 bpc effective precision on DCN, i didn't need to do that
> > > on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
> > > to get > 10 bpc precision for fp16 framebuffers, so something seems to
> > > behave differently for floating point 16 vs. fixed point 16. This all
> > > seems to suggest one could leave lb pixelsize at the old 30 bpp value
> > > on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
> > > to avoid the changes of patch 4/5.
> > >
> > > Thanks,
> > > -mario
> > >
> > >
> > ___
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-04-20 Thread Alex Deucher
On Fri, Apr 16, 2021 at 12:29 PM Mario Kleiner
 wrote:
>
> Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
> Would be great to get this in sooner than later.
>

No objections from me.

Alex


> Thanks and have a nice weekend,
> -mario
>
> On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
>  wrote:
> >
> > Hi,
> >
> > this patch series adds the fourcc's for 16 bit fixed point unorm
> > framebuffers to the core, and then an implementation for AMD gpu's
> > with DisplayCore.
> >
> > This is intended to allow for pageflipping to, and direct scanout of,
> > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > Link: 
> > https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> >
> > My main motivation for this is squeezing every bit of precision
> > out of the hardware for scientific and medical research applications,
> > where fp16 in the unorm range is limited to ~11 bpc effective linear
> > precision in the upper half [0.5;1.0] of the unorm range, although
> > the hardware could do at least 12 bpc.
> >
> > It has been successfully tested on AMD RavenRidge (DCN-1), and with
> > Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> > (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> > on my hw, both running at 10 bpc DP output depth.
> >
> > Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> > 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
> > Apple Retina panel), all running at 10 bpc output depth.
> >
> > No malfunctions, visual artifacts or other oddities were observed
> > (apart from an adventureous mess of cables and adapters on my desk),
> > suggesting it works.
> >
> > I used my automatic photometer measurement procedure to verify the
> > effective output precision of 10 bpc DP native signal + spatial
> > dithering in the gpu as enabled by the amdgpu driver. Results show
> > the expected 12 bpc precision i hoped for -- the current upper limit
> > for AMD display hw afaik.
> >
> > So it seems to work in the way i hoped :).
> >
> > Some open questions wrt. AMD DC, to be addressed in this patch series, or 
> > follow up
> > patches if neccessary:
> >
> > - For the atomic check for plane scaling, the current patch will
> > apply the same hw limits as for other rgb fixed point fb's, e.g.,
> > for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> > limits, because this is also a 64 bpp format? Or something new
> > entirely?
> >
> > - I haven't added the new fourcc to the DCC tables yet. Should i?
> >
> > - I had to change an assert for DCE to allow 36bpp linebuffers (patch 4/5).
> > It looks to me as if that assert was inconsistent with other places
> > in the driver where COLOR_DEPTH121212 is supported, and looking at
> > the code, the change seems harmless. At least on DCE-11.2 the change
> > didn't cause any noticeable (by myself) or measurable (by my equipment)
> > problems on any of the 3 connected displays.
> >
> > - Related to that change, while i needed to increase lb pixelsize to 36bpp
> > to get > 10 bpc effective precision on DCN, i didn't need to do that
> > on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
> > to get > 10 bpc precision for fp16 framebuffers, so something seems to
> > behave differently for floating point 16 vs. fixed point 16. This all
> > seems to suggest one could leave lb pixelsize at the old 30 bpp value
> > on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
> > to avoid the changes of patch 4/5.
> >
> > Thanks,
> > -mario
> >
> >
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-04-16 Thread Ville Syrjälä
On Fri, Apr 16, 2021 at 06:27:23PM +0200, Mario Kleiner wrote:
> On Mon, Mar 22, 2021 at 4:52 PM Ville Syrjälä
>  wrote:
> >
> > On Fri, Mar 19, 2021 at 10:03:12PM +0100, Mario Kleiner wrote:
> > > Hi,
> > >
> > > this patch series adds the fourcc's for 16 bit fixed point unorm
> > > framebuffers to the core, and then an implementation for AMD gpu's
> > > with DisplayCore.
> > >
> > > This is intended to allow for pageflipping to, and direct scanout of,
> > > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > > Link: 
> > > https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
> >
> > We should also add support for these formats into igt.a Should
> > be semi-easy by just adding the suitable float<->uint16
> > conversion stuff.
> >
> 
> Hi Ville,
> 
> Could you point me to a specific test case / file that I should look
> at for adding this?

lib/igt_fb.c is the main thing. It has a bunch of conversion magic
to support rendering into all kinds of weird framebuffer formats
via cairo. 

In this should be mostly a matter of adding convert_uint16_to_float()
and convert_float_to_uint16(), plugging those into fb_convert(),
and declaring the new formats in format_desc[]. There might be
a few little extra details I'm forgetting though.

Once igt_fb has the required stuff kms_plane/pixel-format*
should automagically pick it up if the kernel reports the
format as supported.

Oh, and you need some >1.17 version of cairo for the float
support.

-- 
Ville Syrjälä
Intel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-04-16 Thread Mario Kleiner
Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
Would be great to get this in sooner than later.

Thanks and have a nice weekend,
-mario

On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
 wrote:
>
> Hi,
>
> this patch series adds the fourcc's for 16 bit fixed point unorm
> framebuffers to the core, and then an implementation for AMD gpu's
> with DisplayCore.
>
> This is intended to allow for pageflipping to, and direct scanout of,
> Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> Link: 
> https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
>
> My main motivation for this is squeezing every bit of precision
> out of the hardware for scientific and medical research applications,
> where fp16 in the unorm range is limited to ~11 bpc effective linear
> precision in the upper half [0.5;1.0] of the unorm range, although
> the hardware could do at least 12 bpc.
>
> It has been successfully tested on AMD RavenRidge (DCN-1), and with
> Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> (DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
> on my hw, both running at 10 bpc DP output depth.
>
> Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
> 2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
> Apple Retina panel), all running at 10 bpc output depth.
>
> No malfunctions, visual artifacts or other oddities were observed
> (apart from an adventureous mess of cables and adapters on my desk),
> suggesting it works.
>
> I used my automatic photometer measurement procedure to verify the
> effective output precision of 10 bpc DP native signal + spatial
> dithering in the gpu as enabled by the amdgpu driver. Results show
> the expected 12 bpc precision i hoped for -- the current upper limit
> for AMD display hw afaik.
>
> So it seems to work in the way i hoped :).
>
> Some open questions wrt. AMD DC, to be addressed in this patch series, or 
> follow up
> patches if neccessary:
>
> - For the atomic check for plane scaling, the current patch will
> apply the same hw limits as for other rgb fixed point fb's, e.g.,
> for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> limits, because this is also a 64 bpp format? Or something new
> entirely?
>
> - I haven't added the new fourcc to the DCC tables yet. Should i?
>
> - I had to change an assert for DCE to allow 36bpp linebuffers (patch 4/5).
> It looks to me as if that assert was inconsistent with other places
> in the driver where COLOR_DEPTH121212 is supported, and looking at
> the code, the change seems harmless. At least on DCE-11.2 the change
> didn't cause any noticeable (by myself) or measurable (by my equipment)
> problems on any of the 3 connected displays.
>
> - Related to that change, while i needed to increase lb pixelsize to 36bpp
> to get > 10 bpc effective precision on DCN, i didn't need to do that
> on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
> to get > 10 bpc precision for fp16 framebuffers, so something seems to
> behave differently for floating point 16 vs. fixed point 16. This all
> seems to suggest one could leave lb pixelsize at the old 30 bpp value
> on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
> to avoid the changes of patch 4/5.
>
> Thanks,
> -mario
>
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-04-16 Thread Mario Kleiner
On Mon, Mar 22, 2021 at 4:52 PM Ville Syrjälä
 wrote:
>
> On Fri, Mar 19, 2021 at 10:03:12PM +0100, Mario Kleiner wrote:
> > Hi,
> >
> > this patch series adds the fourcc's for 16 bit fixed point unorm
> > framebuffers to the core, and then an implementation for AMD gpu's
> > with DisplayCore.
> >
> > This is intended to allow for pageflipping to, and direct scanout of,
> > Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> > I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> > for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> > Link: 
> > https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
>
> We should also add support for these formats into igt.a Should
> be semi-easy by just adding the suitable float<->uint16
> conversion stuff.
>

Hi Ville,

Could you point me to a specific test case / file that I should look
at for adding this?

thanks,
-mario

> --
> Ville Syrjälä
> Intel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: 16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-03-22 Thread Ville Syrjälä
On Fri, Mar 19, 2021 at 10:03:12PM +0100, Mario Kleiner wrote:
> Hi,
> 
> this patch series adds the fourcc's for 16 bit fixed point unorm
> framebuffers to the core, and then an implementation for AMD gpu's
> with DisplayCore.
> 
> This is intended to allow for pageflipping to, and direct scanout of,
> Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> Link: 
> https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4

We should also add support for these formats into igt.a Should 
be semi-easy by just adding the suitable float<->uint16
conversion stuff.

-- 
Ville Syrjälä
Intel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

2021-03-19 Thread Mario Kleiner
Hi,

this patch series adds the fourcc's for 16 bit fixed point unorm
framebuffers to the core, and then an implementation for AMD gpu's
with DisplayCore.

This is intended to allow for pageflipping to, and direct scanout of,
Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
for swapchains, mapping to DRM_FORMAT_XBGR16161616:
Link: 
https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4

My main motivation for this is squeezing every bit of precision
out of the hardware for scientific and medical research applications,
where fp16 in the unorm range is limited to ~11 bpc effective linear
precision in the upper half [0.5;1.0] of the unorm range, although
the hardware could do at least 12 bpc.

It has been successfully tested on AMD RavenRidge (DCN-1), and with
Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
(DP 2560x1440@144Hz + HDMI 2560x1440@120Hz), the maximum supported
on my hw, both running at 10 bpc DP output depth.

Up to three displays were active on the Polaris (DP 2560x1440@144Hz +
2560x1440@100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800@60Hz
Apple Retina panel), all running at 10 bpc output depth.

No malfunctions, visual artifacts or other oddities were observed
(apart from an adventureous mess of cables and adapters on my desk),
suggesting it works.

I used my automatic photometer measurement procedure to verify the
effective output precision of 10 bpc DP native signal + spatial
dithering in the gpu as enabled by the amdgpu driver. Results show
the expected 12 bpc precision i hoped for -- the current upper limit
for AMD display hw afaik.

So it seems to work in the way i hoped :).

Some open questions wrt. AMD DC, to be addressed in this patch series, or 
follow up
patches if neccessary:

- For the atomic check for plane scaling, the current patch will
apply the same hw limits as for other rgb fixed point fb's, e.g.,
for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
limits, because this is also a 64 bpp format? Or something new
entirely?

- I haven't added the new fourcc to the DCC tables yet. Should i?

- I had to change an assert for DCE to allow 36bpp linebuffers (patch 4/5).
It looks to me as if that assert was inconsistent with other places
in the driver where COLOR_DEPTH121212 is supported, and looking at
the code, the change seems harmless. At least on DCE-11.2 the change
didn't cause any noticeable (by myself) or measurable (by my equipment)
problems on any of the 3 connected displays.

- Related to that change, while i needed to increase lb pixelsize to 36bpp
to get > 10 bpc effective precision on DCN, i didn't need to do that
on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
to get > 10 bpc precision for fp16 framebuffers, so something seems to
behave differently for floating point 16 vs. fixed point 16. This all
seems to suggest one could leave lb pixelsize at the old 30 bpp value
on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
to avoid the changes of patch 4/5.

Thanks,
-mario


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel