Re: wl_subsurface vs xdg_popup?
On Thu, Mar 07, 2024 at 04:31:47PM -0500, Shawn W wrote: > I've been looking at the protocol docs on http://wayland.app and something > that's stood out to me is wl_subsurface and xdg_popup. If I want a pop up > menu, which one should I go for? I would guess xdg_popup, but it seems like > some compositors may not support repositioning if they don't support version > 3 of the interface, and positioning a popup seems a little complicated. Then > I look at wl_subsurface, and unless I'm misunderstanding it, it seems like it > could also be used for generating popups, and is required for compositors to > support, but it's not clear to me whether the compositor will actually show a > wl_subsurface, since it wouldn't show a regular wl_surface. Would someone be > able to clarify the difference between these? Subsurfaces should in general be used for constructing a part of your "window", and not for auxiliary windows, like popup menus and tooltips. The reason for this is that unlike subsurfaces, popups have necessary grab semantics that is usually needed (click outside to to dismiss, etc), as well as logic to allow it to be positioned within e.g the work area, using xdg_positioner rules. I suggest allowing repositioning your popups only when the compositor support it, and in other, unmap it and map a new one at a new position, if you need to. Jonas
Re: identifying active application/window under wayland
Thanks. This all seems unfortunate. I think I'll wait until I am forced to support wayland and then sort it out. At the moment, it's too difficult. On Tue, 2024-03-19 at 11:04 +0200, Marius Vlad wrote: > If the program is a client you won't be find this information, as > that's > inherent part of the Wayland architecture (avoid one client > snooping on another's client data). The program needs to be part of > the compositor to be able to have that information. As an example, > with Weston's kiosk shell, you can move/map windows on different > outputs > based on their appid: > https://wayland.pages.freedesktop.org/weston/toc/kiosk-shell.html > > Depending on the compositor some of the give you the option to use > plug-ins, or use their libraries to add additional code. > > Ideally, I'd not need to target multiple back-ends, but from what I see, this is not a concern that's held by the wayland model. > > The Wayland architecture incorporates the display server, window > manager > and compositor into one, there's no DE similarly to what you have > with X. > > I'm not aware of a unified way of doing this, but maybe use-reusing > XDG portals (https://flatpak.github.io/xdg-desktop-portal/docs/) to > achieve the same functionality could be a way of doing that? > I didn't see anything in that that looks like it will support what I need.
Re: identifying active application/window under wayland
On Fri, Mar 15, 2024 at 05:07:24AM +, Dan Kortschak wrote: > Hello, Hi, > > Apologies if this is not the correct forum for this. If this is the > case, please direct me to a more appropriate place. > > I have searched the archive and cannot find any topic that appears to > cover this. > > I am the author of a program that in part needs to be able to obtain > the identity of the active application and window (including its title) If the program is a client you won't be find this information, as that's inherent part of the Wayland architecture (avoid one client snooping on another's client data). The program needs to be part of the compositor to be able to have that information. As an example, with Weston's kiosk shell, you can move/map windows on different outputs based on their appid: https://wayland.pages.freedesktop.org/weston/toc/kiosk-shell.html Depending on the compositor some of the give you the option to use plug-ins, or use their libraries to add additional code. > being used by a user. This is required for context-dependent > configuration of an external device and also, if configured by the > user, logging of user activity for time tracking. > > This is readily achievable on the platforms that are currently > supported (X11 and MacOS), but as far as I can see there does not > appear to be a unified approach to achieving this under Wayland. > > Is it the case that there is no way to achieve this, or am I just > missing things that are possible? Is it the case that I may need to > only cover a subset of DEs and special-case code for each that is > supported? Are there plans to support this kind of use? The Wayland architecture incorporates the display server, window manager and compositor into one, there's no DE similarly to what you have with X. I'm not aware of a unified way of doing this, but maybe use-reusing XDG portals (https://flatpak.github.io/xdg-desktop-portal/docs/) to achieve the same functionality could be a way of doing that? > > thanks > Dan > signature.asc Description: PGP signature
wl_subsurface vs xdg_popup?
I've been looking at the protocol docs on http://wayland.app and something that's stood out to me is wl_subsurface and xdg_popup. If I want a pop up menu, which one should I go for? I would guess xdg_popup, but it seems like some compositors may not support repositioning if they don't support version 3 of the interface, and positioning a popup seems a little complicated. Then I look at wl_subsurface, and unless I'm misunderstanding it, it seems like it could also be used for generating popups, and is required for compositors to support, but it's not clear to me whether the compositor will actually show a wl_subsurface, since it wouldn't show a regular wl_surface. Would someone be able to clarify the difference between these?
Re: [RFC PATCH v4 23/42] drm/vkms: add 3x4 matrix in color pipeline
On Mon, 26 Feb 2024 16:10:37 -0500 Harry Wentland wrote: > We add two 3x4 matrices into the VKMS color pipeline. The reason > we're adding matrices is so that we can test that application > of a matrix and its inverse yields an output equal to the input > image. You will test also cases where the matrix configuration will not produce output equal to input, right? Otherwise one can accidentally pass the matrix test by ignoring all matrices. > One complication with the matrix implementation has to do with > the fact that the matrix entries are in signed-magnitude fixed > point, whereas the drm_fixed.h implementation uses 2s-complement. > The latter one is the one that we want for easy addition and > subtraction, so we convert all entries to 2s-complement. > > Signed-off-by: Harry Wentland > --- > drivers/gpu/drm/vkms/vkms_colorop.c | 32 +++- > drivers/gpu/drm/vkms/vkms_composer.c | 27 +++ > include/drm/drm_colorop.h| 2 ++ > 3 files changed, 60 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/vkms/vkms_colorop.c > b/drivers/gpu/drm/vkms/vkms_colorop.c > index d2db366da6d3..a0e54b2c1f7a 100644 > --- a/drivers/gpu/drm/vkms/vkms_colorop.c > +++ b/drivers/gpu/drm/vkms/vkms_colorop.c > @@ -35,7 +35,37 @@ const int vkms_initialize_color_pipeline(struct drm_plane > *plane, struct drm_pro > > prev_op = op; > > - /* 2nd op: 1d curve */ > + /* 2nd op: 3x4 matrix */ > + op = kzalloc(sizeof(struct drm_colorop), GFP_KERNEL); > + if (!op) { > + DRM_ERROR("KMS: Failed to allocate colorop\n"); > + return -ENOMEM; > + } > + > + ret = drm_colorop_ctm_3x4_init(dev, op, plane); > + if (ret) > + return ret; > + > + drm_colorop_set_next_property(prev_op, op); > + > + prev_op = op; > + > + /* 3rd op: 3x4 matrix */ > + op = kzalloc(sizeof(struct drm_colorop), GFP_KERNEL); > + if (!op) { > + DRM_ERROR("KMS: Failed to allocate colorop\n"); > + return -ENOMEM; > + } > + > + ret = drm_colorop_ctm_3x4_init(dev, op, plane); > + if (ret) > + return ret; > + > + drm_colorop_set_next_property(prev_op, op); > + > + prev_op = op; > + > + /* 4th op: 1d curve */ > op = kzalloc(sizeof(struct drm_colorop), GFP_KERNEL); > if (!op) { > DRM_ERROR("KMS: Failed to allocate colorop\n"); > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c > b/drivers/gpu/drm/vkms/vkms_composer.c > index d2101fa55aa3..8bbfce651526 100644 > --- a/drivers/gpu/drm/vkms/vkms_composer.c > +++ b/drivers/gpu/drm/vkms/vkms_composer.c > @@ -164,6 +164,30 @@ static void apply_lut(const struct vkms_crtc_state > *crtc_state, struct line_buff > } > } > > +static void apply_3x4_matrix(struct pixel_argb_s32 *pixel, const struct > drm_color_ctm_3x4 *matrix) > +{ > + s64 rf, gf, bf; > + > + rf = drm_fixp_mul(drm_sm2fixp(matrix->matrix[0]), > drm_int2fixp(pixel->r)) + > + drm_fixp_mul(drm_sm2fixp(matrix->matrix[1]), > drm_int2fixp(pixel->g)) + > + drm_fixp_mul(drm_sm2fixp(matrix->matrix[2]), > drm_int2fixp(pixel->b)) + > + drm_sm2fixp(matrix->matrix[3]); It would be nice if the driver had its private data for the matrix colorop state, so it could convert the matrix only once when userspace sets it. Thanks, pq > + > + gf = drm_fixp_mul(drm_sm2fixp(matrix->matrix[4]), > drm_int2fixp(pixel->r)) + > + drm_fixp_mul(drm_sm2fixp(matrix->matrix[5]), > drm_int2fixp(pixel->g)) + > + drm_fixp_mul(drm_sm2fixp(matrix->matrix[6]), > drm_int2fixp(pixel->b)) + > + drm_sm2fixp(matrix->matrix[7]); > + > + bf = drm_fixp_mul(drm_sm2fixp(matrix->matrix[8]), > drm_int2fixp(pixel->r)) + > + drm_fixp_mul(drm_sm2fixp(matrix->matrix[9]), > drm_int2fixp(pixel->g)) + > + drm_fixp_mul(drm_sm2fixp(matrix->matrix[10]), > drm_int2fixp(pixel->b)) + > + drm_sm2fixp(matrix->matrix[11]); > + > + pixel->r = drm_fixp2int(rf); > + pixel->g = drm_fixp2int(gf); > + pixel->b = drm_fixp2int(bf); > +} > + > static void apply_colorop(struct pixel_argb_s32 *pixel, struct drm_colorop > *colorop) > { > /* TODO is this right? */ > @@ -185,6 +209,9 @@ static void apply_colorop(struct pixel_argb_s32 *pixel, > struct drm_colorop *colo > DRM_DEBUG_DRIVER("unkown colorop 1D curve type > %d\n", colorop_state->curve_1d_type); > break; > } > + } else if (colorop->type == DRM_COLOROP_CTM_3X4) { > + if (colorop_state->data) > + apply_3x4_matrix(pixel, (struct drm_color_ctm_3x4 *) > colorop_state->data->data); > } > > } > diff --git a/include/drm/drm_colorop.h b/include/drm/drm_colorop.h > index 4aee29e161d6..8710e550790c 100644 > --- a/include/drm/drm_colorop.h > +++ b/include/drm/drm_colorop.h > @@ -224,6 +224,8 @@ int drm_col
Re: [RFC PATCH v4 06/42] drm/vkms: Add kunit tests for VKMS LUT handling
On Mon, 26 Feb 2024 16:10:20 -0500 Harry Wentland wrote: > Debugging LUT math is much easier when we can unit test > it. Add kunit functionality to VKMS and add tests for > - get_lut_index > - lerp_u16 > > v4: > - Test the critical points of the lerp function (Pekka) > > v3: > - Use include way of testing static functions (Arthur) > > Signed-off-by: Harry Wentland > Cc: Arthur Grillo > --- > drivers/gpu/drm/vkms/Kconfig | 5 + > drivers/gpu/drm/vkms/tests/.kunitconfig | 4 + > drivers/gpu/drm/vkms/tests/vkms_color_tests.c | 163 ++ > drivers/gpu/drm/vkms/vkms_composer.c | 8 +- > 4 files changed, 178 insertions(+), 2 deletions(-) > create mode 100644 drivers/gpu/drm/vkms/tests/.kunitconfig > create mode 100644 drivers/gpu/drm/vkms/tests/vkms_color_tests.c > Hi Harry, since fixed point math is hard, please allow me to nitpick so maybe these patches could be landed ahead of time. > diff --git a/drivers/gpu/drm/vkms/Kconfig b/drivers/gpu/drm/vkms/Kconfig > index b9ecdebecb0b..c1f8b343ff0e 100644 > --- a/drivers/gpu/drm/vkms/Kconfig > +++ b/drivers/gpu/drm/vkms/Kconfig > @@ -13,3 +13,8 @@ config DRM_VKMS > a VKMS. > > If M is selected the module will be called vkms. > + > +config DRM_VKMS_KUNIT_TESTS > + tristate "Tests for VKMS" if !KUNIT_ALL_TESTS > + depends on DRM_VKMS && KUNIT > + default KUNIT_ALL_TESTS > diff --git a/drivers/gpu/drm/vkms/tests/.kunitconfig > b/drivers/gpu/drm/vkms/tests/.kunitconfig > new file mode 100644 > index ..70e378228cbd > --- /dev/null > +++ b/drivers/gpu/drm/vkms/tests/.kunitconfig > @@ -0,0 +1,4 @@ > +CONFIG_KUNIT=y > +CONFIG_DRM=y > +CONFIG_DRM_VKMS=y > +CONFIG_DRM_VKMS_KUNIT_TESTS=y > diff --git a/drivers/gpu/drm/vkms/tests/vkms_color_tests.c > b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c > new file mode 100644 > index ..fc73e48aa57c > --- /dev/null > +++ b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c > @@ -0,0 +1,163 @@ > +/* SPDX-License-Identifier: GPL-2.0+ */ > + > +#include > + > +#include > + > +#define TEST_LUT_SIZE 16 > + > +static struct drm_color_lut test_linear_array[TEST_LUT_SIZE] = { > + { 0x0, 0x0, 0x0, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > + { 0x, 0x, 0x, 0 }, > +}; > + > +const struct vkms_color_lut test_linear_lut = { > + .base = test_linear_array, > + .lut_length = TEST_LUT_SIZE, > + .channel_value2index_ratio = 0xf000fll Where does 0xf000f come from? Could it be computed from DRM_FIXED_ONE and ARRAY_LENGTH()? > +}; > + > + > +static void vkms_color_test_get_lut_index(struct kunit *test) > +{ > + int i; > + > + KUNIT_EXPECT_EQ(test, drm_fixp2int(get_lut_index(&test_linear_lut, > test_linear_array[0].red)), 0); > + > + for (i = 0; i < TEST_LUT_SIZE; i++) > + KUNIT_EXPECT_EQ(test, > drm_fixp2int_ceil(get_lut_index(&test_linear_lut, test_linear_array[i].red)), > i); Why this instead of + for (i = 0; i < TEST_LUT_SIZE; i++) + KUNIT_EXPECT_EQ(test, get_lut_index(&test_linear_lut, test_linear_array[i].red), drm_int2fixp(i)); and + for (i = 0; i < 0x; i++) + KUNIT_EXPECT_EQ(test, drm_fixp2int(get_lut_index(&test_linear_lut, i)), i / 0x); ? I think your original form is leaving quite much room for error in precision. > +} > + > +static void vkms_color_test_lerp(struct kunit *test) > +{ > + /*** half-way round down ***/ > + s64 t = 0x8000 - 1; > + KUNIT_EXPECT_EQ(test, lerp_u16(0x0, 0x10, t), 0x8); > + > + /* odd a */ > + KUNIT_EXPECT_EQ(test, lerp_u16(0x1, 0x10, t), 0x8); > + > + /* odd b */ > + KUNIT_EXPECT_EQ(test, lerp_u16(0x1, 0xf, t), 0x8); > + > + /* b = a */ > + KUNIT_EXPECT_EQ(test, lerp_u16(0x10, 0x10, t), 0x10); > + > + /* b = a + 1 */ > + KUNIT_EXPECT_EQ(test, lerp_u16(0x10, 0x11, t), 0x10); > + > + > + /*** half-way round up ***/ > + t = 0x8000; > + KUNIT_EXPECT_EQ(test, lerp_u16(0x0, 0x10, t), 0x8); > + > + /* odd a */ > + KUNIT_EXPECT_EQ(test, lerp_u16(0x1, 0x10, t), 0x9); > + > + /* odd b */ > + KUNIT_EXPECT_EQ(test, lerp_u16(0x1, 0xf, t), 0x8); > + > + /* b = a */ > + KUNIT_EXPECT_EQ(test, lerp_u16(0x10, 0x10, t), 0x10); > + > + /* b = a + 1 */ > + KUNIT_EXPECT_EQ(test, lerp_u16(0x10, 0x11, t), 0x11); > + > + /*** t = 0.0 ***/ > + t = 0x0; > + KUNIT_EXPECT_EQ(test, lerp_u16(0x0, 0x10, t), 0x
Re: [RFC PATCH v4 10/42] drm/colorop: Add TYPE property
On Tue, 12 Mar 2024 15:15:13 + Simon Ser wrote: > On Tuesday, March 12th, 2024 at 16:02, Pekka Paalanen > wrote: > > > This list here is the list of all values the enum could take, right? > > Should it not be just the one value it's going to have? It'll never > > change, and it can never be changed. > > Listing all possible values is how other properties behave. Example: > > "type" (immutable): enum {Overlay, Primary, Cursor} = Primary Yeah, that's weird, now that you pointed it out. Userspace does nothing with the knowledge of what this specific object's property "could" be since it's immutable. Only the kernel internals and UAPI docs need the full list, and userspace needs to hardcode the full list in general so it can recognize each value. But on the property instance on a specific object, I see no use for the full list. Thanks, pq pgpfCwiH7ug5G.pgp Description: OpenPGP digital signature
Re: [RFC PATCH v4 17/42] drm/vkms: Add enumerated 1D curve colorop
On Mon, 26 Feb 2024 16:10:31 -0500 Harry Wentland wrote: > This patch introduces a VKMS color pipeline that includes two > drm_colorops for named transfer functions. For now the only ones > supported are sRGB EOTF, sRGB Inverse EOTF, and a Linear TF. > We will expand this in the future but I don't want to do so > without accompanying IGT tests. > > We introduce a new vkms_luts.c file that hard-codes sRGB EOTF, > sRGB Inverse EOTF, and a linear EOTF LUT. These have been > generated with 256 entries each as IGT is currently testing > only 8 bpc surfaces. We will likely need higher precision > but I'm reluctant to make that change without clear indication > that we need it. We'll revisit and, if necessary, regenerate > the LUTs when we have IGT tests for higher precision buffers. > > v4: > - Drop _tf_ from color_pipeline init function > - Pass supported TFs into colorop init > - Create bypass pipeline in DRM helper (Pekka) > > v2: > - Add commit description > - Fix sRGB EOTF LUT definition > - Add linear and sRGB inverse EOTF LUTs > > Signed-off-by: Harry Wentland > Signed-off-by: Alex Hung > --- > drivers/gpu/drm/vkms/Makefile| 4 +- > drivers/gpu/drm/vkms/vkms_colorop.c | 70 +++ > drivers/gpu/drm/vkms/vkms_composer.c | 45 ++ > drivers/gpu/drm/vkms/vkms_drv.h | 4 + > drivers/gpu/drm/vkms/vkms_luts.c | 802 +++ > drivers/gpu/drm/vkms/vkms_luts.h | 12 + > drivers/gpu/drm/vkms/vkms_plane.c| 2 + > 7 files changed, 938 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/vkms/vkms_colorop.c > create mode 100644 drivers/gpu/drm/vkms/vkms_luts.c > create mode 100644 drivers/gpu/drm/vkms/vkms_luts.h ... > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c > b/drivers/gpu/drm/vkms/vkms_composer.c > index b90e446d5954..9493bdb1ba3f 100644 > --- a/drivers/gpu/drm/vkms/vkms_composer.c > +++ b/drivers/gpu/drm/vkms/vkms_composer.c > @@ -12,6 +12,7 @@ > #include > > #include "vkms_drv.h" > +#include "vkms_luts.h" > > static u16 pre_mul_blend_channel(u16 src, u16 dst, u16 alpha) > { > @@ -163,6 +164,47 @@ static void apply_lut(const struct vkms_crtc_state > *crtc_state, struct line_buff > } > } > > +static void pre_blend_color_transform(const struct vkms_plane_state > *plane_state, struct line_buffer *output_buffer) > +{ > + struct drm_colorop *colorop = plane_state->base.base.color_pipeline; > + > + while (colorop) { I think this would be easier to read if you used for (; colorop; colorop = colorop->next) { and > + struct drm_colorop_state *colorop_state; > + > + if (!colorop) > + return; > + > + /* TODO this is probably wrong */ > + colorop_state = colorop->state; > + > + if (!colorop_state) > + return; if (colorop_state->bypass) continue; Something about 'switch (colorop->type)' to pick a function pointer to call, but hard to see at this point of the series how that would work. However, you can pick between srgb_inv_eotf and srgb_eotf already here. Then inside the loop you can just call one set of apply_lut_to_channel_value() and not need conditionals and avoid indentation levels. > + > + for (size_t x = 0; x < output_buffer->n_pixels; x++) { > + struct pixel_argb_u16 *pixel = > &output_buffer->pixels[x]; > + > + if (colorop->type == DRM_COLOROP_1D_CURVE && > + colorop_state->bypass == false) { > + switch (colorop_state->curve_1d_type) { > + case DRM_COLOROP_1D_CURVE_SRGB_INV_EOTF: > + pixel->r = > apply_lut_to_channel_value(&srgb_inv_eotf, pixel->r, LUT_RED); > + pixel->g = > apply_lut_to_channel_value(&srgb_inv_eotf, pixel->g, LUT_GREEN); > + pixel->b = > apply_lut_to_channel_value(&srgb_inv_eotf, pixel->b, LUT_BLUE); > + break; > + case DRM_COLOROP_1D_CURVE_SRGB_EOTF: > + default: > + pixel->r = > apply_lut_to_channel_value(&srgb_eotf, pixel->r, LUT_RED); > + pixel->g = > apply_lut_to_channel_value(&srgb_eotf, pixel->g, LUT_GREEN); > + pixel->b = > apply_lut_to_channel_value(&srgb_eotf, pixel->b, LUT_BLUE); > + break; > + } > + } else { aaargh_unknown_colorop(); } > + } > + > + colorop = colorop->next; > + } > +} ... > diff --git a/drivers/gpu/drm/vkms/vkms_luts.c > b/drivers/gpu/drm/vkms/vkms_luts.c > new file mod
Re: [RFC PATCH v4 10/42] drm/colorop: Add TYPE property
On Mon, 26 Feb 2024 16:10:24 -0500 Harry Wentland wrote: > Add a read-only TYPE property. The TYPE specifies the colorop > type, such as enumerated curve, 1D LUT, CTM, 3D LUT, PWL LUT, > etc. > > v4: > - Use enum property for TYPE (Pekka) > > v3: > - Make TYPE a range property > - Move enum drm_colorop_type to uapi header > - Fix drm_get_colorop_type_name description > > For now we're only introducing an enumerated 1D LUT type to > illustrate the concept. > > Signed-off-by: Harry Wentland > --- > drivers/gpu/drm/drm_atomic.c | 4 +-- > drivers/gpu/drm/drm_atomic_uapi.c | 8 +- > drivers/gpu/drm/drm_colorop.c | 44 ++- > include/drm/drm_colorop.h | 17 +++- > include/uapi/drm/drm_mode.h | 4 +++ > 5 files changed, 72 insertions(+), 5 deletions(-) ... > diff --git a/drivers/gpu/drm/drm_colorop.c b/drivers/gpu/drm/drm_colorop.c > index a295ab96aee1..3a07a8fe2842 100644 > --- a/drivers/gpu/drm/drm_colorop.c > +++ b/drivers/gpu/drm/drm_colorop.c > @@ -32,12 +32,17 @@ > > /* TODO big colorop doc, including properties, etc. */ > > +static const struct drm_prop_enum_list drm_colorop_type_enum_list[] = { > + { DRM_COLOROP_1D_CURVE, "1D Curve" }, > +}; > + > /* Init Helpers */ > > int drm_colorop_init(struct drm_device *dev, struct drm_colorop *colorop, > - struct drm_plane *plane) > + struct drm_plane *plane, enum drm_colorop_type type) > { > struct drm_mode_config *config = &dev->mode_config; > + struct drm_property *prop; > int ret = 0; > > ret = drm_mode_object_add(dev, &colorop->base, DRM_MODE_OBJECT_COLOROP); > @@ -46,12 +51,29 @@ int drm_colorop_init(struct drm_device *dev, struct > drm_colorop *colorop, > > colorop->base.properties = &colorop->properties; > colorop->dev = dev; > + colorop->type = type; > colorop->plane = plane; > > list_add_tail(&colorop->head, &config->colorop_list); > colorop->index = config->num_colorop++; > > /* add properties */ > + > + /* type */ > + prop = drm_property_create_enum(dev, > + DRM_MODE_PROP_IMMUTABLE, > + "TYPE", drm_colorop_type_enum_list, > + ARRAY_SIZE(drm_colorop_type_enum_list)); This list here is the list of all values the enum could take, right? Should it not be just the one value it's going to have? It'll never change, and it can never be changed. > + > + if (!prop) > + return -ENOMEM; > + > + colorop->type_property = prop; > + > + drm_object_attach_property(&colorop->base, > +colorop->type_property, > +colorop->type); > + > return ret; > } > EXPORT_SYMBOL(drm_colorop_init); Thanks, pq pgp6sbEwX7uwQ.pgp Description: OpenPGP digital signature
Re: [RFC PATCH v4 24/42] drm/tests: Add a few tests around drm_fixed.h
On Mon, 26 Feb 2024 16:10:38 -0500 Harry Wentland wrote: > While working on the CTM implementation of VKMS I had to ascertain > myself of a few assumptions. One of those is whether drm_fixed.h > treats its numbers using signed-magnitude or twos-complement. It is > twos-complement. > > In order to make someone else's day easier I am adding the > drm_test_int2fixp test that validates the above assumption. > > I am also adding a test for the new sm2fixp function that converts > from a signed-magnitude fixed point to the twos-complement fixed > point. > > Signed-off-by: Harry Wentland > --- > drivers/gpu/drm/tests/Makefile| 3 +- > drivers/gpu/drm/tests/drm_fixp_test.c | 69 +++ > 2 files changed, 71 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/tests/drm_fixp_test.c > > diff --git a/drivers/gpu/drm/tests/Makefile b/drivers/gpu/drm/tests/Makefile > index d6183b3d7688..98468f7908dd 100644 > --- a/drivers/gpu/drm/tests/Makefile > +++ b/drivers/gpu/drm/tests/Makefile > @@ -19,6 +19,7 @@ obj-$(CONFIG_DRM_KUNIT_TEST) += \ > drm_modes_test.o \ > drm_plane_helper_test.o \ > drm_probe_helper_test.o \ > - drm_rect_test.o > + drm_rect_test.o \ > + drm_fixp_test.o > > CFLAGS_drm_mm_test.o := $(DISABLE_STRUCTLEAK_PLUGIN) > diff --git a/drivers/gpu/drm/tests/drm_fixp_test.c > b/drivers/gpu/drm/tests/drm_fixp_test.c > new file mode 100644 > index ..f420f173ff66 > --- /dev/null > +++ b/drivers/gpu/drm/tests/drm_fixp_test.c > @@ -0,0 +1,69 @@ > +// SPDX-License-Identifier: MIT > +/* > + * Copyright 2022 Advanced Micro Devices, Inc. > + */ > + > +#include > +#include > + > +static void drm_test_sm2fixp(struct kunit *test) > +{ > + KUNIT_EXPECT_EQ(test, 0x7fffll, ((1LL << 63) - 1)); > + > + /* 1 */ > + KUNIT_EXPECT_EQ(test, drm_int2fixp(1), drm_sm2fixp(1ull << > DRM_FIXED_POINT)); This sounds like confusing two different APIs. DRM_FIXED_POINT is for the two's complement representation, and sm is not. It does not look like there is a #define for the signed-magnitude UAPI construct, otherwise drm_color_ctm_s31_32_to_qm_n() would be using it. It's just an accident that DRM_FIXED_POINT has the right value for sm. Hm, drm_color_ctm_s31_32_to_qm_n() produces two's complement. Why not use that for implementing drm_sm2fixp()? Thanks, pq > + > + /* -1 */ > + KUNIT_EXPECT_EQ(test, drm_int2fixp(-1), drm_sm2fixp((1ull << 63) | > (1ull << DRM_FIXED_POINT))); > + > + /* 0.5 */ > + KUNIT_EXPECT_EQ(test, drm_fixp_from_fraction(1, 2), drm_sm2fixp(1ull << > (DRM_FIXED_POINT - 1))); > + > + /* -0.5 */ > + KUNIT_EXPECT_EQ(test, drm_fixp_from_fraction(-1, 2), drm_sm2fixp((1ull > << 63) | (1ull << (DRM_FIXED_POINT - 1; > + > +} > + > +static void drm_test_int2fixp(struct kunit *test) > +{ > + /* 1 */ > + KUNIT_EXPECT_EQ(test, 1ll << 32, drm_int2fixp(1)); > + > + /* -1 */ > + KUNIT_EXPECT_EQ(test, -(1ll << 32), drm_int2fixp(-1)); > + > + /* 1 + (-1) = 0 */ > + KUNIT_EXPECT_EQ(test, 0, drm_int2fixp(1) + drm_int2fixp(-1)); > + > + /* 1 / 2 */ > + KUNIT_EXPECT_EQ(test, 1ll << 31, drm_fixp_from_fraction(1, 2)); > + > + /* -0.5 */ > + KUNIT_EXPECT_EQ(test, -(1ll << 31), drm_fixp_from_fraction(-1, 2)); > + > + /* (1 / 2) + (-1) = 0.5 */ > + KUNIT_EXPECT_EQ(test, 1ll << 31, drm_fixp_from_fraction(-1, 2) + > drm_int2fixp(1)); > + > + /* (1 / 2) - 1) = 0.5 */ > + KUNIT_EXPECT_EQ(test, -(1ll << 31), drm_fixp_from_fraction(1, 2) + > drm_int2fixp(-1)); > + > + /* (1 / 2) - 1) = 0.5 */ > + KUNIT_EXPECT_EQ(test, -(1ll << 31), drm_fixp_from_fraction(1, 2) - > drm_int2fixp(1)); > + > +} > + > +static struct kunit_case drm_fixp_tests[] = { > + KUNIT_CASE(drm_test_int2fixp), > + KUNIT_CASE(drm_test_sm2fixp), > + { } > +}; > + > +static struct kunit_suite drm_rect_test_suite = { > + .name = "drm_fixp", > + .test_cases = drm_fixp_tests, > +}; > + > +kunit_test_suite(drm_rect_test_suite); > + > +MODULE_AUTHOR("AMD"); > +MODULE_LICENSE("GPL and additional rights"); > \ No newline at end of file pgpieK3eeAbs_.pgp Description: OpenPGP digital signature
Re: Wayland debugging with Qtwayland, gstreamer waylandsink, wayland-lib and Weston
On 05/03/2024 12:26, Pekka Paalanen wrote: On Mon, 4 Mar 2024 17:59:25 + Terry Barnaby wrote: On 04/03/2024 15:50, Pekka Paalanen wrote: On Mon, 4 Mar 2024 14:51:52 + Terry Barnaby wrote: On 04/03/2024 14:14, Pekka Paalanen wrote: On Mon, 4 Mar 2024 13:24:56 + Terry Barnaby wrote: On 04/03/2024 09:41, Pekka Paalanen wrote: On Mon, 4 Mar 2024 08:12:10 + Terry Barnaby wrote: While I am trying to investigate my issue in the QtWayland arena via the Qt Jira Bug system, I thought I would try taking Qt out of the equation to simplify the application a bit more to try and gain some understanding of what is going on and how this should all work. So I have created a pure GStreamer/Wayland/Weston application to test out how this should work. This is at: https://portal.beam.ltd.uk/public//test022-wayland-video-example.tar.gz This tries to implement a C++ Widget style application using native Wayland. It is rough and could easily be doing things wrong wrt Wayland. However it does work to a reasonable degree. However, I appear to see the same sort of issue I see with my Qt based system in that when a subsurface of a subsurface is used, the Gstreamer video is not seen. This example normally (UseWidgetTop=0) has a top level xdg_toplevel desktop surface (Gui), a subsurface to that (Video) and then waylandsink creates a subsurface to that which it sets to de-sync mode. When this example is run with UseWidgetTop=0 the video frames from gstreamer are only shown shown when the top subsurface is manually committed with gvideo->update() every second, otherwise the video pipeline is stalled. This is intentional. From wl_subsurface specification: Even if a sub-surface is in desynchronized mode, it will behave as in synchronized mode, if its parent surface behaves as in synchronized mode. This rule is applied recursively throughout the tree of surfaces. This means, that one can set a sub-surface into synchronized mode, and then assume that all its child and grand-child sub-surfaces are synchronized, too, without explicitly setting them. This is derived from the design decision that a wl_surface and its immediate sub-surfaces form a seamlessly integrated unit that works like a single wl_surface without sub-surfaces would. wl_subsurface state is state in the sub-surface's parent, so that the parent controls everything as if there was just a single wl_surface. If the parent sets its sub-surface as desynchronized, it explicitly gives the sub-surface the permission to update on screen regardless of the parent's updates. When the sub-surface is in synchronized mode, the parent surface wants to be updated in sync with the sub-surface in an atomic fashion. When your surface stack looks like: - main surface A, top-level, root surface (implicitly desynchronized) - sub-surface B, synchronized - sub-surface C, desynchronized Updates to surface C are immediately made part of surface B, because surface C is in desynchronized mode. If B was the root surface, all C updates would simply go through. However, surface B is a part of surface A, and surface B is in synchronized mode. This means that the client wants surface A updates to be explicit and atomic. Nothing must change on screen until A is explicitly committed itself. So any update to surface B requires a commit on surface A to become visible. Surface C does not get to override the atomicity requirement of surface A updates. This has been designed so that software component A can control surface A, and delegate a part of surface A to component B which happens to the using a sub-surface: surface B. If surface B parts are further delegated to another component C, then component A can still be sure that nothing updates on surface A until it says so. Component A sets surface B to synchronized to ensure that. That's the rationale behind the Wayland design. Thanks, pq Ah, thanks for the info, that may be why this is not working even in Qt then. This seems a dropoff in Wayland to me. If a software module wants to display Video into an area on the screen at its own rate, setting that surface to de-synced mode is no use in the general case with this policy. It is of use, if you don't have unnecessary sub-surfaces in synchronized mode in between, or you set all those extra sub-surfaces to desynchronized as well. Well they may not be necessary from the Wayland perspective, but from the higher level software they are useful to modularise/separate/provide a join for the software modules especially when software modules are separate like Qt and GStreamer. Sorry to hear that. I would have thought that if a subsurface was explicitly set to de-synced mode then that would be honoured. I can't see a usage case for it to be ignored and its commits synchronised up the tree ? Resizing the window is the main use case. In order to resize surface A, you also need to resi
Re: [RFC PATCH v4 22/42] drm/vkms: Use s32 for internal color pipeline precision
On Mon, 26 Feb 2024 16:10:36 -0500 Harry Wentland wrote: > Certain operations require us to preserve values below 0.0 and > above 1.0 (0x0 and 0x respectively in 16 bpc unorm). One > such operation is a BT709 encoding operation followed by its > decoding operation, or the reverse. > > We'll use s32 values as intermediate in and outputs of our > color operations, for the operations where it matters. > > For now this won't apply to LUT operations. We might want to > update those to work on s32 as well, but it's unclear how > that should work for unorm LUT definitions. We'll revisit > that once we add LUT + CTM tests. > > In order to allow for this we'll also invert the nesting of our > colorop processing loops. We now use the pixel iteration loop > on the outside and the colorop iteration on the inside. I always go through the same thought process: - putting the pixel iteration loop outermost is going to be horrible for performance - maybe should turn the temporary line buffer to argb_s32 for all of VKMS - that's going to be a lot of memory traffic... maybe your way is not horrible for performance - maybe processing pixel by pixel is good, if you can prepare the operation in advance, so you don't need to analyze colorops again and again on each pixel - eh, maybe it's not a gain after all, needs benchmarking My comment on patch 17 was like the step 0 of this train of thought. I just always get the same comments in my mind when seeing the same code again. Thanks, pq > > v4: > - Clarify that we're packing 16-bit UNORM into s32, not >converting values to a different representation (Pekka) > > v3: > - Use new colorop->next pointer > > Signed-off-by: Harry Wentland > --- > drivers/gpu/drm/vkms/vkms_composer.c | 57 +--- > drivers/gpu/drm/vkms/vkms_drv.h | 4 ++ > 2 files changed, 48 insertions(+), 13 deletions(-) > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c > b/drivers/gpu/drm/vkms/vkms_composer.c > index 25b786b49db0..d2101fa55aa3 100644 > --- a/drivers/gpu/drm/vkms/vkms_composer.c > +++ b/drivers/gpu/drm/vkms/vkms_composer.c > @@ -164,7 +164,7 @@ static void apply_lut(const struct vkms_crtc_state > *crtc_state, struct line_buff > } > } > > -static void apply_colorop(struct pixel_argb_u16 *pixel, struct drm_colorop > *colorop) > +static void apply_colorop(struct pixel_argb_s32 *pixel, struct drm_colorop > *colorop) > { > /* TODO is this right? */ > struct drm_colorop_state *colorop_state = colorop->state; > @@ -191,25 +191,56 @@ static void apply_colorop(struct pixel_argb_u16 *pixel, > struct drm_colorop *colo > > static void pre_blend_color_transform(const struct vkms_plane_state > *plane_state, struct line_buffer *output_buffer) > { > - struct drm_colorop *colorop = plane_state->base.base.color_pipeline; > + struct drm_colorop *colorop; > + struct pixel_argb_s32 pixel; > > - while (colorop) { > - struct drm_colorop_state *colorop_state; > + for (size_t x = 0; x < output_buffer->n_pixels; x++) { > > - if (!colorop) > - return; > + /* > + * Some operations, such as applying a BT709 encoding matrix, > + * followed by a decoding matrix, require that we preserve > + * values above 1.0 and below 0.0 until the end of the pipeline. > + * > + * Pack the 16-bit UNORM values into s32 to give us head-room to > + * avoid clipping until we're at the end of the pipeline. Clip > + * intentionally at the end of the pipeline before packing > + * UNORM values back into u16. > + */ > + pixel.a = output_buffer->pixels[x].a; > + pixel.r = output_buffer->pixels[x].r; > + pixel.g = output_buffer->pixels[x].g; > + pixel.b = output_buffer->pixels[x].b; > > - /* TODO this is probably wrong */ > - colorop_state = colorop->state; > + colorop = plane_state->base.base.color_pipeline; > + while (colorop) { > + struct drm_colorop_state *colorop_state; > > - if (!colorop_state) > - return; > + if (!colorop) > + return; > + > + /* TODO this is probably wrong */ > + colorop_state = colorop->state; > + > + if (!colorop_state) > + return; > > - for (size_t x = 0; x < output_buffer->n_pixels; x++) > if (!colorop_state->bypass) > - apply_colorop(&output_buffer->pixels[x], > colorop); > + apply_colorop(&pixel, colorop); > > - colorop = colorop->next; > + colorop = colorop->next; > + } > + > + /* clamp pixel */ > +
Re: [RFC PATCH v4 03/42] drm: Correctly round for fixp2int_round
On Mon, 26 Feb 2024 16:10:17 -0500 Harry Wentland wrote: > A value of 0x8000 and higher should round up, and > below should round down. VKMS Kunit tests for lerp_u16 > showed that this is not the case. Fix it. > > 1 << (DRM_FIXED_POINT_HALF - 1) = > 1 << 15 = 0x8000 > > This is not 0.5, but 0.0762939453125. > > Instead of some smart math use a simple if/else to > round up or down. This helps people like me to understand > what the function does. > > Signed-off-by: Harry Wentland > --- > include/drm/drm_fixed.h | 15 ++- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/include/drm/drm_fixed.h b/include/drm/drm_fixed.h > index cb842ba80ddd..8ee549f68537 100644 > --- a/include/drm/drm_fixed.h > +++ b/include/drm/drm_fixed.h > @@ -77,6 +77,8 @@ static inline u32 dfixed_div(fixed20_12 A, fixed20_12 B) > #define DRM_FIXED_DIGITS_MASK(~DRM_FIXED_DECIMAL_MASK) > #define DRM_FIXED_EPSILON1LL > #define DRM_FIXED_ALMOST_ONE (DRM_FIXED_ONE - DRM_FIXED_EPSILON) > +#define DRM_FIXED_FRACTIONAL 0xll > +#define DRM_FIXED_HALF 0x8000ll > > /** > * @drm_sm2fixp > @@ -106,11 +108,6 @@ static inline int drm_fixp2int(s64 a) > return ((s64)a) >> DRM_FIXED_POINT; > } > > -static inline int drm_fixp2int_round(s64 a) > -{ > - return drm_fixp2int(a + (1 << (DRM_FIXED_POINT_HALF - 1))); > -} > - > static inline int drm_fixp2int_ceil(s64 a) > { > if (a >= 0) > @@ -119,6 +116,14 @@ static inline int drm_fixp2int_ceil(s64 a) > return drm_fixp2int(a - DRM_FIXED_ALMOST_ONE); > } > > +static inline int drm_fixp2int_round(s64 a) > +{ > + if ((a & DRM_FIXED_FRACTIONAL) < DRM_FIXED_HALF) > + return drm_fixp2int(a); So, if we take -epsilon (which is -1 in raw s64 value, the largest possible value less than zero), this would return -1.0? That does not sound right. However, the "add 0.5 and truncate" trick always works, so I'd recommend sticking that. Thanks, pq > + else > + return drm_fixp2int_ceil(a); > +} > + > static inline unsigned drm_fixp_msbset(s64 a) > { > unsigned shift, sign = (a >> 63) & 1; pgpiDqZZCHweZ.pgp Description: OpenPGP digital signature
identifying active application/window under wayland
Hello, Apologies if this is not the correct forum for this. If this is the case, please direct me to a more appropriate place. I have searched the archive and cannot find any topic that appears to cover this. I am the author of a program that in part needs to be able to obtain the identity of the active application and window (including its title) being used by a user. This is required for context-dependent configuration of an external device and also, if configured by the user, logging of user activity for time tracking. This is readily achievable on the platforms that are currently supported (X11 and MacOS), but as far as I can see there does not appear to be a unified approach to achieving this under Wayland. Is it the case that there is no way to achieve this, or am I just missing things that are possible? Is it the case that I may need to only cover a subset of DEs and special-case code for each that is supported? Are there plans to support this kind of use? thanks Dan
Re: [RFC PATCH v4 25/42] drm/vkms: Add tests for CTM handling
On Mon, 26 Feb 2024 16:10:39 -0500 Harry Wentland wrote: > A whole slew of tests for CTM handling that greatly helped in > debugging the CTM code. The extent of tests might seem a bit > silly but they're fast and might someday help save someone > else's day when debugging this. > > v4: > - Comment on origin of bt709_enc matrix (Pekka) > - Use full opaque alpha (Pekka) > - Add additional check for Y < 0x (Pekka) > - Remove unused code (Pekka) > - Rename red, green, blue to Y, U, V where applicable > > Signed-off-by: Harry Wentland > --- > drivers/gpu/drm/vkms/tests/vkms_color_tests.c | 251 ++ > drivers/gpu/drm/vkms/vkms_composer.c | 2 +- > 2 files changed, 252 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/vkms/tests/vkms_color_tests.c > b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c > index e6ac01dee830..83d07f7bae37 100644 > --- a/drivers/gpu/drm/vkms/tests/vkms_color_tests.c > +++ b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c > @@ -3,6 +3,7 @@ > #include > > #include > +#include > > #define TEST_LUT_SIZE 16 > > @@ -181,11 +182,261 @@ static void vkms_color_srgb_inv_srgb(struct kunit > *test) > } > } > > +#define FIXPT_HALF(DRM_FIXED_ONE >> 1) > +#define FIXPT_QUARTER (DRM_FIXED_ONE >> 2) > + > +const struct drm_color_ctm_3x4 test_matrix_3x4_50_desat = { { > + FIXPT_HALF, FIXPT_QUARTER, FIXPT_QUARTER, 0, > + FIXPT_QUARTER, FIXPT_HALF, FIXPT_QUARTER, 0, > + FIXPT_QUARTER, FIXPT_QUARTER, FIXPT_HALF, 0 These are supposed to be sign-magnitude, not fixed-point, to my understanding. It just happens that these specific values have the same bit pattern in both representations. > +} }; > + > +static void vkms_color_ctm_3x4_50_desat(struct kunit *test) > +{ > + struct pixel_argb_s32 ref, out; > + > + /* full white */ > + ref.a = 0x; > + ref.r = 0x; > + ref.g = 0x; > + ref.b = 0x; > + > + memcpy(&out, &ref, sizeof(out)); > + apply_3x4_matrix(&out, &test_matrix_3x4_50_desat); > + > + KUNIT_EXPECT_MEMEQ(test, &ref, &out, sizeof(out)); > + > + /* full black */ > + ref.a = 0x; > + ref.r = 0x0; > + ref.g = 0x0; > + ref.b = 0x0; > + > + memcpy(&out, &ref, sizeof(out)); > + apply_3x4_matrix(&out, &test_matrix_3x4_50_desat); > + > + KUNIT_EXPECT_MEMEQ(test, &ref, &out, sizeof(out)); > + > + /* 50% grey */ > + ref.a = 0x; > + ref.r = 0x8000; > + ref.g = 0x8000; > + ref.b = 0x8000; > + > + memcpy(&out, &ref, sizeof(out)); > + apply_3x4_matrix(&out, &test_matrix_3x4_50_desat); > + > + KUNIT_EXPECT_MEMEQ(test, &ref, &out, sizeof(out)); > + > + /* full red to 50% desat */ > + ref.a = 0x; > + ref.r = 0x7fff; > + ref.g = 0x3fff; > + ref.b = 0x3fff; > + > + out.a = 0x; > + out.r = 0x; > + out.g = 0x0; > + out.b = 0x0; > + > + apply_3x4_matrix(&out, &test_matrix_3x4_50_desat); > + > + KUNIT_EXPECT_MEMEQ(test, &ref, &out, sizeof(out)); > +} > + > +/* > + * BT.709 encoding matrix > + * > + * Values printed from within IGT when converting > + * igt_matrix_3x4_bt709_enc to the fixed-point format expected > + * by DRM/KMS. Shouldn't that be sign-magnitude, not fixed point? I was hoping to get a reference to a spec like BT.709 and a formula for getting the blow out of it. IGT can change over time, and I don't know where IGT for that matrix from. But ok, this is a test with an arbitrary matrix, not necessarily the BT.709 matrix specifically. Btw. which BT.709 matrix is this? YUV->RGB, RGB->XYZ, or the inverse of one of those? Limited or full quantization range? Judging by the tests, I guess RGB->YUV full range. > + */ > +const struct drm_color_ctm_3x4 test_matrix_3x4_bt709_enc = { { > + 0x366cf400ull, 0xb7175900ull, 0x000127bb300ull, 0, > + 0x80001993b3a0ull, 0x80005609fe80ull, 0x6f9db200ull, 0, > + 0x9d70a400ull, 0x80008f011100ull, 0x8e6f9330ull, 0 > +} }; > + > +static void vkms_color_ctm_3x4_bt709(struct kunit *test) > +{ > + struct pixel_argb_s32 out; > + > + /* full white to bt709 */ > + out.a = 0x; > + out.r = 0x; > + out.g = 0x; > + out.b = 0x; > + > + apply_3x4_matrix(&out, &test_matrix_3x4_bt709_enc); > + > + /* Y 255 */ > + KUNIT_EXPECT_GT(test, out.r, 0xfe00); > + KUNIT_EXPECT_LT(test, out.r, 0x1); > + > + /* U 0 */ > + KUNIT_EXPECT_LT(test, out.g, 0x0100); For all of these U/V too, you may want to enforce a range. The value must be approximately 0x100. Something like 0x17 would very wrong. Hmm, wait... Neutral color should have U and V neutral, that is, zero, right? So what is zero in this integer representation? You don't seem to be testing negative U or V... Thanks, pq > + > + /* V 0 */ > + KUNIT_EXPECT_LT(test, out.b, 0x0100); > + > + /* full black to bt709 */ > + out.a = 0xf
Re: [RFC PATCH v4 08/42] drm/doc/rfc: Describe why prescriptive color pipeline is needed
On Mon, 26 Feb 2024 16:10:22 -0500 Harry Wentland wrote: > v4: > - Drop IOCTL docs since we dropped the IOCTLs (Pekka) > - Clarify reading and setting of COLOR_PIPELINE prop (Pekka) > - Add blurb about not requiring to reject a pipeline due to >incompatible ops, as long as op can be bypassed (Pekka) > - Dropped informational strings (such as input CSC) as they're >not actually intended to be advertised (Pekka) > > v3: > - Describe DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE (Sebastian) > - Ask for clear documentation of colorop behavior (Sebastian) > > v2: > - Update colorop visualizations to match reality (Sebastian, Alex Hung) > - Updated wording (Pekka) > - Change BYPASS wording to make it non-mandatory (Sebastian) > - Drop cover-letter-like paragraph from COLOR_PIPELINE Plane Property >section (Pekka) > - Use PQ EOTF instead of its inverse in Pipeline Programming example > (Melissa) > - Add "Driver Implementer's Guide" section (Pekka) > - Add "Driver Forward/Backward Compatibility" section (Sebastian, Pekka) > > Signed-off-by: Harry Wentland > --- > Documentation/gpu/rfc/color_pipeline.rst | 360 +++ > 1 file changed, 360 insertions(+) > create mode 100644 Documentation/gpu/rfc/color_pipeline.rst > > diff --git a/Documentation/gpu/rfc/color_pipeline.rst > b/Documentation/gpu/rfc/color_pipeline.rst > new file mode 100644 > index ..6c653e17054a > --- /dev/null > +++ b/Documentation/gpu/rfc/color_pipeline.rst > @@ -0,0 +1,360 @@ > + > +Linux Color Pipeline API > + > + > +What problem are we solving? > + > + > +We would like to support pre-, and post-blending complex color > +transformations in display controller hardware in order to allow for > +HW-supported HDR use-cases, as well as to provide support to > +color-managed applications, such as video or image editors. > + > +It is possible to support an HDR output on HW supporting the Colorspace > +and HDR Metadata drm_connector properties, but that requires the > +compositor or application to render and compose the content into one > +final buffer intended for display. Doing so is costly. > + > +Most modern display HW offers various 1D LUTs, 3D LUTs, matrices, and other > +operations to support color transformations. These operations are often > +implemented in fixed-function HW and therefore much more power efficient than > +performing similar operations via shaders or CPU. > + > +We would like to make use of this HW functionality to support complex color > +transformations with no, or minimal CPU or shader load. > + > + > +How are other OSes solving this problem? > + > + > +The most widely supported use-cases regard HDR content, whether video or > +gaming. > + > +Most OSes will specify the source content format (color gamut, encoding > transfer > +function, and other metadata, such as max and average light levels) to a > driver. > +Drivers will then program their fixed-function HW accordingly to map from a > +source content buffer's space to a display's space. > + > +When fixed-function HW is not available the compositor will assemble a > shader to > +ask the GPU to perform the transformation from the source content format to > the > +display's format. > + > +A compositor's mapping function and a driver's mapping function are usually > +entirely separate concepts. On OSes where a HW vendor has no insight into > +closed-source compositor code such a vendor will tune their color management > +code to visually match the compositor's. On other OSes, where both mapping > +functions are open to an implementer they will ensure both mappings match. > + > +This results in mapping algorithm lock-in, meaning that no-one alone can > +experiment with or introduce new mapping algorithms and achieve > +consistent results regardless of which implementation path is taken. > + > +Why is Linux different? > +=== > + > +Unlike other OSes, where there is one compositor for one or more drivers, on > +Linux we have a many-to-many relationship. Many compositors; many drivers. > +In addition each compositor vendor or community has their own view of how > +color management should be done. This is what makes Linux so beautiful. > + > +This means that a HW vendor can now no longer tune their driver to one > +compositor, as tuning it to one could make it look fairly different from > +another compositor's color mapping. > + > +We need a better solution. > + > + > +Descriptive API > +=== > + > +An API that describes the source and destination colorspaces is a descriptive > +API. It describes the input and output color spaces but does not describe > +how precisely they should be mapped. Such a mapping includes many minute > +design decision that can greatly affect the look of the final result. > + > +It is not feasible to describe such mapping with enough detail to ensure the >