Re: [PATCH] drm/tidss: Add MIT license along with GPL-2.0
Hi Laurent, Thanks for the quick review. On 13/09/24 13:54, Laurent Pinchart wrote: > Hi Devarsh, > > On Thu, Sep 12, 2024 at 10:41:42PM +0530, Devarsh Thakkar wrote: >> Modify license to include dual licensing as GPL-2.0-only OR MIT license for >> tidss display driver. This allows other operating system ecosystems such as >> Zephyr and also the commercial firmwares to refer and derive code from this > > GPL-2.0 isn't incompatible with "commercial". I think you mean > "proprietary" here. > Yes, GPL-2.0 is not incompatible to commercial but there is an enforecement that derivative code needs to be GPL-2.0 licensed which may not fit well for projects which are not using GPL-2.0 license. But yes MIT will also help proprietary code, so I agree it is good to mention the same in commit message. >> display driver in a more permissive manner. > > How do you envision that to work ? Zephyr doesn't have KMS, so you can't > use the driver as-is. What exactly would TI want to use from the Linux > kernel driver ? > Not the DRM/KMS part, but the tidss specific display controller programming is the main point of interest. For e.g. At this point, mostly I see that the TI customers are interested to re-use/derive code from u-boot tidss driver [1] which is quite simple and use it in their test application or RTOS based offerring which is non-GPL code. Since their test application has much more code apart from the display part, which is non-GPL, they can't use GPL license. Now since the u-boot tidss driver is derived from kernel tidss driver, my understanding was that we need to change the license of kernel tidss driver first before changing the u-boot tidss driver. > Personally, there's a reason why I contribute code to the kernel under > the GPL-2.0 license, it is to make sure the code will remain open. While > I can accept other licenses on a case-by-case basis, I don't like the > casual approach of this patch that seem (to me) to imply that the > license is a mere detail. For a start I would like to know what > "commercial firmwares" you're thinking about. > Understood, let me know if above information suffice or any further information is needed. [1]: https://gitlab.com/u-boot/u-boot/-/blob/master/drivers/video/tidss/tidss_drv.c?ref_type=heads Regards Devarsh
[PATCH] drm/tidss: Add MIT license along with GPL-2.0
Modify license to include dual licensing as GPL-2.0-only OR MIT license for tidss display driver. This allows other operating system ecosystems such as Zephyr and also the commercial firmwares to refer and derive code from this display driver in a more permissive manner. Signed-off-by: Devarsh Thakkar --- drivers/gpu/drm/tidss/Makefile| 2 +- drivers/gpu/drm/tidss/tidss_crtc.c| 2 +- drivers/gpu/drm/tidss/tidss_crtc.h| 2 +- drivers/gpu/drm/tidss/tidss_dispc.c | 2 +- drivers/gpu/drm/tidss/tidss_dispc.h | 2 +- drivers/gpu/drm/tidss/tidss_dispc_regs.h | 2 +- drivers/gpu/drm/tidss/tidss_drv.c | 2 +- drivers/gpu/drm/tidss/tidss_drv.h | 2 +- drivers/gpu/drm/tidss/tidss_encoder.c | 2 +- drivers/gpu/drm/tidss/tidss_encoder.h | 2 +- drivers/gpu/drm/tidss/tidss_irq.c | 2 +- drivers/gpu/drm/tidss/tidss_irq.h | 2 +- drivers/gpu/drm/tidss/tidss_kms.c | 2 +- drivers/gpu/drm/tidss/tidss_kms.h | 2 +- drivers/gpu/drm/tidss/tidss_plane.c | 2 +- drivers/gpu/drm/tidss/tidss_plane.h | 2 +- drivers/gpu/drm/tidss/tidss_scale_coefs.c | 2 +- drivers/gpu/drm/tidss/tidss_scale_coefs.h | 2 +- 18 files changed, 18 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/tidss/Makefile b/drivers/gpu/drm/tidss/Makefile index 312645271014..c67ff32d02e1 100644 --- a/drivers/gpu/drm/tidss/Makefile +++ b/drivers/gpu/drm/tidss/Makefile @@ -1,4 +1,4 @@ -# SPDX-License-Identifier: GPL-2.0 +# SPDX-License-Identifier: GPL-2.0 OR MIT tidss-y := tidss_crtc.o \ tidss_drv.o \ diff --git a/drivers/gpu/drm/tidss/tidss_crtc.c b/drivers/gpu/drm/tidss/tidss_crtc.c index 94f8e3178df5..43dfbead9fa9 100644 --- a/drivers/gpu/drm/tidss/tidss_crtc.c +++ b/drivers/gpu/drm/tidss/tidss_crtc.c @@ -1,4 +1,4 @@ -// SPDX-License-Identifier: GPL-2.0 +// SPDX-License-Identifier: GPL-2.0 OR MIT /* * Copyright (C) 2018 Texas Instruments Incorporated - https://www.ti.com/ * Author: Tomi Valkeinen diff --git a/drivers/gpu/drm/tidss/tidss_crtc.h b/drivers/gpu/drm/tidss/tidss_crtc.h index 040d1205496b..da03873e2ef0 100644 --- a/drivers/gpu/drm/tidss/tidss_crtc.h +++ b/drivers/gpu/drm/tidss/tidss_crtc.h @@ -1,4 +1,4 @@ -/* SPDX-License-Identifier: GPL-2.0 */ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ /* * Copyright (C) 2018 Texas Instruments Incorporated - https://www.ti.com/ * Author: Tomi Valkeinen diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c b/drivers/gpu/drm/tidss/tidss_dispc.c index 1ad711f8d2a8..3321a1c731b1 100644 --- a/drivers/gpu/drm/tidss/tidss_dispc.c +++ b/drivers/gpu/drm/tidss/tidss_dispc.c @@ -1,4 +1,4 @@ -// SPDX-License-Identifier: GPL-2.0 +// SPDX-License-Identifier: GPL-2.0 OR MIT /* * Copyright (C) 2016-2018 Texas Instruments Incorporated - https://www.ti.com/ * Author: Jyri Sarha diff --git a/drivers/gpu/drm/tidss/tidss_dispc.h b/drivers/gpu/drm/tidss/tidss_dispc.h index 086327d51a90..e6e4396a0d63 100644 --- a/drivers/gpu/drm/tidss/tidss_dispc.h +++ b/drivers/gpu/drm/tidss/tidss_dispc.h @@ -1,4 +1,4 @@ -/* SPDX-License-Identifier: GPL-2.0 */ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ /* * Copyright (C) 2018 Texas Instruments Incorporated - https://www.ti.com/ * Author: Tomi Valkeinen diff --git a/drivers/gpu/drm/tidss/tidss_dispc_regs.h b/drivers/gpu/drm/tidss/tidss_dispc_regs.h index 13feedfe5d6d..6e27b6d444ab 100644 --- a/drivers/gpu/drm/tidss/tidss_dispc_regs.h +++ b/drivers/gpu/drm/tidss/tidss_dispc_regs.h @@ -1,4 +1,4 @@ -/* SPDX-License-Identifier: GPL-2.0 */ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ /* * Copyright (C) 2016-2018 Texas Instruments Incorporated - https://www.ti.com/ * Author: Jyri Sarha diff --git a/drivers/gpu/drm/tidss/tidss_drv.c b/drivers/gpu/drm/tidss/tidss_drv.c index d15f836dca95..b060e420ddec 100644 --- a/drivers/gpu/drm/tidss/tidss_drv.c +++ b/drivers/gpu/drm/tidss/tidss_drv.c @@ -1,4 +1,4 @@ -// SPDX-License-Identifier: GPL-2.0 +// SPDX-License-Identifier: GPL-2.0 OR MIT /* * Copyright (C) 2018 Texas Instruments Incorporated - https://www.ti.com/ * Author: Tomi Valkeinen diff --git a/drivers/gpu/drm/tidss/tidss_drv.h b/drivers/gpu/drm/tidss/tidss_drv.h index d7f27b0b0315..d4209234f59c 100644 --- a/drivers/gpu/drm/tidss/tidss_drv.h +++ b/drivers/gpu/drm/tidss/tidss_drv.h @@ -1,4 +1,4 @@ -/* SPDX-License-Identifier: GPL-2.0 */ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ /* * Copyright (C) 2018 Texas Instruments Incorporated - https://www.ti.com/ * Author: Tomi Valkeinen diff --git a/drivers/gpu/drm/tidss/tidss_encoder.c b/drivers/gpu/drm/tidss/tidss_encoder.c index 17a86bed8054..9749fbc0e056 100644 --- a/drivers/gpu/drm/tidss/tidss_encoder.c +++ b/drivers/gpu/drm/tidss/tidss_encoder.c @@ -1,4 +1,4 @@ -// SPDX-License-Identifier: GPL-2.0 +// SPDX-License-Identifier: GPL-2.0 OR MIT /* * Copyright (C) 2018 Texas Instruments Incorporated - https://www.ti.com/ * Author: Tomi Valkeinen diff --git a/drivers/gpu/drm
[PATCH v13 13/13] gpu: ipu-v3: Use generic macro for rounding closest to specified value
Use generic macro round_closest_up() for rounding closest to specified value instead of using local macro round_closest(). There is no change from functionality point of view as round_closest_up() is functionally same as the previously used local macro round_closest(). Signed-off-by: Devarsh Thakkar --- V9-V13: No change V8: Update commit message V1->V7 : (No change, patch introduced in V7) drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c b/drivers/gpu/ipu-v3/ipu-image-convert.c index 841316582ea9..5192a8b5c02c 100644 --- a/drivers/gpu/ipu-v3/ipu-image-convert.c +++ b/drivers/gpu/ipu-v3/ipu-image-convert.c @@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct ipu_image_convert_ctx *ctx, return 0; } -#define round_closest(x, y) round_down((x) + (y)/2, (y)) - /* * Find the best aligned seam position for the given column / row index. * Rotation and image offsets are out of scope. @@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx *ctx, * The closest input sample position that we could actually * start the input tile at, 19.13 fixed point. */ - in_pos_aligned = round_closest(in_pos, 8192U * in_align); + in_pos_aligned = round_closest_up(in_pos, 8192U * in_align); /* Convert 19.13 fixed point to integer */ in_pos_rounded = in_pos_aligned / 8192U; -- 2.39.1
[PATCH v13 12/13] media: imagination: Round to closest multiple for cropping region
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest multiple of requested value while updating the crop rectangle coordinates. Use the rounding macro which gives preference to rounding down in case two nearest values (high and low) are possible to raise the probability of cropping rectangle falling inside the bound region. This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl description as documented in v4l uapi [1] which specifies that driver should choose crop rectangle as close as possible if no flags are passed by user-space, as quoted below : "``0`` - The driver can adjust the rectangle size freely and shall choose a crop/compose rectangle as close as possible to the requested one." Link: https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst [1] Signed-off-by: Devarsh Thakkar --- V9->V13: No change V8: Update commit message with specification reference V1->V7 (No change, patch introduced in V7) --- drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c b/drivers/media/platform/imagination/e5010-jpeg-enc.c index 4b6fbe99d5ff..ea96b1bc5315 100644 --- a/drivers/media/platform/imagination/e5010-jpeg-enc.c +++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c @@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, struct v4l2_selection switch (s->flags) { case 0: - s->r.width = round_down(s->r.width, queue->fmt->frmsize.step_width); - s->r.height = round_down(s->r.height, queue->fmt->frmsize.step_height); - s->r.left = round_down(s->r.left, queue->fmt->frmsize.step_width); - s->r.top = round_down(s->r.top, 2); + s->r.width = round_closest_down(s->r.width, queue->fmt->frmsize.step_width); + s->r.height = round_closest_down(s->r.height, queue->fmt->frmsize.step_height); + s->r.left = round_closest_down(s->r.left, queue->fmt->frmsize.step_width); + s->r.top = round_closest_down(s->r.top, 2); if (s->r.left + s->r.width > queue->width) s->r.width = round_down(s->r.width + s->r.left - queue->width, -- 2.39.1
[PATCH v13 11/13] lib: math_kunit: Add tests for new macros related to rounding to nearest value
Add tests for round_closest_up/down and roundclosest macros which round to nearest multiple of specified argument. These are tested with kunit tool as shared here [1] : Link: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876 [1] Signed-off-by: Devarsh Thakkar Acked-by: Andy Shevchenko --- V1->V13 (No change, patch introduced in V8) lib/math/math_kunit.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c index be27f2afb8e4..05022f010be6 100644 --- a/lib/math/math_kunit.c +++ b/lib/math/math_kunit.c @@ -70,6 +70,26 @@ static void round_down_test(struct kunit *test) KUNIT_EXPECT_EQ(test, round_down((1 << 30) - 1, 1 << 29), 1 << 29); } +static void round_closest_up_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_closest_up(17, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up(15, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up(14, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) + 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 2), 1 << 30); +} + +static void round_closest_down_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_closest_down(17, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_down(15, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_down(14, 4), 12); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) + 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 2), (1 << 30) - 2); +} + /* These versions can round to numbers that aren't a power of two */ static void roundup_test(struct kunit *test) { @@ -95,6 +115,18 @@ static void rounddown_test(struct kunit *test) KUNIT_EXPECT_EQ(test, rounddown(4, 3), 3); } +static void roundclosest_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, roundclosest(21, 5), 20); + KUNIT_EXPECT_EQ(test, roundclosest(19, 5), 20); + KUNIT_EXPECT_EQ(test, roundclosest(17, 5), 15); + KUNIT_EXPECT_EQ(test, roundclosest((1 << 30), 3), (1 << 30) - 1); + KUNIT_EXPECT_EQ(test, roundclosest((1 << 30) - 1, 1 << 29), 1 << 30); + + KUNIT_EXPECT_EQ(test, roundclosest(4, 3), 3); + KUNIT_EXPECT_EQ(test, roundclosest(5, 3), 6); +} + static void div_round_up_test(struct kunit *test) { KUNIT_EXPECT_EQ(test, DIV_ROUND_UP(0, 1), 0); @@ -272,8 +304,11 @@ static struct kunit_case math_test_cases[] = { KUNIT_CASE(int_sqrt_test), KUNIT_CASE(round_up_test), KUNIT_CASE(round_down_test), + KUNIT_CASE(round_closest_up_test), + KUNIT_CASE(round_closest_down_test), KUNIT_CASE(roundup_test), KUNIT_CASE(rounddown_test), + KUNIT_CASE(roundclosest_test), KUNIT_CASE(div_round_up_test), KUNIT_CASE(div_round_closest_test), KUNIT_CASE_PARAM(gcd_test, gcd_gen_params), -- 2.39.1
[PATCH v13 10/13] lib: add basic KUnit test for lib/math
From: Daniel Latypov Add basic test coverage for files that don't require any config options: * part of math.h (what seem to be the most commonly used macros) * gcd.c * lcm.c * int_sqrt.c * reciprocal_div.c (Ignored int_pow.c since it's a simple textbook algorithm.) These tests aren't particularly interesting, but they * provide short and simple examples of parameterized tests * provide a place to add tests for any new files in this dir * are written so adding new test cases to cover edge cases should be easy * looking at code coverage, we hit all the branches in the .c files Signed-off-by: Daniel Latypov Reviewed-by: David Gow Reviewed-by: Andy Shevchenko [devarsht: Rebase to 6.9, remove kernel.h, update Kconfig and change license to GPL] Signed-off-by: Devarsh Thakkar --- Changes since v11: * Add Reviewed-by tag Changes since v10: * Include headers per IWYU principle and add module description Changes since v9: * Added Kconfig dependency for KUNIT Changes since v8: * Remove unrequired header file linux/kernel.h Changes since v7: * Rebase to linux-next, change license to GPL as suggested by checkpatch. Changes since v6: * No change Changes since v5: * add in test cases for roundup/rounddown * address misc comments from David Changes since v4: * add in test cases for some math.h macros (abs, round_up/round_down, div_round_down/closest) * use parameterized testing less to keep things terser Changes since v3: * fix `checkpatch.pl --strict` warnings * add test cases for gcd(0,0) and lcm(0,0) * minor: don't test both gcd(a,b) and gcd(b,a) when a == b Changes since v2: mv math_test.c => math_kunit.c Changes since v1: * Rebase and rewrite to use the new parameterized testing support. * misc: fix overflow in literal and inline int_sqrt format string. * related: commit 1f0e943df68a ("Documentation: kunit: provide guidance for testing many inputs") was merged explaining the patterns shown here. * there's an in-flight patch to update it for parameterized testing. --- lib/math/Kconfig | 14 ++ lib/math/Makefile | 1 + lib/math/math_kunit.c | 294 ++ 3 files changed, 309 insertions(+) create mode 100644 lib/math/math_kunit.c diff --git a/lib/math/Kconfig b/lib/math/Kconfig index 0634b428d0cb..f738d8a433ea 100644 --- a/lib/math/Kconfig +++ b/lib/math/Kconfig @@ -15,3 +15,17 @@ config PRIME_NUMBERS config RATIONAL tristate + +config MATH_KUNIT_TEST + tristate "KUnit test for math helper functions" + depends on KUNIT + default KUNIT_ALL_TESTS + + help + This builds unit test for math helper functions as defined in lib/math + and math.h. + + For more information on KUNIT and unit tests in general, please refer + to the KUnit documentation in Documentation/dev-tools/kunit/. + + If unsure, say N. diff --git a/lib/math/Makefile b/lib/math/Makefile index 91fcdb0c9efe..dcf65d10dab2 100644 --- a/lib/math/Makefile +++ b/lib/math/Makefile @@ -7,3 +7,4 @@ obj-$(CONFIG_RATIONAL) += rational.o obj-$(CONFIG_TEST_DIV64) += test_div64.o obj-$(CONFIG_RATIONAL_KUNIT_TEST) += rational-test.o +obj-$(CONFIG_MATH_KUNIT_TEST) += math_kunit.o diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c new file mode 100644 index ..be27f2afb8e4 --- /dev/null +++ b/lib/math/math_kunit.c @@ -0,0 +1,294 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Simple KUnit suite for math helper funcs that are always enabled. + * + * Copyright (C) 2020, Google LLC. + * Author: Daniel Latypov + */ + +#include +#include +#include +#include +#include +#include +#include + +static void abs_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, abs((char)0), (char)0); + KUNIT_EXPECT_EQ(test, abs((char)42), (char)42); + KUNIT_EXPECT_EQ(test, abs((char)-42), (char)42); + + /* The expression in the macro is actually promoted to an int. */ + KUNIT_EXPECT_EQ(test, abs((short)0), 0); + KUNIT_EXPECT_EQ(test, abs((short)42), 42); + KUNIT_EXPECT_EQ(test, abs((short)-42), 42); + + KUNIT_EXPECT_EQ(test, abs(0), 0); + KUNIT_EXPECT_EQ(test, abs(42), 42); + KUNIT_EXPECT_EQ(test, abs(-42), 42); + + KUNIT_EXPECT_EQ(test, abs(0L), 0L); + KUNIT_EXPECT_EQ(test, abs(42L), 42L); + KUNIT_EXPECT_EQ(test, abs(-42L), 42L); + + KUNIT_EXPECT_EQ(test, abs(0LL), 0LL); + KUNIT_EXPECT_EQ(test, abs(42LL), 42LL); + KUNIT_EXPECT_EQ(test, abs(-42LL), 42LL); + + /* Unsigned types get casted to signed. */ + KUNIT_EXPECT_EQ(test, abs(0ULL), 0LL); + KUNIT_EXPECT_EQ(test, abs(42ULL), 42LL); +} + +static void int_sqrt_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, int_sqrt(0UL), 0UL); + KUNIT_EXPECT_EQ(test, int_sqrt(1UL), 1UL); + KUNIT_EXPECT_EQ(test, int_sqrt(4UL), 2UL); + KUNIT_EXPECT_EQ(test, int_sqrt(5UL), 2UL);
[PATCH v13 09/13] Documentation: core-api: Add math.h macros and functions
Add documentation for rounding, scaling, absolute value and 32-bit division related macros and functions exported by math.h header file. Signed-off-by: Devarsh Thakkar Reviewed-by: Andy Shevchenko --- V13: No change V12: Add Reviewed-by V11: Fix title for math function header V10: Patch introduced V1->V9 (No change) Documentation/core-api/kernel-api.rst | 6 ++ 1 file changed, 6 insertions(+) diff --git a/Documentation/core-api/kernel-api.rst b/Documentation/core-api/kernel-api.rst index ae92a2571388..7de494e76fa6 100644 --- a/Documentation/core-api/kernel-api.rst +++ b/Documentation/core-api/kernel-api.rst @@ -185,6 +185,12 @@ Division Functions .. kernel-doc:: lib/math/gcd.c :export: +Rounding, absolute value, division and 32-bit scaling functions +--- + +.. kernel-doc:: include/linux/math.h + :internal: + UUID/GUID - -- 2.39.1
[PATCH v13 07/13] math.h: Add macros for rounding to closest value
Add below rounding related macros: round_closest_up(x, y) : Rounds x to closest multiple of y where y is a power of 2, with a preference to round up in case two nearest values are possible. round_closest_down(x, y) : Rounds x to closest multiple of y where y is a power of 2, with a preference to round down in case two nearest values are possible. roundclosest(x, y) : Rounds x to closest multiple of y, this macro should generally be used only when y is not multiple of 2 as otherwise round_closest* macros should be used which are much faster. Examples: * round_closest_up(17, 4) = 16 * round_closest_up(15, 4) = 16 * round_closest_up(14, 4) = 16 * round_closest_down(17, 4) = 16 * round_closest_down(15, 4) = 16 * round_closest_down(14, 4) = 12 * roundclosest(21, 5) = 20 * roundclosest(19, 5) = 20 * roundclosest(17, 5) = 15 Signed-off-by: Devarsh Thakkar Acked-by: Andy Shevchenko --- NOTE: This patch is inspired from the Mentor Graphics IPU driver [1] which uses similar macro locally and which is updated in further patch in the series to use this generic macro instead along with other drivers having similar requirements. Link: https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480 [1] V13: - No change V12: - Add Acked-by V11: - Fix commenting style per review comments and remove extra whitespace V10: - Update example comment to fix formatting issues as observed with html docs V9: - No change V8: - Add new macro to round to nearest value for non-multiple of 2 - Update commit message as suggested: V1->V6 (No change, patch introduced in V7) include/linux/math.h | 63 1 file changed, 63 insertions(+) diff --git a/include/linux/math.h b/include/linux/math.h index dd4152711de7..79e3dfda77fc 100644 --- a/include/linux/math.h +++ b/include/linux/math.h @@ -34,6 +34,52 @@ */ #define round_down(x, y) ((x) & ~__round_mask(x, y)) +/** + * round_closest_up - round closest to be multiple of specified value (which is + *power of 2) with preference to rounding up + * @x: the value to round + * @y: multiple to round closest to (must be a power of 2) + * + * Rounds @x to closest multiple of @y (which must be a power of 2). + * The value can be either rounded up or rounded down depending upon rounded + * value's closeness to the specified value. If there are two closest possible + * values, i.e. the difference between the specified value and it's rounded up + * and rounded down values is same then preference is given to rounded up + * value. + * + * To perform arbitrary rounding to closest value (not multiple of 2), use + * roundclosest(). + * + * Examples: + * * round_closest_up(17, 4) = 16 + * * round_closest_up(15, 4) = 16 + * * round_closest_up(14, 4) = 16 + */ +#define round_closest_up(x, y) round_down((x) + (y) / 2, (y)) + +/** + * round_closest_down - round closest to be multiple of specified value (which + * is power of 2) with preference to rounding down + * @x: the value to round + * @y: multiple to round closest to (must be a power of 2) + * + * Rounds @x to closest multiple of @y (which must be a power of 2). + * The value can be either rounded up or rounded down depending upon rounded + * value's closeness to the specified value. If there are two closest possible + * values, i.e. the difference between the specified value and it's rounded up + * and rounded down values is same then preference is given to rounded up + * value. + * + * To perform arbitrary rounding to closest value (not multiple of 2), use + * roundclosest(). + * + * Examples: + * * round_closest_down(17, 4) = 16 + * * round_closest_down(15, 4) = 16 + * * round_closest_down(14, 4) = 12 + */ +#define round_closest_down(x, y) round_up((x) - (y) / 2, (y)) + #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP #define DIV_ROUND_DOWN_ULL(ll, d) \ @@ -77,6 +123,23 @@ } \ ) +/** + * roundclosest - round to nearest multiple + * @x: the value to round + * @y: multiple to round nearest to + * + * Rounds @x to nearest multiple of @y. + * The rounded value can be greater than or less than @x depending + * upon it's nearness to @x. If @y will always be a power of 2, consider + * using the faster round_closest_up() or round_closest_down(). + * + * Examples: + * * roundclosest(21, 5) = 20 + * * roundclosest(19, 5) = 20 + * * roundclosest(17, 5) = 15 + */ +#define roundclosest(x, y) rounddown((x) + (y) / 2, (y)) + /* * Divide positive or negative dividend by positive or negative divisor * and round to closest integer. Result is undefined for negative -- 2.39.1
[PATCH v13 00/13] Add V4L2 M2M Driver for E5010 JPEG Encoder
ttps://gist.github.com/devarsht/298790af819f299a0a05fec89371097b V3->V4 Range diff : https://gist.github.com/devarsht/22a744d999080de6e813bcfb5a596272 Previous patch series: V12: https://lore.kernel.org/all/20240604104001.2235082-1-devar...@ti.com/ V11: https://lore.kernel.org/all/20240531170229.1270828-1-devar...@ti.com/ V10: https://lore.kernel.org/all/20240530165925.2715837-1-devar...@ti.com/ V9: https://lore.kernel.org/all/20240526175655.1093707-1-devar...@ti.com/ V8: https://lore.kernel.org/all/20240517171532.748684-1-devar...@ti.com/ V7: https://lore.kernel.org/all/20240510082603.1263256-1-devar...@ti.com/ V6: https://lore.kernel.org/all/20240228141140.3530612-1-devar...@ti.com/ V5: https://lore.kernel.org/all/20240215134641.3381478-1-devar...@ti.com/ V4: https://lore.kernel.org/all/20240205114239.924697-1-devar...@ti.com/ V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/ V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/ Daniel Latypov (1): lib: add basic KUnit test for lib/math Devarsh Thakkar (12): media: dt-bindings: Add Imagination E5010 JPEG Encoder media: imagination: Add E5010 JPEG Encoder driver media: v4l2-jpeg: Export reference quantization and huffman tables media: Documentation: Document v4l2-jpeg helper functions media: imagination: Use exported tables from v4l2-jpeg core media: verisilicon : Use exported tables from v4l2-jpeg for hantro codec math.h: Add macros for rounding to closest value math.h: Use kernel-doc syntax for divison macros Documentation: core-api: Add math.h macros and functions lib: math_kunit: Add tests for new macros related to rounding to nearest value media: imagination: Round to closest multiple for cropping region gpu: ipu-v3: Use generic macro for rounding closest to specified value Documentation/core-api/kernel-api.rst |6 + .../bindings/media/img,e5010-jpeg-enc.yaml| 75 + Documentation/driver-api/media/v4l2-core.rst |1 + Documentation/driver-api/media/v4l2-jpeg.rst | 10 + MAINTAINERS |7 + drivers/gpu/ipu-v3/ipu-image-convert.c|4 +- drivers/media/platform/Kconfig|1 + drivers/media/platform/Makefile |1 + drivers/media/platform/imagination/Kconfig| 13 + drivers/media/platform/imagination/Makefile |3 + .../platform/imagination/e5010-core-regs.h| 585 ++ .../platform/imagination/e5010-jpeg-enc-hw.c | 267 +++ .../platform/imagination/e5010-jpeg-enc-hw.h | 42 + .../platform/imagination/e5010-jpeg-enc.c | 1641 + .../platform/imagination/e5010-jpeg-enc.h | 168 ++ .../platform/imagination/e5010-mmu-regs.h | 311 drivers/media/platform/verisilicon/Kconfig|1 + .../media/platform/verisilicon/hantro_jpeg.c | 128 +- drivers/media/v4l2-core/v4l2-jpeg.c | 162 +- include/linux/math.h | 86 +- include/media/v4l2-jpeg.h | 28 + lib/math/Kconfig | 14 + lib/math/Makefile |1 + lib/math/math_kunit.c | 329 24 files changed, 3760 insertions(+), 124 deletions(-) create mode 100644 Documentation/devicetree/bindings/media/img,e5010-jpeg-enc.yaml create mode 100644 Documentation/driver-api/media/v4l2-jpeg.rst create mode 100644 drivers/media/platform/imagination/Kconfig create mode 100644 drivers/media/platform/imagination/Makefile create mode 100644 drivers/media/platform/imagination/e5010-core-regs.h create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.c create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.h create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.c create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.h create mode 100644 drivers/media/platform/imagination/e5010-mmu-regs.h create mode 100644 lib/math/math_kunit.c -- 2.39.1
Re: [PATCH v12 12/13] media: imagination: Round to closest multiple for cropping region
Hi Sebastian Thanks for the update. On 06/06/24 17:14, Sebastian Fricke wrote: > Hey, > > On 04.06.2024 16:23, Devarsh Thakkar wrote: >> If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up >> (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest >> multiple of requested value while updating the crop rectangle coordinates. >> >> Use the rounding macro which gives preference to rounding down in case two >> nearest values (high and low) are possible to raise the probability of >> cropping rectangle falling inside the bound region. >> >> This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl >> description as documented in v4l uapi [1] which specifies that driver >> should choose crop rectangle as close as possible if no flags are passed by >> user-space, as quoted below : >> >> "``0`` - The driver can adjust the rectangle size freely and shall choose a >> crop/compose rectangle as close as possible to the requested >> one." >> >> Link: >> https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst >> [1] >> Signed-off-by: Devarsh Thakkar > > Acked-by: Sebastian Fricke > > Can, whoever picks up the math changes, pick up this change as well? > I will send 1-6 via the media subsystem. > For [PATCH 1/13] to [PATCH 6/13] patches from the series, I see few warnings reported to me offline for some of the patches which were caught from using smatch/sparse related automation scripts which were somehow missed by my equivalent test script. The fixes should be trivial though and I will be rolling out a v13 soon to fix them up. The rest of the patches (PATCH 7/13 to PATCH 13/13) no smatch/sparse related warnings were caught for these though, and are good to go in. Although, I can still include them in V13 too just to avoid any confusion. Regards Devarsh
[PATCH v12 13/13] gpu: ipu-v3: Use generic macro for rounding closest to specified value
Use generic macro round_closest_up() for rounding closest to specified value instead of using local macro round_closest(). There is no change from functionality point of view as round_closest_up() is functionally same as the previously used local macro round_closest(). Signed-off-by: Devarsh Thakkar --- V12: No change V11: No change V10: No change V9: No change V8: Update commit message V1->V7 : (No change, patch introduced in V7) --- drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c b/drivers/gpu/ipu-v3/ipu-image-convert.c index 841316582ea9..5192a8b5c02c 100644 --- a/drivers/gpu/ipu-v3/ipu-image-convert.c +++ b/drivers/gpu/ipu-v3/ipu-image-convert.c @@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct ipu_image_convert_ctx *ctx, return 0; } -#define round_closest(x, y) round_down((x) + (y)/2, (y)) - /* * Find the best aligned seam position for the given column / row index. * Rotation and image offsets are out of scope. @@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx *ctx, * The closest input sample position that we could actually * start the input tile at, 19.13 fixed point. */ - in_pos_aligned = round_closest(in_pos, 8192U * in_align); + in_pos_aligned = round_closest_up(in_pos, 8192U * in_align); /* Convert 19.13 fixed point to integer */ in_pos_rounded = in_pos_aligned / 8192U; -- 2.39.1
[PATCH v12 12/13] media: imagination: Round to closest multiple for cropping region
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest multiple of requested value while updating the crop rectangle coordinates. Use the rounding macro which gives preference to rounding down in case two nearest values (high and low) are possible to raise the probability of cropping rectangle falling inside the bound region. This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl description as documented in v4l uapi [1] which specifies that driver should choose crop rectangle as close as possible if no flags are passed by user-space, as quoted below : "``0`` - The driver can adjust the rectangle size freely and shall choose a crop/compose rectangle as close as possible to the requested one." Link: https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst [1] Signed-off-by: Devarsh Thakkar --- V12: No change V11: No change V10: No change V9: No change V8: Update commit message with specification reference V1->V7 (No change, patch introduced in V7) --- drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c b/drivers/media/platform/imagination/e5010-jpeg-enc.c index e701d573a26a..d65646f0c38c 100644 --- a/drivers/media/platform/imagination/e5010-jpeg-enc.c +++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c @@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, struct v4l2_selection switch (s->flags) { case 0: - s->r.width = round_down(s->r.width, queue->fmt->frmsize.step_width); - s->r.height = round_down(s->r.height, queue->fmt->frmsize.step_height); - s->r.left = round_down(s->r.left, queue->fmt->frmsize.step_width); - s->r.top = round_down(s->r.top, 2); + s->r.width = round_closest_down(s->r.width, queue->fmt->frmsize.step_width); + s->r.height = round_closest_down(s->r.height, queue->fmt->frmsize.step_height); + s->r.left = round_closest_down(s->r.left, queue->fmt->frmsize.step_width); + s->r.top = round_closest_down(s->r.top, 2); if (s->r.left + s->r.width > queue->width) s->r.width = round_down(s->r.width + s->r.left - queue->width, -- 2.39.1
[PATCH v12 11/13] lib: math_kunit: Add tests for new macros related to rounding to nearest value
Add tests for round_closest_up/down and roundclosest macros which round to nearest multiple of specified argument. These are tested with kunit tool as shared here [1] : Link: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876 [1] Signed-off-by: Devarsh Thakkar Acked-by: Andy Shevchenko --- V1->V12 (No change, patch introduced in V8) --- lib/math/math_kunit.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c index be27f2afb8e4..05022f010be6 100644 --- a/lib/math/math_kunit.c +++ b/lib/math/math_kunit.c @@ -70,6 +70,26 @@ static void round_down_test(struct kunit *test) KUNIT_EXPECT_EQ(test, round_down((1 << 30) - 1, 1 << 29), 1 << 29); } +static void round_closest_up_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_closest_up(17, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up(15, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up(14, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) + 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 2), 1 << 30); +} + +static void round_closest_down_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_closest_down(17, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_down(15, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_down(14, 4), 12); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) + 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 2), (1 << 30) - 2); +} + /* These versions can round to numbers that aren't a power of two */ static void roundup_test(struct kunit *test) { @@ -95,6 +115,18 @@ static void rounddown_test(struct kunit *test) KUNIT_EXPECT_EQ(test, rounddown(4, 3), 3); } +static void roundclosest_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, roundclosest(21, 5), 20); + KUNIT_EXPECT_EQ(test, roundclosest(19, 5), 20); + KUNIT_EXPECT_EQ(test, roundclosest(17, 5), 15); + KUNIT_EXPECT_EQ(test, roundclosest((1 << 30), 3), (1 << 30) - 1); + KUNIT_EXPECT_EQ(test, roundclosest((1 << 30) - 1, 1 << 29), 1 << 30); + + KUNIT_EXPECT_EQ(test, roundclosest(4, 3), 3); + KUNIT_EXPECT_EQ(test, roundclosest(5, 3), 6); +} + static void div_round_up_test(struct kunit *test) { KUNIT_EXPECT_EQ(test, DIV_ROUND_UP(0, 1), 0); @@ -272,8 +304,11 @@ static struct kunit_case math_test_cases[] = { KUNIT_CASE(int_sqrt_test), KUNIT_CASE(round_up_test), KUNIT_CASE(round_down_test), + KUNIT_CASE(round_closest_up_test), + KUNIT_CASE(round_closest_down_test), KUNIT_CASE(roundup_test), KUNIT_CASE(rounddown_test), + KUNIT_CASE(roundclosest_test), KUNIT_CASE(div_round_up_test), KUNIT_CASE(div_round_closest_test), KUNIT_CASE_PARAM(gcd_test, gcd_gen_params), -- 2.39.1
[PATCH v12 10/13] lib: add basic KUnit test for lib/math
From: Daniel Latypov Add basic test coverage for files that don't require any config options: * part of math.h (what seem to be the most commonly used macros) * gcd.c * lcm.c * int_sqrt.c * reciprocal_div.c (Ignored int_pow.c since it's a simple textbook algorithm.) These tests aren't particularly interesting, but they * provide short and simple examples of parameterized tests * provide a place to add tests for any new files in this dir * are written so adding new test cases to cover edge cases should be easy * looking at code coverage, we hit all the branches in the .c files Signed-off-by: Daniel Latypov Reviewed-by: David Gow Reviewed-by: Andy Shevchenko [devarsht: Rebase to 6.9, remove kernel.h, update Kconfig and change license to GPL] Signed-off-by: Devarsh Thakkar --- Changes since v11: * Add Reviewed-by tag Changes since v10: * Include headers per IWYU principle and add module description Changes since v9: * Added Kconfig dependency for KUNIT Changes since v8: * Remove unrequired header file linux/kernel.h Changes since v7: * Rebase to linux-next, change license to GPL as suggested by checkpatch. Changes since v6: * No change Changes since v5: * add in test cases for roundup/rounddown * address misc comments from David Changes since v4: * add in test cases for some math.h macros (abs, round_up/round_down, div_round_down/closest) * use parameterized testing less to keep things terser Changes since v3: * fix `checkpatch.pl --strict` warnings * add test cases for gcd(0,0) and lcm(0,0) * minor: don't test both gcd(a,b) and gcd(b,a) when a == b Changes since v2: mv math_test.c => math_kunit.c Changes since v1: * Rebase and rewrite to use the new parameterized testing support. * misc: fix overflow in literal and inline int_sqrt format string. * related: commit 1f0e943df68a ("Documentation: kunit: provide guidance for testing many inputs") was merged explaining the patterns shown here. * there's an in-flight patch to update it for parameterized testing. --- lib/math/Kconfig | 14 ++ lib/math/Makefile | 1 + lib/math/math_kunit.c | 294 ++ 3 files changed, 309 insertions(+) create mode 100644 lib/math/math_kunit.c diff --git a/lib/math/Kconfig b/lib/math/Kconfig index 0634b428d0cb..f738d8a433ea 100644 --- a/lib/math/Kconfig +++ b/lib/math/Kconfig @@ -15,3 +15,17 @@ config PRIME_NUMBERS config RATIONAL tristate + +config MATH_KUNIT_TEST + tristate "KUnit test for math helper functions" + depends on KUNIT + default KUNIT_ALL_TESTS + + help + This builds unit test for math helper functions as defined in lib/math + and math.h. + + For more information on KUNIT and unit tests in general, please refer + to the KUnit documentation in Documentation/dev-tools/kunit/. + + If unsure, say N. diff --git a/lib/math/Makefile b/lib/math/Makefile index 91fcdb0c9efe..dcf65d10dab2 100644 --- a/lib/math/Makefile +++ b/lib/math/Makefile @@ -7,3 +7,4 @@ obj-$(CONFIG_RATIONAL) += rational.o obj-$(CONFIG_TEST_DIV64) += test_div64.o obj-$(CONFIG_RATIONAL_KUNIT_TEST) += rational-test.o +obj-$(CONFIG_MATH_KUNIT_TEST) += math_kunit.o diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c new file mode 100644 index ..be27f2afb8e4 --- /dev/null +++ b/lib/math/math_kunit.c @@ -0,0 +1,294 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Simple KUnit suite for math helper funcs that are always enabled. + * + * Copyright (C) 2020, Google LLC. + * Author: Daniel Latypov + */ + +#include +#include +#include +#include +#include +#include +#include + +static void abs_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, abs((char)0), (char)0); + KUNIT_EXPECT_EQ(test, abs((char)42), (char)42); + KUNIT_EXPECT_EQ(test, abs((char)-42), (char)42); + + /* The expression in the macro is actually promoted to an int. */ + KUNIT_EXPECT_EQ(test, abs((short)0), 0); + KUNIT_EXPECT_EQ(test, abs((short)42), 42); + KUNIT_EXPECT_EQ(test, abs((short)-42), 42); + + KUNIT_EXPECT_EQ(test, abs(0), 0); + KUNIT_EXPECT_EQ(test, abs(42), 42); + KUNIT_EXPECT_EQ(test, abs(-42), 42); + + KUNIT_EXPECT_EQ(test, abs(0L), 0L); + KUNIT_EXPECT_EQ(test, abs(42L), 42L); + KUNIT_EXPECT_EQ(test, abs(-42L), 42L); + + KUNIT_EXPECT_EQ(test, abs(0LL), 0LL); + KUNIT_EXPECT_EQ(test, abs(42LL), 42LL); + KUNIT_EXPECT_EQ(test, abs(-42LL), 42LL); + + /* Unsigned types get casted to signed. */ + KUNIT_EXPECT_EQ(test, abs(0ULL), 0LL); + KUNIT_EXPECT_EQ(test, abs(42ULL), 42LL); +} + +static void int_sqrt_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, int_sqrt(0UL), 0UL); + KUNIT_EXPECT_EQ(test, int_sqrt(1UL), 1UL); + KUNIT_EXPECT_EQ(test, int_sqrt(4UL), 2UL); + KUNIT_EXPECT_EQ(test, int_sqrt(5UL), 2UL);
[PATCH v12 09/13] Documentation: core-api: Add math.h macros and functions
Add documentation for rounding, scaling, absolute value and 32-bit division related macros and functions exported by math.h header file. Signed-off-by: Devarsh Thakkar Reviewed-by: Andy Shevchenko --- V12: Add Reviewed-by V11: Fix title for math function header V10: Patch introduced V1->V9 (No change) --- Documentation/core-api/kernel-api.rst | 6 ++ 1 file changed, 6 insertions(+) diff --git a/Documentation/core-api/kernel-api.rst b/Documentation/core-api/kernel-api.rst index ae92a2571388..7de494e76fa6 100644 --- a/Documentation/core-api/kernel-api.rst +++ b/Documentation/core-api/kernel-api.rst @@ -185,6 +185,12 @@ Division Functions .. kernel-doc:: lib/math/gcd.c :export: +Rounding, absolute value, division and 32-bit scaling functions +--- + +.. kernel-doc:: include/linux/math.h + :internal: + UUID/GUID - -- 2.39.1
[PATCH v12 07/13] math.h: Add macros for rounding to closest value
Add below rounding related macros: round_closest_up(x, y) : Rounds x to closest multiple of y where y is a power of 2, with a preference to round up in case two nearest values are possible. round_closest_down(x, y) : Rounds x to closest multiple of y where y is a power of 2, with a preference to round down in case two nearest values are possible. roundclosest(x, y) : Rounds x to closest multiple of y, this macro should generally be used only when y is not multiple of 2 as otherwise round_closest* macros should be used which are much faster. Examples: * round_closest_up(17, 4) = 16 * round_closest_up(15, 4) = 16 * round_closest_up(14, 4) = 16 * round_closest_down(17, 4) = 16 * round_closest_down(15, 4) = 16 * round_closest_down(14, 4) = 12 * roundclosest(21, 5) = 20 * roundclosest(19, 5) = 20 * roundclosest(17, 5) = 15 Signed-off-by: Devarsh Thakkar Acked-by: Andy Shevchenko --- NOTE: This patch is inspired from the Mentor Graphics IPU driver [1] which uses similar macro locally and which is updated in further patch in the series to use this generic macro instead along with other drivers having similar requirements. Link: https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480 [1] V12: - Add Acked-by V11: - Fix commenting style per review comments and remove extra whitespace V10: - Update example comment to fix formatting issues as observed with html docs V9: - No change V8: - Add new macro to round to nearest value for non-multiple of 2 - Update commit message as suggested: V1->V6 (No change, patch introduced in V7) --- include/linux/math.h | 63 1 file changed, 63 insertions(+) diff --git a/include/linux/math.h b/include/linux/math.h index dd4152711de7..79e3dfda77fc 100644 --- a/include/linux/math.h +++ b/include/linux/math.h @@ -34,6 +34,52 @@ */ #define round_down(x, y) ((x) & ~__round_mask(x, y)) +/** + * round_closest_up - round closest to be multiple of specified value (which is + *power of 2) with preference to rounding up + * @x: the value to round + * @y: multiple to round closest to (must be a power of 2) + * + * Rounds @x to closest multiple of @y (which must be a power of 2). + * The value can be either rounded up or rounded down depending upon rounded + * value's closeness to the specified value. If there are two closest possible + * values, i.e. the difference between the specified value and it's rounded up + * and rounded down values is same then preference is given to rounded up + * value. + * + * To perform arbitrary rounding to closest value (not multiple of 2), use + * roundclosest(). + * + * Examples: + * * round_closest_up(17, 4) = 16 + * * round_closest_up(15, 4) = 16 + * * round_closest_up(14, 4) = 16 + */ +#define round_closest_up(x, y) round_down((x) + (y) / 2, (y)) + +/** + * round_closest_down - round closest to be multiple of specified value (which + * is power of 2) with preference to rounding down + * @x: the value to round + * @y: multiple to round closest to (must be a power of 2) + * + * Rounds @x to closest multiple of @y (which must be a power of 2). + * The value can be either rounded up or rounded down depending upon rounded + * value's closeness to the specified value. If there are two closest possible + * values, i.e. the difference between the specified value and it's rounded up + * and rounded down values is same then preference is given to rounded up + * value. + * + * To perform arbitrary rounding to closest value (not multiple of 2), use + * roundclosest(). + * + * Examples: + * * round_closest_down(17, 4) = 16 + * * round_closest_down(15, 4) = 16 + * * round_closest_down(14, 4) = 12 + */ +#define round_closest_down(x, y) round_up((x) - (y) / 2, (y)) + #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP #define DIV_ROUND_DOWN_ULL(ll, d) \ @@ -77,6 +123,23 @@ } \ ) +/** + * roundclosest - round to nearest multiple + * @x: the value to round + * @y: multiple to round nearest to + * + * Rounds @x to nearest multiple of @y. + * The rounded value can be greater than or less than @x depending + * upon it's nearness to @x. If @y will always be a power of 2, consider + * using the faster round_closest_up() or round_closest_down(). + * + * Examples: + * * roundclosest(21, 5) = 20 + * * roundclosest(19, 5) = 20 + * * roundclosest(17, 5) = 15 + */ +#define roundclosest(x, y) rounddown((x) + (y) / 2, (y)) + /* * Divide positive or negative dividend by positive or negative divisor * and round to closest integer. Result is undefined for negative -- 2.39.1
[PATCH v12 00/13] Add V4L2 M2M Driver for E5010 JPEG Encoder
744d999080de6e813bcfb5a596272 Previous patch series: V11: https://lore.kernel.org/all/20240531170229.1270828-1-devar...@ti.com/ V10: https://lore.kernel.org/all/20240530165925.2715837-1-devar...@ti.com/ V9: https://lore.kernel.org/all/20240526175655.1093707-1-devar...@ti.com/ V8: https://lore.kernel.org/all/20240517171532.748684-1-devar...@ti.com/ V7: https://lore.kernel.org/all/20240510082603.1263256-1-devar...@ti.com/ V6: https://lore.kernel.org/all/20240228141140.3530612-1-devar...@ti.com/ V5: https://lore.kernel.org/all/20240215134641.3381478-1-devar...@ti.com/ V4: https://lore.kernel.org/all/20240205114239.924697-1-devar...@ti.com/ V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/ V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/ Daniel Latypov (1): lib: add basic KUnit test for lib/math Devarsh Thakkar (12): media: dt-bindings: Add Imagination E5010 JPEG Encoder media: imagination: Add E5010 JPEG Encoder driver media: v4l2-jpeg: Export reference quantization and huffman tables media: Documentation: Document v4l2-jpeg helper macros media: imagination: Use exported tables from v4l2-jpeg core media: verisilicon : Use exported tables from v4l2-jpeg for hantro codec math.h: Add macros for rounding to closest value math.h: Use kernel-doc syntax for divison functions Documentation: core-api: Add math.h macros and functions lib: math_kunit: Add tests for new macros related to rounding to nearest value media: imagination: Round to closest multiple for cropping region gpu: ipu-v3: Use generic macro for rounding closest to specified value Documentation/core-api/kernel-api.rst |6 + .../bindings/media/img,e5010-jpeg-enc.yaml| 75 + Documentation/driver-api/media/v4l2-core.rst |1 + Documentation/driver-api/media/v4l2-jpeg.rst | 10 + MAINTAINERS |7 + drivers/gpu/ipu-v3/ipu-image-convert.c|4 +- drivers/media/platform/Kconfig|1 + drivers/media/platform/Makefile |1 + drivers/media/platform/imagination/Kconfig| 12 + drivers/media/platform/imagination/Makefile |3 + .../platform/imagination/e5010-core-regs.h| 585 ++ .../platform/imagination/e5010-jpeg-enc-hw.c | 267 +++ .../platform/imagination/e5010-jpeg-enc-hw.h | 42 + .../platform/imagination/e5010-jpeg-enc.c | 1644 + .../platform/imagination/e5010-jpeg-enc.h | 168 ++ .../platform/imagination/e5010-mmu-regs.h | 311 .../media/platform/verisilicon/hantro_jpeg.c | 128 +- drivers/media/v4l2-core/v4l2-jpeg.c | 162 +- include/linux/math.h | 86 +- include/media/v4l2-jpeg.h | 28 + lib/math/Kconfig | 14 + lib/math/Makefile |1 + lib/math/math_kunit.c | 329 23 files changed, 3761 insertions(+), 124 deletions(-) create mode 100644 Documentation/devicetree/bindings/media/img,e5010-jpeg-enc.yaml create mode 100644 Documentation/driver-api/media/v4l2-jpeg.rst create mode 100644 drivers/media/platform/imagination/Kconfig create mode 100644 drivers/media/platform/imagination/Makefile create mode 100644 drivers/media/platform/imagination/e5010-core-regs.h create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.c create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.h create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.c create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.h create mode 100644 drivers/media/platform/imagination/e5010-mmu-regs.h create mode 100644 lib/math/math_kunit.c -- 2.39.1
Re: [PATCH v11 07/11] Documentation: core-api: Add math.h macros and functions
On 01/06/24 00:01, Randy Dunlap wrote: > Hi, > > On 5/31/24 10:12 AM, Devarsh Thakkar wrote: >> Add documentation for rounding, scaling, absolute value and difference, >> 32-bit division related macros and functions exported by math.h header >> file. >> > > I don't see any kernel-doc for division functions in this header file. > > Do some division functions get rendered somehow? > Good catch. I see couple of them having adequate documentation just missing the sphynx syntax, will enable for DIV_ROUND_CLOSEST and DIV_ROUND_CLOSEST_ULL. Regards Devarsh
[PATCH v11 11/11] gpu: ipu-v3: Use generic macro for rounding closest to specified value
Use generic macro round_closest_up() for rounding closest to specified value instead of using local macro round_closest(). There is no change from functionality point of view as round_closest_up() is functionally same as the previously used local macro round_closest(). Signed-off-by: Devarsh Thakkar --- V11: No change V10: No change V9: No change V8: Update commit message V1->V7 : (No change, patch introduced in V7) --- drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c b/drivers/gpu/ipu-v3/ipu-image-convert.c index 841316582ea9..5192a8b5c02c 100644 --- a/drivers/gpu/ipu-v3/ipu-image-convert.c +++ b/drivers/gpu/ipu-v3/ipu-image-convert.c @@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct ipu_image_convert_ctx *ctx, return 0; } -#define round_closest(x, y) round_down((x) + (y)/2, (y)) - /* * Find the best aligned seam position for the given column / row index. * Rotation and image offsets are out of scope. @@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx *ctx, * The closest input sample position that we could actually * start the input tile at, 19.13 fixed point. */ - in_pos_aligned = round_closest(in_pos, 8192U * in_align); + in_pos_aligned = round_closest_up(in_pos, 8192U * in_align); /* Convert 19.13 fixed point to integer */ in_pos_rounded = in_pos_aligned / 8192U; -- 2.39.1
[PATCH v11 10/11] media: imagination: Round to closest multiple for cropping region
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest multiple of requested value while updating the crop rectangle coordinates. Use the rounding macro which gives preference to rounding down in case two nearest values (high and low) are possible to raise the probability of cropping rectangle falling inside the bound region. This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl description as documented in v4l uapi [1] which specifies that driver should choose crop rectangle as close as possible if no flags are passed by user-space, as quoted below : "``0`` - The driver can adjust the rectangle size freely and shall choose a crop/compose rectangle as close as possible to the requested one." [1] : https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst Signed-off-by: Devarsh Thakkar --- V11: No change V10: No change V9: No change V8: Update commit message with specification reference V1->V7 (No change, patch introduced in V7) --- drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c b/drivers/media/platform/imagination/e5010-jpeg-enc.c index e701d573a26a..d65646f0c38c 100644 --- a/drivers/media/platform/imagination/e5010-jpeg-enc.c +++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c @@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, struct v4l2_selection switch (s->flags) { case 0: - s->r.width = round_down(s->r.width, queue->fmt->frmsize.step_width); - s->r.height = round_down(s->r.height, queue->fmt->frmsize.step_height); - s->r.left = round_down(s->r.left, queue->fmt->frmsize.step_width); - s->r.top = round_down(s->r.top, 2); + s->r.width = round_closest_down(s->r.width, queue->fmt->frmsize.step_width); + s->r.height = round_closest_down(s->r.height, queue->fmt->frmsize.step_height); + s->r.left = round_closest_down(s->r.left, queue->fmt->frmsize.step_width); + s->r.top = round_closest_down(s->r.top, 2); if (s->r.left + s->r.width > queue->width) s->r.width = round_down(s->r.width + s->r.left - queue->width, -- 2.39.1
[PATCH v11 09/11] lib: math_kunit: Add tests for new macros related to rounding to nearest value
Add tests for round_closest_up/down and roundclosest macros which round to nearest multiple of specified argument. These are tested with kunit tool as shared here [1]. [1]: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876 Signed-off-by: Devarsh Thakkar --- V1->V11 (No change, patch introduced in V8) --- lib/math/math_kunit.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c index be27f2afb8e4..05022f010be6 100644 --- a/lib/math/math_kunit.c +++ b/lib/math/math_kunit.c @@ -70,6 +70,26 @@ static void round_down_test(struct kunit *test) KUNIT_EXPECT_EQ(test, round_down((1 << 30) - 1, 1 << 29), 1 << 29); } +static void round_closest_up_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_closest_up(17, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up(15, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up(14, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) + 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 2), 1 << 30); +} + +static void round_closest_down_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_closest_down(17, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_down(15, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_down(14, 4), 12); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) + 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 2), (1 << 30) - 2); +} + /* These versions can round to numbers that aren't a power of two */ static void roundup_test(struct kunit *test) { @@ -95,6 +115,18 @@ static void rounddown_test(struct kunit *test) KUNIT_EXPECT_EQ(test, rounddown(4, 3), 3); } +static void roundclosest_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, roundclosest(21, 5), 20); + KUNIT_EXPECT_EQ(test, roundclosest(19, 5), 20); + KUNIT_EXPECT_EQ(test, roundclosest(17, 5), 15); + KUNIT_EXPECT_EQ(test, roundclosest((1 << 30), 3), (1 << 30) - 1); + KUNIT_EXPECT_EQ(test, roundclosest((1 << 30) - 1, 1 << 29), 1 << 30); + + KUNIT_EXPECT_EQ(test, roundclosest(4, 3), 3); + KUNIT_EXPECT_EQ(test, roundclosest(5, 3), 6); +} + static void div_round_up_test(struct kunit *test) { KUNIT_EXPECT_EQ(test, DIV_ROUND_UP(0, 1), 0); @@ -272,8 +304,11 @@ static struct kunit_case math_test_cases[] = { KUNIT_CASE(int_sqrt_test), KUNIT_CASE(round_up_test), KUNIT_CASE(round_down_test), + KUNIT_CASE(round_closest_up_test), + KUNIT_CASE(round_closest_down_test), KUNIT_CASE(roundup_test), KUNIT_CASE(rounddown_test), + KUNIT_CASE(roundclosest_test), KUNIT_CASE(div_round_up_test), KUNIT_CASE(div_round_closest_test), KUNIT_CASE_PARAM(gcd_test, gcd_gen_params), -- 2.39.1
[PATCH v11 08/11] lib: add basic KUnit test for lib/math
From: Daniel Latypov Add basic test coverage for files that don't require any config options: * part of math.h (what seem to be the most commonly used macros) * gcd.c * lcm.c * int_sqrt.c * reciprocal_div.c (Ignored int_pow.c since it's a simple textbook algorithm.) These tests aren't particularly interesting, but they * provide short and simple examples of parameterized tests * provide a place to add tests for any new files in this dir * are written so adding new test cases to cover edge cases should be easy * looking at code coverage, we hit all the branches in the .c files Signed-off-by: Daniel Latypov Reviewed-by: David Gow [devarsht: Rebase to 6.9, remove kernel.h, update Kconfig and change license to GPL] Signed-off-by: Devarsh Thakkar --- Changes since v10: * Include headers per IWYU principle and add module description Changes since v9: * Added Kconfig dependency for KUNIT Changes since v8: * Remove unrequired header file linux/kernel.h Changes since v7: * Rebase to linux-next, change license to GPL as suggested by checkpatch. Changes since v6: * No change Changes since v5: * add in test cases for roundup/rounddown * address misc comments from David Changes since v4: * add in test cases for some math.h macros (abs, round_up/round_down, div_round_down/closest) * use parameterized testing less to keep things terser Changes since v3: * fix `checkpatch.pl --strict` warnings * add test cases for gcd(0,0) and lcm(0,0) * minor: don't test both gcd(a,b) and gcd(b,a) when a == b Changes since v2: mv math_test.c => math_kunit.c Changes since v1: * Rebase and rewrite to use the new parameterized testing support. * misc: fix overflow in literal and inline int_sqrt format string. * related: commit 1f0e943df68a ("Documentation: kunit: provide guidance for testing many inputs") was merged explaining the patterns shown here. * there's an in-flight patch to update it for parameterized testing. --- lib/math/Kconfig | 14 ++ lib/math/Makefile | 1 + lib/math/math_kunit.c | 294 ++ 3 files changed, 309 insertions(+) create mode 100644 lib/math/math_kunit.c diff --git a/lib/math/Kconfig b/lib/math/Kconfig index 0634b428d0cb..f738d8a433ea 100644 --- a/lib/math/Kconfig +++ b/lib/math/Kconfig @@ -15,3 +15,17 @@ config PRIME_NUMBERS config RATIONAL tristate + +config MATH_KUNIT_TEST + tristate "KUnit test for math helper functions" + depends on KUNIT + default KUNIT_ALL_TESTS + + help + This builds unit test for math helper functions as defined in lib/math + and math.h. + + For more information on KUNIT and unit tests in general, please refer + to the KUnit documentation in Documentation/dev-tools/kunit/. + + If unsure, say N. diff --git a/lib/math/Makefile b/lib/math/Makefile index 91fcdb0c9efe..dcf65d10dab2 100644 --- a/lib/math/Makefile +++ b/lib/math/Makefile @@ -7,3 +7,4 @@ obj-$(CONFIG_RATIONAL) += rational.o obj-$(CONFIG_TEST_DIV64) += test_div64.o obj-$(CONFIG_RATIONAL_KUNIT_TEST) += rational-test.o +obj-$(CONFIG_MATH_KUNIT_TEST) += math_kunit.o diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c new file mode 100644 index ..be27f2afb8e4 --- /dev/null +++ b/lib/math/math_kunit.c @@ -0,0 +1,294 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Simple KUnit suite for math helper funcs that are always enabled. + * + * Copyright (C) 2020, Google LLC. + * Author: Daniel Latypov + */ + +#include +#include +#include +#include +#include +#include +#include + +static void abs_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, abs((char)0), (char)0); + KUNIT_EXPECT_EQ(test, abs((char)42), (char)42); + KUNIT_EXPECT_EQ(test, abs((char)-42), (char)42); + + /* The expression in the macro is actually promoted to an int. */ + KUNIT_EXPECT_EQ(test, abs((short)0), 0); + KUNIT_EXPECT_EQ(test, abs((short)42), 42); + KUNIT_EXPECT_EQ(test, abs((short)-42), 42); + + KUNIT_EXPECT_EQ(test, abs(0), 0); + KUNIT_EXPECT_EQ(test, abs(42), 42); + KUNIT_EXPECT_EQ(test, abs(-42), 42); + + KUNIT_EXPECT_EQ(test, abs(0L), 0L); + KUNIT_EXPECT_EQ(test, abs(42L), 42L); + KUNIT_EXPECT_EQ(test, abs(-42L), 42L); + + KUNIT_EXPECT_EQ(test, abs(0LL), 0LL); + KUNIT_EXPECT_EQ(test, abs(42LL), 42LL); + KUNIT_EXPECT_EQ(test, abs(-42LL), 42LL); + + /* Unsigned types get casted to signed. */ + KUNIT_EXPECT_EQ(test, abs(0ULL), 0LL); + KUNIT_EXPECT_EQ(test, abs(42ULL), 42LL); +} + +static void int_sqrt_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, int_sqrt(0UL), 0UL); + KUNIT_EXPECT_EQ(test, int_sqrt(1UL), 1UL); + KUNIT_EXPECT_EQ(test, int_sqrt(4UL), 2UL); + KUNIT_EXPECT_EQ(test, int_sqrt(5UL), 2UL); + KUNIT_EXPECT_EQ(test, int_sqrt(8UL), 2UL); + KUNI
[PATCH v11 07/11] Documentation: core-api: Add math.h macros and functions
Add documentation for rounding, scaling, absolute value and difference, 32-bit division related macros and functions exported by math.h header file. Signed-off-by: Devarsh Thakkar --- V11: Fix title for math function header V10: Patch introduced V1->V9 (No change) --- Documentation/core-api/kernel-api.rst | 6 ++ 1 file changed, 6 insertions(+) diff --git a/Documentation/core-api/kernel-api.rst b/Documentation/core-api/kernel-api.rst index ae92a2571388..7de494e76fa6 100644 --- a/Documentation/core-api/kernel-api.rst +++ b/Documentation/core-api/kernel-api.rst @@ -185,6 +185,12 @@ Division Functions .. kernel-doc:: lib/math/gcd.c :export: +Rounding, absolute value, division and 32-bit scaling functions +--- + +.. kernel-doc:: include/linux/math.h + :internal: + UUID/GUID - -- 2.39.1
[PATCH v11 06/11] math.h: Add macros for rounding to closest value
Add below rounding related macros: round_closest_up(x, y) : Rounds x to closest multiple of y where y is a power of 2, with a preference to round up in case two nearest values are possible. round_closest_down(x, y) : Rounds x to closest multiple of y where y is a power of 2, with a preference to round down in case two nearest values are possible. roundclosest(x, y) : Rounds x to closest multiple of y, this macro should generally be used only when y is not multiple of 2 as otherwise round_closest* macros should be used which are much faster. Examples: * round_closest_up(17, 4) = 16 * round_closest_up(15, 4) = 16 * round_closest_up(14, 4) = 16 * round_closest_down(17, 4) = 16 * round_closest_down(15, 4) = 16 * round_closest_down(14, 4) = 12 * roundclosest(21, 5) = 20 * roundclosest(19, 5) = 20 * roundclosest(17, 5) = 15 Signed-off-by: Devarsh Thakkar --- NOTE: This patch is inspired from the Mentor Graphics IPU driver [1] which uses similar macro locally and which is updated in further patch in the series to use this generic macro instead along with other drivers having similar requirements. [1]: https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480 V11: - Fix commenting style per review comments and remove extra whitespace V10: - Update example comment to fix formatting issues as observed with html docs V9: - No change V8: - Add new macro to round to nearest value for non-multiple of 2 - Update commit message as suggested: V1->V6 (No change, patch introduced in V7) --- include/linux/math.h | 63 1 file changed, 63 insertions(+) diff --git a/include/linux/math.h b/include/linux/math.h index dd4152711de7..79e3dfda77fc 100644 --- a/include/linux/math.h +++ b/include/linux/math.h @@ -34,6 +34,52 @@ */ #define round_down(x, y) ((x) & ~__round_mask(x, y)) +/** + * round_closest_up - round closest to be multiple of specified value (which is + *power of 2) with preference to rounding up + * @x: the value to round + * @y: multiple to round closest to (must be a power of 2) + * + * Rounds @x to closest multiple of @y (which must be a power of 2). + * The value can be either rounded up or rounded down depending upon rounded + * value's closeness to the specified value. If there are two closest possible + * values, i.e. the difference between the specified value and it's rounded up + * and rounded down values is same then preference is given to rounded up + * value. + * + * To perform arbitrary rounding to closest value (not multiple of 2), use + * roundclosest(). + * + * Examples: + * * round_closest_up(17, 4) = 16 + * * round_closest_up(15, 4) = 16 + * * round_closest_up(14, 4) = 16 + */ +#define round_closest_up(x, y) round_down((x) + (y) / 2, (y)) + +/** + * round_closest_down - round closest to be multiple of specified value (which + * is power of 2) with preference to rounding down + * @x: the value to round + * @y: multiple to round closest to (must be a power of 2) + * + * Rounds @x to closest multiple of @y (which must be a power of 2). + * The value can be either rounded up or rounded down depending upon rounded + * value's closeness to the specified value. If there are two closest possible + * values, i.e. the difference between the specified value and it's rounded up + * and rounded down values is same then preference is given to rounded up + * value. + * + * To perform arbitrary rounding to closest value (not multiple of 2), use + * roundclosest(). + * + * Examples: + * * round_closest_down(17, 4) = 16 + * * round_closest_down(15, 4) = 16 + * * round_closest_down(14, 4) = 12 + */ +#define round_closest_down(x, y) round_up((x) - (y) / 2, (y)) + #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP #define DIV_ROUND_DOWN_ULL(ll, d) \ @@ -77,6 +123,23 @@ } \ ) +/** + * roundclosest - round to nearest multiple + * @x: the value to round + * @y: multiple to round nearest to + * + * Rounds @x to nearest multiple of @y. + * The rounded value can be greater than or less than @x depending + * upon it's nearness to @x. If @y will always be a power of 2, consider + * using the faster round_closest_up() or round_closest_down(). + * + * Examples: + * * roundclosest(21, 5) = 20 + * * roundclosest(19, 5) = 20 + * * roundclosest(17, 5) = 15 + */ +#define roundclosest(x, y) rounddown((x) + (y) / 2, (y)) + /* * Divide positive or negative dividend by positive or negative divisor * and round to closest integer. Result is undefined for negative -- 2.39.1
[PATCH v11 00/11] Add V4L2 M2M Driver for E5010 JPEG Encoder
el.org/all/20240517171532.748684-1-devar...@ti.com/ V7: https://lore.kernel.org/all/20240510082603.1263256-1-devar...@ti.com/ V6: https://lore.kernel.org/all/20240228141140.3530612-1-devar...@ti.com/ V5: https://lore.kernel.org/all/20240215134641.3381478-1-devar...@ti.com/ V4: https://lore.kernel.org/all/20240205114239.924697-1-devar...@ti.com/ V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/ V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/ Daniel Latypov (1): lib: add basic KUnit test for lib/math Devarsh Thakkar (10): media: dt-bindings: Add Imagination E5010 JPEG Encoder media: imagination: Add E5010 JPEG Encoder driver media: v4l2-jpeg: Export reference quantization and huffman tables media: imagination: Use exported tables from v4l2-jpeg core media: verisilicon : Use exported tables from v4l2-jpeg for hantro codec math.h: Add macros for rounding to closest value Documentation: core-api: Add math.h macros and functions lib: math_kunit: Add tests for new macros related to rounding to nearest value media: imagination: Round to closest multiple for cropping region gpu: ipu-v3: Use generic macro for rounding closest to specified value Documentation/core-api/kernel-api.rst |6 + .../bindings/media/img,e5010-jpeg-enc.yaml| 75 + MAINTAINERS |7 + drivers/gpu/ipu-v3/ipu-image-convert.c|4 +- drivers/media/platform/Kconfig|1 + drivers/media/platform/Makefile |1 + drivers/media/platform/imagination/Kconfig| 12 + drivers/media/platform/imagination/Makefile |3 + .../platform/imagination/e5010-core-regs.h| 585 ++ .../platform/imagination/e5010-jpeg-enc-hw.c | 267 +++ .../platform/imagination/e5010-jpeg-enc-hw.h | 42 + .../platform/imagination/e5010-jpeg-enc.c | 1644 + .../platform/imagination/e5010-jpeg-enc.h | 168 ++ .../platform/imagination/e5010-mmu-regs.h | 311 .../media/platform/verisilicon/hantro_jpeg.c | 128 +- drivers/media/v4l2-core/v4l2-jpeg.c | 162 +- include/linux/math.h | 63 + include/media/v4l2-jpeg.h | 28 + lib/math/Kconfig | 14 + lib/math/Makefile |1 + lib/math/math_kunit.c | 329 21 files changed, 3733 insertions(+), 118 deletions(-) create mode 100644 Documentation/devicetree/bindings/media/img,e5010-jpeg-enc.yaml create mode 100644 drivers/media/platform/imagination/Kconfig create mode 100644 drivers/media/platform/imagination/Makefile create mode 100644 drivers/media/platform/imagination/e5010-core-regs.h create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.c create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.h create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.c create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.h create mode 100644 drivers/media/platform/imagination/e5010-mmu-regs.h create mode 100644 lib/math/math_kunit.c -- 2.39.1
Re: [PATCH v10 07/11] Documentation: core-api: Add math.h macros and functions
Hi Randy, Thanks for the review. On 31/05/24 04:14, Randy Dunlap wrote: > > > On 5/30/24 10:17 AM, Devarsh Thakkar wrote: >> Add documentation for rounding, scaling, absolute value and difference, >> 32-bit division related macros and functions exported by math.h header >> file. >> >> Signed-off-by: Devarsh Thakkar >> --- >> V1->V9 (No change) >> V10: Patch introduced >> --- >> Documentation/core-api/kernel-api.rst | 6 ++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/Documentation/core-api/kernel-api.rst >> b/Documentation/core-api/kernel-api.rst >> index ae92a2571388..fb467783d491 100644 >> --- a/Documentation/core-api/kernel-api.rst >> +++ b/Documentation/core-api/kernel-api.rst >> @@ -185,6 +185,12 @@ Division Functions >> .. kernel-doc:: lib/math/gcd.c >> :export: >> >> +Rounding, absolute value, scaling and 32bit division functions > > 32-bit > please. > Good catch. Also I see some division functions supporting non-32bit functions too, so I would make it as below : Rounding, absolute value, division and 32-bit scaling functions --- Regards Devarsh
Re: [PATCH v10 07/11] Documentation: core-api: Add math.h macros and functions
On 31/05/24 00:51, Andy Shevchenko wrote: > On Thu, May 30, 2024 at 10:47:40PM +0530, Devarsh Thakkar wrote: >> Add documentation for rounding, scaling, absolute value and difference, >> 32-bit division related macros and functions exported by math.h header >> file. > > ... > >> +Rounding, absolute value, scaling and 32bit division functions >> +-- >> + >> +.. kernel-doc:: include/linux/math.h >> + :internal: > > Please, double check that this is correct keyword in this case. > Yes, this is inline as per what is described here [1] as there are no export symbols in the header. Also the rendered output looks good. [1] : https://docs.kernel.org/doc-guide/kernel-doc.html#:~:text=net/mac80211/*.c-,internal,-%3A%20%5Bsource%2Dpattern Regards Devarsh
Re: [PATCH v10 08/11] lib: add basic KUnit test for lib/math
On 31/05/24 00:53, Andy Shevchenko wrote: > On Thu, May 30, 2024 at 10:48:10PM +0530, Devarsh Thakkar wrote: >> From: Daniel Latypov >> >> Add basic test coverage for files that don't require any config options: >> * part of math.h (what seem to be the most commonly used macros) >> * gcd.c >> * lcm.c >> * int_sqrt.c >> * reciprocal_div.c >> (Ignored int_pow.c since it's a simple textbook algorithm.) >> >> These tests aren't particularly interesting, but they >> * provide short and simple examples of parameterized tests >> * provide a place to add tests for any new files in this dir >> * are written so adding new test cases to cover edge cases should be >> easy >> * looking at code coverage, we hit all the branches in the .c files > > ... > >> +#include >> +#include >> +#include >> +#include > > Really, you ignored my comment a second (?) time? This is road to nowhere. > You need to update the inclusion bloc in accordance with IWYU principle. > I see a few headers are missing. > Sorry I forgot to add those somehow. I will add module.h and math.h in v11. Regards Devarsh
Re: [PATCH v10 06/11] math.h: Add macros for rounding to closest value
Hi Andy, Thanks for the review. On 31/05/24 00:49, Andy Shevchenko wrote: > On Thu, May 30, 2024 at 10:42:25PM +0530, Devarsh Thakkar wrote: >> Add below rounding related macros: >> >> round_closest_up(x, y) : Rounds x to closest multiple of y where y is a >> power of 2, with a preference to round up in case two nearest values are >> possible. >> >> round_closest_down(x, y) : Rounds x to closest multiple of y where y is a >> power of 2, with a preference to round down in case two nearest values are >> possible. >> >> roundclosest(x, y) : Rounds x to closest multiple of y, this macro should >> generally be used only when y is not multiple of 2 as otherwise >> round_closest* macros should be used which are much faster. >> >> Examples: >> * round_closest_up(17, 4) = 16 >> * round_closest_up(15, 4) = 16 >> * round_closest_up(14, 4) = 16 >> * round_closest_down(17, 4) = 16 >> * round_closest_down(15, 4) = 16 >> * round_closest_down(14, 4) = 12 >> * roundclosest(21, 5) = 20 >> * roundclosest(19, 5) = 20 >> * roundclosest(17, 5) = 15 > > ... > >> + * Examples : > > It's inconsistent with the other one below. > >> + * round_closest_up(17, 4) = 16 >> + * >> + * round_closest_up(15, 4) = 16 >> + * >> + * round_closest_up(14, 4) = 16 > > The three have TABs/spaces mixture. > > I believe you wanted: > > * Examples:: > * * round_closest_up(17, 4) = 16 > * * round_closest_up(15, 4) = 16 > * * round_closest_up(14, 4) = 16 > I initially referred the style from this link [1] but probably missed to remove extra space from my patch. But what you suggested looks better, I will go with what you suggested. [1] https://elixir.bootlin.com/linux/v6.9/source/include/linux/int_log.h#L22 Regards Devarsh
[PATCH v10 11/11] gpu: ipu-v3: Use generic macro for rounding closest to specified value
Use generic macro round_closest_up() for rounding closest to specified value instead of using local macro round_closest(). There is no change from functionality point of view as round_closest_up() is functionally same as the previously used local macro round_closest(). Signed-off-by: Devarsh Thakkar --- V10: No change V9: No change V8: Update commit message V1->V7 : (No change, patch introduced in V7) --- drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c b/drivers/gpu/ipu-v3/ipu-image-convert.c index 841316582ea9..5192a8b5c02c 100644 --- a/drivers/gpu/ipu-v3/ipu-image-convert.c +++ b/drivers/gpu/ipu-v3/ipu-image-convert.c @@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct ipu_image_convert_ctx *ctx, return 0; } -#define round_closest(x, y) round_down((x) + (y)/2, (y)) - /* * Find the best aligned seam position for the given column / row index. * Rotation and image offsets are out of scope. @@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx *ctx, * The closest input sample position that we could actually * start the input tile at, 19.13 fixed point. */ - in_pos_aligned = round_closest(in_pos, 8192U * in_align); + in_pos_aligned = round_closest_up(in_pos, 8192U * in_align); /* Convert 19.13 fixed point to integer */ in_pos_rounded = in_pos_aligned / 8192U; -- 2.39.1
[PATCH v10 10/11] media: imagination: Round to closest multiple for cropping region
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest multiple of requested value while updating the crop rectangle coordinates. Use the rounding macro which gives preference to rounding down in case two nearest values (high and low) are possible to raise the probability of cropping rectangle falling inside the bound region. This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl description as documented in v4l uapi [1] which specifies that driver should choose crop rectangle as close as possible if no flags are passed by user-space, as quoted below : "``0`` - The driver can adjust the rectangle size freely and shall choose a crop/compose rectangle as close as possible to the requested one." [1] : https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst Signed-off-by: Devarsh Thakkar --- V10: No change V9: No change V8: Update commit message with specification reference V1->V7 (No change, patch introduced in V7) --- drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c b/drivers/media/platform/imagination/e5010-jpeg-enc.c index e701d573a26a..d65646f0c38c 100644 --- a/drivers/media/platform/imagination/e5010-jpeg-enc.c +++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c @@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, struct v4l2_selection switch (s->flags) { case 0: - s->r.width = round_down(s->r.width, queue->fmt->frmsize.step_width); - s->r.height = round_down(s->r.height, queue->fmt->frmsize.step_height); - s->r.left = round_down(s->r.left, queue->fmt->frmsize.step_width); - s->r.top = round_down(s->r.top, 2); + s->r.width = round_closest_down(s->r.width, queue->fmt->frmsize.step_width); + s->r.height = round_closest_down(s->r.height, queue->fmt->frmsize.step_height); + s->r.left = round_closest_down(s->r.left, queue->fmt->frmsize.step_width); + s->r.top = round_closest_down(s->r.top, 2); if (s->r.left + s->r.width > queue->width) s->r.width = round_down(s->r.width + s->r.left - queue->width, -- 2.39.1
[PATCH v10 09/11] lib: math_kunit: Add tests for new macros related to rounding to nearest value
Add tests for round_closest_up/down and roundclosest macros which round to nearest multiple of specified argument. These are tested with kunit tool as shared here [1]. [1]: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876 Signed-off-by: Devarsh Thakkar --- V1->V10 (No change, patch introduced in V8) --- lib/math/math_kunit.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c index 16263e30a292..60b66aa41121 100644 --- a/lib/math/math_kunit.c +++ b/lib/math/math_kunit.c @@ -67,6 +67,26 @@ static void round_down_test(struct kunit *test) KUNIT_EXPECT_EQ(test, round_down((1 << 30) - 1, 1 << 29), 1 << 29); } +static void round_closest_up_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_closest_up(17, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up(15, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up(14, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) + 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 2), 1 << 30); +} + +static void round_closest_down_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_closest_down(17, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_down(15, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_down(14, 4), 12); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) + 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 2), (1 << 30) - 2); +} + /* These versions can round to numbers that aren't a power of two */ static void roundup_test(struct kunit *test) { @@ -92,6 +112,18 @@ static void rounddown_test(struct kunit *test) KUNIT_EXPECT_EQ(test, rounddown(4, 3), 3); } +static void roundclosest_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, roundclosest(21, 5), 20); + KUNIT_EXPECT_EQ(test, roundclosest(19, 5), 20); + KUNIT_EXPECT_EQ(test, roundclosest(17, 5), 15); + KUNIT_EXPECT_EQ(test, roundclosest((1 << 30), 3), (1 << 30) - 1); + KUNIT_EXPECT_EQ(test, roundclosest((1 << 30) - 1, 1 << 29), 1 << 30); + + KUNIT_EXPECT_EQ(test, roundclosest(4, 3), 3); + KUNIT_EXPECT_EQ(test, roundclosest(5, 3), 6); +} + static void div_round_up_test(struct kunit *test) { KUNIT_EXPECT_EQ(test, DIV_ROUND_UP(0, 1), 0); @@ -269,8 +301,11 @@ static struct kunit_case math_test_cases[] = { KUNIT_CASE(int_sqrt_test), KUNIT_CASE(round_up_test), KUNIT_CASE(round_down_test), + KUNIT_CASE(round_closest_up_test), + KUNIT_CASE(round_closest_down_test), KUNIT_CASE(roundup_test), KUNIT_CASE(rounddown_test), + KUNIT_CASE(roundclosest_test), KUNIT_CASE(div_round_up_test), KUNIT_CASE(div_round_closest_test), KUNIT_CASE_PARAM(gcd_test, gcd_gen_params), -- 2.39.1
[PATCH v10 08/11] lib: add basic KUnit test for lib/math
From: Daniel Latypov Add basic test coverage for files that don't require any config options: * part of math.h (what seem to be the most commonly used macros) * gcd.c * lcm.c * int_sqrt.c * reciprocal_div.c (Ignored int_pow.c since it's a simple textbook algorithm.) These tests aren't particularly interesting, but they * provide short and simple examples of parameterized tests * provide a place to add tests for any new files in this dir * are written so adding new test cases to cover edge cases should be easy * looking at code coverage, we hit all the branches in the .c files Signed-off-by: Daniel Latypov Reviewed-by: David Gow [devarsht: Rebase to 6.9, remove kernel.h, update Kconfig and change license to GPL] Signed-off-by: Devarsh Thakkar --- Changes since v9: * Added Kconfig dependency for KUNIT Changes since v8: * Remove unrequired header file linux/kernel.h Changes since v7: * Rebase to linux-next, change license to GPL as suggested by checkpatch. Changes since v6: * No change Changes since v5: * add in test cases for roundup/rounddown * address misc comments from David Changes since v4: * add in test cases for some math.h macros (abs, round_up/round_down, div_round_down/closest) * use parameterized testing less to keep things terser Changes since v3: * fix `checkpatch.pl --strict` warnings * add test cases for gcd(0,0) and lcm(0,0) * minor: don't test both gcd(a,b) and gcd(b,a) when a == b Changes since v2: mv math_test.c => math_kunit.c Changes since v1: * Rebase and rewrite to use the new parameterized testing support. * misc: fix overflow in literal and inline int_sqrt format string. * related: commit 1f0e943df68a ("Documentation: kunit: provide guidance for testing many inputs") was merged explaining the patterns shown here. * there's an in-flight patch to update it for parameterized testing. --- lib/math/Kconfig | 14 ++ lib/math/Makefile | 1 + lib/math/math_kunit.c | 290 ++ 3 files changed, 305 insertions(+) create mode 100644 lib/math/math_kunit.c diff --git a/lib/math/Kconfig b/lib/math/Kconfig index 0634b428d0cb..f738d8a433ea 100644 --- a/lib/math/Kconfig +++ b/lib/math/Kconfig @@ -15,3 +15,17 @@ config PRIME_NUMBERS config RATIONAL tristate + +config MATH_KUNIT_TEST + tristate "KUnit test for math helper functions" + depends on KUNIT + default KUNIT_ALL_TESTS + + help + This builds unit test for math helper functions as defined in lib/math + and math.h. + + For more information on KUNIT and unit tests in general, please refer + to the KUnit documentation in Documentation/dev-tools/kunit/. + + If unsure, say N. diff --git a/lib/math/Makefile b/lib/math/Makefile index 91fcdb0c9efe..dcf65d10dab2 100644 --- a/lib/math/Makefile +++ b/lib/math/Makefile @@ -7,3 +7,4 @@ obj-$(CONFIG_RATIONAL) += rational.o obj-$(CONFIG_TEST_DIV64) += test_div64.o obj-$(CONFIG_RATIONAL_KUNIT_TEST) += rational-test.o +obj-$(CONFIG_MATH_KUNIT_TEST) += math_kunit.o diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c new file mode 100644 index ..16263e30a292 --- /dev/null +++ b/lib/math/math_kunit.c @@ -0,0 +1,290 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Simple KUnit suite for math helper funcs that are always enabled. + * + * Copyright (C) 2020, Google LLC. + * Author: Daniel Latypov + */ + +#include +#include +#include +#include + +static void abs_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, abs((char)0), (char)0); + KUNIT_EXPECT_EQ(test, abs((char)42), (char)42); + KUNIT_EXPECT_EQ(test, abs((char)-42), (char)42); + + /* The expression in the macro is actually promoted to an int. */ + KUNIT_EXPECT_EQ(test, abs((short)0), 0); + KUNIT_EXPECT_EQ(test, abs((short)42), 42); + KUNIT_EXPECT_EQ(test, abs((short)-42), 42); + + KUNIT_EXPECT_EQ(test, abs(0), 0); + KUNIT_EXPECT_EQ(test, abs(42), 42); + KUNIT_EXPECT_EQ(test, abs(-42), 42); + + KUNIT_EXPECT_EQ(test, abs(0L), 0L); + KUNIT_EXPECT_EQ(test, abs(42L), 42L); + KUNIT_EXPECT_EQ(test, abs(-42L), 42L); + + KUNIT_EXPECT_EQ(test, abs(0LL), 0LL); + KUNIT_EXPECT_EQ(test, abs(42LL), 42LL); + KUNIT_EXPECT_EQ(test, abs(-42LL), 42LL); + + /* Unsigned types get casted to signed. */ + KUNIT_EXPECT_EQ(test, abs(0ULL), 0LL); + KUNIT_EXPECT_EQ(test, abs(42ULL), 42LL); +} + +static void int_sqrt_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, int_sqrt(0UL), 0UL); + KUNIT_EXPECT_EQ(test, int_sqrt(1UL), 1UL); + KUNIT_EXPECT_EQ(test, int_sqrt(4UL), 2UL); + KUNIT_EXPECT_EQ(test, int_sqrt(5UL), 2UL); + KUNIT_EXPECT_EQ(test, int_sqrt(8UL), 2UL); + KUNIT_EXPECT_EQ(test, int_sqrt(1UL << 30), 1UL << 15); +} + +static void round_up_test(struct kun
[PATCH v10 07/11] Documentation: core-api: Add math.h macros and functions
Add documentation for rounding, scaling, absolute value and difference, 32-bit division related macros and functions exported by math.h header file. Signed-off-by: Devarsh Thakkar --- V1->V9 (No change) V10: Patch introduced --- Documentation/core-api/kernel-api.rst | 6 ++ 1 file changed, 6 insertions(+) diff --git a/Documentation/core-api/kernel-api.rst b/Documentation/core-api/kernel-api.rst index ae92a2571388..fb467783d491 100644 --- a/Documentation/core-api/kernel-api.rst +++ b/Documentation/core-api/kernel-api.rst @@ -185,6 +185,12 @@ Division Functions .. kernel-doc:: lib/math/gcd.c :export: +Rounding, absolute value, scaling and 32bit division functions +-- + +.. kernel-doc:: include/linux/math.h + :internal: + UUID/GUID - -- 2.39.1
[PATCH v10 06/11] math.h: Add macros for rounding to closest value
Add below rounding related macros: round_closest_up(x, y) : Rounds x to closest multiple of y where y is a power of 2, with a preference to round up in case two nearest values are possible. round_closest_down(x, y) : Rounds x to closest multiple of y where y is a power of 2, with a preference to round down in case two nearest values are possible. roundclosest(x, y) : Rounds x to closest multiple of y, this macro should generally be used only when y is not multiple of 2 as otherwise round_closest* macros should be used which are much faster. Examples: * round_closest_up(17, 4) = 16 * round_closest_up(15, 4) = 16 * round_closest_up(14, 4) = 16 * round_closest_down(17, 4) = 16 * round_closest_down(15, 4) = 16 * round_closest_down(14, 4) = 12 * roundclosest(21, 5) = 20 * roundclosest(19, 5) = 20 * roundclosest(17, 5) = 15 Signed-off-by: Devarsh Thakkar --- NOTE: This patch is inspired from the Mentor Graphics IPU driver [1] which uses similar macro locally and which is updated in further patch in the series to use this generic macro instead along with other drivers having similar requirements. [1]: https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480 V10: - Update example comment to fix formatting issues as observed with html docs V9: - No change V8: - Add new macro to round to nearest value for non-multiple of 2 - Update commit message as suggested: V1->V6 (No change, patch introduced in V7) --- include/linux/math.h | 72 1 file changed, 72 insertions(+) diff --git a/include/linux/math.h b/include/linux/math.h index dd4152711de7..1f6177191b66 100644 --- a/include/linux/math.h +++ b/include/linux/math.h @@ -34,6 +34,58 @@ */ #define round_down(x, y) ((x) & ~__round_mask(x, y)) +/** + * round_closest_up - round closest to be multiple of specified value (which is + *power of 2) with preference to rounding up + * @x: the value to round + * @y: multiple to round closest to (must be a power of 2) + * + * Rounds @x to closest multiple of @y (which must be a power of 2). + * The value can be either rounded up or rounded down depending upon rounded + * value's closeness to the specified value. If there are two closest possible + * values, i.e. the difference between the specified value and it's rounded up + * and rounded down values is same then preference is given to rounded up + * value. + * + * To perform arbitrary rounding to closest value (not multiple of 2), use + * roundclosest(). + * + * Examples : + * + * round_closest_up(17, 4) = 16 + * + * round_closest_up(15, 4) = 16 + * + * round_closest_up(14, 4) = 16 + */ +#define round_closest_up(x, y) round_down((x) + (y) / 2, (y)) + +/** + * round_closest_down - round closest to be multiple of specified value (which + * is power of 2) with preference to rounding down + * @x: the value to round + * @y: multiple to round closest to (must be a power of 2) + * + * Rounds @x to closest multiple of @y (which must be a power of 2). + * The value can be either rounded up or rounded down depending upon rounded + * value's closeness to the specified value. If there are two closest possible + * values, i.e. the difference between the specified value and it's rounded up + * and rounded down values is same then preference is given to rounded up + * value. + * + * To perform arbitrary rounding to closest value (not multiple of 2), use + * roundclosest(). + * + * Examples: + * + * round_closest_down(17, 4) = 16 + * + * round_closest_down(15, 4) = 16 + * + * round_closest_down(14, 4) = 12 + */ +#define round_closest_down(x, y) round_up((x) - (y) / 2, (y)) + #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP #define DIV_ROUND_DOWN_ULL(ll, d) \ @@ -77,6 +129,26 @@ } \ ) +/** + * roundclosest - round to nearest multiple + * @x: the value to round + * @y: multiple to round nearest to + * + * Rounds @x to nearest multiple of @y. + * The rounded value can be greater than or less than @x depending + * upon it's nearness to @x. If @y will always be a power of 2, consider + * using the faster round_closest_up() or round_closest_down(). + * + * Examples : + * + * roundclosest(21, 5) = 20 + * + * roundclosest(19, 5) = 20 + * + * roundclosest(17, 5) = 15 + */ +#define roundclosest(x, y) rounddown((x) + (y) / 2, (y)) + /* * Divide positive or negative dividend by positive or negative divisor * and round to closest integer. Result is undefined for negative -- 2.39.1
[PATCH v10 00/11] Add V4L2 M2M Driver for E5010 JPEG Encoder
/all/20240205114239.924697-1-devar...@ti.com/ V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/ V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/ Daniel Latypov (1): lib: add basic KUnit test for lib/math Devarsh Thakkar (10): media: dt-bindings: Add Imagination E5010 JPEG Encoder media: imagination: Add E5010 JPEG Encoder driver media: v4l2-jpeg: Export reference quantization and huffman tables media: imagination: Use exported tables from v4l2-jpeg core media: verisilicon : Use exported tables from v4l2-jpeg for hantro codec math.h: Add macros for rounding to closest value Documentation: core-api: Add math.h macros and functions lib: math_kunit: Add tests for new macros related to rounding to nearest value media: imagination: Round to closest multiple for cropping region gpu: ipu-v3: Use generic macro for rounding closest to specified value Documentation/core-api/kernel-api.rst |6 + .../bindings/media/img,e5010-jpeg-enc.yaml| 75 + MAINTAINERS |7 + drivers/gpu/ipu-v3/ipu-image-convert.c|4 +- drivers/media/platform/Kconfig|1 + drivers/media/platform/Makefile |1 + drivers/media/platform/imagination/Kconfig| 12 + drivers/media/platform/imagination/Makefile |3 + .../platform/imagination/e5010-core-regs.h| 585 ++ .../platform/imagination/e5010-jpeg-enc-hw.c | 267 +++ .../platform/imagination/e5010-jpeg-enc-hw.h | 42 + .../platform/imagination/e5010-jpeg-enc.c | 1644 + .../platform/imagination/e5010-jpeg-enc.h | 168 ++ .../platform/imagination/e5010-mmu-regs.h | 311 .../media/platform/verisilicon/hantro_jpeg.c | 128 +- drivers/media/v4l2-core/v4l2-jpeg.c | 162 +- include/linux/math.h | 72 + include/media/v4l2-jpeg.h | 28 + lib/math/Kconfig | 14 + lib/math/Makefile |1 + lib/math/math_kunit.c | 325 21 files changed, 3738 insertions(+), 118 deletions(-) create mode 100644 Documentation/devicetree/bindings/media/img,e5010-jpeg-enc.yaml create mode 100644 drivers/media/platform/imagination/Kconfig create mode 100644 drivers/media/platform/imagination/Makefile create mode 100644 drivers/media/platform/imagination/e5010-core-regs.h create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.c create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.h create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.c create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.h create mode 100644 drivers/media/platform/imagination/e5010-mmu-regs.h create mode 100644 lib/math/math_kunit.c -- 2.39.1
Re: [RFC PATCH 2/3] drm/tidss: Add support for display sharing
Hi Javier, Maxime, Daniel, Sorry for the delay. Please find response inline. On 16/05/24 18:21, Daniel Vetter wrote: > On Wed, May 15, 2024 at 04:45:09PM +0200, Javier Martinez Canillas wrote: >> Devarsh Thakkar writes: [..] >> >> If I understand you correctly, for now the only real use case is when the >> the RTOS owns / manages the complete display pipeline and Linux can only >> own video planes. >> Not exactly, What I mean is that this is the default configuration/example we intend to provide to customer as an out-of-box demo . But flexibility is provided to customer to modify the display sharing configuration per their use-case, for e.g at RTOS side in place of device-tree, we have a sysconfig menu [1] using which they can select the desired configuration, furthermore they can go ahead and edit the code too, so Linux driver is expected to be flexible to support different configurations as supported by the HW. I have a limited view of all possible use-cases which customer may try out with different configurations but few examples are shared below part from the one discussed earlier : [Examples]: 1) Customer is running Linux as main OS but using RTOS to control some external peripherals like temperature sensor, motion sensor e.t.c. In that case if they want to display the sensor data too on the same monitor, then they can use the configuration where RTOS use single plane and Linux as the DSS master. 2) Another configuration could be where RTOS want to control one full end-to-end pipeline going to one connector and Linux want to control full end-to-end pipeline going to another connector, that can be supported too using this scheme (as shared in this series). 3) Also I think, this device-tree based scheme could be leveraged in virtualization too with static partitioning based scheme using Xen for e.g. we split the DSS resources between host (DOM0) and the guest (DOMU). >> The opposite is supported by the DSS hardware (thanks to its feature that >> allows partitioning the register space and having multiple per-host IRQs) >> but it's not a real use case yet. The reason why this case is added to the >> DT binding is as you said for flexiblity and make the design future-proof. >> Not really, as explained above we are documenting all possible configurations which hardware supports as supported in software in the SDK and that's what we are aiming for upstream too. [..]>>>> I'm probably missing something then here, but if the Linux side of >>>> things is expected to keep the current configuration and keep it active >>>> for it to work, what use-case would it be useful for? >>>> >>> >>> It's just one of the partitioning possibilities that I mentioned here, that >>> Linux is in control of DSS as a whole and the user want the other host (be >>> it >>> RTOS or any other core) to control a single plane. For e.g it could be Linux >>> (with GPU rendering) displaying the graphics and RTOS overlaying a real time >>> clock or any other signs which need to be displayed in real-time. >>> But more than the use-case this is inspired by the fact that we want to be >>> flexible and support in the linux driver whatever partitioning scheme >>> possibilities are there which are supported in hardware and we let user >>> decide >>> on the partitioning scheme. >>> >> >> A possible use case here could be if Linux is safer than the other host >> owning a single plane, right? Then in that case the RTOS could fail but >> the display pipeline won't be teared down. >> >> That is, if your safety tell-tales would be driven by Linux and having >> other OS dislay the GPU-rendered QT based application on another plane. >> >> But as said, for now that's a theorethical use case since the one you >> mentioned is the opposite. >> >> [] >> Yes that could be a possible use-case too, we want to provide customer the flexibility in their app design to select different configuration, as we say all these configurations as supported in the driver. [..] >>>>> If there is a more complex use-case which requires dynamic >>>>> assignment/arbitration of resources then I agree those require some sort >>>>> of >>>>> IPC scheme but this is not what we target with these series. This series >>>>> is >>>>> simply to support static partitioning feature (separate register space, >>>>> separate irq, firewalling support etc) of TI DSS hardware across the >>>>> multiple >>>>> hosts and there are use-cases too for which this scheme suffices. >>>> >>>&
Re: [PATCH v9 07/10] lib: add basic KUnit test for lib/math
On 28/05/24 02:08, Andy Shevchenko wrote: > On Mon, May 27, 2024 at 11:37:20PM +0300, Andy Shevchenko wrote: >> On Sun, May 26, 2024 at 11:39:33PM +0530, Devarsh Thakkar wrote: > > ... > >>> +MODULE_LICENSE("GPL"); >> >> modpost validator won't be happy about this, i.e. missing >> MODULE_DESCRIPTION(). > > And obviously + module.h in the inclusion block. > The module.h is already included under include/kunit/test.h and that's the reason compiler did not give any error. But I can still include it under math.h for better readability as you suggested as anyway compiler will not re-include if already included by another header file. Also I see we were missing a dependency between math_kunit and kunit modules, so adding a dependency there too. Regards Devarsh
Re: [PATCH v9 07/10] lib: add basic KUnit test for lib/math
Hi Andy, Thanks for the review. On 28/05/24 02:07, Andy Shevchenko wrote: [..] >> +#include >> +#include >> +#include > > + math.h (where abs()/DIV_ROUND_*()/etc come from?) > I believe I mentioned that. > I did compile and test this, so math.h was indirectly getting included via some other header file already included but I would not rely on that and include math.h separately as you suggested. >> +#include > > ... > >> +MODULE_LICENSE("GPL"); > > modpost validator won't be happy about this, i.e. missing > MODULE_DESCRIPTION(). > Indeed, it gives below logs, let me add that too. WARNING: modpost: missing MODULE_DESCRIPTION() in lib/math/math_kunit.o Regards Devarsh
Re: [PATCH v9 06/10] math.h: Add macros for rounding to closest value
Hi Andy, Thanks for the review. On 28/05/24 02:02, Andy Shevchenko wrote: > On Sun, May 26, 2024 at 11:38:56PM +0530, Devarsh Thakkar wrote: ... >> +/** >> + * round_closest_up - round closest to be multiple of specified value >> (which is >> + *power of 2) with preference to rounding up >> + > > Not that big deal, but missing '*' here. Personally I would not even put > a blank line between Summary and Field Descriptions. > My bad. Yes I would remove the blank line here. This is picked up as warning from kernel-doc too. >> + * @x: the value to round >> + * @y: multiple to round closest to (must be a power of 2) >> + * >> + * Rounds @x to closest multiple of @y (which must be a power of 2). >> + * The value can be either rounded up or rounded down depending upon rounded >> + * value's closeness to the specified value. If there are two closest >> possible >> + * values, i.e. the difference between the specified value and it's rounded >> up >> + * and rounded down values is same then preference is given to rounded up >> + * value. >> + * >> + * To perform arbitrary rounding to closest value (not multiple of 2), use >> + * roundclosest(). >> + * >> + * Examples : > > What is this suppose to be rendered to? > The file math.h is not rendered as part of kernel-doc right now. I can put this under Documentation/core-api/kernel-api.rst perhaps I can create a new section as below: Rounding, absolute diff and 32bit division macros - under the section: CRC and Math Functions in Linux === is that okay ? >> + * round_closest_up(17, 4) = 16 >> + * round_closest_up(15, 4) = 16 >> + * round_closest_up(14, 4) = 16 > > Btw, is kernel-doc validator happy about all kernel docs you added? > Yes, except the aforementioned blank line. Regards Devarsh
[PATCH v9 10/10] gpu: ipu-v3: Use generic macro for rounding closest to specified value
Use generic macro round_closest_up() for rounding closest to specified value instead of using local macro round_closest(). There is no change from functionality point of view as round_closest_up() is functionally same as the previously used local macro round_closest(). Signed-off-by: Devarsh Thakkar --- V9: No change V8: Update commit message V1->V7 : (No change, patch introduced in V7) --- drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c b/drivers/gpu/ipu-v3/ipu-image-convert.c index 841316582ea9..5192a8b5c02c 100644 --- a/drivers/gpu/ipu-v3/ipu-image-convert.c +++ b/drivers/gpu/ipu-v3/ipu-image-convert.c @@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct ipu_image_convert_ctx *ctx, return 0; } -#define round_closest(x, y) round_down((x) + (y)/2, (y)) - /* * Find the best aligned seam position for the given column / row index. * Rotation and image offsets are out of scope. @@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx *ctx, * The closest input sample position that we could actually * start the input tile at, 19.13 fixed point. */ - in_pos_aligned = round_closest(in_pos, 8192U * in_align); + in_pos_aligned = round_closest_up(in_pos, 8192U * in_align); /* Convert 19.13 fixed point to integer */ in_pos_rounded = in_pos_aligned / 8192U; -- 2.39.1
[PATCH v9 09/10] media: imagination: Round to closest multiple for cropping region
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest multiple of requested value while updating the crop rectangle coordinates. Use the rounding macro which gives preference to rounding down in case two nearest values (high and low) are possible to raise the probability of cropping rectangle falling inside the bound region. This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl description as documented in v4l uapi [1] which specifies that driver should choose crop rectangle as close as possible if no flags are passed by user-space, as quoted below : "``0`` - The driver can adjust the rectangle size freely and shall choose a crop/compose rectangle as close as possible to the requested one." [1] : https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst Signed-off-by: Devarsh Thakkar --- V9: No change V8: Update commit message with specification reference V1->V7 (No change, patch introduced in V7) --- drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c b/drivers/media/platform/imagination/e5010-jpeg-enc.c index e701d573a26a..d65646f0c38c 100644 --- a/drivers/media/platform/imagination/e5010-jpeg-enc.c +++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c @@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, struct v4l2_selection switch (s->flags) { case 0: - s->r.width = round_down(s->r.width, queue->fmt->frmsize.step_width); - s->r.height = round_down(s->r.height, queue->fmt->frmsize.step_height); - s->r.left = round_down(s->r.left, queue->fmt->frmsize.step_width); - s->r.top = round_down(s->r.top, 2); + s->r.width = round_closest_down(s->r.width, queue->fmt->frmsize.step_width); + s->r.height = round_closest_down(s->r.height, queue->fmt->frmsize.step_height); + s->r.left = round_closest_down(s->r.left, queue->fmt->frmsize.step_width); + s->r.top = round_closest_down(s->r.top, 2); if (s->r.left + s->r.width > queue->width) s->r.width = round_down(s->r.width + s->r.left - queue->width, -- 2.39.1
[PATCH v9 08/10] lib: math_kunit: Add tests for new macros related to rounding to nearest value
Add tests for round_closest_up/down and roundclosest macros which round to nearest multiple of specified argument. These are tested with kunit tool as shared here [1]. [1]: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876 Signed-off-by: Devarsh Thakkar --- V1->V9 (No change, patch introduced in V8) --- lib/math/math_kunit.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c index 16263e30a292..60b66aa41121 100644 --- a/lib/math/math_kunit.c +++ b/lib/math/math_kunit.c @@ -67,6 +67,26 @@ static void round_down_test(struct kunit *test) KUNIT_EXPECT_EQ(test, round_down((1 << 30) - 1, 1 << 29), 1 << 29); } +static void round_closest_up_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_closest_up(17, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up(15, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up(14, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) + 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 2), 1 << 30); +} + +static void round_closest_down_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_closest_down(17, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_down(15, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_down(14, 4), 12); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) + 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 2), (1 << 30) - 2); +} + /* These versions can round to numbers that aren't a power of two */ static void roundup_test(struct kunit *test) { @@ -92,6 +112,18 @@ static void rounddown_test(struct kunit *test) KUNIT_EXPECT_EQ(test, rounddown(4, 3), 3); } +static void roundclosest_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, roundclosest(21, 5), 20); + KUNIT_EXPECT_EQ(test, roundclosest(19, 5), 20); + KUNIT_EXPECT_EQ(test, roundclosest(17, 5), 15); + KUNIT_EXPECT_EQ(test, roundclosest((1 << 30), 3), (1 << 30) - 1); + KUNIT_EXPECT_EQ(test, roundclosest((1 << 30) - 1, 1 << 29), 1 << 30); + + KUNIT_EXPECT_EQ(test, roundclosest(4, 3), 3); + KUNIT_EXPECT_EQ(test, roundclosest(5, 3), 6); +} + static void div_round_up_test(struct kunit *test) { KUNIT_EXPECT_EQ(test, DIV_ROUND_UP(0, 1), 0); @@ -269,8 +301,11 @@ static struct kunit_case math_test_cases[] = { KUNIT_CASE(int_sqrt_test), KUNIT_CASE(round_up_test), KUNIT_CASE(round_down_test), + KUNIT_CASE(round_closest_up_test), + KUNIT_CASE(round_closest_down_test), KUNIT_CASE(roundup_test), KUNIT_CASE(rounddown_test), + KUNIT_CASE(roundclosest_test), KUNIT_CASE(div_round_up_test), KUNIT_CASE(div_round_closest_test), KUNIT_CASE_PARAM(gcd_test, gcd_gen_params), -- 2.39.1
[PATCH v9 07/10] lib: add basic KUnit test for lib/math
From: Daniel Latypov Add basic test coverage for files that don't require any config options: * part of math.h (what seem to be the most commonly used macros) * gcd.c * lcm.c * int_sqrt.c * reciprocal_div.c (Ignored int_pow.c since it's a simple textbook algorithm.) These tests aren't particularly interesting, but they * provide short and simple examples of parameterized tests * provide a place to add tests for any new files in this dir * are written so adding new test cases to cover edge cases should be easy * looking at code coverage, we hit all the branches in the .c files Signed-off-by: Daniel Latypov Reviewed-by: David Gow [devarsht: Rebase to 6.9, remove kernel.h and change license to GPL] Signed-off-by: Devarsh Thakkar --- Changes since v8: * Remove unrequired header file linux/kernel.h Changes since v7: * Rebase to linux-next, change license to GPL as suggested by checkpatch. Changes since v6: * No change Changes since v5: * add in test cases for roundup/rounddown * address misc comments from David Changes since v4: * add in test cases for some math.h macros (abs, round_up/round_down, div_round_down/closest) * use parameterized testing less to keep things terser Changes since v3: * fix `checkpatch.pl --strict` warnings * add test cases for gcd(0,0) and lcm(0,0) * minor: don't test both gcd(a,b) and gcd(b,a) when a == b Changes since v2: mv math_test.c => math_kunit.c Changes since v1: * Rebase and rewrite to use the new parameterized testing support. * misc: fix overflow in literal and inline int_sqrt format string. * related: commit 1f0e943df68a ("Documentation: kunit: provide guidance for testing many inputs") was merged explaining the patterns shown here. * there's an in-flight patch to update it for parameterized testing. --- lib/math/Kconfig | 11 ++ lib/math/Makefile | 1 + lib/math/math_kunit.c | 290 ++ 3 files changed, 302 insertions(+) create mode 100644 lib/math/math_kunit.c diff --git a/lib/math/Kconfig b/lib/math/Kconfig index 0634b428d0cb..832c6989ca13 100644 --- a/lib/math/Kconfig +++ b/lib/math/Kconfig @@ -15,3 +15,14 @@ config PRIME_NUMBERS config RATIONAL tristate + +config MATH_KUNIT_TEST + tristate "KUnit test for math helper functions" + help + This builds unit test for math helper functions as defined in lib/math + and math.h. + + For more information on KUNIT and unit tests in general, please refer + to the KUnit documentation in Documentation/dev-tools/kunit/. + + If unsure, say N. diff --git a/lib/math/Makefile b/lib/math/Makefile index 91fcdb0c9efe..dcf65d10dab2 100644 --- a/lib/math/Makefile +++ b/lib/math/Makefile @@ -7,3 +7,4 @@ obj-$(CONFIG_RATIONAL) += rational.o obj-$(CONFIG_TEST_DIV64) += test_div64.o obj-$(CONFIG_RATIONAL_KUNIT_TEST) += rational-test.o +obj-$(CONFIG_MATH_KUNIT_TEST) += math_kunit.o diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c new file mode 100644 index ..16263e30a292 --- /dev/null +++ b/lib/math/math_kunit.c @@ -0,0 +1,290 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Simple KUnit suite for math helper funcs that are always enabled. + * + * Copyright (C) 2020, Google LLC. + * Author: Daniel Latypov + */ + +#include +#include +#include +#include + +static void abs_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, abs((char)0), (char)0); + KUNIT_EXPECT_EQ(test, abs((char)42), (char)42); + KUNIT_EXPECT_EQ(test, abs((char)-42), (char)42); + + /* The expression in the macro is actually promoted to an int. */ + KUNIT_EXPECT_EQ(test, abs((short)0), 0); + KUNIT_EXPECT_EQ(test, abs((short)42), 42); + KUNIT_EXPECT_EQ(test, abs((short)-42), 42); + + KUNIT_EXPECT_EQ(test, abs(0), 0); + KUNIT_EXPECT_EQ(test, abs(42), 42); + KUNIT_EXPECT_EQ(test, abs(-42), 42); + + KUNIT_EXPECT_EQ(test, abs(0L), 0L); + KUNIT_EXPECT_EQ(test, abs(42L), 42L); + KUNIT_EXPECT_EQ(test, abs(-42L), 42L); + + KUNIT_EXPECT_EQ(test, abs(0LL), 0LL); + KUNIT_EXPECT_EQ(test, abs(42LL), 42LL); + KUNIT_EXPECT_EQ(test, abs(-42LL), 42LL); + + /* Unsigned types get casted to signed. */ + KUNIT_EXPECT_EQ(test, abs(0ULL), 0LL); + KUNIT_EXPECT_EQ(test, abs(42ULL), 42LL); +} + +static void int_sqrt_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, int_sqrt(0UL), 0UL); + KUNIT_EXPECT_EQ(test, int_sqrt(1UL), 1UL); + KUNIT_EXPECT_EQ(test, int_sqrt(4UL), 2UL); + KUNIT_EXPECT_EQ(test, int_sqrt(5UL), 2UL); + KUNIT_EXPECT_EQ(test, int_sqrt(8UL), 2UL); + KUNIT_EXPECT_EQ(test, int_sqrt(1UL << 30), 1UL << 15); +} + +static void round_up_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_up(0, 1), 0); + KUNIT_EXPECT_EQ(test, round_up(1, 2), 2); + KUNIT_EXPECT_EQ(test, round_up(3,
[PATCH v9 06/10] math.h: Add macros for rounding to closest value
Add below rounding related macros: round_closest_up(x, y) : Rounds x to closest multiple of y where y is a power of 2, with a preference to round up in case two nearest values are possible. round_closest_down(x, y) : Rounds x to closest multiple of y where y is a power of 2, with a preference to round down in case two nearest values are possible. roundclosest(x, y) : Rounds x to closest multiple of y, this macro should generally be used only when y is not multiple of 2 as otherwise round_closest* macros should be used which are much faster. Examples: * round_closest_up(17, 4) = 16 * round_closest_up(15, 4) = 16 * round_closest_up(14, 4) = 16 * round_closest_down(17, 4) = 16 * round_closest_down(15, 4) = 16 * round_closest_down(14, 4) = 12 * roundclosest(21, 5) = 20 * roundclosest(19, 5) = 20 * roundclosest(17, 5) = 15 Signed-off-by: Devarsh Thakkar --- NOTE: This patch is inspired from the Mentor Graphics IPU driver [1] which uses similar macro locally and which is updated in further patch in the series to use this generic macro instead along with other drivers having similar requirements. [1]: https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480 V9: - No change V8: - Add new macro to round to nearest value for non-multiple of 2 - Update commit message as suggested: V1->V6 (No change, patch introduced in V7) --- include/linux/math.h | 65 1 file changed, 65 insertions(+) diff --git a/include/linux/math.h b/include/linux/math.h index dd4152711de7..e2cc3769ed0e 100644 --- a/include/linux/math.h +++ b/include/linux/math.h @@ -34,6 +34,54 @@ */ #define round_down(x, y) ((x) & ~__round_mask(x, y)) +/** + * round_closest_up - round closest to be multiple of specified value (which is + *power of 2) with preference to rounding up + + * @x: the value to round + * @y: multiple to round closest to (must be a power of 2) + * + * Rounds @x to closest multiple of @y (which must be a power of 2). + * The value can be either rounded up or rounded down depending upon rounded + * value's closeness to the specified value. If there are two closest possible + * values, i.e. the difference between the specified value and it's rounded up + * and rounded down values is same then preference is given to rounded up + * value. + * + * To perform arbitrary rounding to closest value (not multiple of 2), use + * roundclosest(). + * + * Examples : + * round_closest_up(17, 4) = 16 + * round_closest_up(15, 4) = 16 + * round_closest_up(14, 4) = 16 + */ +#define round_closest_up(x, y) round_down((x) + (y) / 2, (y)) + +/** + * round_closest_down - round closest to be multiple of specified value (which + * is power of 2) with preference to rounding down + * + * @x: the value to round + * @y: multiple to round closest to (must be a power of 2) + * + * Rounds @x to closest multiple of @y (which must be a power of 2). + * The value can be either rounded up or rounded down depending upon rounded + * value's closeness to the specified value. If there are two closest possible + * values, i.e. the difference between the specified value and it's rounded up + * and rounded down values is same then preference is given to rounded up + * value. + * + * To perform arbitrary rounding to closest value (not multiple of 2), use + * roundclosest(). + * + * Examples : + * round_closest_down(17, 4) = 16 + * round_closest_down(15, 4) = 16 + * round_closest_down(14, 4) = 12 + */ +#define round_closest_down(x, y) round_up((x) - (y) / 2, (y)) + #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP #define DIV_ROUND_DOWN_ULL(ll, d) \ @@ -77,6 +125,23 @@ } \ ) +/** + * roundclosest - round to nearest multiple + * @x: the value to round + * @y: multiple to round nearest to + * + * Rounds @x to nearest multiple of @y. + * The rounded value can be greater than or less than @x depending + * upon it's nearness to @x. If @y will always be a power of 2, consider + * using the faster round_closest_up() or round_closest_down(). + * + * Examples : + * roundclosest(21, 5) = 20 + * roundclosest(19, 5) = 20 + * roundclosest(17, 5) = 15 + */ +#define roundclosest(x, y) rounddown((x) + (y) / 2, (y)) + /* * Divide positive or negative dividend by positive or negative divisor * and round to closest integer. Result is undefined for negative -- 2.39.1
[PATCH v9 00/10] Add V4L2 M2M Driver for E5010 JPEG Encoder
This adds support for V4L2 M2M based driver for E5010 JPEG Encoder which is a stateful JPEG encoder from Imagination technologies and is present in TI AM62A SoC. While adding support for it, following additional framework changes were made: - Moved reference quantization and huffman tables provided in ITU-T-REC-T.81 to v4l2-jpeg.c as suggested in mailing list [1]. - Add macros to round to closest integer (either higher or lower) while rounding in order of 2. - Add KUnit tests for math functions. v4l2-compliance test : Link: https://gist.github.com/devarsht/1f039c631ca953a57f405cfce1b69e49 E5010 JPEG Encoder Manual tests : Performance: Link: https://gist.github.com/devarsht/c40672944fd71c9a53ab55adbfd9e28b Functionality: Link: https://gist.github.com/devarsht/8e88fcaabff016bb2bac83d89c9d23ce Compression Quality: Link: https://gist.github.com/devarsht/cbcc7cd97e8c48ba1486caa2b7884655 Multi Instance: Link: https://gist.github.com/devarsht/22c2fca08cd3441fb40f2c7a4cebc95a Crop support: Link: https://gist.github.com/devarsht/de6f5142f678bb1a5338abfd9f814abd Runtime PM: Link: https://gist.github.com/devarsht/70cd95d4440ddc678489d93885ddd4dd Math lib KUnit tests: Link: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876 [1]: https://lore.kernel.org/all/de46aefe-36da-4e1a-b4fa-b375b2749...@xs4all.nl/ Changelog: V8->V9: - Remove kernel.h header file - Remove stale filler data on jpeg header in E5010 jpeg driver V7->V8: - Add KUnit tests for math functions - Add roundclosest() for supporting rounding for non-multiple of 2 - Update commit message as suggested - Add Reviewed-by and Acked-by tags to patches as received V6->V7: - Fix cropping support - Move reference huffman and quantization tables to v4l2-jpeg.c - Fix suspend/resume use-case - Add Reviewed-by V5->V6: - Fix sparse warnings V4->V5: - Sort the #includes in driver file alphabetically - Rename huffman and quantization tables to not use '_' - Add Reviewed-by tag V3->V4: - Use ti-specific compatible ti,am62a-jpeg-enc as secondary one in dt-binding - Remove clock-names as only single clock in dt-binding - Fix issue with default params setting - Correct v4l2 error prints - Simplify register write functions with single statement return values - Remove unrequired error checks from get_queue() - Drop explicit device_caps setting as it is already taken care by v4l2 core - Remove unrequired multiplanar checks and memset from s_fmt, g_fmt callback functions - Fix try_fmt callback to not update the queues - Remove unrequired contiguous format attribute from queue_init - Use dynamic allocation for video_device and remove unrequired assignments in probe() - Remove unrequired checks from queue_setup function - Return queued buffers back if start_streaming fails - Use ARRAY_SIZE in place of hard-coding - Use huffman and quantization tables from reference header file V2->V3: - Add DONOTMERGE patches for dts and defconfig - Update driver with below changes : - Correct license headers - Use more generic name core instead of jasper for base registers - Add Comment for forward declarations - Simplify quantization table calculations - Use v4l2_apply_frmsize_constraints for updating framesize and remove unrequired functions - Place TODO at top of file and in commit message too - Use dev_err_probe helper in probe function - Fix return value checking for failure scenarios in probe function - Use v4l2_err/info/warn helpers instead of dev_err/info/warn helpers - Fix unexpected indentation - Correct commit message - Update dt-bindings with below changes : - Add vendor specific compatible - Fix commit title and message - Update reg names - Update clocks to 1 - Fix dts example with proper naming V1->V2: - Send dt-bindings and driver together Patch-Diff between the series : V8->V9 Range diff : https://gist.github.com/devarsht/3fd6c4e8031ab114248f93d01c8dfc74 V7->V8 Range diff : https://gist.github.com/devarsht/3fd6c4e8031ab114248f93d01c8dfc74 V6->V7 Range diff : https://gist.github.com/devarsht/1db185b1e187eaf397e9e4c37066777e V5->V6 Range diff : https://gist.github.com/devarsht/c89180ac2b0d2814614f2b59d0705c19 V4->V5 Range diff : https://gist.github.com/devarsht/298790af819f299a0a05fec89371097b V3->V4 Range diff : https://gist.github.com/devarsht/22a744d999080de6e813bcfb5a596272 Previous patch series: V8: https://lore.kernel.org/all/20240517171532.748684-1-devar...@ti.com/ V7: https://lore.kernel.org/all/20240510082603.1263256-1-devar...@ti.com/ V6: https://lore.kernel.org/all/20240228141140.3530612-1-devar...@ti.com/ V5: https://lore.kernel.org/all/20240215134641.3381478-1-devar...@ti.com/ V4: https://lore.kernel.org/all/20240205114239.924697-1-devar...@ti.com/ V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/ V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/ Daniel Latypov (1
Re: [PATCH v8 07/10] lib: add basic KUnit test for lib/math
On 20/05/24 17:52, Andy Shevchenko wrote: > On Mon, May 20, 2024 at 05:11:18PM +0530, Devarsh Thakkar wrote: >> On 18/05/24 01:44, Andy Shevchenko wrote: >>> On Fri, May 17, 2024 at 11:06:07PM +0530, Devarsh Thakkar wrote: > > [..] > [..] > Yes, and one should follow IWYU principle and not cargo cult or whatever > arbitrary lists. > Agreed. >>>> +#include >>> >>> + math.h // obviously >>> + module.h >>> >>>> +#include >>> >>> + types.h >> >> All the above headers are already included as part of kernel.h > > Yes, that's why you should not use "proxy" headers. > Have you read the top comment in the kernel.h? > Yes, it says it is not recommended to include this inside another header file. Although here we are adding it inside c file, but I can still try avoid it and include only the required headers instead of kernel.h as you recommended. Regards Devarsh
Re: [PATCH v8 07/10] lib: add basic KUnit test for lib/math
Hi Andy, Daniel, On 18/05/24 01:44, Andy Shevchenko wrote: > On Fri, May 17, 2024 at 11:06:07PM +0530, Devarsh Thakkar wrote: [..] > >> [devarsht: Rebase to 6.9 and change license to GPL] > > I'm not sure that you may change license. It needs the author's confirmation. > As per latest licensing doc [1], It quotes that GPL is same as GPL v2 and GPL v2 exists only for historical reasons as shared below : “GPL v2 Same as “GPL”. It exists for historic reasons." So I think it should be fine and more apt to use GPL. This is also highlighted in below commit: bf7fbeeae6db644ef5995085de2bc5c6121f8c8d module: Cure the MODULE_LICENSE "GPL" vs. "GPL v2" bogosity >> --- >> Changes since v6: >> * Rebase to linux-next, change license to GPL as suggested by checkpatch. > > Note, checkpatch.pl is not false positives free. Be careful > with what it suggests. > >> +#include >> +#include > >> +#include > > Do you know why this header is included? > It includes all the other required headers (including those you mentioned below), and header list is probably copied from other files in same directory. But it does suffice the requirements as I have verified the compilation. >> +#include > > + math.h // obviously > + module.h > >> +#include > > + types.h > All the above headers are already included as part of kernel.h Kindly let me know if any queries. [1]: https://docs.kernel.org/process/license-rules.html Regards Devarsh
[PATCH v8 10/10] gpu: ipu-v3: Use generic macro for rounding closest to specified value
Use generic macro round_closest_up() for rounding closest to specified value instead of using local macro round_closest(). There is no change from functionality point of view as round_closest_up() is functionally same as the previously used local macro round_closest(). Signed-off-by: Devarsh Thakkar --- V8: Update commit message V1->V7 : (No change, patch introduced in V7) --- drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c b/drivers/gpu/ipu-v3/ipu-image-convert.c index 841316582ea9..5192a8b5c02c 100644 --- a/drivers/gpu/ipu-v3/ipu-image-convert.c +++ b/drivers/gpu/ipu-v3/ipu-image-convert.c @@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct ipu_image_convert_ctx *ctx, return 0; } -#define round_closest(x, y) round_down((x) + (y)/2, (y)) - /* * Find the best aligned seam position for the given column / row index. * Rotation and image offsets are out of scope. @@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx *ctx, * The closest input sample position that we could actually * start the input tile at, 19.13 fixed point. */ - in_pos_aligned = round_closest(in_pos, 8192U * in_align); + in_pos_aligned = round_closest_up(in_pos, 8192U * in_align); /* Convert 19.13 fixed point to integer */ in_pos_rounded = in_pos_aligned / 8192U; -- 2.39.1
[PATCH v8 09/10] media: imagination: Round to closest multiple for cropping region
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest multiple of requested value while updating the crop rectangle coordinates. Use the rounding macro which gives preference to rounding down in case two nearest values (high and low) are possible to raise the probability of cropping rectangle falling inside the bound region. This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl description as documented in v4l uapi [1] which specifies that driver should choose crop rectangle as close as possible if no flags are passed by user-space, as quoted below : "``0`` - The driver can adjust the rectangle size freely and shall choose a crop/compose rectangle as close as possible to the requested one." [1] : https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst Signed-off-by: Devarsh Thakkar --- V8: Update commit message with specification reference V1->V7 (No change, patch introduced in V7) --- drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c b/drivers/media/platform/imagination/e5010-jpeg-enc.c index 189e2a99c43d..abd66bc9b96c 100644 --- a/drivers/media/platform/imagination/e5010-jpeg-enc.c +++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c @@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, struct v4l2_selection switch (s->flags) { case 0: - s->r.width = round_down(s->r.width, queue->fmt->frmsize.step_width); - s->r.height = round_down(s->r.height, queue->fmt->frmsize.step_height); - s->r.left = round_down(s->r.left, queue->fmt->frmsize.step_width); - s->r.top = round_down(s->r.top, 2); + s->r.width = round_closest_down(s->r.width, queue->fmt->frmsize.step_width); + s->r.height = round_closest_down(s->r.height, queue->fmt->frmsize.step_height); + s->r.left = round_closest_down(s->r.left, queue->fmt->frmsize.step_width); + s->r.top = round_closest_down(s->r.top, 2); if (s->r.left + s->r.width > queue->width) s->r.width = round_down(s->r.width + s->r.left - queue->width, -- 2.39.1
[PATCH v8 08/10] lib: math_kunit: Add tests for new rounding related macros
Add tests for round_closest_up/down and roundclosest macros which round to nearest multiple of specified argument. These are tested with kunit tool as shared here [1]. [1]: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876 Signed-off-by: Devarsh Thakkar --- V1->V8 (No change, patch introduced in V8) --- lib/math/math_kunit.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c index 1b00e4195a1a..1955dcd5655d 100644 --- a/lib/math/math_kunit.c +++ b/lib/math/math_kunit.c @@ -68,6 +68,26 @@ static void round_down_test(struct kunit *test) KUNIT_EXPECT_EQ(test, round_down((1 << 30) - 1, 1 << 29), 1 << 29); } +static void round_closest_up_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_closest_up(17, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up(15, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up(14, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) + 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 2), 1 << 30); +} + +static void round_closest_down_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_closest_down(17, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_down(15, 4), 16); + KUNIT_EXPECT_EQ(test, round_closest_down(14, 4), 12); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) + 1, 1 << 30), 1 << 30); + KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 2), (1 << 30) - 2); +} + /* These versions can round to numbers that aren't a power of two */ static void roundup_test(struct kunit *test) { @@ -93,6 +113,18 @@ static void rounddown_test(struct kunit *test) KUNIT_EXPECT_EQ(test, rounddown(4, 3), 3); } +static void roundclosest_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, roundclosest(21, 5), 20); + KUNIT_EXPECT_EQ(test, roundclosest(19, 5), 20); + KUNIT_EXPECT_EQ(test, roundclosest(17, 5), 15); + KUNIT_EXPECT_EQ(test, roundclosest((1 << 30), 3), (1 << 30) - 1); + KUNIT_EXPECT_EQ(test, roundclosest((1 << 30) - 1, 1 << 29), 1 << 30); + + KUNIT_EXPECT_EQ(test, roundclosest(4, 3), 3); + KUNIT_EXPECT_EQ(test, roundclosest(5, 3), 6); +} + static void div_round_up_test(struct kunit *test) { KUNIT_EXPECT_EQ(test, DIV_ROUND_UP(0, 1), 0); @@ -270,8 +302,11 @@ static struct kunit_case math_test_cases[] = { KUNIT_CASE(int_sqrt_test), KUNIT_CASE(round_up_test), KUNIT_CASE(round_down_test), + KUNIT_CASE(round_closest_up_test), + KUNIT_CASE(round_closest_down_test), KUNIT_CASE(roundup_test), KUNIT_CASE(rounddown_test), + KUNIT_CASE(roundclosest_test), KUNIT_CASE(div_round_up_test), KUNIT_CASE(div_round_closest_test), KUNIT_CASE_PARAM(gcd_test, gcd_gen_params), -- 2.39.1
[PATCH v8 07/10] lib: add basic KUnit test for lib/math
From: Daniel Latypov Add basic test coverage for files that don't require any config options: * part of math.h (what seem to be the most commonly used macros) * gcd.c * lcm.c * int_sqrt.c * reciprocal_div.c (Ignored int_pow.c since it's a simple textbook algorithm.) These tests aren't particularly interesting, but they * provide short and simple examples of parameterized tests * provide a place to add tests for any new files in this dir * are written so adding new test cases to cover edge cases should be easy * looking at code coverage, we hit all the branches in the .c files Signed-off-by: Daniel Latypov Reviewed-by: David Gow [devarsht: Rebase to 6.9 and change license to GPL] Signed-off-by: Devarsh Thakkar --- Changes since v6: * Rebase to linux-next, change license to GPL as suggested by checkpatch. Changes since v5: * add in test cases for roundup/rounddown * address misc comments from David Changes since v4: * add in test cases for some math.h macros (abs, round_up/round_down, div_round_down/closest) * use parameterized testing less to keep things terser Changes since v3: * fix `checkpatch.pl --strict` warnings * add test cases for gcd(0,0) and lcm(0,0) * minor: don't test both gcd(a,b) and gcd(b,a) when a == b Changes since v2: mv math_test.c => math_kunit.c Changes since v1: * Rebase and rewrite to use the new parameterized testing support. * misc: fix overflow in literal and inline int_sqrt format string. * related: commit 1f0e943df68a ("Documentation: kunit: provide guidance for testing many inputs") was merged explaining the patterns shown here. * there's an in-flight patch to update it for parameterized testing. --- lib/math/Kconfig | 11 ++ lib/math/Makefile | 1 + lib/math/math_kunit.c | 291 ++ 3 files changed, 303 insertions(+) create mode 100644 lib/math/math_kunit.c diff --git a/lib/math/Kconfig b/lib/math/Kconfig index 0634b428d0cb..832c6989ca13 100644 --- a/lib/math/Kconfig +++ b/lib/math/Kconfig @@ -15,3 +15,14 @@ config PRIME_NUMBERS config RATIONAL tristate + +config MATH_KUNIT_TEST + tristate "KUnit test for math helper functions" + help + This builds unit test for math helper functions as defined in lib/math + and math.h. + + For more information on KUNIT and unit tests in general, please refer + to the KUnit documentation in Documentation/dev-tools/kunit/. + + If unsure, say N. diff --git a/lib/math/Makefile b/lib/math/Makefile index 91fcdb0c9efe..dcf65d10dab2 100644 --- a/lib/math/Makefile +++ b/lib/math/Makefile @@ -7,3 +7,4 @@ obj-$(CONFIG_RATIONAL) += rational.o obj-$(CONFIG_TEST_DIV64) += test_div64.o obj-$(CONFIG_RATIONAL_KUNIT_TEST) += rational-test.o +obj-$(CONFIG_MATH_KUNIT_TEST) += math_kunit.o diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c new file mode 100644 index ..1b00e4195a1a --- /dev/null +++ b/lib/math/math_kunit.c @@ -0,0 +1,291 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Simple KUnit suite for math helper funcs that are always enabled. + * + * Copyright (C) 2020, Google LLC. + * Author: Daniel Latypov + */ + +#include +#include +#include +#include +#include + +static void abs_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, abs((char)0), (char)0); + KUNIT_EXPECT_EQ(test, abs((char)42), (char)42); + KUNIT_EXPECT_EQ(test, abs((char)-42), (char)42); + + /* The expression in the macro is actually promoted to an int. */ + KUNIT_EXPECT_EQ(test, abs((short)0), 0); + KUNIT_EXPECT_EQ(test, abs((short)42), 42); + KUNIT_EXPECT_EQ(test, abs((short)-42), 42); + + KUNIT_EXPECT_EQ(test, abs(0), 0); + KUNIT_EXPECT_EQ(test, abs(42), 42); + KUNIT_EXPECT_EQ(test, abs(-42), 42); + + KUNIT_EXPECT_EQ(test, abs(0L), 0L); + KUNIT_EXPECT_EQ(test, abs(42L), 42L); + KUNIT_EXPECT_EQ(test, abs(-42L), 42L); + + KUNIT_EXPECT_EQ(test, abs(0LL), 0LL); + KUNIT_EXPECT_EQ(test, abs(42LL), 42LL); + KUNIT_EXPECT_EQ(test, abs(-42LL), 42LL); + + /* Unsigned types get casted to signed. */ + KUNIT_EXPECT_EQ(test, abs(0ULL), 0LL); + KUNIT_EXPECT_EQ(test, abs(42ULL), 42LL); +} + +static void int_sqrt_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, int_sqrt(0UL), 0UL); + KUNIT_EXPECT_EQ(test, int_sqrt(1UL), 1UL); + KUNIT_EXPECT_EQ(test, int_sqrt(4UL), 2UL); + KUNIT_EXPECT_EQ(test, int_sqrt(5UL), 2UL); + KUNIT_EXPECT_EQ(test, int_sqrt(8UL), 2UL); + KUNIT_EXPECT_EQ(test, int_sqrt(1UL << 30), 1UL << 15); +} + +static void round_up_test(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, round_up(0, 1), 0); + KUNIT_EXPECT_EQ(test, round_up(1, 2), 2); + KUNIT_EXPECT_EQ(test, round_up(3, 2), 4); + KUNIT_EXPECT_EQ(test, round_up((1 << 30) - 1, 2), 1 << 30); + KUNIT_
[PATCH v8 06/10] math.h: Add macros for rounding to closest value
Add below rounding related macros: round_closest_up(x, y) : Rounds x to closest multiple of y where y is a power of 2, with a preference to round up in case two nearest values are possible. round_closest_down(x, y) : Rounds x to closest multiple of y where y is a power of 2, with a preference to round down in case two nearest values are possible. roundclosest(x, y) : Rounds x to closest multiple of y, this macro should generally be used only when y is not multiple of 2 as otherwise round_closest* macros should be used which are much faster. Examples: * round_closest_up(17, 4) = 16 * round_closest_up(15, 4) = 16 * round_closest_up(14, 4) = 16 * round_closest_down(17, 4) = 16 * round_closest_down(15, 4) = 16 * round_closest_down(14, 4) = 12 * roundclosest(21, 5) = 20 * roundclosest(19, 5) = 20 * roundclosest(17, 5) = 15 Signed-off-by: Devarsh Thakkar --- NOTE: This patch is inspired from the Mentor Graphics IPU driver [1] which uses similar macro locally and which is updated in further patch in the series to use this generic macro instead along with other drivers having similar requirements. [1]: https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480 V8: - Add new macro to round to nearest value for non-multiple of 2 - Update commit message as suggested V1->V6 (No change, patch introduced in V7) --- include/linux/math.h | 65 1 file changed, 65 insertions(+) diff --git a/include/linux/math.h b/include/linux/math.h index dd4152711de7..e2cc3769ed0e 100644 --- a/include/linux/math.h +++ b/include/linux/math.h @@ -34,6 +34,54 @@ */ #define round_down(x, y) ((x) & ~__round_mask(x, y)) +/** + * round_closest_up - round closest to be multiple of specified value (which is + *power of 2) with preference to rounding up + + * @x: the value to round + * @y: multiple to round closest to (must be a power of 2) + * + * Rounds @x to closest multiple of @y (which must be a power of 2). + * The value can be either rounded up or rounded down depending upon rounded + * value's closeness to the specified value. If there are two closest possible + * values, i.e. the difference between the specified value and it's rounded up + * and rounded down values is same then preference is given to rounded up + * value. + * + * To perform arbitrary rounding to closest value (not multiple of 2), use + * roundclosest(). + * + * Examples : + * round_closest_up(17, 4) = 16 + * round_closest_up(15, 4) = 16 + * round_closest_up(14, 4) = 16 + */ +#define round_closest_up(x, y) round_down((x) + (y) / 2, (y)) + +/** + * round_closest_down - round closest to be multiple of specified value (which + * is power of 2) with preference to rounding down + * + * @x: the value to round + * @y: multiple to round closest to (must be a power of 2) + * + * Rounds @x to closest multiple of @y (which must be a power of 2). + * The value can be either rounded up or rounded down depending upon rounded + * value's closeness to the specified value. If there are two closest possible + * values, i.e. the difference between the specified value and it's rounded up + * and rounded down values is same then preference is given to rounded up + * value. + * + * To perform arbitrary rounding to closest value (not multiple of 2), use + * roundclosest(). + * + * Examples : + * round_closest_down(17, 4) = 16 + * round_closest_down(15, 4) = 16 + * round_closest_down(14, 4) = 12 + */ +#define round_closest_down(x, y) round_up((x) - (y) / 2, (y)) + #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP #define DIV_ROUND_DOWN_ULL(ll, d) \ @@ -77,6 +125,23 @@ } \ ) +/** + * roundclosest - round to nearest multiple + * @x: the value to round + * @y: multiple to round nearest to + * + * Rounds @x to nearest multiple of @y. + * The rounded value can be greater than or less than @x depending + * upon it's nearness to @x. If @y will always be a power of 2, consider + * using the faster round_closest_up() or round_closest_down(). + * + * Examples : + * roundclosest(21, 5) = 20 + * roundclosest(19, 5) = 20 + * roundclosest(17, 5) = 15 + */ +#define roundclosest(x, y) rounddown((x) + (y) / 2, (y)) + /* * Divide positive or negative dividend by positive or negative divisor * and round to closest integer. Result is undefined for negative -- 2.39.1
[PATCH v8 00/10] Add V4L2 M2M Driver for E5010 JPEG Encoder
This adds support for V4L2 M2M based driver for E5010 JPEG Encoder which is a stateful JPEG encoder from Imagination technologies and is present in TI AM62A SoC. While adding support for it, following additional framework changes were made: - Moved reference quantization and huffman tables provided in ITU-T-REC-T.81 to v4l2-jpeg.c as suggested in mailing list [1]. - Add macros to round to closest integer (either higher or lower) while rounding in order of 2. - Add KUnit tests for math functions. v4l2-compliance test : Link: https://gist.github.com/devarsht/1f039c631ca953a57f405cfce1b69e49 E5010 JPEG Encoder Manual tests : Performance: Link: https://gist.github.com/devarsht/c40672944fd71c9a53ab55adbfd9e28b Functionality: Link: https://gist.github.com/devarsht/8e88fcaabff016bb2bac83d89c9d23ce Compression Quality: Link: https://gist.github.com/devarsht/cbcc7cd97e8c48ba1486caa2b7884655 Multi Instance: Link: https://gist.github.com/devarsht/22c2fca08cd3441fb40f2c7a4cebc95a Crop support: Link: https://gist.github.com/devarsht/de6f5142f678bb1a5338abfd9f814abd Runtime PM: Link: https://gist.github.com/devarsht/70cd95d4440ddc678489d93885ddd4dd Math lib KUnit tests: Link: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876 [1]: https://lore.kernel.org/all/de46aefe-36da-4e1a-b4fa-b375b2749...@xs4all.nl/ Changelog: V7->V8: - Add KUnit tests for math functions - Add roundclosest() for supporting rounding for non-multiple of 2 - Update commit message as suggested - Add Reviewed-by and Acked-by tags to patches as received V6->V7: - Fix cropping support - Move reference huffman and quantization tables to v4l2-jpeg.c - Fix suspend/resume use-case - Add Reviewed-by V5->V6: - Fix sparse warnings V4->V5: - Sort the #includes in driver file alphabetically - Rename huffman and quantization tables to not use '_' - Add Reviewed-by tag V3->V4: - Use ti-specific compatible ti,am62a-jpeg-enc as secondary one in dt-binding - Remove clock-names as only single clock in dt-binding - Fix issue with default params setting - Correct v4l2 error prints - Simplify register write functions with single statement return values - Remove unrequired error checks from get_queue() - Drop explicit device_caps setting as it is already taken care by v4l2 core - Remove unrequired multiplanar checks and memset from s_fmt, g_fmt callback functions - Fix try_fmt callback to not update the queues - Remove unrequired contiguous format attribute from queue_init - Use dynamic allocation for video_device and remove unrequired assignments in probe() - Remove unrequired checks from queue_setup function - Return queued buffers back if start_streaming fails - Use ARRAY_SIZE in place of hard-coding - Use huffman and quantization tables from reference header file V2->V3: - Add DONOTMERGE patches for dts and defconfig - Update driver with below changes : - Correct license headers - Use more generic name core instead of jasper for base registers - Add Comment for forward declarations - Simplify quantization table calculations - Use v4l2_apply_frmsize_constraints for updating framesize and remove unrequired functions - Place TODO at top of file and in commit message too - Use dev_err_probe helper in probe function - Fix return value checking for failure scenarios in probe function - Use v4l2_err/info/warn helpers instead of dev_err/info/warn helpers - Fix unexpected indentation - Correct commit message - Update dt-bindings with below changes : - Add vendor specific compatible - Fix commit title and message - Update reg names - Update clocks to 1 - Fix dts example with proper naming V1->V2: - Send dt-bindings and driver together Patch-Diff between the series : V7->V8 Range diff : https://gist.github.com/devarsht/3fd6c4e8031ab114248f93d01c8dfc74 V6->V7 Range diff : https://gist.github.com/devarsht/1db185b1e187eaf397e9e4c37066777e V5->V6 Range diff : https://gist.github.com/devarsht/c89180ac2b0d2814614f2b59d0705c19 V4->V5 Range diff : https://gist.github.com/devarsht/298790af819f299a0a05fec89371097b V3->V4 Range diff : https://gist.github.com/devarsht/22a744d999080de6e813bcfb5a596272 Previous patch series: V7: https://lore.kernel.org/all/20240510082603.1263256-1-devar...@ti.com/ V6: https://lore.kernel.org/all/20240228141140.3530612-1-devar...@ti.com/ V5: https://lore.kernel.org/all/20240215134641.3381478-1-devar...@ti.com/ V4: https://lore.kernel.org/all/20240205114239.924697-1-devar...@ti.com/ V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/ V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/ Daniel Latypov (1): lib: add basic KUnit test for lib/math Devarsh Thakkar (9): media: dt-bindings: Add Imagination E5010 JPEG Encoder media: imagination: Add E5010 JPEG Encoder driver media: v4l2-jpeg: Export reference quantization and huffman tables media: imagination: U
Re: [PATCH v7 7/8] media: imagination: Round to closest multiple for cropping region
Hi Nicolas, Thanks for the review. On 15/05/24 01:52, Nicolas Dufresne wrote: > Le samedi 11 mai 2024 à 22:38 +0530, Devarsh Thakkar a écrit : >> Hi Andy, >> >> Thanks for the quick review. >> On 10/05/24 20:40, Andy Shevchenko wrote: >>> On Fri, May 10, 2024 at 12:10:01AM +0530, Devarsh Thakkar wrote: >>>> If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up >>>> (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest >>>> multiple of requested value while updating the crop rectangle coordinates. >>>> >>>> Use the rounding macro which gives preference to rounding down in case two >>>> nearest values (high and low) are possible to raise the probability of >>>> cropping rectangle falling inside the bound region. >>> >>> This is arguable. How do we know that the bigger range is supported? >>> The safest side is to go smaller than bigger. >>> >> >> Yes and that's what the driver does when do when application passes >> while doing the selection. If application does not >> specify explicitly whether to round down or round up the cropping >> parameters requested by it (i.e app is neither passing V4L2_SEL_FLAG_LE >> nor V4L2_SEL_FLAG_GE flags), then it is preferred by driver to round the >> cropping parameters to nearest possible value by either rounding down or >> rounding up to align with hardware requirements. >> >> For e.g. If requested width for cropping region is 127 and HW requires >> width to be multiple of 64 then we would prefer to round it up to 128 >> rather than rounding down to a more distant value (i.e. 64), but if >> requested cropping width is 129 then we would prefer to instead round it >> down to 128. But if requested cropping width is 160 then there are two >> nearest possible values 160 - 32 = 128 and 160 + 32 = 192 and in which >> case we prefer the smaller value as you suggested and that's why the >> driver uses round_closest_down. >> >> For any reason, if still the cropping rectangle falls beyond the bound >> region, then driver will return out of range error (-ERANGE) to >> application. > > I would appreciate if this change was based on specification text, meaning > improving the next if that behaviour is undefined. We might not be able to fix > it everywhere, but we can recommend something. > Yes, this change is based on specification text. This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl description as documented in v4l uapi [1] which means driver should choose crop rectangle as close as possible if no flags are passed by user-space, as quoted below : "``0`` - The driver can adjust the rectangle size freely and shall choose a crop/compose rectangle as close as possible to the requested one." If the user-space has specific requirement to either round down, round up or honor exact values, it should pass V4L2_SEL_FLAG_LE, V4L2_SEL_FLAG_GE or V4L2_SEL_FLAG_LE | V4L2_SEL_FLAG_GE flags respectively. [1] https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst#:~:text=compose%20rectangle%20as-,close,-as%20possible%20to Regards Devarsh
Re: [PATCH v7 6/8] math.h Add macros to round to closest specified power of 2
Hi Andy, On 13/05/24 17:55, Andy Shevchenko wrote: > On Mon, May 13, 2024 at 04:55:58PM +0530, Devarsh Thakkar wrote: >> On 13/05/24 14:29, Andy Shevchenko wrote: >>> On Sat, May 11, 2024 at 11:11:14PM +0530, Devarsh Thakkar wrote: >>>> On 10/05/24 20:45, Jani Nikula wrote: > > [...] > - align naming (with the existing round*() macros) I think round_closest_up/round_closest_down align already and inspired by the existing naming convention used for round*() and DIV_ROUND_CLOSEST() macros in math.h as explained below (copied from my previous reply [1]) "Coming back to naming, this is as per existing convention used for naming round_up, round_down (notice the `_` being used for macros working with pow of 2) and DIV_ROUND_CLOSEST (notice the work `closest` used to specify the answer to be nearest to specified value)" But do let me know if you have any other suggestions for naming? > - add examples into commit message of the math.h patch Agreed > - add test cases (you need to create lib/math_kunit.c for that) Agreed. > - elaborate in the commit message of the IPU3 change what you posted above > (some kind of a summary) Agreed. [1]: https://lore.kernel.org/all/zkig0-01pz632...@smile.fi.intel.com/ Regards Devarsh
Re: [PATCH v7 6/8] math.h Add macros to round to closest specified power of 2
Hi Andy, On 13/05/24 14:29, Andy Shevchenko wrote: > On Sat, May 11, 2024 at 11:11:14PM +0530, Devarsh Thakkar wrote: >> On 10/05/24 20:45, Jani Nikula wrote: > > [...] > >>> Moreover, I think the naming of round_up() and round_down() should have >>> reflected the fact that they operate on powers of 2. It's unfortunate >>> that the difference to roundup() and rounddown() is just the underscore! >>> That's just a trap. >>> >>> So let's perhaps not repeat the same with round_closest_up() and >>> round_closest_down()? >> >> Yes the naming is inspired by existing macros i.e. round_up, round_down >> (which round up/down to next pow of 2) and DIV_ROUND_CLOSEST (which >> round the divided value to closest value) and there are already a lot of >> users for these API's : >> >> linux-next git:(heads/next-20240509) ✗ grep -nr round_up drivers | wc >> 7304261 74775 >> >> linux-next git:(heads/next-20240509) ✗ grep -nr round_down drivers | wc >> 2261293 22194 >> >> linux-next git:(heads/next-20240509) ✗ grep -nr DIV_ROUND_CLOSEST >> drivers | wc >>12077461 111822 > > Side note, discover `git grep ...`: it's much much faster on Git index, > than classic one on a working copy. > >> so I thought to align with existing naming convention assuming >> developers are already familiar with this. >> >> But if a wider consensus is to go with a newer naming convention then I >> am open to it, although a challenge there would be to keep it short. For >> e.g. this one is already 3 words, if we go with more explicit >> "round_closest_up_pow_2" it looks quite long in my opinion :) . > > You need properly name the macros. Again, round_up() / roundup() and > roundup_pow_of_two() are three _different_ macros, and it's not clear > why you can't use one of them in your case. > I can't use any of these because these macros either round up or round down, whereas I want to round to closest value for the argument specified by the user, be it achieved either by rounding up or rounding down depending upon whichever makes the answer closer to the user-specified argument. To make it clear, I have already included the examples in the macro description [2], copying it here, maybe I can put the same examples in the commit message too to avoid confusions : round_closest_up(17, 4) = 16 round_closest_up(15, 4) = 16 round_closest_up(14, 4) = 16 round_closest_down(17, 4) = 16 round_closest_down(15, 4) = 16 round_closest_down(14, 4) = 12 Coming back to naming, this is as per existing convention used for naming round_up, round_down (notice the `_` being used for macros working with pow of 2) and DIV_ROUND_CLOSEST (notice the work `closest` used to specify the answer to be nearest to specified value). Naming is a bit subjective, but I personally don't think it is a good idea to go away with the existing naming convention or go with longer names. > The patch that changes those to a new one are doubtful to begin with. > I.e. need a careful review on the arithmetics side of the change > including HW capabilities of handling "closest" results. > This is already tested from my side, in-fact I have posted some of the results in cover-letter with these macros [1] : Regarding hardware capabilities, it uses existing round_up, round_down macros underneath which are optimized to handle pow of 2 after modifying the user provided argument using addition/subtraction and division, so I don't think it should generally a problem with the hardware. And I see other macros DIV_ROUND_CLOSEST [3] already using similar operations i.e. addition/subtraction and division so don't think it should be a problem to keep similar other macros in the same file. [1]: https://gist.github.com/devarsht/de6f5142f678bb1a5338abfd9f814abd#file-v7_jpeg_encoder_crop_validation-L204 [2]: https://lore.kernel.org/all/20240509183952.4064331-1-devar...@ti.com/ [3]: https://elixir.bootlin.com/linux/latest/source/include/linux/math.h#L86 Regards Devarsh
Re: [PATCH v7 8/8] gpu: ipu-v3: Use generic macro for rounding to nearest multiple
Hi Andy, Thanks for the quick review. On 10/05/24 20:33, Andy Shevchenko wrote: > On Fri, May 10, 2024 at 12:10:10AM +0530, Devarsh Thakkar wrote: >> Use generic macro round_closest_up for rounding to nearest multiple instead > > round_closest_up() > > We refer to the functions as func(). > Agreed. Will fix commit msg to use round_closest_up() >> of using local function. > > ... > >> @@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx >> *ctx, >> * The closest input sample position that we could actually >> * start the input tile at, 19.13 fixed point. >> */ >> -in_pos_aligned = round_closest(in_pos, 8192U * in_align); >> +in_pos_aligned = round_closest_up(in_pos, 8192U * in_align); >> /* Convert 19.13 fixed point to integer */ >> in_pos_rounded = in_pos_aligned / 8192U; > > Oh, these seems to be better to use either ALIGN*(), or PFN_*() / PAGE_*() > families of macros. What the semantic of 8192 is? > As Laurent mentioned, it looks like the fractional part of the integer. But functionality wise, there is no change with the introduction of this patch. round_closest_up() does exactly the same thing as what the local function round_closest used to do before this patch. Regards Devarsh
Re: [PATCH v7 6/8] math.h Add macros to round to closest specified power of 2
Hi Jani, Thanks for the quick review. On 10/05/24 20:45, Jani Nikula wrote: [...] > Moreover, I think the naming of round_up() and round_down() should have > reflected the fact that they operate on powers of 2. It's unfortunate > that the difference to roundup() and rounddown() is just the underscore! > That's just a trap. > > So let's perhaps not repeat the same with round_closest_up() and > round_closest_down()? > Yes the naming is inspired by existing macros i.e. round_up, round_down (which round up/down to next pow of 2) and DIV_ROUND_CLOSEST (which round the divided value to closest value) and there are already a lot of users for these API's : linux-next git:(heads/next-20240509) ✗ grep -nr round_up drivers | wc 7304261 74775 linux-next git:(heads/next-20240509) ✗ grep -nr round_down drivers | wc 2261293 22194 linux-next git:(heads/next-20240509) ✗ grep -nr DIV_ROUND_CLOSEST drivers | wc 12077461 111822 so I thought to align with existing naming convention assuming developers are already familiar with this. But if a wider consensus is to go with a newer naming convention then I am open to it, although a challenge there would be to keep it short. For e.g. this one is already 3 words, if we go with more explicit "round_closest_up_pow_2" it looks quite long in my opinion :) . Regards Devarsh
Re: [PATCH v7 6/8] math.h Add macros to round to closest specified power of 2
Hi Andy, Thanks for the quick review. On 10/05/24 20:31, Andy Shevchenko wrote: > On Fri, May 10, 2024 at 12:09:52AM +0530, Devarsh Thakkar wrote: >> Add macros to round to nearest specified power of 2. > > This is not what they are doing. For the above we already have macros defined. > Sorry I did not understand this comment, if you are talking about existing macros round_up & round_down they either round "up" and round "down" to specified power of 2 as specified here [1]. whereas the macros introduced in this patch round to "nearest" specified power of 2. >> Two macros are added : > > (Yes, after I wrapped to comment this line looks better on its own, > so whatever will be the first sentence, this line should be separated > from.) > Agreed. >> round_closest_up and round_closest_down which round up to nearest multiple > > round_closest_up() and round_closest_down() > > >> of 2 with a preference to round up or round down respectively if there are >> two possible nearest values to the given number. > > You should reformulate, because AFAICS there is the crucial difference > from these and existing round_*_pow_of_two(). > In math.h, we already have round_up/round_down macros which rounded up/down to next specified power of 2. Then we had the DIV_ROUND_CLOSEST which used the suffix _CLOSEST to imply the meaning that divided value will be rounded to nearest/closest int value either by rounding up or rounding down. So inspired from naming convention of this macros given developers are already familiar with them, I used round_closest for specifying the rounded value to nearest/closest value which can be achieved either by rounding up/down. And I also wanted user to have finer control for the scenarios where there are two possible nearest values : For e.g round(160, 64) can be either 192 and 128 and both are 32 away from 160. And that's the reason I went ahead with two macros instead i.e round_closest_up, round_closest_down so that developers can choose exactly what they want. Regards Devarsh
Re: [PATCH v7 7/8] media: imagination: Round to closest multiple for cropping region
Hi Andy, Thanks for the quick review. On 10/05/24 20:40, Andy Shevchenko wrote: > On Fri, May 10, 2024 at 12:10:01AM +0530, Devarsh Thakkar wrote: >> If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up >> (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest >> multiple of requested value while updating the crop rectangle coordinates. >> >> Use the rounding macro which gives preference to rounding down in case two >> nearest values (high and low) are possible to raise the probability of >> cropping rectangle falling inside the bound region. > > This is arguable. How do we know that the bigger range is supported? > The safest side is to go smaller than bigger. > Yes and that's what the driver does when do when application passes V4L2_SEL_FLAG_LE while doing the selection. If application does not specify explicitly whether to round down or round up the cropping parameters requested by it (i.e app is neither passing V4L2_SEL_FLAG_LE nor V4L2_SEL_FLAG_GE flags), then it is preferred by driver to round the cropping parameters to nearest possible value by either rounding down or rounding up to align with hardware requirements. For e.g. If requested width for cropping region is 127 and HW requires width to be multiple of 64 then we would prefer to round it up to 128 rather than rounding down to a more distant value (i.e. 64), but if requested cropping width is 129 then we would prefer to instead round it down to 128. But if requested cropping width is 160 then there are two nearest possible values 160 - 32 = 128 and 160 + 32 = 192 and in which case we prefer the smaller value as you suggested and that's why the driver uses round_closest_down. For any reason, if still the cropping rectangle falls beyond the bound region, then driver will return out of range error (-ERANGE) to application. Regards Devarsh
[PATCH RESEND v7 8/8] gpu: ipu-v3: Use generic macro for rounding to nearest multiple
Use generic macro round_closest_up for rounding to nearest multiple instead of using local function. Signed-off-by: Devarsh Thakkar --- V1->V6 (No change, patch introduced in V7) --- drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c b/drivers/gpu/ipu-v3/ipu-image-convert.c index 841316582ea9..5192a8b5c02c 100644 --- a/drivers/gpu/ipu-v3/ipu-image-convert.c +++ b/drivers/gpu/ipu-v3/ipu-image-convert.c @@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct ipu_image_convert_ctx *ctx, return 0; } -#define round_closest(x, y) round_down((x) + (y)/2, (y)) - /* * Find the best aligned seam position for the given column / row index. * Rotation and image offsets are out of scope. @@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx *ctx, * The closest input sample position that we could actually * start the input tile at, 19.13 fixed point. */ - in_pos_aligned = round_closest(in_pos, 8192U * in_align); + in_pos_aligned = round_closest_up(in_pos, 8192U * in_align); /* Convert 19.13 fixed point to integer */ in_pos_rounded = in_pos_aligned / 8192U; -- 2.39.1
[PATCH RESEND v7 7/8] media: imagination: Round to closest multiple for cropping region
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest multiple of requested value while updating the crop rectangle coordinates. Use the rounding macro which gives preference to rounding down in case two nearest values (high and low) are possible to raise the probability of cropping rectangle falling inside the bound region. Signed-off-by: Devarsh Thakkar --- V1->V6 (No change, patch introduced in V7) --- drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c b/drivers/media/platform/imagination/e5010-jpeg-enc.c index 189e2a99c43d..abd66bc9b96c 100644 --- a/drivers/media/platform/imagination/e5010-jpeg-enc.c +++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c @@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, struct v4l2_selection switch (s->flags) { case 0: - s->r.width = round_down(s->r.width, queue->fmt->frmsize.step_width); - s->r.height = round_down(s->r.height, queue->fmt->frmsize.step_height); - s->r.left = round_down(s->r.left, queue->fmt->frmsize.step_width); - s->r.top = round_down(s->r.top, 2); + s->r.width = round_closest_down(s->r.width, queue->fmt->frmsize.step_width); + s->r.height = round_closest_down(s->r.height, queue->fmt->frmsize.step_height); + s->r.left = round_closest_down(s->r.left, queue->fmt->frmsize.step_width); + s->r.top = round_closest_down(s->r.top, 2); if (s->r.left + s->r.width > queue->width) s->r.width = round_down(s->r.width + s->r.left - queue->width, -- 2.39.1
[PATCH RESEND v7 6/8] math.h Add macros to round to closest specified power of 2
Add macros to round to nearest specified power of 2. Two macros are added : round_closest_up and round_closest_down which round up to nearest multiple of 2 with a preference to round up or round down respectively if there are two possible nearest values to the given number. This patch is inspired from the Mentor Graphics IPU driver [1] which uses similar macro locally and which can be updated to use this generic macro instead along with other drivers having similar requirements. [1]: https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480 Signed-off-by: Devarsh Thakkar --- V1->V6 (No change, patch introduced in V7) --- include/linux/math.h | 36 1 file changed, 36 insertions(+) diff --git a/include/linux/math.h b/include/linux/math.h index dd4152711de7..82ee3265c910 100644 --- a/include/linux/math.h +++ b/include/linux/math.h @@ -34,6 +34,42 @@ */ #define round_down(x, y) ((x) & ~__round_mask(x, y)) +/** + * round_closest_up - round to nearest specified power of 2 with preference + * to rounding up + * @x: the value to round + * @y: multiple to round to (must be a power of 2) + * + * Rounds @x to nearest multiple of @y (which must be a power of 2). + * The rounded value can be greater than or less than @x depending + * upon it's nearness to @x. If there are two nearest possible values, + * then preference will be given to the greater value. + * + * Examples : + * round_closest_up(17, 4) = 16 + * round_closest_up(15, 4) = 16 + * round_closest_up(14, 4) = 16 + */ +#define round_closest_up(x, y) round_down((x) + (y) / 2, (y)) + +/** + * round_closest_down - round to nearest specified power of 2 with preference + * to rounding down + * @x: the value to round + * @y: multiple to round down to (must be a power of 2) + * + * Rounds @x to nearest multiple of @y (which must be a power of 2). + * The rounded value can be greater than or less than @x depending + * upon it's nearness to @x. If there are two nearest possible values, + * then preference will be given to the lesser value. + * + * Examples : + * round_closest_down(17, 4) = 16 + * round_closest_down(15, 4) = 16 + * round_closest_down(14, 4) = 12 + */ +#define round_closest_down(x, y) round_up((x) - (y) / 2, (y)) + #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP #define DIV_ROUND_DOWN_ULL(ll, d) \ -- 2.39.1
[PATCH RESEND v7 0/8] Add V4L2 M2M Driver for E5010 JPEG Encoder
Resending this V7 series to have proper linking of patches in the series with cover-letter while doing git send-email. Original cover letter: This adds support for V4L2 M2M based driver for E5010 JPEG Encoder which is a stateful JPEG encoder from Imagination technologies and is present in TI AM62A SoC. While adding support for it, following additional framework changes were made: - Moved reference quantization and huffman tables provided in ITU-T-REC-T.81 to v4l2-jpeg.c as suggested in mailing list [1]. - Add macros to round to closest integer (either higher or lower) while rounding in order of 2. v4l2-compliance test : Link: https://gist.github.com/devarsht/1f039c631ca953a57f405cfce1b69e49 E5010 JPEG Encoder Manual tests : Performance: Link: https://gist.github.com/devarsht/c40672944fd71c9a53ab55adbfd9e28b Functionality: Link: https://gist.github.com/devarsht/8e88fcaabff016bb2bac83d89c9d23ce Compression Quality: Link: https://gist.github.com/devarsht/cbcc7cd97e8c48ba1486caa2b7884655 Multi Instance: Link: https://gist.github.com/devarsht/22c2fca08cd3441fb40f2c7a4cebc95a Crop support: Link: https://gist.github.com/devarsht/de6f5142f678bb1a5338abfd9f814abd Runtime PM: Link: https://gist.github.com/devarsht/70cd95d4440ddc678489d93885ddd4dd [1]: https://lore.kernel.org/all/de46aefe-36da-4e1a-b4fa-b375b2749...@xs4all.nl/ Changelog: V6->V7: - Fix cropping support - Move reference huffman and quantization tables to v4l2-jpeg.c - Fix suspend/resume use-case - Add Reviewed-by V5->V6: - Fix sparse warnings V4->V5: - Sort the #includes in driver file alphabetically - Rename huffman and quantization tables to not use '_' - Add Reviewed-by tag V3->V4: - Use ti-specific compatible ti,am62a-jpeg-enc as secondary one in dt-binding - Remove clock-names as only single clock in dt-binding - Fix issue with default params setting - Correct v4l2 error prints - Simplify register write functions with single statement return values - Remove unrequired error checks from get_queue() - Drop explicit device_caps setting as it is already taken care by v4l2 core - Remove unrequired multiplanar checks and memset from s_fmt, g_fmt callback functions - Fix try_fmt callback to not update the queues - Remove unrequired contiguous format attribute from queue_init - Use dynamic allocation for video_device and remove unrequired assignments in probe() - Remove unrequired checks from queue_setup function - Return queued buffers back if start_streaming fails - Use ARRAY_SIZE in place of hard-coding - Use huffman and quantization tables from reference header file V2->V3: - Add DONOTMERGE patches for dts and defconfig - Update driver with below changes : - Correct license headers - Use more generic name core instead of jasper for base registers - Add Comment for forward declarations - Simplify quantization table calculations - Use v4l2_apply_frmsize_constraints for updating framesize and remove unrequired functions - Place TODO at top of file and in commit message too - Use dev_err_probe helper in probe function - Fix return value checking for failure scenarios in probe function - Use v4l2_err/info/warn helpers instead of dev_err/info/warn helpers - Fix unexpected indentation - Correct commit message - Update dt-bindings with below changes : - Add vendor specific compatible - Fix commit title and message - Update reg names - Update clocks to 1 - Fix dts example with proper naming V1->V2: - Send dt-bindings and driver together Patch-Diff between the series : V3->V4 Range diff : https://gist.github.com/devarsht/22a744d999080de6e813bcfb5a596272 V4->V5 Range diff : https://gist.github.com/devarsht/298790af819f299a0a05fec89371097b V5->V6 Range diff : https://gist.github.com/devarsht/c89180ac2b0d2814614f2b59d0705c19 V6->V7 Range diff : https://gist.github.com/devarsht/1db185b1e187eaf397e9e4c37066777e Previous patch series: V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/ V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/ V4: https://lore.kernel.org/all/20240205114239.924697-1-devar...@ti.com/ V5: https://lore.kernel.org/all/20240215134641.3381478-1-devar...@ti.com/ V6: https://lore.kernel.org/all/20240228141140.3530612-1-devar...@ti.com/ Devarsh Thakkar (8): media: dt-bindings: Add Imagination E5010 JPEG Encoder media: imagination: Add E5010 JPEG Encoder driver media: v4l2-jpeg: Export reference quantization and huffman tables media: imagination: Use exported tables from v4l2-jpeg core media: verisilcon : Use exported tables from v4l2-jpeg for hantro codec math.h Add macros to round to closest specified power of 2 media: imagination: Round to closest multiple for cropping region gpu: ipu-v3: Use generic macro for rounding to nearest multiple .../bindings/media/img,e5010-jpeg-enc.yaml| 75 + MAINTAINERS |7 +
[PATCH v7 7/8] media: imagination: Round to closest multiple for cropping region
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest multiple of requested value while updating the crop rectangle coordinates. Use the rounding macro which gives preference to rounding down in case two nearest values (high and low) are possible to raise the probability of cropping rectangle falling inside the bound region. Signed-off-by: Devarsh Thakkar --- V1->V6 (No change, patch introduced in V7) --- drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c b/drivers/media/platform/imagination/e5010-jpeg-enc.c index 189e2a99c43d..abd66bc9b96c 100644 --- a/drivers/media/platform/imagination/e5010-jpeg-enc.c +++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c @@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, struct v4l2_selection switch (s->flags) { case 0: - s->r.width = round_down(s->r.width, queue->fmt->frmsize.step_width); - s->r.height = round_down(s->r.height, queue->fmt->frmsize.step_height); - s->r.left = round_down(s->r.left, queue->fmt->frmsize.step_width); - s->r.top = round_down(s->r.top, 2); + s->r.width = round_closest_down(s->r.width, queue->fmt->frmsize.step_width); + s->r.height = round_closest_down(s->r.height, queue->fmt->frmsize.step_height); + s->r.left = round_closest_down(s->r.left, queue->fmt->frmsize.step_width); + s->r.top = round_closest_down(s->r.top, 2); if (s->r.left + s->r.width > queue->width) s->r.width = round_down(s->r.width + s->r.left - queue->width, -- 2.39.1
[PATCH v7 6/8] math.h Add macros to round to closest specified power of 2
Add macros to round to nearest specified power of 2. Two macros are added : round_closest_up and round_closest_down which round up to nearest multiple of 2 with a preference to round up or round down respectively if there are two possible nearest values to the given number. This patch is inspired from the Mentor Graphics IPU driver [1] which uses similar macro locally and which can be updated to use this generic macro instead along with other drivers having similar requirements. [1]: https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480 Signed-off-by: Devarsh Thakkar --- V1->V6 (No change, patch introduced in V7) --- include/linux/math.h | 36 1 file changed, 36 insertions(+) diff --git a/include/linux/math.h b/include/linux/math.h index dd4152711de7..82ee3265c910 100644 --- a/include/linux/math.h +++ b/include/linux/math.h @@ -34,6 +34,42 @@ */ #define round_down(x, y) ((x) & ~__round_mask(x, y)) +/** + * round_closest_up - round to nearest specified power of 2 with preference + * to rounding up + * @x: the value to round + * @y: multiple to round to (must be a power of 2) + * + * Rounds @x to nearest multiple of @y (which must be a power of 2). + * The rounded value can be greater than or less than @x depending + * upon it's nearness to @x. If there are two nearest possible values, + * then preference will be given to the greater value. + * + * Examples : + * round_closest_up(17, 4) = 16 + * round_closest_up(15, 4) = 16 + * round_closest_up(14, 4) = 16 + */ +#define round_closest_up(x, y) round_down((x) + (y) / 2, (y)) + +/** + * round_closest_down - round to nearest specified power of 2 with preference + * to rounding down + * @x: the value to round + * @y: multiple to round down to (must be a power of 2) + * + * Rounds @x to nearest multiple of @y (which must be a power of 2). + * The rounded value can be greater than or less than @x depending + * upon it's nearness to @x. If there are two nearest possible values, + * then preference will be given to the lesser value. + * + * Examples : + * round_closest_down(17, 4) = 16 + * round_closest_down(15, 4) = 16 + * round_closest_down(14, 4) = 12 + */ +#define round_closest_down(x, y) round_up((x) - (y) / 2, (y)) + #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP #define DIV_ROUND_DOWN_ULL(ll, d) \ -- 2.39.1
[PATCH v7 8/8] gpu: ipu-v3: Use generic macro for rounding to nearest multiple
Use generic macro round_closest_up for rounding to nearest multiple instead of using local function. Signed-off-by: Devarsh Thakkar --- V1->V6 (No change, patch introduced in V7) --- drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c b/drivers/gpu/ipu-v3/ipu-image-convert.c index 841316582ea9..5192a8b5c02c 100644 --- a/drivers/gpu/ipu-v3/ipu-image-convert.c +++ b/drivers/gpu/ipu-v3/ipu-image-convert.c @@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct ipu_image_convert_ctx *ctx, return 0; } -#define round_closest(x, y) round_down((x) + (y)/2, (y)) - /* * Find the best aligned seam position for the given column / row index. * Rotation and image offsets are out of scope. @@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx *ctx, * The closest input sample position that we could actually * start the input tile at, 19.13 fixed point. */ - in_pos_aligned = round_closest(in_pos, 8192U * in_align); + in_pos_aligned = round_closest_up(in_pos, 8192U * in_align); /* Convert 19.13 fixed point to integer */ in_pos_rounded = in_pos_aligned / 8192U; -- 2.39.1
Re: [RFC PATCH 2/3] drm/tidss: Add support for display sharing
Hi Maxime, On 14/03/24 20:04, Maxime Ripard wrote: > Hi, > > On Wed, Feb 14, 2024 at 09:17:12PM +0530, Devarsh Thakkar wrote: >> On 13/02/24 19:34, Maxime Ripard wrote: >>> On Thu, Feb 08, 2024 at 06:26:17PM +0530, Devarsh Thakkar wrote: >>>> On 26/01/24 17:45, Maxime Ripard wrote: >>>>> Hi, >>>>> >>>>> Thanks a lot for working on that. >>>>> >>>>> On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote: >>>>>> Display subsystem present in TI Keystone family of devices supports >>>>>> sharing >>>>>> of display between multiple hosts as it provides separate register space >>>>>> (common* region) for each host to programming display controller and >>>>>> also a >>>>>> unique interrupt line for each host. >>>>>> >>>>>> This adds support for display sharing, by allowing partitioning of >>>>>> resources either at video port level or at video plane level as >>>>>> described below : >>>>>> >>>>>> 1) Linux can own (i.e have write access) completely one or more of video >>>>>> ports along with corresponding resources (viz. overlay managers, >>>>>> video planes) used by Linux in context of those video ports. >>>>>> Even if Linux is owning >>>>>> these video ports it can still share this video port with a remote core >>>>>> which can own one or more video planes associated with this video port. >>>>>> >>>>>> 2) Linux owns one or more of the video planes with video port >>>>>> (along with corresponding overlay manager) associated with these planes >>>>>> being owned and controlled by a remote core. Linux still has read-only >>>>>> access to the associated video port and overlay managers so that it can >>>>>> parse the settings made by remote core. >>>>> >>>>> So, just to make sure we're on the same page. 1) means Linux drives the >>>>> whole display engine, but can lend planes to the R5? How does that work, >>>>> is Linux aware of the workload being there (plane size, format, etc) ? >>>>> >>>> >>>> Well, there is no dynamic procedure being followed for lending. The >>>> partitioning scheme is decided and known before hand, and the remote >>>> core firmware updated and compiled accordingly, and similarly the >>>> device-tree overlay for Linux is also updated with partitioning >>>> information before bootup. >>>> >>>> What would happen here is that Linux will know before-hand this >>>> partitioning information via device-tree properties and won't enumerate >>>> the plane owned by RTOS, but it will enumerate the rest of the display >>>> components and initialize the DSS, after which user can load the DSS >>>> firmware on remote core and this firmware will only have control of >>>> plane as it was compiled with that configuration. >>> >>> Right. If the RTOS is in control of a single plane, how it is expected >>> to deal with Linux shutting the CRTC down, or enforcing a configuration >>> that isn't compatible with what the RTOS expects (like a plane with a >>> higher zpos masking its plane), what is the mechanism to reconcile it? >>> >> >> Just for the note, for this "RTOS control single plane" mode, we don't have a >> firmware available to test (right now we are only supporting example for >> "RTOS >> controlling the display mode" as shared here [1]) and hence this is not >> validated but the idea was to keep dt-bindings generic enough to support them >> in future and that's why I referred to it here. >> >> Coming back to your questions, with the current scheme the Linux (tidss) >> would >> be expected to make sure the CRTC being shared with RTOS is never shutdown >> and >> the RTOS plane should never gets masked. > > I'm probably missing something then here, but if the Linux side of > things is expected to keep the current configuration and keep it active > for it to work, what use-case would it be useful for? > It's just one of the partitioning possibilities that I mentioned here, that Linux is in control of DSS as a whole and the user want the other host (be it RTOS or any other core) to control a single plane. For e.g it could be Linux (with GPU rend
[PATCH v5 3/4] arm64: dts: ti: Add common1 register space for AM62x SoC
This adds common1 register space for AM62x SoC which is using TI's Keystone display hardware and supporting it as described in Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml Fixes: 8ccc1073c7bb ("arm64: dts: ti: k3-am62-main: Add node for DSS") Signed-off-by: Devarsh Thakkar --- V1->V4 : - No change (this was part of "arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs ) V5 : - Split this as a separate patch from "arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs" - Remove Reviewed-By tag as patch is split now --- arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi index fe0cc4a9a501..8cee4d94cdd3 100644 --- a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi @@ -779,9 +779,10 @@ dss: dss@3020 { <0x00 0x30207000 0x00 0x1000>, /* ovr1 */ <0x00 0x30208000 0x00 0x1000>, /* ovr2 */ <0x00 0x3020a000 0x00 0x1000>, /* vp1: Used for OLDI */ - <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */ + <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */ + <0x00 0x30201000 0x00 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>; clocks = <&k3_clks 186 6>, <&dss_vp1_clk>, -- 2.34.1
[PATCH v5 1/4] dt-bindings: display: ti, am65x-dss: Add support for common1 region
TI keystone display subsystem present in AM65, AM62 and AM62A SoC support two separate register spaces namely "common" and "common1" which can be used by two separate hosts to program the display controller as described in respective Technical Reference Manuals [1]. The common1 register space has similar set of configuration registers as supported in common register space except the global configuration registers which are exclusive to common region. This adds binding for "common1" register region too as supported by the hardware. [1]: AM62x TRM: https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers) AM65x TRM: https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers) AM62A TRM: https://www.ti.com/lit/pdf/spruj16 (Section 14.9.9 Display Subsystem Registers) Fixes: 2d8730f1021f ("dt-bindings: display: ti,am65x-dss: Add dt-schema yaml binding") Signed-off-by: Devarsh Thakkar Reviewed-by: Aradhya Bhatia Acked-by: Conor Dooley --- V2: Add Acked-by tag V3: Add Fixes tag V4: Add Reviewed-by and AM62A TRM link V5: No change --- .../devicetree/bindings/display/ti/ti,am65x-dss.yaml | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml index b6767ef0d24d..55e3e490d0e6 100644 --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml @@ -37,6 +37,7 @@ properties: - description: OVR2 overlay manager for vp2 - description: VP1 video port 1 - description: VP2 video port 2 + - description: common1 DSS register area reg-names: items: @@ -47,6 +48,7 @@ properties: - const: ovr2 - const: vp1 - const: vp2 + - const: common1 clocks: items: @@ -147,9 +149,10 @@ examples: <0x04a07000 0x1000>, /* ovr1 */ <0x04a08000 0x1000>, /* ovr2 */ <0x04a0a000 0x1000>, /* vp1 */ -<0x04a0b000 0x1000>; /* vp2 */ +<0x04a0b000 0x1000>, /* vp2 */ +<0x04a01000 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", -"ovr1", "ovr2", "vp1", "vp2"; +"ovr1", "ovr2", "vp1", "vp2", "common1"; ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>; power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>; clocks =<&k3_clks 67 1>, -- 2.34.1
[PATCH v5 2/4] arm64: dts: ti: Add common1 register space for AM65x SoC
This adds common1 register space for AM65x SoC which is using TI's Keystone display hardware and supporting it as described in Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml Fixes: fc539b90eda2 ("arm64: dts: ti: am654: Add DSS node") Signed-off-by: Devarsh Thakkar --- V1->V4 : - No change (this was part of "arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs ) V5 : - Split this as a separate patch from "arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs" - Remove Reviewed-By tag as patch is split now --- arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi index 07010d31350e..ff857117d719 100644 --- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi @@ -991,9 +991,10 @@ dss: dss@4a0 { <0x0 0x04a07000 0x0 0x1000>, /* ovr1 */ <0x0 0x04a08000 0x0 0x1000>, /* ovr2 */ <0x0 0x04a0a000 0x0 0x1000>, /* vp1 */ - <0x0 0x04a0b000 0x0 0x1000>; /* vp2 */ + <0x0 0x04a0b000 0x0 0x1000>, /* vp2 */ + <0x0 0x04a01000 0x0 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>; -- 2.34.1
[PATCH v5 0/4] Add common1 region for AM62, AM62A & AM65x
This adds DSS common1 region for respective SoCs supporting it. Changelog: V2 : Remove do-not-merge tag and add am62a dss common1 reion V3 : Add Fixes tag to each commit V4 : Add Reviewed-by tag and AM62A SoC TRM Link V5 : Split dts patch to separate patches for each SoC Devarsh Thakkar (4): dt-bindings: display: ti,am65x-dss: Add support for common1 region arm64: dts: ti: Add common1 register space for AM65x SoC arm64: dts: ti: Add common1 register space for AM62x SoC arm64: dts: ti: Add common1 register space for AM62A SoC .../devicetree/bindings/display/ti/ti,am65x-dss.yaml | 7 +-- arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++-- 4 files changed, 14 insertions(+), 8 deletions(-) -- 2.34.1
[PATCH v5 4/4] arm64: dts: ti: Add common1 register space for AM62A SoC
This adds common1 register space for AM62A SoC which is using TI's Keystone display hardware and supporting it as described in Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml Fixes: 3618811657b3 ("arm64: dts: ti: k3-am62a-main: Add node for Display SubSystem (DSS)") Signed-off-by: Devarsh Thakkar --- V1->V4 : - No change (this was part of "arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs ) V5 : - Split this as a separate patch from "arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs" - Remove Reviewed-By tag as this is split from older revision --- arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi index 2749533cf4fd..dbdd6bec4667 100644 --- a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi @@ -994,9 +994,10 @@ dss: dss@3020 { <0x00 0x30207000 0x00 0x1000>, /* ovr1 */ <0x00 0x30208000 0x00 0x1000>, /* ovr2 */ <0x00 0x3020a000 0x00 0x1000>, /* vp1: Tied OFF in the SoC */ - <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */ + <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */ + <0x00 0x30201000 0x00 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>; clocks = <&k3_clks 186 6>, <&k3_clks 186 0>, -- 2.34.1
[PATCH v4 1/2] dt-bindings: display: ti, am65x-dss: Add support for common1 region
TI keystone display subsystem present in AM65 and other SoCs such as AM62 support two separate register spaces namely "common" and "common1" which can be used by two separate hosts to program the display controller as described in respective Technical Reference Manuals [1]. The common1 register space has similar set of configuration registers as supported in common register space except the global configuration registers which are exclusive to common region. This adds binding for "common1" register region too as supported by the hardware. [1]: AM62x TRM: https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers) AM65x TRM: https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers) AM62A TRM: https://www.ti.com/lit/pdf/spruj16 (Section 14.9.9 Display Subsystem Registers) Fixes: 2d8730f1021f ("dt-bindings: display: ti,am65x-dss: Add dt-schema yaml binding") Signed-off-by: Devarsh Thakkar Reviewed-by: Aradhya Bhatia Acked-by: Conor Dooley --- V2: Add Acked-by tag V3: Add Fixes tag V4: Add Reviewed-by and AM62A TRM link --- .../devicetree/bindings/display/ti/ti,am65x-dss.yaml | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml index b6767ef0d24d..55e3e490d0e6 100644 --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml @@ -37,6 +37,7 @@ properties: - description: OVR2 overlay manager for vp2 - description: VP1 video port 1 - description: VP2 video port 2 + - description: common1 DSS register area reg-names: items: @@ -47,6 +48,7 @@ properties: - const: ovr2 - const: vp1 - const: vp2 + - const: common1 clocks: items: @@ -147,9 +149,10 @@ examples: <0x04a07000 0x1000>, /* ovr1 */ <0x04a08000 0x1000>, /* ovr2 */ <0x04a0a000 0x1000>, /* vp1 */ -<0x04a0b000 0x1000>; /* vp2 */ +<0x04a0b000 0x1000>, /* vp2 */ +<0x04a01000 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", -"ovr1", "ovr2", "vp1", "vp2"; +"ovr1", "ovr2", "vp1", "vp2", "common1"; ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>; power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>; clocks =<&k3_clks 67 1>, -- 2.34.1
[PATCH v4 0/2] Add common1 region for AM62, AM62A & AM65x
This adds DSS common1 region for respective SoCs supporting it. Changelog: V2 : Remove do-not-merge tag and add am62a dss common1 reion V3 : Add Fixes tag to each commit V4 : Add Reviewed-by tag and AM62A SoC TRM Link Devarsh Thakkar (2): dt-bindings: display: ti,am65x-dss: Add support for common1 region arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs .../devicetree/bindings/display/ti/ti,am65x-dss.yaml | 7 +-- arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++-- 4 files changed, 14 insertions(+), 8 deletions(-) -- 2.34.1
[PATCH v4 2/2] arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs
This adds common1 register space for AM62x, AM62A and AM65x SoC's which are using TI's Keystone display hardware and supporting it as described in Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml Fixes: 3618811657b3 ("arm64: dts: ti: k3-am62a-main: Add node for Display SubSystem (DSS)") Fixes: 8ccc1073c7bb ("arm64: dts: ti: k3-am62-main: Add node for DSS") Fixes: fc539b90eda2 ("arm64: dts: ti: am654: Add DSS node") Signed-off-by: Devarsh Thakkar Reviewed-by: Aradhya Bhatia --- V2: Add common1 region for AM62A SoC too V3: Add Fixes tag V4: Add Reviewed-by --- arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++-- 3 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi index fe0cc4a9a501..8cee4d94cdd3 100644 --- a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi @@ -779,9 +779,10 @@ dss: dss@3020 { <0x00 0x30207000 0x00 0x1000>, /* ovr1 */ <0x00 0x30208000 0x00 0x1000>, /* ovr2 */ <0x00 0x3020a000 0x00 0x1000>, /* vp1: Used for OLDI */ - <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */ + <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */ + <0x00 0x30201000 0x00 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>; clocks = <&k3_clks 186 6>, <&dss_vp1_clk>, diff --git a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi index 972971159a62..f475daea548e 100644 --- a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi @@ -994,9 +994,10 @@ dss: dss@3020 { <0x00 0x30207000 0x00 0x1000>, /* ovr1 */ <0x00 0x30208000 0x00 0x1000>, /* ovr2 */ <0x00 0x3020a000 0x00 0x1000>, /* vp1: Tied OFF in the SoC */ - <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */ + <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */ + <0x00 0x30201000 0x00 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>; clocks = <&k3_clks 186 6>, <&k3_clks 186 0>, diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi index 07010d31350e..ff857117d719 100644 --- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi @@ -991,9 +991,10 @@ dss: dss@4a0 { <0x0 0x04a07000 0x0 0x1000>, /* ovr1 */ <0x0 0x04a08000 0x0 0x1000>, /* ovr2 */ <0x0 0x04a0a000 0x0 0x1000>, /* vp1 */ - <0x0 0x04a0b000 0x0 0x1000>; /* vp2 */ + <0x0 0x04a0b000 0x0 0x1000>, /* vp2 */ + <0x0 0x04a01000 0x0 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>; -- 2.34.1
[PATCH v3 0/2] Add common1 region for AM62, AM62A & AM65x
This adds DSS common1 region for respective SoCs supporting it. Changelog: V2 : Remove do-not-merge tag and add am62a dss common1 reion V3 : Add Fixes tag to each commit Devarsh Thakkar (2): dt-bindings: display: ti,am65x-dss: Add support for common1 region arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs .../devicetree/bindings/display/ti/ti,am65x-dss.yaml | 7 +-- arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++-- 4 files changed, 14 insertions(+), 8 deletions(-) -- 2.34.1
[PATCH v3 1/2] dt-bindings: display: ti, am65x-dss: Add support for common1 region
TI keystone display subsystem present in AM65 and other SoCs such as AM62 support two separate register spaces namely "common" and "common1" which can be used by two separate hosts to program the display controller as described in respective Technical Reference Manuals [1]. The common1 register space has similar set of configuration registers as supported in common register space except the global configuration registers which are exclusive to common region. This adds binding for "common1" register region too as supported by the hardware. [1]: AM62x TRM: https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers) AM65x TRM: https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers) Fixes: 2d8730f1021f ("dt-bindings: display: ti,am65x-dss: Add dt-schema yaml binding") Signed-off-by: Devarsh Thakkar Acked-by: Conor Dooley --- V2: Add Acked-by tag V3: Add Fixes tag --- .../devicetree/bindings/display/ti/ti,am65x-dss.yaml | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml index b6767ef0d24d..55e3e490d0e6 100644 --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml @@ -37,6 +37,7 @@ properties: - description: OVR2 overlay manager for vp2 - description: VP1 video port 1 - description: VP2 video port 2 + - description: common1 DSS register area reg-names: items: @@ -47,6 +48,7 @@ properties: - const: ovr2 - const: vp1 - const: vp2 + - const: common1 clocks: items: @@ -147,9 +149,10 @@ examples: <0x04a07000 0x1000>, /* ovr1 */ <0x04a08000 0x1000>, /* ovr2 */ <0x04a0a000 0x1000>, /* vp1 */ -<0x04a0b000 0x1000>; /* vp2 */ +<0x04a0b000 0x1000>, /* vp2 */ +<0x04a01000 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", -"ovr1", "ovr2", "vp1", "vp2"; +"ovr1", "ovr2", "vp1", "vp2", "common1"; ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>; power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>; clocks =<&k3_clks 67 1>, -- 2.34.1
[PATCH v3 2/2] arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs
This adds common1 register space for AM62x, AM62A and AM65x SoC's which are using TI's Keystone display hardware and supporting it as described in Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml Fixes: 3618811657b3 ("arm64: dts: ti: k3-am62a-main: Add node for Display SubSystem (DSS)") Fixes: 8ccc1073c7bb ("arm64: dts: ti: k3-am62-main: Add node for DSS") Fixes: fc539b90eda2 ("arm64: dts: ti: am654: Add DSS node") Signed-off-by: Devarsh Thakkar --- V2: Add common1 region for AM62A SoC too V3: Add Fixes tag --- arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++-- 3 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi index fe0cc4a9a501..8cee4d94cdd3 100644 --- a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi @@ -779,9 +779,10 @@ dss: dss@3020 { <0x00 0x30207000 0x00 0x1000>, /* ovr1 */ <0x00 0x30208000 0x00 0x1000>, /* ovr2 */ <0x00 0x3020a000 0x00 0x1000>, /* vp1: Used for OLDI */ - <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */ + <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */ + <0x00 0x30201000 0x00 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>; clocks = <&k3_clks 186 6>, <&dss_vp1_clk>, diff --git a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi index 972971159a62..f475daea548e 100644 --- a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi @@ -994,9 +994,10 @@ dss: dss@3020 { <0x00 0x30207000 0x00 0x1000>, /* ovr1 */ <0x00 0x30208000 0x00 0x1000>, /* ovr2 */ <0x00 0x3020a000 0x00 0x1000>, /* vp1: Tied OFF in the SoC */ - <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */ + <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */ + <0x00 0x30201000 0x00 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>; clocks = <&k3_clks 186 6>, <&k3_clks 186 0>, diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi index 07010d31350e..ff857117d719 100644 --- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi @@ -991,9 +991,10 @@ dss: dss@4a0 { <0x0 0x04a07000 0x0 0x1000>, /* ovr1 */ <0x0 0x04a08000 0x0 0x1000>, /* ovr2 */ <0x0 0x04a0a000 0x0 0x1000>, /* vp1 */ - <0x0 0x04a0b000 0x0 0x1000>; /* vp2 */ + <0x0 0x04a0b000 0x0 0x1000>, /* vp2 */ + <0x0 0x04a01000 0x0 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>; -- 2.34.1
Re: [RFC PATCH 2/3] drm/tidss: Add support for display sharing
Hi Maxime, Thanks for the quick reply. On 13/02/24 19:34, Maxime Ripard wrote: > Hi Devarsh, > > On Thu, Feb 08, 2024 at 06:26:17PM +0530, Devarsh Thakkar wrote: >> Hi Maxime, >> >> Thanks a lot for checking on this. >> >> On 26/01/24 17:45, Maxime Ripard wrote: >>> Hi, >>> >>> Thanks a lot for working on that. >>> >>> On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote: >>>> Display subsystem present in TI Keystone family of devices supports sharing >>>> of display between multiple hosts as it provides separate register space >>>> (common* region) for each host to programming display controller and also a >>>> unique interrupt line for each host. >>>> >>>> This adds support for display sharing, by allowing partitioning of >>>> resources either at video port level or at video plane level as >>>> described below : >>>> >>>> 1) Linux can own (i.e have write access) completely one or more of video >>>> ports along with corresponding resources (viz. overlay managers, >>>> video planes) used by Linux in context of those video ports. >>>> Even if Linux is owning >>>> these video ports it can still share this video port with a remote core >>>> which can own one or more video planes associated with this video port. >>>> >>>> 2) Linux owns one or more of the video planes with video port >>>> (along with corresponding overlay manager) associated with these planes >>>> being owned and controlled by a remote core. Linux still has read-only >>>> access to the associated video port and overlay managers so that it can >>>> parse the settings made by remote core. >>> >>> So, just to make sure we're on the same page. 1) means Linux drives the >>> whole display engine, but can lend planes to the R5? How does that work, >>> is Linux aware of the workload being there (plane size, format, etc) ? >>> >> >> Well, there is no dynamic procedure being followed for lending. The >> partitioning scheme is decided and known before hand, and the remote >> core firmware updated and compiled accordingly, and similarly the >> device-tree overlay for Linux is also updated with partitioning >> information before bootup. >> >> What would happen here is that Linux will know before-hand this >> partitioning information via device-tree properties and won't enumerate >> the plane owned by RTOS, but it will enumerate the rest of the display >> components and initialize the DSS, after which user can load the DSS >> firmware on remote core and this firmware will only have control of >> plane as it was compiled with that configuration. > > Right. If the RTOS is in control of a single plane, how it is expected > to deal with Linux shutting the CRTC down, or enforcing a configuration > that isn't compatible with what the RTOS expects (like a plane with a > higher zpos masking its plane), what is the mechanism to reconcile it? > Just for the note, for this "RTOS control single plane" mode, we don't have a firmware available to test (right now we are only supporting example for "RTOS controlling the display mode" as shared here [1]) and hence this is not validated but the idea was to keep dt-bindings generic enough to support them in future and that's why I referred to it here. Coming back to your questions, with the current scheme the Linux (tidss) would be expected to make sure the CRTC being shared with RTOS is never shutdown and the RTOS plane should never gets masked. I think the IPC based scheme would have been mainly needed for the case where you have a single entity controlling the display for e.g you have a single display controller register space and a single IRQ but you have multiple planes and say you want to divide these planes to different host processors. In that case you want a single entity to act as a main entity and be in control of DSS and rest of the processors communicate with the "main entity" to request display resources and plane updates and main entity also programs dss on their behalf. But unlike above, TI DSS7 is designed to support static partitioning of display resources among multiple hosts, where each host can program the display hardware independently using separate register space and having a separate irq and without requirement of any communication between the hosts. Now as this feature is unique to TI DSS7 we want to support this feature in tidss driver. The DSS resource partitioning feature is described in detail here [2] >>> And 2) would mean that the display e
[PATCH v2 2/2] arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs
This adds common1 register space for AM62x, AM62A and AM65x SoC's which are using TI's Keystone display hardware and supporting it as described in Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml Signed-off-by: Devarsh Thakkar --- V2: Add common1 region for AM62A SoC too --- arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++-- 3 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi index fe0cc4a9a501..8cee4d94cdd3 100644 --- a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi @@ -779,9 +779,10 @@ dss: dss@3020 { <0x00 0x30207000 0x00 0x1000>, /* ovr1 */ <0x00 0x30208000 0x00 0x1000>, /* ovr2 */ <0x00 0x3020a000 0x00 0x1000>, /* vp1: Used for OLDI */ - <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */ + <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */ + <0x00 0x30201000 0x00 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>; clocks = <&k3_clks 186 6>, <&dss_vp1_clk>, diff --git a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi index 972971159a62..f475daea548e 100644 --- a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi @@ -994,9 +994,10 @@ dss: dss@3020 { <0x00 0x30207000 0x00 0x1000>, /* ovr1 */ <0x00 0x30208000 0x00 0x1000>, /* ovr2 */ <0x00 0x3020a000 0x00 0x1000>, /* vp1: Tied OFF in the SoC */ - <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */ + <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */ + <0x00 0x30201000 0x00 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>; clocks = <&k3_clks 186 6>, <&k3_clks 186 0>, diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi index 07010d31350e..ff857117d719 100644 --- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi @@ -991,9 +991,10 @@ dss: dss@4a0 { <0x0 0x04a07000 0x0 0x1000>, /* ovr1 */ <0x0 0x04a08000 0x0 0x1000>, /* ovr2 */ <0x0 0x04a0a000 0x0 0x1000>, /* vp1 */ - <0x0 0x04a0b000 0x0 0x1000>; /* vp2 */ + <0x0 0x04a0b000 0x0 0x1000>, /* vp2 */ + <0x0 0x04a01000 0x0 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>; -- 2.34.1
[PATCH v2 0/2] Add common1 region for AM62, AM62A & AM65x
This adds DSS common1 region for respective SoCs supporting it. Devarsh Thakkar (2): dt-bindings: display: ti,am65x-dss: Add support for common1 region arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs .../devicetree/bindings/display/ti/ti,am65x-dss.yaml | 7 +-- arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++-- 4 files changed, 14 insertions(+), 8 deletions(-) -- 2.34.1
[PATCH v2 1/2] dt-bindings: display: ti, am65x-dss: Add support for common1 region
TI keystone display subsystem present in AM65 and other SoCs such as AM62 support two separate register spaces namely "common" and "common1" which can be used by two separate hosts to program the display controller as described in respective Technical Reference Manuals [1]. The common1 register space has similar set of configuration registers as supported in common register space except the global configuration registers which are exclusive to common region. This adds binding for "common1" register region too as supported by the hardware. [1]: AM62x TRM: https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers) AM65x TRM: https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers) AM62A TRM: https://www.ti.com/lit/pdf/spruj16 (Section 14.9.9.1 DSS Registers) Signed-off-by: Devarsh Thakkar Acked-by: Conor Dooley --- V2: Add Acked-by tag --- .../devicetree/bindings/display/ti/ti,am65x-dss.yaml | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml index b6767ef0d24d..55e3e490d0e6 100644 --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml @@ -37,6 +37,7 @@ properties: - description: OVR2 overlay manager for vp2 - description: VP1 video port 1 - description: VP2 video port 2 + - description: common1 DSS register area reg-names: items: @@ -47,6 +48,7 @@ properties: - const: ovr2 - const: vp1 - const: vp2 + - const: common1 clocks: items: @@ -147,9 +149,10 @@ examples: <0x04a07000 0x1000>, /* ovr1 */ <0x04a08000 0x1000>, /* ovr2 */ <0x04a0a000 0x1000>, /* vp1 */ -<0x04a0b000 0x1000>; /* vp2 */ +<0x04a0b000 0x1000>, /* vp2 */ +<0x04a01000 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", -"ovr1", "ovr2", "vp1", "vp2"; +"ovr1", "ovr2", "vp1", "vp2", "common1"; ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>; power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>; clocks =<&k3_clks 67 1>, -- 2.34.1
Re: [PATCH 1/2] dt-bindings: display: ti,am65x-dss: Add support for common1 region
Hi Tomi, Vignesh, On 14/02/24 14:53, Tomi Valkeinen wrote: > On 14/02/2024 11:10, Tomi Valkeinen wrote: >> Hi, >> >> On 15/01/2024 14:57, Devarsh Thakkar wrote: >>> TI keystone display subsystem present in AM65 and other SoCs such as AM62 >>> support two separate register spaces namely "common" and "common1" which >>> can be used by two separate hosts to program the display controller as >>> described in respective Technical Reference Manuals [1]. >>> >>> The common1 register space has similar set of configuration registers as >>> supported in common register space except the global configuration >>> registers which are exclusive to common region. >>> >>> This adds binding for "common1" register region too as supported by the >>> hardware. >>> >>> [1]: >>> AM62x TRM: >>> https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers) >>> >>> AM65x TRM: >>> https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers) >>> >>> Signed-off-by: Devarsh Thakkar >>> --- >>> .../devicetree/bindings/display/ti/ti,am65x-dss.yaml | 7 +-- >>> 1 file changed, 5 insertions(+), 2 deletions(-) >>> >>> diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >>> b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >>> index b6767ef0d24d..55e3e490d0e6 100644 >>> --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >>> +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >>> @@ -37,6 +37,7 @@ properties: >>> - description: OVR2 overlay manager for vp2 >>> - description: VP1 video port 1 >>> - description: VP2 video port 2 >>> + - description: common1 DSS register area >>> reg-names: >>> items: >>> @@ -47,6 +48,7 @@ properties: >>> - const: ovr2 >>> - const: vp1 >>> - const: vp2 >>> + - const: common1 >>> clocks: >>> items: >>> @@ -147,9 +149,10 @@ examples: >>> <0x04a07000 0x1000>, /* ovr1 */ >>> <0x04a08000 0x1000>, /* ovr2 */ >>> <0x04a0a000 0x1000>, /* vp1 */ >>> - <0x04a0b000 0x1000>; /* vp2 */ >>> + <0x04a0b000 0x1000>, /* vp2 */ >>> + <0x04a01000 0x1000>; /* common1 */ >>> reg-names = "common", "vidl1", "vid", >>> - "ovr1", "ovr2", "vp1", "vp2"; >>> + "ovr1", "ovr2", "vp1", "vp2", "common1"; >>> ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>; >>> power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>; >>> clocks = <&k3_clks 67 1>, >> >> Looks fine to me, I'll apply to drm-misc-next. > > Hmm, now thinking about this, doesn't this cause dtb checks to start failing, > as the dtbs are missing one entry? Is it better to merge these kind of changes > with the dts changes? Or does it matter? > Yes if one get's applied and other doesn't then there will be such issues. I am sending shortly both the dt-binding and device-tree patches together, as long as both look fine and ready to be accepted by respective maintainers, I think both can get merged to respective trees and land in linux-next without causing any issues. Regards Devarsh > Tomi >
Re: [RFC PATCH 2/3] drm/tidss: Add support for display sharing
Hi Maxime, Thanks a lot for checking on this. On 26/01/24 17:45, Maxime Ripard wrote: > Hi, > > Thanks a lot for working on that. > > On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote: >> Display subsystem present in TI Keystone family of devices supports sharing >> of display between multiple hosts as it provides separate register space >> (common* region) for each host to programming display controller and also a >> unique interrupt line for each host. >> >> This adds support for display sharing, by allowing partitioning of >> resources either at video port level or at video plane level as >> described below : >> >> 1) Linux can own (i.e have write access) completely one or more of video >> ports along with corresponding resources (viz. overlay managers, >> video planes) used by Linux in context of those video ports. >> Even if Linux is owning >> these video ports it can still share this video port with a remote core >> which can own one or more video planes associated with this video port. >> >> 2) Linux owns one or more of the video planes with video port >> (along with corresponding overlay manager) associated with these planes >> being owned and controlled by a remote core. Linux still has read-only >> access to the associated video port and overlay managers so that it can >> parse the settings made by remote core. > > So, just to make sure we're on the same page. 1) means Linux drives the > whole display engine, but can lend planes to the R5? How does that work, > is Linux aware of the workload being there (plane size, format, etc) ? > Well, there is no dynamic procedure being followed for lending. The partitioning scheme is decided and known before hand, and the remote core firmware updated and compiled accordingly, and similarly the device-tree overlay for Linux is also updated with partitioning information before bootup. What would happen here is that Linux will know before-hand this partitioning information via device-tree properties and won't enumerate the plane owned by RTOS, but it will enumerate the rest of the display components and initialize the DSS, after which user can load the DSS firmware on remote core and this firmware will only have control of plane as it was compiled with that configuration. > And 2) would mean that the display engine is under the R5 control and > Linux only gets to fill the plane and let the firmware know of what it > wants? > Here too the partitioning information is pre-decided and remote core firmware and device-tree overlay for Linux updated accordingly. But in this case as remote core firmware owns the display (minus the plane owned by Linux) it is started and initialized during the bootloader phase itself where it initializes the DSS and starts rendering using the plane owned by it and Linux just latches to the DSS without re-initializing it, with write access only to the plane that is owned by Linux. You can refer [1] for more details on this. > If so, do we even need the tidss driver in the second case? We could > just write a fwkms driver of some sorts that could be used by multiple > implementations of the same "defer to firmware" logic. > This feature of static partitioning of DSS resources is specific to DSS7 hardware (which is controlled by tidss driver) which supports dedicated register space and interrupt line for each of the hosts [0], so that multiple hosts can drive the display controller simultaneously as per the desired static partitioning of resources, and so I don't think a separate driver is required here and tidss seems the right place to support this, where using this device-tree approach different resource partitioning schemas can be achieved as described here [1]. This was also aligned with Tomi too where we discussed that tidss is the right place to support this as we are simply leveraging the DSS hardware capabilities of static partitioning here. >> For both the cases, the resources used in context of processing core >> running Linux along with ownership information are exposed by user as >> part of device-tree blob and driver uses an updated feature list tailored >> for this shared mode accordingly. The driver also auto-populates >> matching overlay managers and output types from shared video >> port list provided in device-tree blob. >> In dispc_feature struct remove const access specfier for output_type >> array as it is required to be updated dynamically in run-time for shared >> mode. > > I'm also not entirely sure that the device tree is the right path there. > Surely the firmware capabilities will evolve over time, while the device > tree won't. Is there some way to make it discoverable at probe time by &g
Re: [RFC PATCH 3/3] arm64: dts: ti: k3-am62x: Add overlay to use DSS in display sharing mode
Hi Tomi, Thanks for the review. On 23/01/24 13:59, Tomi Valkeinen wrote: > Hi, > > On 16/01/2024 15:41, Devarsh Thakkar wrote: >> This overlay needs to be used with display sharing supported device >> manager firmware only. >> >> Remote core running this firmware has write access to "common" register >> space, VIDL pipeline, OVR1 overlay and VP1 videoport. >> >> The processing core running Linux is provided write access to VID >> pipeline and "common1" register space. >> >> The VP1 video port is shared between Linux and remote core with remote >> core configuring the overlay manager to set Zorder 1 for VID pipeline >> and Zorder 2 for VIDL pipeline. >> >> Add reserved memory region for framebuffer region used by remote core in >> dss shared mode overlay file so that Linux does not re-use the same >> while allocating memory. > > I don't understand this one. Why is RAM used by RTOS accessible by Linux > in the first place? > Well, I think the R5 SPL initializes full DDR before starting firmwares on remote cores and the regions used by this remote cores be it for IPC or Code/data are marked as reserved both in Linux as well as U-boot so that Linux/U-boot does not use it [1]. Same scheme is followed here w.r. t RTOS framebuffer too. [1] : https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi?h=v6.7#n63 Regards Devarsh > Tomi > >> Also add a label for reserved memory region in base device-tree file so >> that it can be referred back in overlay file. >> >> Signed-off-by: Devarsh Thakkar >> --- >> arch/arm64/boot/dts/ti/Makefile | 1 + >> .../arm64/boot/dts/ti/k3-am62x-sk-common.dtsi | 2 +- >> .../dts/ti/k3-am62x-sk-dss-shared-mode.dtso | 33 +++ >> 3 files changed, 35 insertions(+), 1 deletion(-) >> create mode 100644 >> arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso >> >> diff --git a/arch/arm64/boot/dts/ti/Makefile >> b/arch/arm64/boot/dts/ti/Makefile >> index 52c1dc910308..ff832741b367 100644 >> --- a/arch/arm64/boot/dts/ti/Makefile >> +++ b/arch/arm64/boot/dts/ti/Makefile >> @@ -35,6 +35,7 @@ dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-csi2-ov5640.dtbo >> dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-csi2-tevi-ov5640.dtbo >> dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-csi2-imx219.dtbo >> dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-hdmi-audio.dtbo >> +dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-dss-shared-mode.dtbo >> # Boards with AM64x SoC >> dtb-$(CONFIG_ARCH_K3) += k3-am642-evm.dtb >> diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi >> b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi >> index 33768c02d8eb..8b55ca10102f 100644 >> --- a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi >> +++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi >> @@ -34,7 +34,7 @@ memory@8000 { >> reg = <0x 0x8000 0x 0x8000>; >> }; >> - reserved-memory { >> + reserved_memory: reserved-memory { >> #address-cells = <2>; >> #size-cells = <2>; >> ranges; >> diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso >> b/arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso >> new file mode 100644 >> index ..02153748a5c2 >> --- /dev/null >> +++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso >> @@ -0,0 +1,33 @@ >> +// SPDX-License-Identifier: GPL-2.0 >> +/** >> + * DT overlay to enable display sharing mode for AM62P DSS0 >> + * This is compatible with custom AM62 Device Manager firmware >> + * >> + * Copyright (C) 2023 Texas Instruments Incorporated - >> http://www.ti.com/ >> + */ >> + >> +/dts-v1/; >> +/plugin/; >> + >> +#include >> +#include >> + >> +&dss0 { >> + ti,dss-shared-mode; >> + ti,dss-shared-mode-vp = "vp1"; >> + ti,dss-shared-mode-vp-owned = <0>; >> + ti,dss-shared-mode-common = "common1"; >> + ti,dss-shared-mode-planes = "vid"; >> + ti,dss-shared-mode-plane-zorder = <0>; >> + interrupts = ; >> +}; >> + >> +&reserved_memory { >> + #address-cells = <2>; >> + #size-cells = <2>; >> + rtos_framebuffer_memory_region: rtos-framebuffer-memory@9480 { >> + compatible = "shared-dma-pool"; >> + reg = <0x00 0x9480 0x00 0x0800>; >> + no-map; >> + }; >> +}; > >
Re: [RFC PATCH 1/3] dt-bindings: display: ti,am65x-dss: Add support for display sharing mode
Hi Rob, Thanks for the review. On 19/01/24 19:25, Rob Herring wrote: > On Thu, Jan 18, 2024 at 7:59 AM Devarsh Thakkar wrote: >> >> Hi Rob, >> >> Thanks for the quick review. >> >> On 18/01/24 01:43, Rob Herring wrote: >>> On Tue, Jan 16, 2024 at 07:11:40PM +0530, Devarsh Thakkar wrote: >>>> Add support for using TI Keystone DSS hardware present in display >>>> sharing mode. >>>> >>>> TI Keystone DSS hardware supports partitioning of resources between >>>> multiple hosts as it provides separate register space and unique >>>> interrupt line to each host. >>>> >>>> The DSS hardware can be used in shared mode in such a way that one or >>>> more of video planes can be owned by Linux wherease other planes can be >>>> owned by remote cores. >>>> >>>> One or more of the video ports can be dedicated exclusively to a >>>> processing core, wherease some of the video ports can be shared between >>>> two hosts too with only one of them having write access. >>>> >>>> Signed-off-by: Devarsh Thakkar >>>> --- >>>> .../bindings/display/ti/ti,am65x-dss.yaml | 82 +++ >>>> 1 file changed, 82 insertions(+) >>>> >>>> diff --git >>>> a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >>>> b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >>>> index 55e3e490d0e6..d9bc69fbf1fb 100644 >>>> --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >>>> +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >>>> @@ -112,6 +112,86 @@ properties: >>>>Input memory (from main memory to dispc) bandwidth limit in >>>>bytes per second >>>> >>>> + ti,dss-shared-mode: >>>> +type: boolean >>>> +description: >>>> + TI DSS7 supports sharing of display between multiple hosts >>>> + as it provides separate register space for display configuration and >>>> + unique interrupt line to each host. >>> >>> If you care about line breaks, you need '|'. >>> >> >> Noted. >> >>>> + One of the host is provided access to the global display >>>> + configuration labelled as "common" region of DSS allows that host >>>> + exclusive access to global registers of DSS while other host can >>>> + configure the display for it's usage using a separate register >>>> + space labelled as "common1". >>>> + The DSS resources can be partitioned in such a way that one or more >>>> + of the video planes are owned by Linux whereas other video planes >>> >>> Your h/w can only run Linux? >>> >>> What if you want to use this same binding to define the configuration to >>> the 'remote processor'? You can easily s/Linux/the OS/, but it all >>> should be reworded to describe things in terms of the local processor. >>> >> >> It can run both Linux and RTOS or for that matter any other OS too. But yes I >> got your point, will reword accordingly. >> >>>> + can be owned by a remote core. >>>> + The video port controlling these planes acts as a shared video port >>>> + and it can be configured with write access either by Linux or the >>>> + remote core in which case Linux only has read-only access to that >>>> + video port. >>> >>> What is the purpose of this property when all the other properties are >>> required? >>> >> >> The ti,dss-shared-mode and below group of properties are optional. But >> if ti,dss-shared-mode is set then only driver should parse below set of >> properties. > > Let me re-phrase. Drop this property. It serves no purpose since all > the properties have to be present anyway. > Noted. >>>> + ti,dss-shared-mode-planes: >>>> +description: >>>> + The video layer that is owned by processing core running Linux. >>>> + The display driver running from Linux has exclusive write access to >>>> + this video layer. >>>> +$ref: /schemas/types.yaml#/definitions/string >>>> +enum: [vidl, vid] >>>> + >>>> + ti,dss-shared-mode-vp: >>>> +description: >>>> + The video port that
Re: [RFC PATCH 1/3] dt-bindings: display: ti,am65x-dss: Add support for display sharing mode
Hi Rob, Thanks for the quick review. On 18/01/24 01:43, Rob Herring wrote: > On Tue, Jan 16, 2024 at 07:11:40PM +0530, Devarsh Thakkar wrote: >> Add support for using TI Keystone DSS hardware present in display >> sharing mode. >> >> TI Keystone DSS hardware supports partitioning of resources between >> multiple hosts as it provides separate register space and unique >> interrupt line to each host. >> >> The DSS hardware can be used in shared mode in such a way that one or >> more of video planes can be owned by Linux wherease other planes can be >> owned by remote cores. >> >> One or more of the video ports can be dedicated exclusively to a >> processing core, wherease some of the video ports can be shared between >> two hosts too with only one of them having write access. >> >> Signed-off-by: Devarsh Thakkar >> --- >> .../bindings/display/ti/ti,am65x-dss.yaml | 82 +++ >> 1 file changed, 82 insertions(+) >> >> diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >> b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >> index 55e3e490d0e6..d9bc69fbf1fb 100644 >> --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >> +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >> @@ -112,6 +112,86 @@ properties: >>Input memory (from main memory to dispc) bandwidth limit in >>bytes per second >> >> + ti,dss-shared-mode: >> +type: boolean >> +description: >> + TI DSS7 supports sharing of display between multiple hosts >> + as it provides separate register space for display configuration and >> + unique interrupt line to each host. > > If you care about line breaks, you need '|'. > Noted. >> + One of the host is provided access to the global display >> + configuration labelled as "common" region of DSS allows that host >> + exclusive access to global registers of DSS while other host can >> + configure the display for it's usage using a separate register >> + space labelled as "common1". >> + The DSS resources can be partitioned in such a way that one or more >> + of the video planes are owned by Linux whereas other video planes > > Your h/w can only run Linux? > > What if you want to use this same binding to define the configuration to > the 'remote processor'? You can easily s/Linux/the OS/, but it all > should be reworded to describe things in terms of the local processor. > It can run both Linux and RTOS or for that matter any other OS too. But yes I got your point, will reword accordingly. >> + can be owned by a remote core. >> + The video port controlling these planes acts as a shared video port >> + and it can be configured with write access either by Linux or the >> + remote core in which case Linux only has read-only access to that >> + video port. > > What is the purpose of this property when all the other properties are > required? > The ti,dss-shared-mode and below group of properties are optional. But if ti,dss-shared-mode is set then only driver should parse below set of properties. >> + >> + ti,dss-shared-mode-planes: >> +description: >> + The video layer that is owned by processing core running Linux. >> + The display driver running from Linux has exclusive write access to >> + this video layer. >> +$ref: /schemas/types.yaml#/definitions/string >> +enum: [vidl, vid] >> + >> + ti,dss-shared-mode-vp: >> +description: >> + The video port that is being used in context of processing core >> + running Linux with display susbsytem being used in shared mode. >> + This can be owned either by the processing core running Linux in >> + which case Linux has the write access and the responsibility to >> + configure this video port and the associated overlay manager or >> + it can be shared between core running Linux and a remote core >> + with remote core provided with write access to this video port and >> + associated overlay managers and remote core configures and drives >> + this video port also feeding data from one or more of the >> + video planes owned by Linux, with Linux only having read-only access >> + to this video port and associated overlay managers. >> + >> +$ref: /schemas/types.yaml#/definitions/string >> +enum: [vp1, vp2] >> + >> + ti,dss-shared-mode-common: >>
[RFC PATCH 3/3] arm64: dts: ti: k3-am62x: Add overlay to use DSS in display sharing mode
This overlay needs to be used with display sharing supported device manager firmware only. Remote core running this firmware has write access to "common" register space, VIDL pipeline, OVR1 overlay and VP1 videoport. The processing core running Linux is provided write access to VID pipeline and "common1" register space. The VP1 video port is shared between Linux and remote core with remote core configuring the overlay manager to set Zorder 1 for VID pipeline and Zorder 2 for VIDL pipeline. Add reserved memory region for framebuffer region used by remote core in dss shared mode overlay file so that Linux does not re-use the same while allocating memory. Also add a label for reserved memory region in base device-tree file so that it can be referred back in overlay file. Signed-off-by: Devarsh Thakkar --- arch/arm64/boot/dts/ti/Makefile | 1 + .../arm64/boot/dts/ti/k3-am62x-sk-common.dtsi | 2 +- .../dts/ti/k3-am62x-sk-dss-shared-mode.dtso | 33 +++ 3 files changed, 35 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso diff --git a/arch/arm64/boot/dts/ti/Makefile b/arch/arm64/boot/dts/ti/Makefile index 52c1dc910308..ff832741b367 100644 --- a/arch/arm64/boot/dts/ti/Makefile +++ b/arch/arm64/boot/dts/ti/Makefile @@ -35,6 +35,7 @@ dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-csi2-ov5640.dtbo dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-csi2-tevi-ov5640.dtbo dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-csi2-imx219.dtbo dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-hdmi-audio.dtbo +dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-dss-shared-mode.dtbo # Boards with AM64x SoC dtb-$(CONFIG_ARCH_K3) += k3-am642-evm.dtb diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi index 33768c02d8eb..8b55ca10102f 100644 --- a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi @@ -34,7 +34,7 @@ memory@8000 { reg = <0x 0x8000 0x 0x8000>; }; - reserved-memory { + reserved_memory: reserved-memory { #address-cells = <2>; #size-cells = <2>; ranges; diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso b/arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso new file mode 100644 index ..02153748a5c2 --- /dev/null +++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso @@ -0,0 +1,33 @@ +// SPDX-License-Identifier: GPL-2.0 +/** + * DT overlay to enable display sharing mode for AM62P DSS0 + * This is compatible with custom AM62 Device Manager firmware + * + * Copyright (C) 2023 Texas Instruments Incorporated - http://www.ti.com/ + */ + +/dts-v1/; +/plugin/; + +#include +#include + +&dss0 { + ti,dss-shared-mode; + ti,dss-shared-mode-vp = "vp1"; + ti,dss-shared-mode-vp-owned = <0>; + ti,dss-shared-mode-common = "common1"; + ti,dss-shared-mode-planes = "vid"; + ti,dss-shared-mode-plane-zorder = <0>; + interrupts = ; +}; + +&reserved_memory { + #address-cells = <2>; + #size-cells = <2>; + rtos_framebuffer_memory_region: rtos-framebuffer-memory@9480 { + compatible = "shared-dma-pool"; + reg = <0x00 0x9480 0x00 0x0800>; + no-map; + }; +}; -- 2.34.1
[RFC PATCH 1/3] dt-bindings: display: ti, am65x-dss: Add support for display sharing mode
Add support for using TI Keystone DSS hardware present in display sharing mode. TI Keystone DSS hardware supports partitioning of resources between multiple hosts as it provides separate register space and unique interrupt line to each host. The DSS hardware can be used in shared mode in such a way that one or more of video planes can be owned by Linux wherease other planes can be owned by remote cores. One or more of the video ports can be dedicated exclusively to a processing core, wherease some of the video ports can be shared between two hosts too with only one of them having write access. Signed-off-by: Devarsh Thakkar --- .../bindings/display/ti/ti,am65x-dss.yaml | 82 +++ 1 file changed, 82 insertions(+) diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml index 55e3e490d0e6..d9bc69fbf1fb 100644 --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml @@ -112,6 +112,86 @@ properties: Input memory (from main memory to dispc) bandwidth limit in bytes per second + ti,dss-shared-mode: +type: boolean +description: + TI DSS7 supports sharing of display between multiple hosts + as it provides separate register space for display configuration and + unique interrupt line to each host. + One of the host is provided access to the global display + configuration labelled as "common" region of DSS allows that host + exclusive access to global registers of DSS while other host can + configure the display for it's usage using a separate register + space labelled as "common1". + The DSS resources can be partitioned in such a way that one or more + of the video planes are owned by Linux whereas other video planes + can be owned by a remote core. + The video port controlling these planes acts as a shared video port + and it can be configured with write access either by Linux or the + remote core in which case Linux only has read-only access to that + video port. + + ti,dss-shared-mode-planes: +description: + The video layer that is owned by processing core running Linux. + The display driver running from Linux has exclusive write access to + this video layer. +$ref: /schemas/types.yaml#/definitions/string +enum: [vidl, vid] + + ti,dss-shared-mode-vp: +description: + The video port that is being used in context of processing core + running Linux with display susbsytem being used in shared mode. + This can be owned either by the processing core running Linux in + which case Linux has the write access and the responsibility to + configure this video port and the associated overlay manager or + it can be shared between core running Linux and a remote core + with remote core provided with write access to this video port and + associated overlay managers and remote core configures and drives + this video port also feeding data from one or more of the + video planes owned by Linux, with Linux only having read-only access + to this video port and associated overlay managers. + +$ref: /schemas/types.yaml#/definitions/string +enum: [vp1, vp2] + + ti,dss-shared-mode-common: +description: + The DSS register region owned by processing core running Linux. +$ref: /schemas/types.yaml#/definitions/string +enum: [common, common1] + + ti,dss-shared-mode-vp-owned: +description: + This tells whether processing core running Linux has write access to + the video ports enlisted in ti,dss-shared-mode-vps. +$ref: /schemas/types.yaml#/definitions/uint32 +enum: [0, 1] + + ti,dss-shared-mode-plane-zorder: +description: + The zorder of the planes owned by Linux. + For the scenario where Linux is not having write access to associated + video port, this field is just for + informational purpose to enumerate the zorder configuration + being used by remote core. + +$ref: /schemas/types.yaml#/definitions/uint32 +enum: [0, 1] + +dependencies: + ti,dss-shared-mode: [ 'ti,dss-shared-mode-planes', 'ti,dss-shared-mode-vp', +'ti,dss-shared-mode-plane-zorder', 'ti,dss-shared-mode-vp-owned'] + ti,dss-shared-mode-vp: ['ti,dss-shared-mode', 'ti,dss-shared-mode-planes', + 'ti,dss-shared-mode-plane-zorder', 'ti,dss-shared-mode-vp-owned'] + ti,dss-shared-mode-planes: ['ti,dss-shared-mode', 'ti,dss-shared-mode-vp', + 'ti,dss-shared-mode-plane-zorder', 'ti,dss-shared-mode-vp-owned'] + ti,dss-shared-mode-plane-zorder: ['ti,dss-shared-mode-planes', 'ti,dss-shared-
[RFC PATCH 2/3] drm/tidss: Add support for display sharing
Display subsystem present in TI Keystone family of devices supports sharing of display between multiple hosts as it provides separate register space (common* region) for each host to programming display controller and also a unique interrupt line for each host. This adds support for display sharing, by allowing partitioning of resources either at video port level or at video plane level as described below : 1) Linux can own (i.e have write access) completely one or more of video ports along with corresponding resources (viz. overlay managers, video planes) used by Linux in context of those video ports. Even if Linux is owning these video ports it can still share this video port with a remote core which can own one or more video planes associated with this video port. 2) Linux owns one or more of the video planes with video port (along with corresponding overlay manager) associated with these planes being owned and controlled by a remote core. Linux still has read-only access to the associated video port and overlay managers so that it can parse the settings made by remote core. For both the cases, the resources used in context of processing core running Linux along with ownership information are exposed by user as part of device-tree blob and driver uses an updated feature list tailored for this shared mode accordingly. The driver also auto-populates matching overlay managers and output types from shared video port list provided in device-tree blob. In dispc_feature struct remove const access specfier for output_type array as it is required to be updated dynamically in run-time for shared mode. For 2) where Linux is only owning a set of video planes with corresponding video port and overlay manager controlled by a remote core, separate set of CRTC callbacks are used which just latch on to the preset mode set by remote core, thus avoiding any reconfiguration of associated video ports, overlay managers and clocks. For this case, it is also checked that Linux controlled video planes don't exceed screen size set by remote core while running the display. Display clocks and OLDI related fields are also not populated for this scenario as remote core is owning those resources. For 1), where Linux owns only a set of video port and associated planes with rest of resources owned completely by remote cores, only those set of resources are exposed to Linux and programmed using traditional CRTC helpers and rest of video ports and associated resources are removed from feature list accordingly. Signed-off-by: Devarsh Thakkar --- drivers/gpu/drm/tidss/tidss_crtc.c | 120 - drivers/gpu/drm/tidss/tidss_dispc.c | 254 +--- drivers/gpu/drm/tidss/tidss_dispc.h | 2 +- drivers/gpu/drm/tidss/tidss_drv.c | 33 ++-- drivers/gpu/drm/tidss/tidss_drv.h | 6 + 5 files changed, 375 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/tidss/tidss_crtc.c b/drivers/gpu/drm/tidss/tidss_crtc.c index 5f838980c7a1..f6a877ff4c6c 100644 --- a/drivers/gpu/drm/tidss/tidss_crtc.c +++ b/drivers/gpu/drm/tidss/tidss_crtc.c @@ -31,13 +31,19 @@ static void tidss_crtc_finish_page_flip(struct tidss_crtc *tcrtc) /* * New settings are taken into use at VFP, and GO bit is cleared at * the same time. This happens before the vertical blank interrupt. -* So there is a small change that the driver sets GO bit after VFP, but +* So there is a small chance that the driver sets GO bit after VFP, but * before vblank, and we have to check for that case here. +* +* For a video port shared between Linux and remote core but owned by remote core, +* this is not required since Linux just attaches to mode that was preset by remote +* core with which display is being shared. */ - busy = dispc_vp_go_busy(tidss->dispc, tcrtc->hw_videoport); - if (busy) { - spin_unlock_irqrestore(&ddev->event_lock, flags); - return; + if (!tidss->shared_mode || tidss->shared_mode_owned_vps[tcrtc->hw_videoport]) { + busy = dispc_vp_go_busy(tidss->dispc, tcrtc->hw_videoport); + if (busy) { + spin_unlock_irqrestore(&ddev->event_lock, flags); + return; + } } event = tcrtc->event; @@ -208,6 +214,44 @@ static void tidss_crtc_atomic_flush(struct drm_crtc *crtc, spin_unlock_irqrestore(&ddev->event_lock, flags); } +static void tidss_shared_vp_crtc_atomic_flush(struct drm_crtc *crtc, + struct drm_atomic_state *state) +{ + struct tidss_crtc *tcrtc = to_tidss_crtc(crtc); + struct drm_device *ddev = crtc->dev; + unsigned long flags; + + dev_dbg(ddev->dev, + "%s: %s enabled %d, needs modeset %d, event %p\n", __func__, + crtc->n
[RFC PATCH 0/3] Add display sharing support in tidss
This adds display sharing support in tidss display driver along with an example overlay devicetree file using which can be used to enable display sharing on AM62x devices with device manager core i.e. R5F expected to run a custom firmware which supports corresponding display sharing configuration. As resources can be partitioned at different levels a flexible scheme is followed while designing devicetree bindings and driver changes keeping in mind possible scenarios in which resources can be partitioned and various DSS hardware IP's supported across different devices. A rebased version of this patch has been tested on AM62P SoC which also supports same DSS IP as AM62x and the patch with display sharing functionality is already available in vendor-specific kernel source tree [1] along with documentation which explains the design [2] and DM firmware support [3]. NOTE1: This is marked as RFC for upstream since AM62P is not yet supported in upstream tree and for AM62x SoC which is the target SoC for this patch, the dss sharing functionality is not validated on upstream tree due to missing OLDI support and display sharing DM firmware support, but we still wanted to get some feedback on this. NOTE2: This series depends on : https://lore.kernel.org/all/20240115125716.560363-1-devar...@ti.com/ [1] : https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=09.01.00.007&id=d805270609cfb6b2e67bd2fd5959d71f48393196 https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=09.01.00.007&id=93d751a94cbf9ad07c7f658e78aa510d919d7cd6 https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=09.01.00.007&id=665c17837dc8bed27e8d63388f8f7f7a85c0cd94 https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=ti-linux-6.1.y-cicd&id=f8d7f1a9617862af922c6bc10e0765ba4f4857d6 [2] : https://software-dl.ti.com/processor-sdk-linux/esd/AM62PX/09_01_00_08/exports/docs/linux/Foundational_Components/Kernel/Kernel_Drivers/Display/DSS7.html#driver-features (Display Sharing mode feature description) https://software-dl.ti.com/processor-sdk-linux/esd/AM62PX/09_01_00_08/exports/docs/linux/How_to_Guides/Target/How_to_enable_display_sharing_between_remotecore_and_Linux.html (How To Guide) https://software-dl.ti.com/processor-sdk-linux/esd/AM62PX/09_01_00_08/exports/docs/system/Demo_User_Guides/Display_Cluster_User_Guide.html (Display Cluster user guide with demo details) [3] : https://git.ti.com/cgit/processor-firmware/ti-linux-firmware/tree/ti-dm/am62pxx/dss_display_share.wkup-r5f0_0.release.strip.out?h=ti-linux-firmware-next Devarsh Thakkar (3): dt-bindings: display: ti,am65x-dss: Add support for display sharing mode drm/tidss: Add support for display sharing arm64: dts: ti: k3-am62x: Add overlay to use DSS in display sharing mode .../bindings/display/ti/ti,am65x-dss.yaml | 82 ++ arch/arm64/boot/dts/ti/Makefile | 1 + .../dts/ti/k3-am62x-sk-dss-shared-mode.dtso | 23 ++ drivers/gpu/drm/tidss/tidss_crtc.c| 120 - drivers/gpu/drm/tidss/tidss_dispc.c | 254 -- drivers/gpu/drm/tidss/tidss_dispc.h | 2 +- drivers/gpu/drm/tidss/tidss_drv.c | 33 ++- drivers/gpu/drm/tidss/tidss_drv.h | 6 + 8 files changed, 481 insertions(+), 40 deletions(-) create mode 100644 arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso -- 2.34.1
Re: [DO NOT MERGE PATCH 2/2] arm64: dts: ti: Add common1 register space for AM62x and AM65x SoCs
Hi Conor, Thanks for the review. On 15/01/24 21:44, Conor Dooley wrote: > On Mon, Jan 15, 2024 at 06:27:16PM +0530, Devarsh Thakkar wrote: >> This adds common1 register space for AM62x and AM65x SoC's which are using >> TI's Keystone display hardware and supporting it as described in >> Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml. >> >> This region is documented in respective Technical Reference Manuals [1]. >> >> [1]: >> AM62x TRM: >> https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers) >> >> AM65x TRM: >> https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers) >> >> Signed-off-by: Devarsh Thakkar >> --- > > "[DO NOT MERGE PATCH 2/2]" but no rationale here as to why this cannot > be merged? What's the problem with it? > No problem as such from my point of view, but this is the process I follow since maintainer trees for device-tree file and bindings are different. I generally mark a [DO NOT MERGE] tag for device-tree file patches until binding patch gets merged so that the device-tree patches don't get applied by mistake if binding patch has some pending comments. Once binding patch gets merged, I re-send the device-tree file patches again to respective list. Regards Devarsh > Cheers, > Conor. > >> arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++-- >> arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++-- >> 2 files changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi >> b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi >> index 464b7565d085..298bf8d5de8c 100644 >> --- a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi >> +++ b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi >> @@ -779,9 +779,10 @@ dss: dss@3020 { >><0x00 0x30207000 0x00 0x1000>, /* ovr1 */ >><0x00 0x30208000 0x00 0x1000>, /* ovr2 */ >><0x00 0x3020a000 0x00 0x1000>, /* vp1: Used for OLDI */ >> - <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */ >> + <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */ >> + <0x00 0x30201000 0x00 0x1000>; /* common1 */ >> reg-names = "common", "vidl1", "vid", >> -"ovr1", "ovr2", "vp1", "vp2"; >> +"ovr1", "ovr2", "vp1", "vp2", "common1"; >> power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>; >> clocks = <&k3_clks 186 6>, >> <&dss_vp1_clk>, >> diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi >> b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi >> index fcea54465636..5b2d4365b911 100644 >> --- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi >> +++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi >> @@ -1019,9 +1019,10 @@ dss: dss@4a0 { >><0x0 0x04a07000 0x0 0x1000>, /* ovr1 */ >><0x0 0x04a08000 0x0 0x1000>, /* ovr2 */ >><0x0 0x04a0a000 0x0 0x1000>, /* vp1 */ >> - <0x0 0x04a0b000 0x0 0x1000>; /* vp2 */ >> + <0x0 0x04a0b000 0x0 0x1000>, /* vp2 */ >> + <0x0 0x04a01000 0x0 0x1000>; /* common1 */ >> reg-names = "common", "vidl1", "vid", >> -"ovr1", "ovr2", "vp1", "vp2"; >> +"ovr1", "ovr2", "vp1", "vp2", "common1"; >> >> ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>; >> >> -- >> 2.34.1 >>
Re: [PATCH 1/2] dt-bindings: display: ti,am65x-dss: Add support for common1 region
Hi Conor, Thanks for the review. On 15/01/24 21:47, Conor Dooley wrote: > On Mon, Jan 15, 2024 at 06:27:15PM +0530, Devarsh Thakkar wrote: >> TI keystone display subsystem present in AM65 and other SoCs such as AM62 > > Do all 3 SoCs supported by this binding (am625 am62a7 am65x) have this > common1 register? If not, you should limit it the platforms that do have > it. > Yes all 3 SoCs supported by binding have common1 register space supported. AM62x TRM: https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers) AM65x TRM: https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers) AM62A TRM: https://www.ti.com/lit/pdf/spruj16 (Section 14.9.9.1 DSS Registers) Regards Devarsh > Thanks, > Conor. > >> support two separate register spaces namely "common" and "common1" which >> can be used by two separate hosts to program the display controller as >> described in respective Technical Reference Manuals [1]. >> >> The common1 register space has similar set of configuration registers as >> supported in common register space except the global configuration >> registers which are exclusive to common region. >> >> This adds binding for "common1" register region too as supported by the >> hardware. >> >> [1]: >> AM62x TRM: >> https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers) >> >> AM65x TRM: >> https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers) >> >> Signed-off-by: Devarsh Thakkar >> --- >> .../devicetree/bindings/display/ti/ti,am65x-dss.yaml | 7 +-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >> b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >> index b6767ef0d24d..55e3e490d0e6 100644 >> --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >> +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml >> @@ -37,6 +37,7 @@ properties: >>- description: OVR2 overlay manager for vp2 >>- description: VP1 video port 1 >>- description: VP2 video port 2 >> + - description: common1 DSS register area >> >>reg-names: >> items: >> @@ -47,6 +48,7 @@ properties: >>- const: ovr2 >>- const: vp1 >>- const: vp2 >> + - const: common1 >> >>clocks: >> items: >> @@ -147,9 +149,10 @@ examples: >> <0x04a07000 0x1000>, /* ovr1 */ >> <0x04a08000 0x1000>, /* ovr2 */ >> <0x04a0a000 0x1000>, /* vp1 */ >> -<0x04a0b000 0x1000>; /* vp2 */ >> +<0x04a0b000 0x1000>, /* vp2 */ >> +<0x04a01000 0x1000>; /* common1 */ >> reg-names = "common", "vidl1", "vid", >> -"ovr1", "ovr2", "vp1", "vp2"; >> +"ovr1", "ovr2", "vp1", "vp2", "common1"; >> ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>; >> power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>; >> clocks =<&k3_clks 67 1>, >> -- >> 2.34.1 >>
[PATCH 0/2] Add common1 register space for TI Keystone displays
Add common1 register space for SoC's supporting TI Keystone displays present in AM65x and AM62x SoCs. This is required to support use-cases where Linux may want to use common1 region instead of common region with the latter being controlled by another processing core. The enumeration of common1 region in device-tree bindings seem to be a miss as ideally bindings should enumerate all supported register spaces as done in Documentation/devicetree/bindings/display/ti/ti,j721e-dss.yaml which also uses TI Keystone display subsystem albeit with some more features. Devarsh Thakkar (2): dt-bindings: display: ti,am65x-dss: Add support for common1 region arm64: dts: ti: Add common1 register space for AM62x and AM65x SoCs .../devicetree/bindings/display/ti/ti,am65x-dss.yaml | 7 +-- arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++-- 3 files changed, 11 insertions(+), 6 deletions(-) -- 2.34.1
[DO NOT MERGE PATCH 2/2] arm64: dts: ti: Add common1 register space for AM62x and AM65x SoCs
This adds common1 register space for AM62x and AM65x SoC's which are using TI's Keystone display hardware and supporting it as described in Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml. This region is documented in respective Technical Reference Manuals [1]. [1]: AM62x TRM: https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers) AM65x TRM: https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers) Signed-off-by: Devarsh Thakkar --- arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++-- arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi index 464b7565d085..298bf8d5de8c 100644 --- a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi @@ -779,9 +779,10 @@ dss: dss@3020 { <0x00 0x30207000 0x00 0x1000>, /* ovr1 */ <0x00 0x30208000 0x00 0x1000>, /* ovr2 */ <0x00 0x3020a000 0x00 0x1000>, /* vp1: Used for OLDI */ - <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */ + <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */ + <0x00 0x30201000 0x00 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>; clocks = <&k3_clks 186 6>, <&dss_vp1_clk>, diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi index fcea54465636..5b2d4365b911 100644 --- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi @@ -1019,9 +1019,10 @@ dss: dss@4a0 { <0x0 0x04a07000 0x0 0x1000>, /* ovr1 */ <0x0 0x04a08000 0x0 0x1000>, /* ovr2 */ <0x0 0x04a0a000 0x0 0x1000>, /* vp1 */ - <0x0 0x04a0b000 0x0 0x1000>; /* vp2 */ + <0x0 0x04a0b000 0x0 0x1000>, /* vp2 */ + <0x0 0x04a01000 0x0 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", - "ovr1", "ovr2", "vp1", "vp2"; + "ovr1", "ovr2", "vp1", "vp2", "common1"; ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>; -- 2.34.1
[PATCH 1/2] dt-bindings: display: ti, am65x-dss: Add support for common1 region
TI keystone display subsystem present in AM65 and other SoCs such as AM62 support two separate register spaces namely "common" and "common1" which can be used by two separate hosts to program the display controller as described in respective Technical Reference Manuals [1]. The common1 register space has similar set of configuration registers as supported in common register space except the global configuration registers which are exclusive to common region. This adds binding for "common1" register region too as supported by the hardware. [1]: AM62x TRM: https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers) AM65x TRM: https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers) Signed-off-by: Devarsh Thakkar --- .../devicetree/bindings/display/ti/ti,am65x-dss.yaml | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml index b6767ef0d24d..55e3e490d0e6 100644 --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml @@ -37,6 +37,7 @@ properties: - description: OVR2 overlay manager for vp2 - description: VP1 video port 1 - description: VP2 video port 2 + - description: common1 DSS register area reg-names: items: @@ -47,6 +48,7 @@ properties: - const: ovr2 - const: vp1 - const: vp2 + - const: common1 clocks: items: @@ -147,9 +149,10 @@ examples: <0x04a07000 0x1000>, /* ovr1 */ <0x04a08000 0x1000>, /* ovr2 */ <0x04a0a000 0x1000>, /* vp1 */ -<0x04a0b000 0x1000>; /* vp2 */ +<0x04a0b000 0x1000>, /* vp2 */ +<0x04a01000 0x1000>; /* common1 */ reg-names = "common", "vidl1", "vid", -"ovr1", "ovr2", "vp1", "vp2"; +"ovr1", "ovr2", "vp1", "vp2", "common1"; ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>; power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>; clocks =<&k3_clks 67 1>, -- 2.34.1
Re: [PATCH] drm/tidss: Power up attached PM domains on probe
Hi Maxime, On 12/10/23 18:10, Maxime Ripard wrote: > Hi, > > On Thu, Oct 12, 2023 at 05:10:06PM +0530, Devarsh Thakkar wrote: >> On 09/10/23 13:20, Devarsh Thakkar wrote: >>> Some SoC's such as AM62P have dedicated power domains >>> for OLDI which need to be powered on separetely along >>> with display controller. >>> >>> So during driver probe, power up all attached PM domains >>> enumerated in devicetree node for DSS. >>> >>> This also prepares base to add display support for AM62P. >>> >> >> NAK, for this patch, as discussed with team there are already plans >> to have separate OLDI bridge driver which should eventually handle >> the additional power domains. > > Just for the record in case your current plan doesn't work out and we > need to revisit this patch: my point was that it's something that > deviates by a margin from what drivers are usually expected to do, so we > should document why that deviation is there. > Sure, thanks for suggesting, I agree that if going with this logic, we should definitely put a comment in driver regarding same. > The patch itself looks reasonable to me otherwise. Yes, it's just that we are planning a separate driver for OLDI. We will see if we need similar logic in that driver too. Regards Devarsh > > Maxime
Re: [PATCH] drm/tidss: Power up attached PM domains on probe
On 09/10/23 13:20, Devarsh Thakkar wrote: Some SoC's such as AM62P have dedicated power domains for OLDI which need to be powered on separetely along with display controller. So during driver probe, power up all attached PM domains enumerated in devicetree node for DSS. This also prepares base to add display support for AM62P. NAK, for this patch, as discussed with team there are already plans to have separate OLDI bridge driver which should eventually handle the additional power domains. Sorry for the noise. Regards Devarsh Signed-off-by: Devarsh Thakkar > --- drivers/gpu/drm/tidss/tidss_drv.c | 76 +++ drivers/gpu/drm/tidss/tidss_drv.h | 5 ++ 2 files changed, 81 insertions(+) diff --git a/drivers/gpu/drm/tidss/tidss_drv.c b/drivers/gpu/drm/tidss/tidss_drv.c index 4d063eb9cd0b..a703a27d17bf 100644 --- a/drivers/gpu/drm/tidss/tidss_drv.c +++ b/drivers/gpu/drm/tidss/tidss_drv.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -114,6 +115,72 @@ static const struct drm_driver tidss_driver = { .minor = 0, }; +static int tidss_detach_pm_domains(struct tidss_device *tidss) +{ + int i; + + if (tidss->num_domains <= 1) + return 0; + + for (i = 0; i < tidss->num_domains; i++) { + if (tidss->pd_link[i] && !IS_ERR(tidss->pd_link[i])) + device_link_del(tidss->pd_link[i]); + if (tidss->pd_dev[i] && !IS_ERR(tidss->pd_dev[i])) + dev_pm_domain_detach(tidss->pd_dev[i], true); + tidss->pd_dev[i] = NULL; + tidss->pd_link[i] = NULL; + } + + return 0; +} + +static int tidss_attach_pm_domains(struct tidss_device *tidss) +{ + struct device *dev = tidss->dev; + int i; + int ret; + struct platform_device *pdev = to_platform_device(dev); + struct device_node *np = pdev->dev.of_node; + + tidss->num_domains = of_count_phandle_with_args(np, "power-domains", + "#power-domain-cells"); + if (tidss->num_domains <= 1) { + dev_dbg(dev, "One or less power domains, no need to do attach domains\n"); + return 0; + } + + tidss->pd_dev = devm_kmalloc_array(dev, tidss->num_domains, + sizeof(*tidss->pd_dev), GFP_KERNEL); + if (!tidss->pd_dev) + return -ENOMEM; + + tidss->pd_link = devm_kmalloc_array(dev, tidss->num_domains, + sizeof(*tidss->pd_link), GFP_KERNEL); + if (!tidss->pd_link) + return -ENOMEM; + + for (i = 0; i < tidss->num_domains; i++) { + tidss->pd_dev[i] = dev_pm_domain_attach_by_id(dev, i); + if (IS_ERR(tidss->pd_dev[i])) { + ret = PTR_ERR(tidss->pd_dev[i]); + goto fail; + } + + tidss->pd_link[i] = device_link_add(dev, tidss->pd_dev[i], + DL_FLAG_STATELESS | + DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE); + if (!tidss->pd_link[i]) { + ret = -EINVAL; + goto fail; + } + } + + return 0; +fail: + tidss_detach_pm_domains(tidss); + return ret; +} + static int tidss_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; @@ -136,6 +203,13 @@ static int tidss_probe(struct platform_device *pdev) platform_set_drvdata(pdev, tidss); + /* powering up associated OLDI domains */ + ret = tidss_attach_pm_domains(tidss); + if (ret < 0) { + dev_err(dev, "failed to attach power domains %d\n", ret); + return ret; + } + ret = dispc_init(tidss); if (ret) { dev_err(dev, "failed to initialize dispc: %d\n", ret); @@ -193,6 +267,7 @@ static int tidss_probe(struct platform_device *pdev) dispc_runtime_suspend(tidss->dispc); #endif pm_runtime_disable(dev); + tidss_detach_pm_domains(tidss); return ret; } @@ -220,6 +295,7 @@ static void tidss_remove(struct platform_device *pdev) /* devm allocated dispc goes away with the dev so mark it NULL */ dispc_remove(tidss); + tidss_detach_pm_domains(tidss); dev_dbg(dev, "%s done\n", __func__); } diff --git a/drivers/gpu/drm/tidss/tidss_drv.h b/drivers/gpu/drm/tidss/tidss_drv.h index d7f27b0b0315..3c8b37b3aba6 100644 --- a/drivers/gpu/drm/tidss/tidss_drv.h +++ b/drivers/gpu/drm/tidss/tidss_drv.h @@ -31,6 +31,11 @@ struct tidss_dev