Re: [PATCH] drm/tidss: Add MIT license along with GPL-2.0

2024-09-16 Thread Devarsh Thakkar
Hi Laurent,

Thanks for the quick review.

On 13/09/24 13:54, Laurent Pinchart wrote:
> Hi Devarsh,
> 
> On Thu, Sep 12, 2024 at 10:41:42PM +0530, Devarsh Thakkar wrote:
>> Modify license to include dual licensing as GPL-2.0-only OR MIT license for
>> tidss display driver. This allows other operating system ecosystems such as
>> Zephyr and also the commercial firmwares to refer and derive code from this
> 
> GPL-2.0 isn't incompatible with "commercial". I think you mean
> "proprietary" here.
> 

Yes, GPL-2.0 is not incompatible to commercial but there is an enforecement
that derivative code needs to be GPL-2.0 licensed which may not fit well for
projects which are not using GPL-2.0 license. But yes MIT will also help
proprietary code, so I agree it is good to mention the same in commit message.

>> display driver in a more permissive manner.
> 
> How do you envision that to work ? Zephyr doesn't have KMS, so you can't
> use the driver as-is. What exactly would TI want to use from the Linux
> kernel driver ?
> 

Not the DRM/KMS part, but the tidss specific display controller programming is
the main point of interest. For e.g. At this point, mostly I see that the TI
customers are interested to re-use/derive code from u-boot tidss driver [1]
which is quite simple and use it in their test application or RTOS based
offerring which is non-GPL code. Since their test application has much more
code apart from the display part, which is non-GPL, they can't use GPL license.

Now since the u-boot tidss driver is derived from kernel tidss driver, my
understanding was that we need to change the license of kernel tidss driver
first before changing the u-boot tidss driver.

> Personally, there's a reason why I contribute code to the kernel under
> the GPL-2.0 license, it is to make sure the code will remain open. While
> I can accept other licenses on a case-by-case basis, I don't like the
> casual approach of this patch that seem (to me) to imply that the
> license is a mere detail. For a start I would like to know what
> "commercial firmwares" you're thinking about.
> 

Understood, let me know if above information suffice or any further
information is needed.

[1]:
https://gitlab.com/u-boot/u-boot/-/blob/master/drivers/video/tidss/tidss_drv.c?ref_type=heads

Regards
Devarsh


[PATCH] drm/tidss: Add MIT license along with GPL-2.0

2024-09-12 Thread Devarsh Thakkar
Modify license to include dual licensing as GPL-2.0-only OR MIT license for
tidss display driver. This allows other operating system ecosystems such as
Zephyr and also the commercial firmwares to refer and derive code from this
display driver in a more permissive manner.

Signed-off-by: Devarsh Thakkar 
---
 drivers/gpu/drm/tidss/Makefile| 2 +-
 drivers/gpu/drm/tidss/tidss_crtc.c| 2 +-
 drivers/gpu/drm/tidss/tidss_crtc.h| 2 +-
 drivers/gpu/drm/tidss/tidss_dispc.c   | 2 +-
 drivers/gpu/drm/tidss/tidss_dispc.h   | 2 +-
 drivers/gpu/drm/tidss/tidss_dispc_regs.h  | 2 +-
 drivers/gpu/drm/tidss/tidss_drv.c | 2 +-
 drivers/gpu/drm/tidss/tidss_drv.h | 2 +-
 drivers/gpu/drm/tidss/tidss_encoder.c | 2 +-
 drivers/gpu/drm/tidss/tidss_encoder.h | 2 +-
 drivers/gpu/drm/tidss/tidss_irq.c | 2 +-
 drivers/gpu/drm/tidss/tidss_irq.h | 2 +-
 drivers/gpu/drm/tidss/tidss_kms.c | 2 +-
 drivers/gpu/drm/tidss/tidss_kms.h | 2 +-
 drivers/gpu/drm/tidss/tidss_plane.c   | 2 +-
 drivers/gpu/drm/tidss/tidss_plane.h   | 2 +-
 drivers/gpu/drm/tidss/tidss_scale_coefs.c | 2 +-
 drivers/gpu/drm/tidss/tidss_scale_coefs.h | 2 +-
 18 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/tidss/Makefile b/drivers/gpu/drm/tidss/Makefile
index 312645271014..c67ff32d02e1 100644
--- a/drivers/gpu/drm/tidss/Makefile
+++ b/drivers/gpu/drm/tidss/Makefile
@@ -1,4 +1,4 @@
-# SPDX-License-Identifier: GPL-2.0
+# SPDX-License-Identifier: GPL-2.0 OR MIT
 
 tidss-y := tidss_crtc.o \
tidss_drv.o \
diff --git a/drivers/gpu/drm/tidss/tidss_crtc.c 
b/drivers/gpu/drm/tidss/tidss_crtc.c
index 94f8e3178df5..43dfbead9fa9 100644
--- a/drivers/gpu/drm/tidss/tidss_crtc.c
+++ b/drivers/gpu/drm/tidss/tidss_crtc.c
@@ -1,4 +1,4 @@
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: GPL-2.0 OR MIT
 /*
  * Copyright (C) 2018 Texas Instruments Incorporated - https://www.ti.com/
  * Author: Tomi Valkeinen 
diff --git a/drivers/gpu/drm/tidss/tidss_crtc.h 
b/drivers/gpu/drm/tidss/tidss_crtc.h
index 040d1205496b..da03873e2ef0 100644
--- a/drivers/gpu/drm/tidss/tidss_crtc.h
+++ b/drivers/gpu/drm/tidss/tidss_crtc.h
@@ -1,4 +1,4 @@
-/* SPDX-License-Identifier: GPL-2.0 */
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
 /*
  * Copyright (C) 2018 Texas Instruments Incorporated - https://www.ti.com/
  * Author: Tomi Valkeinen 
diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c 
b/drivers/gpu/drm/tidss/tidss_dispc.c
index 1ad711f8d2a8..3321a1c731b1 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc.c
+++ b/drivers/gpu/drm/tidss/tidss_dispc.c
@@ -1,4 +1,4 @@
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: GPL-2.0 OR MIT
 /*
  * Copyright (C) 2016-2018 Texas Instruments Incorporated - https://www.ti.com/
  * Author: Jyri Sarha 
diff --git a/drivers/gpu/drm/tidss/tidss_dispc.h 
b/drivers/gpu/drm/tidss/tidss_dispc.h
index 086327d51a90..e6e4396a0d63 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc.h
+++ b/drivers/gpu/drm/tidss/tidss_dispc.h
@@ -1,4 +1,4 @@
-/* SPDX-License-Identifier: GPL-2.0 */
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
 /*
  * Copyright (C) 2018 Texas Instruments Incorporated - https://www.ti.com/
  * Author: Tomi Valkeinen 
diff --git a/drivers/gpu/drm/tidss/tidss_dispc_regs.h 
b/drivers/gpu/drm/tidss/tidss_dispc_regs.h
index 13feedfe5d6d..6e27b6d444ab 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc_regs.h
+++ b/drivers/gpu/drm/tidss/tidss_dispc_regs.h
@@ -1,4 +1,4 @@
-/* SPDX-License-Identifier: GPL-2.0 */
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
 /*
  * Copyright (C) 2016-2018 Texas Instruments Incorporated - https://www.ti.com/
  * Author: Jyri Sarha 
diff --git a/drivers/gpu/drm/tidss/tidss_drv.c 
b/drivers/gpu/drm/tidss/tidss_drv.c
index d15f836dca95..b060e420ddec 100644
--- a/drivers/gpu/drm/tidss/tidss_drv.c
+++ b/drivers/gpu/drm/tidss/tidss_drv.c
@@ -1,4 +1,4 @@
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: GPL-2.0 OR MIT
 /*
  * Copyright (C) 2018 Texas Instruments Incorporated - https://www.ti.com/
  * Author: Tomi Valkeinen 
diff --git a/drivers/gpu/drm/tidss/tidss_drv.h 
b/drivers/gpu/drm/tidss/tidss_drv.h
index d7f27b0b0315..d4209234f59c 100644
--- a/drivers/gpu/drm/tidss/tidss_drv.h
+++ b/drivers/gpu/drm/tidss/tidss_drv.h
@@ -1,4 +1,4 @@
-/* SPDX-License-Identifier: GPL-2.0 */
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
 /*
  * Copyright (C) 2018 Texas Instruments Incorporated - https://www.ti.com/
  * Author: Tomi Valkeinen 
diff --git a/drivers/gpu/drm/tidss/tidss_encoder.c 
b/drivers/gpu/drm/tidss/tidss_encoder.c
index 17a86bed8054..9749fbc0e056 100644
--- a/drivers/gpu/drm/tidss/tidss_encoder.c
+++ b/drivers/gpu/drm/tidss/tidss_encoder.c
@@ -1,4 +1,4 @@
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: GPL-2.0 OR MIT
 /*
  * Copyright (C) 2018 Texas Instruments Incorporated - https://www.ti.com/
  * Author: Tomi Valkeinen 
diff --git a/drivers/gpu/drm

[PATCH v13 13/13] gpu: ipu-v3: Use generic macro for rounding closest to specified value

2024-06-07 Thread Devarsh Thakkar
Use generic macro round_closest_up() for rounding closest to specified
value instead of using local macro round_closest().

There is no change from functionality point of view as round_closest_up()
is functionally same as the previously used local macro round_closest().

Signed-off-by: Devarsh Thakkar 
---
V9-V13: No change
V8: Update commit message
V1->V7 : (No change, patch introduced in V7)
 drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c 
b/drivers/gpu/ipu-v3/ipu-image-convert.c
index 841316582ea9..5192a8b5c02c 100644
--- a/drivers/gpu/ipu-v3/ipu-image-convert.c
+++ b/drivers/gpu/ipu-v3/ipu-image-convert.c
@@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct 
ipu_image_convert_ctx *ctx,
return 0;
 }
 
-#define round_closest(x, y) round_down((x) + (y)/2, (y))
-
 /*
  * Find the best aligned seam position for the given column / row index.
  * Rotation and image offsets are out of scope.
@@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx 
*ctx,
 * The closest input sample position that we could actually
 * start the input tile at, 19.13 fixed point.
 */
-   in_pos_aligned = round_closest(in_pos, 8192U * in_align);
+   in_pos_aligned = round_closest_up(in_pos, 8192U * in_align);
/* Convert 19.13 fixed point to integer */
in_pos_rounded = in_pos_aligned / 8192U;
 
-- 
2.39.1



[PATCH v13 12/13] media: imagination: Round to closest multiple for cropping region

2024-06-07 Thread Devarsh Thakkar
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up
(V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest
multiple of requested value while updating the crop rectangle coordinates.

Use the rounding macro which gives preference to rounding down in case two
nearest values (high and low) are possible to raise the probability of
cropping rectangle falling inside the bound region.

This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl
description as documented in v4l uapi [1] which specifies that driver
should choose crop rectangle as close as possible if no flags are passed by
user-space, as quoted below :

"``0`` - The driver can adjust the rectangle size freely and shall choose a
crop/compose rectangle as close as possible to the requested
 one."

Link: 
https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst
 [1]
Signed-off-by: Devarsh Thakkar 
---
V9->V13: No change
V8: Update commit message with specification reference
V1->V7 (No change, patch introduced in V7)
---
 drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c 
b/drivers/media/platform/imagination/e5010-jpeg-enc.c
index 4b6fbe99d5ff..ea96b1bc5315 100644
--- a/drivers/media/platform/imagination/e5010-jpeg-enc.c
+++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c
@@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, 
struct v4l2_selection
 
switch (s->flags) {
case 0:
-   s->r.width = round_down(s->r.width, 
queue->fmt->frmsize.step_width);
-   s->r.height = round_down(s->r.height, 
queue->fmt->frmsize.step_height);
-   s->r.left = round_down(s->r.left, 
queue->fmt->frmsize.step_width);
-   s->r.top = round_down(s->r.top, 2);
+   s->r.width = round_closest_down(s->r.width, 
queue->fmt->frmsize.step_width);
+   s->r.height = round_closest_down(s->r.height, 
queue->fmt->frmsize.step_height);
+   s->r.left = round_closest_down(s->r.left, 
queue->fmt->frmsize.step_width);
+   s->r.top = round_closest_down(s->r.top, 2);
 
if (s->r.left + s->r.width > queue->width)
s->r.width = round_down(s->r.width + s->r.left - 
queue->width,
-- 
2.39.1



[PATCH v13 11/13] lib: math_kunit: Add tests for new macros related to rounding to nearest value

2024-06-07 Thread Devarsh Thakkar
Add tests for round_closest_up/down and roundclosest macros which round
to nearest multiple of specified argument. These are tested with kunit
tool as shared here [1] :

Link: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876 [1]
Signed-off-by: Devarsh Thakkar 
Acked-by: Andy Shevchenko 
---
V1->V13 (No change, patch introduced in V8)
 lib/math/math_kunit.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
index be27f2afb8e4..05022f010be6 100644
--- a/lib/math/math_kunit.c
+++ b/lib/math/math_kunit.c
@@ -70,6 +70,26 @@ static void round_down_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, round_down((1 << 30) - 1, 1 << 29), 1 << 29);
 }
 
+static void round_closest_up_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_up(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(14, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 2), 1 << 30);
+}
+
+static void round_closest_down_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_down(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(14, 4), 12);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 2), (1 << 30) - 
2);
+}
+
 /* These versions can round to numbers that aren't a power of two */
 static void roundup_test(struct kunit *test)
 {
@@ -95,6 +115,18 @@ static void rounddown_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, rounddown(4, 3), 3);
 }
 
+static void roundclosest_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, roundclosest(21, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(19, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(17, 5), 15);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30), 3), (1 << 30) - 1);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30) - 1, 1 << 29), 1 << 30);
+
+   KUNIT_EXPECT_EQ(test, roundclosest(4, 3), 3);
+   KUNIT_EXPECT_EQ(test, roundclosest(5, 3), 6);
+}
+
 static void div_round_up_test(struct kunit *test)
 {
KUNIT_EXPECT_EQ(test, DIV_ROUND_UP(0, 1), 0);
@@ -272,8 +304,11 @@ static struct kunit_case math_test_cases[] = {
KUNIT_CASE(int_sqrt_test),
KUNIT_CASE(round_up_test),
KUNIT_CASE(round_down_test),
+   KUNIT_CASE(round_closest_up_test),
+   KUNIT_CASE(round_closest_down_test),
KUNIT_CASE(roundup_test),
KUNIT_CASE(rounddown_test),
+   KUNIT_CASE(roundclosest_test),
KUNIT_CASE(div_round_up_test),
KUNIT_CASE(div_round_closest_test),
KUNIT_CASE_PARAM(gcd_test, gcd_gen_params),
-- 
2.39.1



[PATCH v13 10/13] lib: add basic KUnit test for lib/math

2024-06-07 Thread Devarsh Thakkar
From: Daniel Latypov 

Add basic test coverage for files that don't require any config options:
* part of math.h (what seem to be the most commonly used macros)
* gcd.c
* lcm.c
* int_sqrt.c
* reciprocal_div.c
(Ignored int_pow.c since it's a simple textbook algorithm.)

These tests aren't particularly interesting, but they
* provide short and simple examples of parameterized tests
* provide a place to add tests for any new files in this dir
* are written so adding new test cases to cover edge cases should be
  easy
  * looking at code coverage, we hit all the branches in the .c files

Signed-off-by: Daniel Latypov 
Reviewed-by: David Gow 
Reviewed-by: Andy Shevchenko 
[devarsht: Rebase to 6.9, remove kernel.h, update Kconfig and change license to 
GPL]
Signed-off-by: Devarsh Thakkar 
---
Changes since v11:
* Add Reviewed-by tag

Changes since v10:
* Include headers per IWYU principle and add module description

Changes since v9:
* Added Kconfig dependency for KUNIT

Changes since v8:
* Remove unrequired header file linux/kernel.h

Changes since v7:
* Rebase to linux-next, change license to GPL as suggested by checkpatch.

Changes since v6:
* No change

Changes since v5:
* add in test cases for roundup/rounddown
* address misc comments from David

Changes since v4:
* add in test cases for some math.h macros (abs, round_up/round_down,
  div_round_down/closest)
* use parameterized testing less to keep things terser

Changes since v3:
* fix `checkpatch.pl --strict` warnings
* add test cases for gcd(0,0) and lcm(0,0)
* minor: don't test both gcd(a,b) and gcd(b,a) when a == b

Changes since v2: mv math_test.c => math_kunit.c

Changes since v1:
* Rebase and rewrite to use the new parameterized testing support.
* misc: fix overflow in literal and inline int_sqrt format string.
* related: commit 1f0e943df68a ("Documentation: kunit: provide guidance
for testing many inputs") was merged explaining the patterns shown here.
  * there's an in-flight patch to update it for parameterized testing.
---
 lib/math/Kconfig  |  14 ++
 lib/math/Makefile |   1 +
 lib/math/math_kunit.c | 294 ++
 3 files changed, 309 insertions(+)
 create mode 100644 lib/math/math_kunit.c

diff --git a/lib/math/Kconfig b/lib/math/Kconfig
index 0634b428d0cb..f738d8a433ea 100644
--- a/lib/math/Kconfig
+++ b/lib/math/Kconfig
@@ -15,3 +15,17 @@ config PRIME_NUMBERS
 
 config RATIONAL
tristate
+
+config MATH_KUNIT_TEST
+   tristate "KUnit test for math helper functions"
+   depends on KUNIT
+   default KUNIT_ALL_TESTS
+
+   help
+ This builds unit test for math helper functions as defined in lib/math
+ and math.h.
+
+ For more information on KUNIT and unit tests in general, please refer
+ to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+ If unsure, say N.
diff --git a/lib/math/Makefile b/lib/math/Makefile
index 91fcdb0c9efe..dcf65d10dab2 100644
--- a/lib/math/Makefile
+++ b/lib/math/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_RATIONAL)  += rational.o
 
 obj-$(CONFIG_TEST_DIV64)   += test_div64.o
 obj-$(CONFIG_RATIONAL_KUNIT_TEST) += rational-test.o
+obj-$(CONFIG_MATH_KUNIT_TEST) += math_kunit.o
diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
new file mode 100644
index ..be27f2afb8e4
--- /dev/null
+++ b/lib/math/math_kunit.c
@@ -0,0 +1,294 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Simple KUnit suite for math helper funcs that are always enabled.
+ *
+ * Copyright (C) 2020, Google LLC.
+ * Author: Daniel Latypov 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static void abs_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, abs((char)0), (char)0);
+   KUNIT_EXPECT_EQ(test, abs((char)42), (char)42);
+   KUNIT_EXPECT_EQ(test, abs((char)-42), (char)42);
+
+   /* The expression in the macro is actually promoted to an int. */
+   KUNIT_EXPECT_EQ(test, abs((short)0),  0);
+   KUNIT_EXPECT_EQ(test, abs((short)42),  42);
+   KUNIT_EXPECT_EQ(test, abs((short)-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0),  0);
+   KUNIT_EXPECT_EQ(test, abs(42),  42);
+   KUNIT_EXPECT_EQ(test, abs(-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0L), 0L);
+   KUNIT_EXPECT_EQ(test, abs(42L), 42L);
+   KUNIT_EXPECT_EQ(test, abs(-42L), 42L);
+
+   KUNIT_EXPECT_EQ(test, abs(0LL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42LL), 42LL);
+   KUNIT_EXPECT_EQ(test, abs(-42LL), 42LL);
+
+   /* Unsigned types get casted to signed. */
+   KUNIT_EXPECT_EQ(test, abs(0ULL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42ULL), 42LL);
+}
+
+static void int_sqrt_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, int_sqrt(0UL), 0UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(1UL), 1UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(4UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(5UL), 2UL);

[PATCH v13 09/13] Documentation: core-api: Add math.h macros and functions

2024-06-07 Thread Devarsh Thakkar
Add documentation for rounding, scaling, absolute value and 32-bit division
related macros and functions exported by math.h header file.

Signed-off-by: Devarsh Thakkar 
Reviewed-by: Andy Shevchenko 
---
V13: No change
V12: Add Reviewed-by
V11: Fix title for math function header
V10: Patch introduced
V1->V9 (No change)
 Documentation/core-api/kernel-api.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/core-api/kernel-api.rst 
b/Documentation/core-api/kernel-api.rst
index ae92a2571388..7de494e76fa6 100644
--- a/Documentation/core-api/kernel-api.rst
+++ b/Documentation/core-api/kernel-api.rst
@@ -185,6 +185,12 @@ Division Functions
 .. kernel-doc:: lib/math/gcd.c
:export:
 
+Rounding, absolute value, division and 32-bit scaling functions
+---
+
+.. kernel-doc:: include/linux/math.h
+   :internal:
+
 UUID/GUID
 -
 
-- 
2.39.1



[PATCH v13 07/13] math.h: Add macros for rounding to closest value

2024-06-07 Thread Devarsh Thakkar
Add below rounding related macros:

round_closest_up(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round up in case two nearest values are
possible.

round_closest_down(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round down in case two nearest values are
possible.

roundclosest(x, y) : Rounds x to closest multiple of y, this macro should
generally be used only when y is not multiple of 2 as otherwise
round_closest* macros should be used which are much faster.

Examples:
 * round_closest_up(17, 4) = 16
 * round_closest_up(15, 4) = 16
 * round_closest_up(14, 4) = 16
 * round_closest_down(17, 4) = 16
 * round_closest_down(15, 4) = 16
 * round_closest_down(14, 4) = 12
 * roundclosest(21, 5) = 20
 * roundclosest(19, 5) = 20
 * roundclosest(17, 5) = 15

Signed-off-by: Devarsh Thakkar 
Acked-by: Andy Shevchenko 
---
NOTE: This patch is inspired from the Mentor Graphics IPU driver [1]
which uses similar macro locally and which is updated in further patch
in the series to use this generic macro instead along with other drivers
having similar requirements.

Link: 
https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480
 [1]

V13:
- No change

V12:
- Add Acked-by

V11:
- Fix commenting style per review comments and remove extra whitespace

V10:
- Update example comment to fix formatting issues as observed with html docs

V9:
- No change

V8:
- Add new macro to round to nearest value for non-multiple of 2
- Update commit message as suggested:

V1->V6 (No change, patch introduced in V7)
 include/linux/math.h | 63 
 1 file changed, 63 insertions(+)

diff --git a/include/linux/math.h b/include/linux/math.h
index dd4152711de7..79e3dfda77fc 100644
--- a/include/linux/math.h
+++ b/include/linux/math.h
@@ -34,6 +34,52 @@
  */
 #define round_down(x, y) ((x) & ~__round_mask(x, y))
 
+/**
+ * round_closest_up - round closest to be multiple of specified value (which is
+ *power of 2) with preference to rounding up
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples:
+ * * round_closest_up(17, 4) = 16
+ * * round_closest_up(15, 4) = 16
+ * * round_closest_up(14, 4) = 16
+ */
+#define round_closest_up(x, y) round_down((x) + (y) / 2, (y))
+
+/**
+ * round_closest_down - round closest to be multiple of specified value (which
+ * is power of 2) with preference to rounding down
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples:
+ * * round_closest_down(17, 4) = 16
+ * * round_closest_down(15, 4) = 16
+ * * round_closest_down(14, 4) = 12
+ */
+#define round_closest_down(x, y) round_up((x) - (y) / 2, (y))
+
 #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
 
 #define DIV_ROUND_DOWN_ULL(ll, d) \
@@ -77,6 +123,23 @@
 }  \
 )
 
+/**
+ * roundclosest - round to nearest multiple
+ * @x: the value to round
+ * @y: multiple to round nearest to
+ *
+ * Rounds @x to nearest multiple of @y.
+ * The rounded value can be greater than or less than @x depending
+ * upon it's nearness to @x. If @y will always be a power of 2, consider
+ * using the faster round_closest_up() or round_closest_down().
+ *
+ * Examples:
+ * * roundclosest(21, 5) = 20
+ * * roundclosest(19, 5) = 20
+ * * roundclosest(17, 5) = 15
+ */
+#define roundclosest(x, y) rounddown((x) + (y) / 2, (y))
+
 /*
  * Divide positive or negative dividend by positive or negative divisor
  * and round to closest integer. Result is undefined for negative
-- 
2.39.1



[PATCH v13 00/13] Add V4L2 M2M Driver for E5010 JPEG Encoder

2024-06-07 Thread Devarsh Thakkar
ttps://gist.github.com/devarsht/298790af819f299a0a05fec89371097b

V3->V4 Range diff :
https://gist.github.com/devarsht/22a744d999080de6e813bcfb5a596272

Previous patch series:
V12: https://lore.kernel.org/all/20240604104001.2235082-1-devar...@ti.com/
V11: https://lore.kernel.org/all/20240531170229.1270828-1-devar...@ti.com/
V10: https://lore.kernel.org/all/20240530165925.2715837-1-devar...@ti.com/
V9: https://lore.kernel.org/all/20240526175655.1093707-1-devar...@ti.com/
V8: https://lore.kernel.org/all/20240517171532.748684-1-devar...@ti.com/
V7: https://lore.kernel.org/all/20240510082603.1263256-1-devar...@ti.com/
V6: https://lore.kernel.org/all/20240228141140.3530612-1-devar...@ti.com/
V5: https://lore.kernel.org/all/20240215134641.3381478-1-devar...@ti.com/
V4: https://lore.kernel.org/all/20240205114239.924697-1-devar...@ti.com/
V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/
V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/


Daniel Latypov (1):
  lib: add basic KUnit test for lib/math

Devarsh Thakkar (12):
  media: dt-bindings: Add Imagination E5010 JPEG Encoder
  media: imagination: Add E5010 JPEG Encoder driver
  media: v4l2-jpeg: Export reference quantization and huffman tables
  media: Documentation: Document v4l2-jpeg helper functions
  media: imagination: Use exported tables from v4l2-jpeg core
  media: verisilicon : Use exported tables from v4l2-jpeg for hantro
codec
  math.h: Add macros for rounding to closest value
  math.h: Use kernel-doc syntax for divison macros
  Documentation: core-api: Add math.h macros and functions
  lib: math_kunit: Add tests for new macros related to rounding to
nearest value
  media: imagination: Round to closest multiple for cropping region
  gpu: ipu-v3: Use generic macro for rounding closest to specified value

 Documentation/core-api/kernel-api.rst |6 +
 .../bindings/media/img,e5010-jpeg-enc.yaml|   75 +
 Documentation/driver-api/media/v4l2-core.rst  |1 +
 Documentation/driver-api/media/v4l2-jpeg.rst  |   10 +
 MAINTAINERS   |7 +
 drivers/gpu/ipu-v3/ipu-image-convert.c|4 +-
 drivers/media/platform/Kconfig|1 +
 drivers/media/platform/Makefile   |1 +
 drivers/media/platform/imagination/Kconfig|   13 +
 drivers/media/platform/imagination/Makefile   |3 +
 .../platform/imagination/e5010-core-regs.h|  585 ++
 .../platform/imagination/e5010-jpeg-enc-hw.c  |  267 +++
 .../platform/imagination/e5010-jpeg-enc-hw.h  |   42 +
 .../platform/imagination/e5010-jpeg-enc.c | 1641 +
 .../platform/imagination/e5010-jpeg-enc.h |  168 ++
 .../platform/imagination/e5010-mmu-regs.h |  311 
 drivers/media/platform/verisilicon/Kconfig|1 +
 .../media/platform/verisilicon/hantro_jpeg.c  |  128 +-
 drivers/media/v4l2-core/v4l2-jpeg.c   |  162 +-
 include/linux/math.h  |   86 +-
 include/media/v4l2-jpeg.h |   28 +
 lib/math/Kconfig  |   14 +
 lib/math/Makefile |1 +
 lib/math/math_kunit.c |  329 
 24 files changed, 3760 insertions(+), 124 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/media/img,e5010-jpeg-enc.yaml
 create mode 100644 Documentation/driver-api/media/v4l2-jpeg.rst
 create mode 100644 drivers/media/platform/imagination/Kconfig
 create mode 100644 drivers/media/platform/imagination/Makefile
 create mode 100644 drivers/media/platform/imagination/e5010-core-regs.h
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.c
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.h
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.c
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.h
 create mode 100644 drivers/media/platform/imagination/e5010-mmu-regs.h
 create mode 100644 lib/math/math_kunit.c

-- 
2.39.1



Re: [PATCH v12 12/13] media: imagination: Round to closest multiple for cropping region

2024-06-06 Thread Devarsh Thakkar
Hi Sebastian

Thanks for the update.
On 06/06/24 17:14, Sebastian Fricke wrote:
> Hey,
> 
> On 04.06.2024 16:23, Devarsh Thakkar wrote:
>> If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up
>> (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest
>> multiple of requested value while updating the crop rectangle coordinates.
>>
>> Use the rounding macro which gives preference to rounding down in case two
>> nearest values (high and low) are possible to raise the probability of
>> cropping rectangle falling inside the bound region.
>>
>> This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl
>> description as documented in v4l uapi [1] which specifies that driver
>> should choose crop rectangle as close as possible if no flags are passed by
>> user-space, as quoted below :
>>
>> "``0`` - The driver can adjust the rectangle size freely and shall choose a
>> crop/compose rectangle as close as possible to the requested
>> one."
>>
>> Link:
>> https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst
>>  [1]
>> Signed-off-by: Devarsh Thakkar 
> 
> Acked-by: Sebastian Fricke 
> 
> Can, whoever picks up the math changes, pick up this change as well?
> I will send 1-6 via the media subsystem.
> 

For [PATCH 1/13] to [PATCH 6/13] patches from the series,
I see few warnings reported to me offline for some of the patches which were
caught from using smatch/sparse related automation scripts which were somehow
missed by my equivalent test script. The fixes should be trivial though and I
will be rolling out a v13 soon to fix them up.

The rest of the patches (PATCH 7/13 to PATCH 13/13) no smatch/sparse related
warnings were caught for these though, and are good to go in. Although, I can
still include them in V13 too just to avoid any confusion.

Regards
Devarsh


[PATCH v12 13/13] gpu: ipu-v3: Use generic macro for rounding closest to specified value

2024-06-04 Thread Devarsh Thakkar
Use generic macro round_closest_up() for rounding closest to specified
value instead of using local macro round_closest().

There is no change from functionality point of view as round_closest_up()
is functionally same as the previously used local macro round_closest().

Signed-off-by: Devarsh Thakkar 
---
V12: No change
V11: No change
V10: No change
V9: No change
V8: Update commit message
V1->V7 : (No change, patch introduced in V7)
---
 drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c 
b/drivers/gpu/ipu-v3/ipu-image-convert.c
index 841316582ea9..5192a8b5c02c 100644
--- a/drivers/gpu/ipu-v3/ipu-image-convert.c
+++ b/drivers/gpu/ipu-v3/ipu-image-convert.c
@@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct 
ipu_image_convert_ctx *ctx,
return 0;
 }
 
-#define round_closest(x, y) round_down((x) + (y)/2, (y))
-
 /*
  * Find the best aligned seam position for the given column / row index.
  * Rotation and image offsets are out of scope.
@@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx 
*ctx,
 * The closest input sample position that we could actually
 * start the input tile at, 19.13 fixed point.
 */
-   in_pos_aligned = round_closest(in_pos, 8192U * in_align);
+   in_pos_aligned = round_closest_up(in_pos, 8192U * in_align);
/* Convert 19.13 fixed point to integer */
in_pos_rounded = in_pos_aligned / 8192U;
 
-- 
2.39.1



[PATCH v12 12/13] media: imagination: Round to closest multiple for cropping region

2024-06-04 Thread Devarsh Thakkar
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up
(V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest
multiple of requested value while updating the crop rectangle coordinates.

Use the rounding macro which gives preference to rounding down in case two
nearest values (high and low) are possible to raise the probability of
cropping rectangle falling inside the bound region.

This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl
description as documented in v4l uapi [1] which specifies that driver
should choose crop rectangle as close as possible if no flags are passed by
user-space, as quoted below :

"``0`` - The driver can adjust the rectangle size freely and shall choose a
crop/compose rectangle as close as possible to the requested
 one."

Link: 
https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst
 [1]
Signed-off-by: Devarsh Thakkar 
---
V12: No change
V11: No change
V10: No change
V9:  No change
V8:  Update commit message with specification reference
V1->V7 (No change, patch introduced in V7)
---
 drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c 
b/drivers/media/platform/imagination/e5010-jpeg-enc.c
index e701d573a26a..d65646f0c38c 100644
--- a/drivers/media/platform/imagination/e5010-jpeg-enc.c
+++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c
@@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, 
struct v4l2_selection
 
switch (s->flags) {
case 0:
-   s->r.width = round_down(s->r.width, 
queue->fmt->frmsize.step_width);
-   s->r.height = round_down(s->r.height, 
queue->fmt->frmsize.step_height);
-   s->r.left = round_down(s->r.left, 
queue->fmt->frmsize.step_width);
-   s->r.top = round_down(s->r.top, 2);
+   s->r.width = round_closest_down(s->r.width, 
queue->fmt->frmsize.step_width);
+   s->r.height = round_closest_down(s->r.height, 
queue->fmt->frmsize.step_height);
+   s->r.left = round_closest_down(s->r.left, 
queue->fmt->frmsize.step_width);
+   s->r.top = round_closest_down(s->r.top, 2);
 
if (s->r.left + s->r.width > queue->width)
s->r.width = round_down(s->r.width + s->r.left - 
queue->width,
-- 
2.39.1



[PATCH v12 11/13] lib: math_kunit: Add tests for new macros related to rounding to nearest value

2024-06-04 Thread Devarsh Thakkar
Add tests for round_closest_up/down and roundclosest macros which round
to nearest multiple of specified argument. These are tested with kunit
tool as shared here [1] :

Link: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876 [1]
Signed-off-by: Devarsh Thakkar 
Acked-by: Andy Shevchenko 
---
V1->V12 (No change, patch introduced in V8)
---
 lib/math/math_kunit.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
index be27f2afb8e4..05022f010be6 100644
--- a/lib/math/math_kunit.c
+++ b/lib/math/math_kunit.c
@@ -70,6 +70,26 @@ static void round_down_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, round_down((1 << 30) - 1, 1 << 29), 1 << 29);
 }
 
+static void round_closest_up_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_up(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(14, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 2), 1 << 30);
+}
+
+static void round_closest_down_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_down(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(14, 4), 12);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 2), (1 << 30) - 
2);
+}
+
 /* These versions can round to numbers that aren't a power of two */
 static void roundup_test(struct kunit *test)
 {
@@ -95,6 +115,18 @@ static void rounddown_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, rounddown(4, 3), 3);
 }
 
+static void roundclosest_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, roundclosest(21, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(19, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(17, 5), 15);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30), 3), (1 << 30) - 1);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30) - 1, 1 << 29), 1 << 30);
+
+   KUNIT_EXPECT_EQ(test, roundclosest(4, 3), 3);
+   KUNIT_EXPECT_EQ(test, roundclosest(5, 3), 6);
+}
+
 static void div_round_up_test(struct kunit *test)
 {
KUNIT_EXPECT_EQ(test, DIV_ROUND_UP(0, 1), 0);
@@ -272,8 +304,11 @@ static struct kunit_case math_test_cases[] = {
KUNIT_CASE(int_sqrt_test),
KUNIT_CASE(round_up_test),
KUNIT_CASE(round_down_test),
+   KUNIT_CASE(round_closest_up_test),
+   KUNIT_CASE(round_closest_down_test),
KUNIT_CASE(roundup_test),
KUNIT_CASE(rounddown_test),
+   KUNIT_CASE(roundclosest_test),
KUNIT_CASE(div_round_up_test),
KUNIT_CASE(div_round_closest_test),
KUNIT_CASE_PARAM(gcd_test, gcd_gen_params),
-- 
2.39.1



[PATCH v12 10/13] lib: add basic KUnit test for lib/math

2024-06-04 Thread Devarsh Thakkar
From: Daniel Latypov 

Add basic test coverage for files that don't require any config options:
* part of math.h (what seem to be the most commonly used macros)
* gcd.c
* lcm.c
* int_sqrt.c
* reciprocal_div.c
(Ignored int_pow.c since it's a simple textbook algorithm.)

These tests aren't particularly interesting, but they
* provide short and simple examples of parameterized tests
* provide a place to add tests for any new files in this dir
* are written so adding new test cases to cover edge cases should be
  easy
  * looking at code coverage, we hit all the branches in the .c files

Signed-off-by: Daniel Latypov 
Reviewed-by: David Gow 
Reviewed-by: Andy Shevchenko 
[devarsht: Rebase to 6.9, remove kernel.h, update Kconfig and change license to 
GPL]
Signed-off-by: Devarsh Thakkar 
---
Changes since v11:
* Add Reviewed-by tag

Changes since v10:
* Include headers per IWYU principle and add module description

Changes since v9:
* Added Kconfig dependency for KUNIT

Changes since v8:
* Remove unrequired header file linux/kernel.h

Changes since v7:
* Rebase to linux-next, change license to GPL as suggested by checkpatch.

Changes since v6:
* No change

Changes since v5:
* add in test cases for roundup/rounddown
* address misc comments from David

Changes since v4:
* add in test cases for some math.h macros (abs, round_up/round_down,
  div_round_down/closest)
* use parameterized testing less to keep things terser

Changes since v3:
* fix `checkpatch.pl --strict` warnings
* add test cases for gcd(0,0) and lcm(0,0)
* minor: don't test both gcd(a,b) and gcd(b,a) when a == b

Changes since v2: mv math_test.c => math_kunit.c

Changes since v1:
* Rebase and rewrite to use the new parameterized testing support.
* misc: fix overflow in literal and inline int_sqrt format string.
* related: commit 1f0e943df68a ("Documentation: kunit: provide guidance
for testing many inputs") was merged explaining the patterns shown here.
  * there's an in-flight patch to update it for parameterized testing.
---
 lib/math/Kconfig  |  14 ++
 lib/math/Makefile |   1 +
 lib/math/math_kunit.c | 294 ++
 3 files changed, 309 insertions(+)
 create mode 100644 lib/math/math_kunit.c

diff --git a/lib/math/Kconfig b/lib/math/Kconfig
index 0634b428d0cb..f738d8a433ea 100644
--- a/lib/math/Kconfig
+++ b/lib/math/Kconfig
@@ -15,3 +15,17 @@ config PRIME_NUMBERS
 
 config RATIONAL
tristate
+
+config MATH_KUNIT_TEST
+   tristate "KUnit test for math helper functions"
+   depends on KUNIT
+   default KUNIT_ALL_TESTS
+
+   help
+ This builds unit test for math helper functions as defined in lib/math
+ and math.h.
+
+ For more information on KUNIT and unit tests in general, please refer
+ to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+ If unsure, say N.
diff --git a/lib/math/Makefile b/lib/math/Makefile
index 91fcdb0c9efe..dcf65d10dab2 100644
--- a/lib/math/Makefile
+++ b/lib/math/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_RATIONAL)  += rational.o
 
 obj-$(CONFIG_TEST_DIV64)   += test_div64.o
 obj-$(CONFIG_RATIONAL_KUNIT_TEST) += rational-test.o
+obj-$(CONFIG_MATH_KUNIT_TEST) += math_kunit.o
diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
new file mode 100644
index ..be27f2afb8e4
--- /dev/null
+++ b/lib/math/math_kunit.c
@@ -0,0 +1,294 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Simple KUnit suite for math helper funcs that are always enabled.
+ *
+ * Copyright (C) 2020, Google LLC.
+ * Author: Daniel Latypov 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static void abs_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, abs((char)0), (char)0);
+   KUNIT_EXPECT_EQ(test, abs((char)42), (char)42);
+   KUNIT_EXPECT_EQ(test, abs((char)-42), (char)42);
+
+   /* The expression in the macro is actually promoted to an int. */
+   KUNIT_EXPECT_EQ(test, abs((short)0),  0);
+   KUNIT_EXPECT_EQ(test, abs((short)42),  42);
+   KUNIT_EXPECT_EQ(test, abs((short)-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0),  0);
+   KUNIT_EXPECT_EQ(test, abs(42),  42);
+   KUNIT_EXPECT_EQ(test, abs(-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0L), 0L);
+   KUNIT_EXPECT_EQ(test, abs(42L), 42L);
+   KUNIT_EXPECT_EQ(test, abs(-42L), 42L);
+
+   KUNIT_EXPECT_EQ(test, abs(0LL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42LL), 42LL);
+   KUNIT_EXPECT_EQ(test, abs(-42LL), 42LL);
+
+   /* Unsigned types get casted to signed. */
+   KUNIT_EXPECT_EQ(test, abs(0ULL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42ULL), 42LL);
+}
+
+static void int_sqrt_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, int_sqrt(0UL), 0UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(1UL), 1UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(4UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(5UL), 2UL);

[PATCH v12 09/13] Documentation: core-api: Add math.h macros and functions

2024-06-04 Thread Devarsh Thakkar
Add documentation for rounding, scaling, absolute value and 32-bit division
related macros and functions exported by math.h header file.

Signed-off-by: Devarsh Thakkar 
Reviewed-by: Andy Shevchenko 
---
V12: Add Reviewed-by
V11: Fix title for math function header
V10: Patch introduced
V1->V9 (No change)
---
 Documentation/core-api/kernel-api.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/core-api/kernel-api.rst 
b/Documentation/core-api/kernel-api.rst
index ae92a2571388..7de494e76fa6 100644
--- a/Documentation/core-api/kernel-api.rst
+++ b/Documentation/core-api/kernel-api.rst
@@ -185,6 +185,12 @@ Division Functions
 .. kernel-doc:: lib/math/gcd.c
:export:
 
+Rounding, absolute value, division and 32-bit scaling functions
+---
+
+.. kernel-doc:: include/linux/math.h
+   :internal:
+
 UUID/GUID
 -
 
-- 
2.39.1



[PATCH v12 07/13] math.h: Add macros for rounding to closest value

2024-06-04 Thread Devarsh Thakkar
Add below rounding related macros:

round_closest_up(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round up in case two nearest values are
possible.

round_closest_down(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round down in case two nearest values are
possible.

roundclosest(x, y) : Rounds x to closest multiple of y, this macro should
generally be used only when y is not multiple of 2 as otherwise
round_closest* macros should be used which are much faster.

Examples:
 * round_closest_up(17, 4) = 16
 * round_closest_up(15, 4) = 16
 * round_closest_up(14, 4) = 16
 * round_closest_down(17, 4) = 16
 * round_closest_down(15, 4) = 16
 * round_closest_down(14, 4) = 12
 * roundclosest(21, 5) = 20
 * roundclosest(19, 5) = 20
 * roundclosest(17, 5) = 15

Signed-off-by: Devarsh Thakkar 
Acked-by: Andy Shevchenko 
---
NOTE: This patch is inspired from the Mentor Graphics IPU driver [1]
which uses similar macro locally and which is updated in further patch
in the series to use this generic macro instead along with other drivers
having similar requirements.

Link: 
https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480
 [1]
V12:
- Add Acked-by
V11:
- Fix commenting style per review comments and remove extra whitespace

V10:
- Update example comment to fix formatting issues as observed with html docs

V9:
- No change

V8:
- Add new macro to round to nearest value for non-multiple of 2
- Update commit message as suggested:

V1->V6 (No change, patch introduced in V7)
---
 include/linux/math.h | 63 
 1 file changed, 63 insertions(+)

diff --git a/include/linux/math.h b/include/linux/math.h
index dd4152711de7..79e3dfda77fc 100644
--- a/include/linux/math.h
+++ b/include/linux/math.h
@@ -34,6 +34,52 @@
  */
 #define round_down(x, y) ((x) & ~__round_mask(x, y))
 
+/**
+ * round_closest_up - round closest to be multiple of specified value (which is
+ *power of 2) with preference to rounding up
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples:
+ * * round_closest_up(17, 4) = 16
+ * * round_closest_up(15, 4) = 16
+ * * round_closest_up(14, 4) = 16
+ */
+#define round_closest_up(x, y) round_down((x) + (y) / 2, (y))
+
+/**
+ * round_closest_down - round closest to be multiple of specified value (which
+ * is power of 2) with preference to rounding down
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples:
+ * * round_closest_down(17, 4) = 16
+ * * round_closest_down(15, 4) = 16
+ * * round_closest_down(14, 4) = 12
+ */
+#define round_closest_down(x, y) round_up((x) - (y) / 2, (y))
+
 #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
 
 #define DIV_ROUND_DOWN_ULL(ll, d) \
@@ -77,6 +123,23 @@
 }  \
 )
 
+/**
+ * roundclosest - round to nearest multiple
+ * @x: the value to round
+ * @y: multiple to round nearest to
+ *
+ * Rounds @x to nearest multiple of @y.
+ * The rounded value can be greater than or less than @x depending
+ * upon it's nearness to @x. If @y will always be a power of 2, consider
+ * using the faster round_closest_up() or round_closest_down().
+ *
+ * Examples:
+ * * roundclosest(21, 5) = 20
+ * * roundclosest(19, 5) = 20
+ * * roundclosest(17, 5) = 15
+ */
+#define roundclosest(x, y) rounddown((x) + (y) / 2, (y))
+
 /*
  * Divide positive or negative dividend by positive or negative divisor
  * and round to closest integer. Result is undefined for negative
-- 
2.39.1



[PATCH v12 00/13] Add V4L2 M2M Driver for E5010 JPEG Encoder

2024-06-04 Thread Devarsh Thakkar
744d999080de6e813bcfb5a596272

Previous patch series:
V11: https://lore.kernel.org/all/20240531170229.1270828-1-devar...@ti.com/
V10: https://lore.kernel.org/all/20240530165925.2715837-1-devar...@ti.com/
V9: https://lore.kernel.org/all/20240526175655.1093707-1-devar...@ti.com/
V8: https://lore.kernel.org/all/20240517171532.748684-1-devar...@ti.com/
V7: https://lore.kernel.org/all/20240510082603.1263256-1-devar...@ti.com/
V6: https://lore.kernel.org/all/20240228141140.3530612-1-devar...@ti.com/
V5: https://lore.kernel.org/all/20240215134641.3381478-1-devar...@ti.com/
V4: https://lore.kernel.org/all/20240205114239.924697-1-devar...@ti.com/
V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/
V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/


Daniel Latypov (1):
  lib: add basic KUnit test for lib/math

Devarsh Thakkar (12):
  media: dt-bindings: Add Imagination E5010 JPEG Encoder
  media: imagination: Add E5010 JPEG Encoder driver
  media: v4l2-jpeg: Export reference quantization and huffman tables
  media: Documentation: Document v4l2-jpeg helper macros
  media: imagination: Use exported tables from v4l2-jpeg core
  media: verisilicon : Use exported tables from v4l2-jpeg for hantro
codec
  math.h: Add macros for rounding to closest value
  math.h: Use kernel-doc syntax for divison functions
  Documentation: core-api: Add math.h macros and functions
  lib: math_kunit: Add tests for new macros related to rounding to
nearest value
  media: imagination: Round to closest multiple for cropping region
  gpu: ipu-v3: Use generic macro for rounding closest to specified value

 Documentation/core-api/kernel-api.rst |6 +
 .../bindings/media/img,e5010-jpeg-enc.yaml|   75 +
 Documentation/driver-api/media/v4l2-core.rst  |1 +
 Documentation/driver-api/media/v4l2-jpeg.rst  |   10 +
 MAINTAINERS   |7 +
 drivers/gpu/ipu-v3/ipu-image-convert.c|4 +-
 drivers/media/platform/Kconfig|1 +
 drivers/media/platform/Makefile   |1 +
 drivers/media/platform/imagination/Kconfig|   12 +
 drivers/media/platform/imagination/Makefile   |3 +
 .../platform/imagination/e5010-core-regs.h|  585 ++
 .../platform/imagination/e5010-jpeg-enc-hw.c  |  267 +++
 .../platform/imagination/e5010-jpeg-enc-hw.h  |   42 +
 .../platform/imagination/e5010-jpeg-enc.c | 1644 +
 .../platform/imagination/e5010-jpeg-enc.h |  168 ++
 .../platform/imagination/e5010-mmu-regs.h |  311 
 .../media/platform/verisilicon/hantro_jpeg.c  |  128 +-
 drivers/media/v4l2-core/v4l2-jpeg.c   |  162 +-
 include/linux/math.h  |   86 +-
 include/media/v4l2-jpeg.h |   28 +
 lib/math/Kconfig  |   14 +
 lib/math/Makefile |1 +
 lib/math/math_kunit.c |  329 
 23 files changed, 3761 insertions(+), 124 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/media/img,e5010-jpeg-enc.yaml
 create mode 100644 Documentation/driver-api/media/v4l2-jpeg.rst
 create mode 100644 drivers/media/platform/imagination/Kconfig
 create mode 100644 drivers/media/platform/imagination/Makefile
 create mode 100644 drivers/media/platform/imagination/e5010-core-regs.h
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.c
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.h
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.c
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.h
 create mode 100644 drivers/media/platform/imagination/e5010-mmu-regs.h
 create mode 100644 lib/math/math_kunit.c

-- 
2.39.1



Re: [PATCH v11 07/11] Documentation: core-api: Add math.h macros and functions

2024-06-01 Thread Devarsh Thakkar
On 01/06/24 00:01, Randy Dunlap wrote:
> Hi,
> 
> On 5/31/24 10:12 AM, Devarsh Thakkar wrote:
>> Add documentation for rounding, scaling, absolute value and difference,
>> 32-bit division related macros and functions exported by math.h header
>> file.
>>
> 
> I don't see any kernel-doc for division functions in this header file.
> 
> Do some division functions get rendered somehow?
> 

Good catch. I see couple of them having adequate documentation just
missing the sphynx syntax, will enable for DIV_ROUND_CLOSEST and
DIV_ROUND_CLOSEST_ULL.

Regards
Devarsh


[PATCH v11 11/11] gpu: ipu-v3: Use generic macro for rounding closest to specified value

2024-05-31 Thread Devarsh Thakkar
Use generic macro round_closest_up() for rounding closest to specified
value instead of using local macro round_closest().

There is no change from functionality point of view as round_closest_up()
is functionally same as the previously used local macro round_closest().

Signed-off-by: Devarsh Thakkar 
---
V11: No change
V10: No change
V9: No change
V8: Update commit message
V1->V7 : (No change, patch introduced in V7)
---
 drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c 
b/drivers/gpu/ipu-v3/ipu-image-convert.c
index 841316582ea9..5192a8b5c02c 100644
--- a/drivers/gpu/ipu-v3/ipu-image-convert.c
+++ b/drivers/gpu/ipu-v3/ipu-image-convert.c
@@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct 
ipu_image_convert_ctx *ctx,
return 0;
 }
 
-#define round_closest(x, y) round_down((x) + (y)/2, (y))
-
 /*
  * Find the best aligned seam position for the given column / row index.
  * Rotation and image offsets are out of scope.
@@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx 
*ctx,
 * The closest input sample position that we could actually
 * start the input tile at, 19.13 fixed point.
 */
-   in_pos_aligned = round_closest(in_pos, 8192U * in_align);
+   in_pos_aligned = round_closest_up(in_pos, 8192U * in_align);
/* Convert 19.13 fixed point to integer */
in_pos_rounded = in_pos_aligned / 8192U;
 
-- 
2.39.1



[PATCH v11 10/11] media: imagination: Round to closest multiple for cropping region

2024-05-31 Thread Devarsh Thakkar
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up
(V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest
multiple of requested value while updating the crop rectangle coordinates.

Use the rounding macro which gives preference to rounding down in case two
nearest values (high and low) are possible to raise the probability of
cropping rectangle falling inside the bound region.

This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl
description as documented in v4l uapi [1] which specifies that driver
should choose crop rectangle as close as possible if no flags are passed by
user-space, as quoted below :

"``0`` - The driver can adjust the rectangle size freely and shall choose a
crop/compose rectangle as close as possible to the requested
 one."

[1] :
https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst

Signed-off-by: Devarsh Thakkar 
---
V11: No change
V10: No change
V9:  No change
V8:  Update commit message with specification reference
V1->V7 (No change, patch introduced in V7)
---
 drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c 
b/drivers/media/platform/imagination/e5010-jpeg-enc.c
index e701d573a26a..d65646f0c38c 100644
--- a/drivers/media/platform/imagination/e5010-jpeg-enc.c
+++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c
@@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, 
struct v4l2_selection
 
switch (s->flags) {
case 0:
-   s->r.width = round_down(s->r.width, 
queue->fmt->frmsize.step_width);
-   s->r.height = round_down(s->r.height, 
queue->fmt->frmsize.step_height);
-   s->r.left = round_down(s->r.left, 
queue->fmt->frmsize.step_width);
-   s->r.top = round_down(s->r.top, 2);
+   s->r.width = round_closest_down(s->r.width, 
queue->fmt->frmsize.step_width);
+   s->r.height = round_closest_down(s->r.height, 
queue->fmt->frmsize.step_height);
+   s->r.left = round_closest_down(s->r.left, 
queue->fmt->frmsize.step_width);
+   s->r.top = round_closest_down(s->r.top, 2);
 
if (s->r.left + s->r.width > queue->width)
s->r.width = round_down(s->r.width + s->r.left - 
queue->width,
-- 
2.39.1



[PATCH v11 09/11] lib: math_kunit: Add tests for new macros related to rounding to nearest value

2024-05-31 Thread Devarsh Thakkar
Add tests for round_closest_up/down and roundclosest macros which round
to nearest multiple of specified argument. These are tested with kunit
tool as shared here [1].

[1]: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876

Signed-off-by: Devarsh Thakkar 
---
V1->V11 (No change, patch introduced in V8)
---
 lib/math/math_kunit.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
index be27f2afb8e4..05022f010be6 100644
--- a/lib/math/math_kunit.c
+++ b/lib/math/math_kunit.c
@@ -70,6 +70,26 @@ static void round_down_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, round_down((1 << 30) - 1, 1 << 29), 1 << 29);
 }
 
+static void round_closest_up_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_up(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(14, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 2), 1 << 30);
+}
+
+static void round_closest_down_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_down(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(14, 4), 12);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 2), (1 << 30) - 
2);
+}
+
 /* These versions can round to numbers that aren't a power of two */
 static void roundup_test(struct kunit *test)
 {
@@ -95,6 +115,18 @@ static void rounddown_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, rounddown(4, 3), 3);
 }
 
+static void roundclosest_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, roundclosest(21, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(19, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(17, 5), 15);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30), 3), (1 << 30) - 1);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30) - 1, 1 << 29), 1 << 30);
+
+   KUNIT_EXPECT_EQ(test, roundclosest(4, 3), 3);
+   KUNIT_EXPECT_EQ(test, roundclosest(5, 3), 6);
+}
+
 static void div_round_up_test(struct kunit *test)
 {
KUNIT_EXPECT_EQ(test, DIV_ROUND_UP(0, 1), 0);
@@ -272,8 +304,11 @@ static struct kunit_case math_test_cases[] = {
KUNIT_CASE(int_sqrt_test),
KUNIT_CASE(round_up_test),
KUNIT_CASE(round_down_test),
+   KUNIT_CASE(round_closest_up_test),
+   KUNIT_CASE(round_closest_down_test),
KUNIT_CASE(roundup_test),
KUNIT_CASE(rounddown_test),
+   KUNIT_CASE(roundclosest_test),
KUNIT_CASE(div_round_up_test),
KUNIT_CASE(div_round_closest_test),
KUNIT_CASE_PARAM(gcd_test, gcd_gen_params),
-- 
2.39.1



[PATCH v11 08/11] lib: add basic KUnit test for lib/math

2024-05-31 Thread Devarsh Thakkar
From: Daniel Latypov 

Add basic test coverage for files that don't require any config options:
* part of math.h (what seem to be the most commonly used macros)
* gcd.c
* lcm.c
* int_sqrt.c
* reciprocal_div.c
(Ignored int_pow.c since it's a simple textbook algorithm.)

These tests aren't particularly interesting, but they
* provide short and simple examples of parameterized tests
* provide a place to add tests for any new files in this dir
* are written so adding new test cases to cover edge cases should be
  easy
  * looking at code coverage, we hit all the branches in the .c files

Signed-off-by: Daniel Latypov 
Reviewed-by: David Gow 
[devarsht: Rebase to 6.9, remove kernel.h, update Kconfig and change license to 
GPL]
Signed-off-by: Devarsh Thakkar 
---
Changes since v10:
* Include headers per IWYU principle and add module description

Changes since v9:
* Added Kconfig dependency for KUNIT

Changes since v8:
* Remove unrequired header file linux/kernel.h

Changes since v7:
* Rebase to linux-next, change license to GPL as suggested by checkpatch.

Changes since v6:
* No change

Changes since v5:
* add in test cases for roundup/rounddown
* address misc comments from David

Changes since v4:
* add in test cases for some math.h macros (abs, round_up/round_down,
  div_round_down/closest)
* use parameterized testing less to keep things terser

Changes since v3:
* fix `checkpatch.pl --strict` warnings
* add test cases for gcd(0,0) and lcm(0,0)
* minor: don't test both gcd(a,b) and gcd(b,a) when a == b

Changes since v2: mv math_test.c => math_kunit.c

Changes since v1:
* Rebase and rewrite to use the new parameterized testing support.
* misc: fix overflow in literal and inline int_sqrt format string.
* related: commit 1f0e943df68a ("Documentation: kunit: provide guidance
for testing many inputs") was merged explaining the patterns shown here.
  * there's an in-flight patch to update it for parameterized testing.
---
 lib/math/Kconfig  |  14 ++
 lib/math/Makefile |   1 +
 lib/math/math_kunit.c | 294 ++
 3 files changed, 309 insertions(+)
 create mode 100644 lib/math/math_kunit.c

diff --git a/lib/math/Kconfig b/lib/math/Kconfig
index 0634b428d0cb..f738d8a433ea 100644
--- a/lib/math/Kconfig
+++ b/lib/math/Kconfig
@@ -15,3 +15,17 @@ config PRIME_NUMBERS
 
 config RATIONAL
tristate
+
+config MATH_KUNIT_TEST
+   tristate "KUnit test for math helper functions"
+   depends on KUNIT
+   default KUNIT_ALL_TESTS
+
+   help
+ This builds unit test for math helper functions as defined in lib/math
+ and math.h.
+
+ For more information on KUNIT and unit tests in general, please refer
+ to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+ If unsure, say N.
diff --git a/lib/math/Makefile b/lib/math/Makefile
index 91fcdb0c9efe..dcf65d10dab2 100644
--- a/lib/math/Makefile
+++ b/lib/math/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_RATIONAL)  += rational.o
 
 obj-$(CONFIG_TEST_DIV64)   += test_div64.o
 obj-$(CONFIG_RATIONAL_KUNIT_TEST) += rational-test.o
+obj-$(CONFIG_MATH_KUNIT_TEST) += math_kunit.o
diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
new file mode 100644
index ..be27f2afb8e4
--- /dev/null
+++ b/lib/math/math_kunit.c
@@ -0,0 +1,294 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Simple KUnit suite for math helper funcs that are always enabled.
+ *
+ * Copyright (C) 2020, Google LLC.
+ * Author: Daniel Latypov 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static void abs_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, abs((char)0), (char)0);
+   KUNIT_EXPECT_EQ(test, abs((char)42), (char)42);
+   KUNIT_EXPECT_EQ(test, abs((char)-42), (char)42);
+
+   /* The expression in the macro is actually promoted to an int. */
+   KUNIT_EXPECT_EQ(test, abs((short)0),  0);
+   KUNIT_EXPECT_EQ(test, abs((short)42),  42);
+   KUNIT_EXPECT_EQ(test, abs((short)-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0),  0);
+   KUNIT_EXPECT_EQ(test, abs(42),  42);
+   KUNIT_EXPECT_EQ(test, abs(-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0L), 0L);
+   KUNIT_EXPECT_EQ(test, abs(42L), 42L);
+   KUNIT_EXPECT_EQ(test, abs(-42L), 42L);
+
+   KUNIT_EXPECT_EQ(test, abs(0LL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42LL), 42LL);
+   KUNIT_EXPECT_EQ(test, abs(-42LL), 42LL);
+
+   /* Unsigned types get casted to signed. */
+   KUNIT_EXPECT_EQ(test, abs(0ULL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42ULL), 42LL);
+}
+
+static void int_sqrt_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, int_sqrt(0UL), 0UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(1UL), 1UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(4UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(5UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(8UL), 2UL);
+   KUNI

[PATCH v11 07/11] Documentation: core-api: Add math.h macros and functions

2024-05-31 Thread Devarsh Thakkar
Add documentation for rounding, scaling, absolute value and difference,
32-bit division related macros and functions exported by math.h header
file.

Signed-off-by: Devarsh Thakkar 
---
V11: Fix title for math function header
V10: Patch introduced
V1->V9 (No change)
---
 Documentation/core-api/kernel-api.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/core-api/kernel-api.rst 
b/Documentation/core-api/kernel-api.rst
index ae92a2571388..7de494e76fa6 100644
--- a/Documentation/core-api/kernel-api.rst
+++ b/Documentation/core-api/kernel-api.rst
@@ -185,6 +185,12 @@ Division Functions
 .. kernel-doc:: lib/math/gcd.c
:export:
 
+Rounding, absolute value, division and 32-bit scaling functions
+---
+
+.. kernel-doc:: include/linux/math.h
+   :internal:
+
 UUID/GUID
 -
 
-- 
2.39.1



[PATCH v11 06/11] math.h: Add macros for rounding to closest value

2024-05-31 Thread Devarsh Thakkar
Add below rounding related macros:

round_closest_up(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round up in case two nearest values are
possible.

round_closest_down(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round down in case two nearest values are
possible.

roundclosest(x, y) : Rounds x to closest multiple of y, this macro should
generally be used only when y is not multiple of 2 as otherwise
round_closest* macros should be used which are much faster.

Examples:
 * round_closest_up(17, 4) = 16
 * round_closest_up(15, 4) = 16
 * round_closest_up(14, 4) = 16
 * round_closest_down(17, 4) = 16
 * round_closest_down(15, 4) = 16
 * round_closest_down(14, 4) = 12
 * roundclosest(21, 5) = 20
 * roundclosest(19, 5) = 20
 * roundclosest(17, 5) = 15

Signed-off-by: Devarsh Thakkar 
---
NOTE: This patch is inspired from the Mentor Graphics IPU driver [1]
which uses similar macro locally and which is updated in further patch
in the series to use this generic macro instead along with other drivers
having similar requirements.

[1]:
https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480
V11:
- Fix commenting style per review comments and remove extra whitespace

V10:
- Update example comment to fix formatting issues as observed with html docs

V9:
- No change

V8:
- Add new macro to round to nearest value for non-multiple of 2
- Update commit message as suggested:

V1->V6 (No change, patch introduced in V7)
---
 include/linux/math.h | 63 
 1 file changed, 63 insertions(+)

diff --git a/include/linux/math.h b/include/linux/math.h
index dd4152711de7..79e3dfda77fc 100644
--- a/include/linux/math.h
+++ b/include/linux/math.h
@@ -34,6 +34,52 @@
  */
 #define round_down(x, y) ((x) & ~__round_mask(x, y))
 
+/**
+ * round_closest_up - round closest to be multiple of specified value (which is
+ *power of 2) with preference to rounding up
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples:
+ * * round_closest_up(17, 4) = 16
+ * * round_closest_up(15, 4) = 16
+ * * round_closest_up(14, 4) = 16
+ */
+#define round_closest_up(x, y) round_down((x) + (y) / 2, (y))
+
+/**
+ * round_closest_down - round closest to be multiple of specified value (which
+ * is power of 2) with preference to rounding down
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples:
+ * * round_closest_down(17, 4) = 16
+ * * round_closest_down(15, 4) = 16
+ * * round_closest_down(14, 4) = 12
+ */
+#define round_closest_down(x, y) round_up((x) - (y) / 2, (y))
+
 #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
 
 #define DIV_ROUND_DOWN_ULL(ll, d) \
@@ -77,6 +123,23 @@
 }  \
 )
 
+/**
+ * roundclosest - round to nearest multiple
+ * @x: the value to round
+ * @y: multiple to round nearest to
+ *
+ * Rounds @x to nearest multiple of @y.
+ * The rounded value can be greater than or less than @x depending
+ * upon it's nearness to @x. If @y will always be a power of 2, consider
+ * using the faster round_closest_up() or round_closest_down().
+ *
+ * Examples:
+ * * roundclosest(21, 5) = 20
+ * * roundclosest(19, 5) = 20
+ * * roundclosest(17, 5) = 15
+ */
+#define roundclosest(x, y) rounddown((x) + (y) / 2, (y))
+
 /*
  * Divide positive or negative dividend by positive or negative divisor
  * and round to closest integer. Result is undefined for negative
-- 
2.39.1



[PATCH v11 00/11] Add V4L2 M2M Driver for E5010 JPEG Encoder

2024-05-31 Thread Devarsh Thakkar
el.org/all/20240517171532.748684-1-devar...@ti.com/
V7: https://lore.kernel.org/all/20240510082603.1263256-1-devar...@ti.com/
V6: https://lore.kernel.org/all/20240228141140.3530612-1-devar...@ti.com/
V5: https://lore.kernel.org/all/20240215134641.3381478-1-devar...@ti.com/
V4: https://lore.kernel.org/all/20240205114239.924697-1-devar...@ti.com/
V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/
V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/


Daniel Latypov (1):
  lib: add basic KUnit test for lib/math

Devarsh Thakkar (10):
  media: dt-bindings: Add Imagination E5010 JPEG Encoder
  media: imagination: Add E5010 JPEG Encoder driver
  media: v4l2-jpeg: Export reference quantization and huffman tables
  media: imagination: Use exported tables from v4l2-jpeg core
  media: verisilicon : Use exported tables from v4l2-jpeg for hantro
codec
  math.h: Add macros for rounding to closest value
  Documentation: core-api: Add math.h macros and functions
  lib: math_kunit: Add tests for new macros related to rounding to
nearest value
  media: imagination: Round to closest multiple for cropping region
  gpu: ipu-v3: Use generic macro for rounding closest to specified value

 Documentation/core-api/kernel-api.rst |6 +
 .../bindings/media/img,e5010-jpeg-enc.yaml|   75 +
 MAINTAINERS   |7 +
 drivers/gpu/ipu-v3/ipu-image-convert.c|4 +-
 drivers/media/platform/Kconfig|1 +
 drivers/media/platform/Makefile   |1 +
 drivers/media/platform/imagination/Kconfig|   12 +
 drivers/media/platform/imagination/Makefile   |3 +
 .../platform/imagination/e5010-core-regs.h|  585 ++
 .../platform/imagination/e5010-jpeg-enc-hw.c  |  267 +++
 .../platform/imagination/e5010-jpeg-enc-hw.h  |   42 +
 .../platform/imagination/e5010-jpeg-enc.c | 1644 +
 .../platform/imagination/e5010-jpeg-enc.h |  168 ++
 .../platform/imagination/e5010-mmu-regs.h |  311 
 .../media/platform/verisilicon/hantro_jpeg.c  |  128 +-
 drivers/media/v4l2-core/v4l2-jpeg.c   |  162 +-
 include/linux/math.h  |   63 +
 include/media/v4l2-jpeg.h |   28 +
 lib/math/Kconfig  |   14 +
 lib/math/Makefile |1 +
 lib/math/math_kunit.c |  329 
 21 files changed, 3733 insertions(+), 118 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/media/img,e5010-jpeg-enc.yaml
 create mode 100644 drivers/media/platform/imagination/Kconfig
 create mode 100644 drivers/media/platform/imagination/Makefile
 create mode 100644 drivers/media/platform/imagination/e5010-core-regs.h
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.c
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.h
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.c
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.h
 create mode 100644 drivers/media/platform/imagination/e5010-mmu-regs.h
 create mode 100644 lib/math/math_kunit.c

-- 
2.39.1



Re: [PATCH v10 07/11] Documentation: core-api: Add math.h macros and functions

2024-05-31 Thread Devarsh Thakkar
Hi Randy,

Thanks for the review.

On 31/05/24 04:14, Randy Dunlap wrote:
> 
> 
> On 5/30/24 10:17 AM, Devarsh Thakkar wrote:
>> Add documentation for rounding, scaling, absolute value and difference,
>> 32-bit division related macros and functions exported by math.h header
>> file.
>>
>> Signed-off-by: Devarsh Thakkar 
>> ---
>> V1->V9 (No change)
>> V10: Patch introduced
>> ---
>>  Documentation/core-api/kernel-api.rst | 6 ++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/Documentation/core-api/kernel-api.rst 
>> b/Documentation/core-api/kernel-api.rst
>> index ae92a2571388..fb467783d491 100644
>> --- a/Documentation/core-api/kernel-api.rst
>> +++ b/Documentation/core-api/kernel-api.rst
>> @@ -185,6 +185,12 @@ Division Functions
>>  .. kernel-doc:: lib/math/gcd.c
>> :export:
>>  
>> +Rounding, absolute value, scaling and 32bit division functions
> 
>  32-bit
> please.
> 

Good catch. Also I see some division functions supporting non-32bit functions
too, so I would make it as below :
Rounding, absolute value, division and 32-bit scaling functions
---

Regards
Devarsh


Re: [PATCH v10 07/11] Documentation: core-api: Add math.h macros and functions

2024-05-31 Thread Devarsh Thakkar



On 31/05/24 00:51, Andy Shevchenko wrote:
> On Thu, May 30, 2024 at 10:47:40PM +0530, Devarsh Thakkar wrote:
>> Add documentation for rounding, scaling, absolute value and difference,
>> 32-bit division related macros and functions exported by math.h header
>> file.
> 
> ...
> 
>> +Rounding, absolute value, scaling and 32bit division functions
>> +--
>> +
>> +.. kernel-doc:: include/linux/math.h
>> +   :internal:
> 
> Please, double check that this is correct keyword in this case.
> 

Yes, this is inline as per what is described here [1] as there are no export
symbols in the header. Also the rendered output looks good.

[1] :
https://docs.kernel.org/doc-guide/kernel-doc.html#:~:text=net/mac80211/*.c-,internal,-%3A%20%5Bsource%2Dpattern

Regards
Devarsh


Re: [PATCH v10 08/11] lib: add basic KUnit test for lib/math

2024-05-31 Thread Devarsh Thakkar



On 31/05/24 00:53, Andy Shevchenko wrote:
> On Thu, May 30, 2024 at 10:48:10PM +0530, Devarsh Thakkar wrote:
>> From: Daniel Latypov 
>>
>> Add basic test coverage for files that don't require any config options:
>> * part of math.h (what seem to be the most commonly used macros)
>> * gcd.c
>> * lcm.c
>> * int_sqrt.c
>> * reciprocal_div.c
>> (Ignored int_pow.c since it's a simple textbook algorithm.)
>>
>> These tests aren't particularly interesting, but they
>> * provide short and simple examples of parameterized tests
>> * provide a place to add tests for any new files in this dir
>> * are written so adding new test cases to cover edge cases should be
>>   easy
>>   * looking at code coverage, we hit all the branches in the .c files
> 
> ...
> 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
> 
> Really, you ignored my comment a second (?) time? This is road to nowhere.
> You need to update the inclusion bloc in accordance with IWYU principle.
> I see a few headers are missing.
> 

Sorry I forgot to add those somehow. I will add module.h and math.h in v11.

Regards
Devarsh


Re: [PATCH v10 06/11] math.h: Add macros for rounding to closest value

2024-05-31 Thread Devarsh Thakkar
Hi Andy,

Thanks for the review.

On 31/05/24 00:49, Andy Shevchenko wrote:
> On Thu, May 30, 2024 at 10:42:25PM +0530, Devarsh Thakkar wrote:
>> Add below rounding related macros:
>>
>> round_closest_up(x, y) : Rounds x to closest multiple of y where y is a
>> power of 2, with a preference to round up in case two nearest values are
>> possible.
>>
>> round_closest_down(x, y) : Rounds x to closest multiple of y where y is a
>> power of 2, with a preference to round down in case two nearest values are
>> possible.
>>
>> roundclosest(x, y) : Rounds x to closest multiple of y, this macro should
>> generally be used only when y is not multiple of 2 as otherwise
>> round_closest* macros should be used which are much faster.
>>
>> Examples:
>>  * round_closest_up(17, 4) = 16
>>  * round_closest_up(15, 4) = 16
>>  * round_closest_up(14, 4) = 16
>>  * round_closest_down(17, 4) = 16
>>  * round_closest_down(15, 4) = 16
>>  * round_closest_down(14, 4) = 12
>>  * roundclosest(21, 5) = 20
>>  * roundclosest(19, 5) = 20
>>  * roundclosest(17, 5) = 15
> 
> ...
> 
>> + * Examples :
> 
> It's inconsistent with the other one below.
> 
>> + *  round_closest_up(17, 4) = 16
>> + *
>> + *  round_closest_up(15, 4) = 16
>> + *
>> + *  round_closest_up(14, 4) = 16
> 
> The three have TABs/spaces mixture.
> 
> I believe you wanted:
> 
>  * Examples::
>  * * round_closest_up(17, 4) = 16
>  * * round_closest_up(15, 4) = 16
>  * * round_closest_up(14, 4) = 16
> 

I initially referred the style from this link [1] but probably missed to
remove extra space from my patch.

But what you suggested looks better, I will go with what you suggested.

[1]  https://elixir.bootlin.com/linux/v6.9/source/include/linux/int_log.h#L22

Regards
Devarsh


[PATCH v10 11/11] gpu: ipu-v3: Use generic macro for rounding closest to specified value

2024-05-30 Thread Devarsh Thakkar
Use generic macro round_closest_up() for rounding closest to specified
value instead of using local macro round_closest().

There is no change from functionality point of view as round_closest_up()
is functionally same as the previously used local macro round_closest().

Signed-off-by: Devarsh Thakkar 
---
V10: No change
V9: No change
V8: Update commit message
V1->V7 : (No change, patch introduced in V7)
---
 drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c 
b/drivers/gpu/ipu-v3/ipu-image-convert.c
index 841316582ea9..5192a8b5c02c 100644
--- a/drivers/gpu/ipu-v3/ipu-image-convert.c
+++ b/drivers/gpu/ipu-v3/ipu-image-convert.c
@@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct 
ipu_image_convert_ctx *ctx,
return 0;
 }
 
-#define round_closest(x, y) round_down((x) + (y)/2, (y))
-
 /*
  * Find the best aligned seam position for the given column / row index.
  * Rotation and image offsets are out of scope.
@@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx 
*ctx,
 * The closest input sample position that we could actually
 * start the input tile at, 19.13 fixed point.
 */
-   in_pos_aligned = round_closest(in_pos, 8192U * in_align);
+   in_pos_aligned = round_closest_up(in_pos, 8192U * in_align);
/* Convert 19.13 fixed point to integer */
in_pos_rounded = in_pos_aligned / 8192U;
 
-- 
2.39.1



[PATCH v10 10/11] media: imagination: Round to closest multiple for cropping region

2024-05-30 Thread Devarsh Thakkar
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up
(V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest
multiple of requested value while updating the crop rectangle coordinates.

Use the rounding macro which gives preference to rounding down in case two
nearest values (high and low) are possible to raise the probability of
cropping rectangle falling inside the bound region.

This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl
description as documented in v4l uapi [1] which specifies that driver
should choose crop rectangle as close as possible if no flags are passed by
user-space, as quoted below :

"``0`` - The driver can adjust the rectangle size freely and shall choose a
crop/compose rectangle as close as possible to the requested
 one."

[1] :
https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst

Signed-off-by: Devarsh Thakkar 
---
V10: No change
V9:  No change
V8:  Update commit message with specification reference
V1->V7 (No change, patch introduced in V7)
---
 drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c 
b/drivers/media/platform/imagination/e5010-jpeg-enc.c
index e701d573a26a..d65646f0c38c 100644
--- a/drivers/media/platform/imagination/e5010-jpeg-enc.c
+++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c
@@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, 
struct v4l2_selection
 
switch (s->flags) {
case 0:
-   s->r.width = round_down(s->r.width, 
queue->fmt->frmsize.step_width);
-   s->r.height = round_down(s->r.height, 
queue->fmt->frmsize.step_height);
-   s->r.left = round_down(s->r.left, 
queue->fmt->frmsize.step_width);
-   s->r.top = round_down(s->r.top, 2);
+   s->r.width = round_closest_down(s->r.width, 
queue->fmt->frmsize.step_width);
+   s->r.height = round_closest_down(s->r.height, 
queue->fmt->frmsize.step_height);
+   s->r.left = round_closest_down(s->r.left, 
queue->fmt->frmsize.step_width);
+   s->r.top = round_closest_down(s->r.top, 2);
 
if (s->r.left + s->r.width > queue->width)
s->r.width = round_down(s->r.width + s->r.left - 
queue->width,
-- 
2.39.1



[PATCH v10 09/11] lib: math_kunit: Add tests for new macros related to rounding to nearest value

2024-05-30 Thread Devarsh Thakkar
Add tests for round_closest_up/down and roundclosest macros which round
to nearest multiple of specified argument. These are tested with kunit
tool as shared here [1].

[1]: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876

Signed-off-by: Devarsh Thakkar 
---
V1->V10 (No change, patch introduced in V8)
---
 lib/math/math_kunit.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
index 16263e30a292..60b66aa41121 100644
--- a/lib/math/math_kunit.c
+++ b/lib/math/math_kunit.c
@@ -67,6 +67,26 @@ static void round_down_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, round_down((1 << 30) - 1, 1 << 29), 1 << 29);
 }
 
+static void round_closest_up_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_up(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(14, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 2), 1 << 30);
+}
+
+static void round_closest_down_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_down(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(14, 4), 12);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 2), (1 << 30) - 
2);
+}
+
 /* These versions can round to numbers that aren't a power of two */
 static void roundup_test(struct kunit *test)
 {
@@ -92,6 +112,18 @@ static void rounddown_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, rounddown(4, 3), 3);
 }
 
+static void roundclosest_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, roundclosest(21, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(19, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(17, 5), 15);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30), 3), (1 << 30) - 1);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30) - 1, 1 << 29), 1 << 30);
+
+   KUNIT_EXPECT_EQ(test, roundclosest(4, 3), 3);
+   KUNIT_EXPECT_EQ(test, roundclosest(5, 3), 6);
+}
+
 static void div_round_up_test(struct kunit *test)
 {
KUNIT_EXPECT_EQ(test, DIV_ROUND_UP(0, 1), 0);
@@ -269,8 +301,11 @@ static struct kunit_case math_test_cases[] = {
KUNIT_CASE(int_sqrt_test),
KUNIT_CASE(round_up_test),
KUNIT_CASE(round_down_test),
+   KUNIT_CASE(round_closest_up_test),
+   KUNIT_CASE(round_closest_down_test),
KUNIT_CASE(roundup_test),
KUNIT_CASE(rounddown_test),
+   KUNIT_CASE(roundclosest_test),
KUNIT_CASE(div_round_up_test),
KUNIT_CASE(div_round_closest_test),
KUNIT_CASE_PARAM(gcd_test, gcd_gen_params),
-- 
2.39.1



[PATCH v10 08/11] lib: add basic KUnit test for lib/math

2024-05-30 Thread Devarsh Thakkar
From: Daniel Latypov 

Add basic test coverage for files that don't require any config options:
* part of math.h (what seem to be the most commonly used macros)
* gcd.c
* lcm.c
* int_sqrt.c
* reciprocal_div.c
(Ignored int_pow.c since it's a simple textbook algorithm.)

These tests aren't particularly interesting, but they
* provide short and simple examples of parameterized tests
* provide a place to add tests for any new files in this dir
* are written so adding new test cases to cover edge cases should be
  easy
  * looking at code coverage, we hit all the branches in the .c files

Signed-off-by: Daniel Latypov 
Reviewed-by: David Gow 
[devarsht: Rebase to 6.9, remove kernel.h, update Kconfig and change license to 
GPL]
Signed-off-by: Devarsh Thakkar 
---
Changes since v9:
* Added Kconfig dependency for KUNIT

Changes since v8:
* Remove unrequired header file linux/kernel.h

Changes since v7:
* Rebase to linux-next, change license to GPL as suggested by checkpatch.

Changes since v6:
* No change

Changes since v5:
* add in test cases for roundup/rounddown
* address misc comments from David

Changes since v4:
* add in test cases for some math.h macros (abs, round_up/round_down,
  div_round_down/closest)
* use parameterized testing less to keep things terser

Changes since v3:
* fix `checkpatch.pl --strict` warnings
* add test cases for gcd(0,0) and lcm(0,0)
* minor: don't test both gcd(a,b) and gcd(b,a) when a == b

Changes since v2: mv math_test.c => math_kunit.c

Changes since v1:
* Rebase and rewrite to use the new parameterized testing support.
* misc: fix overflow in literal and inline int_sqrt format string.
* related: commit 1f0e943df68a ("Documentation: kunit: provide guidance
for testing many inputs") was merged explaining the patterns shown here.
  * there's an in-flight patch to update it for parameterized testing.
---
 lib/math/Kconfig  |  14 ++
 lib/math/Makefile |   1 +
 lib/math/math_kunit.c | 290 ++
 3 files changed, 305 insertions(+)
 create mode 100644 lib/math/math_kunit.c

diff --git a/lib/math/Kconfig b/lib/math/Kconfig
index 0634b428d0cb..f738d8a433ea 100644
--- a/lib/math/Kconfig
+++ b/lib/math/Kconfig
@@ -15,3 +15,17 @@ config PRIME_NUMBERS
 
 config RATIONAL
tristate
+
+config MATH_KUNIT_TEST
+   tristate "KUnit test for math helper functions"
+   depends on KUNIT
+   default KUNIT_ALL_TESTS
+
+   help
+ This builds unit test for math helper functions as defined in lib/math
+ and math.h.
+
+ For more information on KUNIT and unit tests in general, please refer
+ to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+ If unsure, say N.
diff --git a/lib/math/Makefile b/lib/math/Makefile
index 91fcdb0c9efe..dcf65d10dab2 100644
--- a/lib/math/Makefile
+++ b/lib/math/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_RATIONAL)  += rational.o
 
 obj-$(CONFIG_TEST_DIV64)   += test_div64.o
 obj-$(CONFIG_RATIONAL_KUNIT_TEST) += rational-test.o
+obj-$(CONFIG_MATH_KUNIT_TEST) += math_kunit.o
diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
new file mode 100644
index ..16263e30a292
--- /dev/null
+++ b/lib/math/math_kunit.c
@@ -0,0 +1,290 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Simple KUnit suite for math helper funcs that are always enabled.
+ *
+ * Copyright (C) 2020, Google LLC.
+ * Author: Daniel Latypov 
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+static void abs_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, abs((char)0), (char)0);
+   KUNIT_EXPECT_EQ(test, abs((char)42), (char)42);
+   KUNIT_EXPECT_EQ(test, abs((char)-42), (char)42);
+
+   /* The expression in the macro is actually promoted to an int. */
+   KUNIT_EXPECT_EQ(test, abs((short)0),  0);
+   KUNIT_EXPECT_EQ(test, abs((short)42),  42);
+   KUNIT_EXPECT_EQ(test, abs((short)-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0),  0);
+   KUNIT_EXPECT_EQ(test, abs(42),  42);
+   KUNIT_EXPECT_EQ(test, abs(-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0L), 0L);
+   KUNIT_EXPECT_EQ(test, abs(42L), 42L);
+   KUNIT_EXPECT_EQ(test, abs(-42L), 42L);
+
+   KUNIT_EXPECT_EQ(test, abs(0LL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42LL), 42LL);
+   KUNIT_EXPECT_EQ(test, abs(-42LL), 42LL);
+
+   /* Unsigned types get casted to signed. */
+   KUNIT_EXPECT_EQ(test, abs(0ULL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42ULL), 42LL);
+}
+
+static void int_sqrt_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, int_sqrt(0UL), 0UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(1UL), 1UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(4UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(5UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(8UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(1UL << 30), 1UL << 15);
+}
+
+static void round_up_test(struct kun

[PATCH v10 07/11] Documentation: core-api: Add math.h macros and functions

2024-05-30 Thread Devarsh Thakkar
Add documentation for rounding, scaling, absolute value and difference,
32-bit division related macros and functions exported by math.h header
file.

Signed-off-by: Devarsh Thakkar 
---
V1->V9 (No change)
V10: Patch introduced
---
 Documentation/core-api/kernel-api.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/core-api/kernel-api.rst 
b/Documentation/core-api/kernel-api.rst
index ae92a2571388..fb467783d491 100644
--- a/Documentation/core-api/kernel-api.rst
+++ b/Documentation/core-api/kernel-api.rst
@@ -185,6 +185,12 @@ Division Functions
 .. kernel-doc:: lib/math/gcd.c
:export:
 
+Rounding, absolute value, scaling and 32bit division functions
+--
+
+.. kernel-doc:: include/linux/math.h
+   :internal:
+
 UUID/GUID
 -
 
-- 
2.39.1



[PATCH v10 06/11] math.h: Add macros for rounding to closest value

2024-05-30 Thread Devarsh Thakkar
Add below rounding related macros:

round_closest_up(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round up in case two nearest values are
possible.

round_closest_down(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round down in case two nearest values are
possible.

roundclosest(x, y) : Rounds x to closest multiple of y, this macro should
generally be used only when y is not multiple of 2 as otherwise
round_closest* macros should be used which are much faster.

Examples:
 * round_closest_up(17, 4) = 16
 * round_closest_up(15, 4) = 16
 * round_closest_up(14, 4) = 16
 * round_closest_down(17, 4) = 16
 * round_closest_down(15, 4) = 16
 * round_closest_down(14, 4) = 12
 * roundclosest(21, 5) = 20
 * roundclosest(19, 5) = 20
 * roundclosest(17, 5) = 15

Signed-off-by: Devarsh Thakkar 
---
NOTE: This patch is inspired from the Mentor Graphics IPU driver [1]
which uses similar macro locally and which is updated in further patch
in the series to use this generic macro instead along with other drivers
having similar requirements.

[1]:
https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480

V10:
- Update example comment to fix formatting issues as observed with html docs

V9:
- No change

V8:
- Add new macro to round to nearest value for non-multiple of 2
- Update commit message as suggested:

V1->V6 (No change, patch introduced in V7)
---
 include/linux/math.h | 72 
 1 file changed, 72 insertions(+)

diff --git a/include/linux/math.h b/include/linux/math.h
index dd4152711de7..1f6177191b66 100644
--- a/include/linux/math.h
+++ b/include/linux/math.h
@@ -34,6 +34,58 @@
  */
 #define round_down(x, y) ((x) & ~__round_mask(x, y))
 
+/**
+ * round_closest_up - round closest to be multiple of specified value (which is
+ *power of 2) with preference to rounding up
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples :
+ *
+ * round_closest_up(17, 4) = 16
+ *
+ * round_closest_up(15, 4) = 16
+ *
+ * round_closest_up(14, 4) = 16
+ */
+#define round_closest_up(x, y) round_down((x) + (y) / 2, (y))
+
+/**
+ * round_closest_down - round closest to be multiple of specified value (which
+ * is power of 2) with preference to rounding down
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples:
+ *
+ * round_closest_down(17, 4) = 16
+ *
+ * round_closest_down(15, 4) = 16
+ *
+ * round_closest_down(14, 4) = 12
+ */
+#define round_closest_down(x, y) round_up((x) - (y) / 2, (y))
+
 #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
 
 #define DIV_ROUND_DOWN_ULL(ll, d) \
@@ -77,6 +129,26 @@
 }  \
 )
 
+/**
+ * roundclosest - round to nearest multiple
+ * @x: the value to round
+ * @y: multiple to round nearest to
+ *
+ * Rounds @x to nearest multiple of @y.
+ * The rounded value can be greater than or less than @x depending
+ * upon it's nearness to @x. If @y will always be a power of 2, consider
+ * using the faster round_closest_up() or round_closest_down().
+ *
+ * Examples :
+ *
+ * roundclosest(21, 5) = 20
+ *
+ * roundclosest(19, 5) = 20
+ *
+ * roundclosest(17, 5) = 15
+ */
+#define roundclosest(x, y) rounddown((x) + (y) / 2, (y))
+
 /*
  * Divide positive or negative dividend by positive or negative divisor
  * and round to closest integer. Result is undefined for negative
-- 
2.39.1



[PATCH v10 00/11] Add V4L2 M2M Driver for E5010 JPEG Encoder

2024-05-30 Thread Devarsh Thakkar
/all/20240205114239.924697-1-devar...@ti.com/
V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/
V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/


Daniel Latypov (1):
  lib: add basic KUnit test for lib/math

Devarsh Thakkar (10):
  media: dt-bindings: Add Imagination E5010 JPEG Encoder
  media: imagination: Add E5010 JPEG Encoder driver
  media: v4l2-jpeg: Export reference quantization and huffman tables
  media: imagination: Use exported tables from v4l2-jpeg core
  media: verisilicon : Use exported tables from v4l2-jpeg for hantro
codec
  math.h: Add macros for rounding to closest value
  Documentation: core-api: Add math.h macros and functions
  lib: math_kunit: Add tests for new macros related to rounding to
nearest value
  media: imagination: Round to closest multiple for cropping region
  gpu: ipu-v3: Use generic macro for rounding closest to specified value

 Documentation/core-api/kernel-api.rst |6 +
 .../bindings/media/img,e5010-jpeg-enc.yaml|   75 +
 MAINTAINERS   |7 +
 drivers/gpu/ipu-v3/ipu-image-convert.c|4 +-
 drivers/media/platform/Kconfig|1 +
 drivers/media/platform/Makefile   |1 +
 drivers/media/platform/imagination/Kconfig|   12 +
 drivers/media/platform/imagination/Makefile   |3 +
 .../platform/imagination/e5010-core-regs.h|  585 ++
 .../platform/imagination/e5010-jpeg-enc-hw.c  |  267 +++
 .../platform/imagination/e5010-jpeg-enc-hw.h  |   42 +
 .../platform/imagination/e5010-jpeg-enc.c | 1644 +
 .../platform/imagination/e5010-jpeg-enc.h |  168 ++
 .../platform/imagination/e5010-mmu-regs.h |  311 
 .../media/platform/verisilicon/hantro_jpeg.c  |  128 +-
 drivers/media/v4l2-core/v4l2-jpeg.c   |  162 +-
 include/linux/math.h  |   72 +
 include/media/v4l2-jpeg.h |   28 +
 lib/math/Kconfig  |   14 +
 lib/math/Makefile |1 +
 lib/math/math_kunit.c |  325 
 21 files changed, 3738 insertions(+), 118 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/media/img,e5010-jpeg-enc.yaml
 create mode 100644 drivers/media/platform/imagination/Kconfig
 create mode 100644 drivers/media/platform/imagination/Makefile
 create mode 100644 drivers/media/platform/imagination/e5010-core-regs.h
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.c
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc-hw.h
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.c
 create mode 100644 drivers/media/platform/imagination/e5010-jpeg-enc.h
 create mode 100644 drivers/media/platform/imagination/e5010-mmu-regs.h
 create mode 100644 lib/math/math_kunit.c

-- 
2.39.1



Re: [RFC PATCH 2/3] drm/tidss: Add support for display sharing

2024-05-29 Thread Devarsh Thakkar
Hi Javier, Maxime, Daniel,

Sorry for the delay. Please find response inline.

On 16/05/24 18:21, Daniel Vetter wrote:
> On Wed, May 15, 2024 at 04:45:09PM +0200, Javier Martinez Canillas wrote:
>> Devarsh Thakkar  writes:

[..]

>>
>> If I understand you correctly, for now the only real use case is when the
>> the RTOS owns / manages the complete display pipeline and Linux can only
>> own video planes.
>>

Not exactly, What I mean is that this is the default configuration/example we
intend to provide to customer as an out-of-box demo . But flexibility is
provided to customer to modify the display sharing configuration per their
use-case, for e.g at RTOS side in place of device-tree, we have a sysconfig
menu [1] using which they can select the desired configuration, furthermore
they can go ahead and edit the code too, so Linux driver is expected to be
flexible to support different configurations as supported by the HW.

I have a limited view of all possible use-cases which customer may try out
with different configurations but few examples are shared below part from the
one discussed earlier :

[Examples]:
1) Customer is running Linux as main OS but using RTOS to control some
external peripherals like temperature sensor, motion sensor e.t.c. In that
case if they want to display the sensor data too on the same monitor, then
they can use the configuration where RTOS use single plane and Linux as the
DSS master.

2) Another configuration could be where RTOS want to control one full
end-to-end pipeline going to one connector and Linux want to control full
end-to-end pipeline going to another connector, that can be supported too
using this scheme (as shared in this series).

3) Also I think, this device-tree based scheme could be leveraged in
virtualization too with static partitioning based scheme using Xen for e.g. we
split the DSS resources between host (DOM0) and the guest (DOMU).

>> The opposite is supported by the DSS hardware (thanks to its feature that
>> allows partitioning the register space and having multiple per-host IRQs) 
>> but it's not a real use case yet. The reason why this case is added to the
>> DT binding is as you said for flexiblity and make the design future-proof.
>>

Not really, as explained above we are documenting all possible configurations
which hardware supports as supported in software in the SDK and that's what we
are aiming for upstream too.

[..]>>>> I'm probably missing something then here, but if the Linux side of
>>>> things is expected to keep the current configuration and keep it active
>>>> for it to work, what use-case would it be useful for?
>>>>
>>>
>>> It's just one of the partitioning possibilities that I mentioned here, that
>>> Linux is in control of DSS as a whole and the user want the other host (be 
>>> it
>>> RTOS or any other core) to control a single plane. For e.g it could be Linux
>>> (with GPU rendering) displaying the graphics and RTOS overlaying a real time
>>> clock or any other signs which need to be displayed in real-time.
>>> But more than the use-case this is inspired by the fact that we want to be
>>> flexible and support in the linux driver whatever partitioning scheme
>>> possibilities are there which are supported in hardware and we let user 
>>> decide
>>> on the partitioning scheme.
>>>
>>
>> A possible use case here could be if Linux is safer than the other host
>> owning a single plane, right? Then in that case the RTOS could fail but
>> the display pipeline won't be teared down.
>>
>> That is, if your safety tell-tales would be driven by Linux and having
>> other OS dislay the GPU-rendered QT based application on another plane.
>>
>> But as said, for now that's a theorethical use case since the one you
>> mentioned is the opposite.
>>
>> []
>>

Yes that could be a possible use-case too, we want to provide customer the
flexibility in their app design to select different configuration, as we say
all these configurations as supported in the driver.

[..]
>>>>> If there is a more complex use-case which requires dynamic
>>>>> assignment/arbitration of resources then I agree those require some sort 
>>>>> of
>>>>> IPC scheme but this is not what we target with these series. This series 
>>>>> is
>>>>> simply to support static partitioning feature (separate register space,
>>>>> separate irq, firewalling support etc) of TI DSS hardware across the 
>>>>> multiple
>>>>> hosts and there are use-cases too for which this scheme suffices.
>>>>
>>>&

Re: [PATCH v9 07/10] lib: add basic KUnit test for lib/math

2024-05-28 Thread Devarsh Thakkar



On 28/05/24 02:08, Andy Shevchenko wrote:
> On Mon, May 27, 2024 at 11:37:20PM +0300, Andy Shevchenko wrote:
>> On Sun, May 26, 2024 at 11:39:33PM +0530, Devarsh Thakkar wrote:
> 
> ...
> 
>>> +MODULE_LICENSE("GPL");
>>
>> modpost validator won't be happy about this, i.e. missing 
>> MODULE_DESCRIPTION().
> 
> And obviously + module.h in the inclusion block.
> 

The module.h is already included under include/kunit/test.h and that's the
reason compiler did not give any error. But I can still include it under
math.h for better readability as you suggested as anyway compiler will not
re-include if already included by another header file.

Also I see we were missing a dependency between math_kunit and kunit modules,
so adding a dependency there too.

Regards
Devarsh


Re: [PATCH v9 07/10] lib: add basic KUnit test for lib/math

2024-05-28 Thread Devarsh Thakkar
Hi Andy,

Thanks for the review.

On 28/05/24 02:07, Andy Shevchenko wrote:

[..]
>> +#include 
>> +#include 
>> +#include 
> 
> + math.h (where abs()/DIV_ROUND_*()/etc come from?)
> I believe I mentioned that.
> 

I did compile and test this, so math.h was indirectly getting included via
some other header file already included but I would not rely on that and
include math.h separately as you suggested.

>> +#include 
> 
> ...
> 
>> +MODULE_LICENSE("GPL");
> 
> modpost validator won't be happy about this, i.e. missing 
> MODULE_DESCRIPTION().
> 

Indeed, it gives below logs, let me add that too.
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/math/math_kunit.o


Regards
Devarsh


Re: [PATCH v9 06/10] math.h: Add macros for rounding to closest value

2024-05-28 Thread Devarsh Thakkar
Hi Andy,

Thanks for the review.

On 28/05/24 02:02, Andy Shevchenko wrote:
> On Sun, May 26, 2024 at 11:38:56PM +0530, Devarsh Thakkar wrote:
...

>> +/**
>> + * round_closest_up - round closest to be multiple of specified value 
>> (which is
>> + *power of 2) with preference to rounding up
>> +
> 
> Not that big deal, but missing '*' here. Personally I would not even put
> a blank line between Summary and Field Descriptions.
> 

My bad. Yes I would remove the blank line here. This is picked up as warning
from kernel-doc too.

>> + * @x: the value to round
>> + * @y: multiple to round closest to (must be a power of 2)
>> + *
>> + * Rounds @x to closest multiple of @y (which must be a power of 2).
>> + * The value can be either rounded up or rounded down depending upon rounded
>> + * value's closeness to the specified value. If there are two closest 
>> possible
>> + * values, i.e. the difference between the specified value and it's rounded 
>> up
>> + * and rounded down values is same then preference is given to rounded up
>> + * value.
>> + *
>> + * To perform arbitrary rounding to closest value (not multiple of 2), use
>> + * roundclosest().
>> + *
>> + * Examples :
> 
> What is this suppose to be rendered to?
> 

The file math.h is not rendered as part of kernel-doc right now. I can put
this under Documentation/core-api/kernel-api.rst perhaps I can create a new
section as below:

Rounding, absolute diff and 32bit division macros
-

under the section:
CRC and Math Functions in Linux

===

is that okay ?

>> + * round_closest_up(17, 4) = 16
>> + * round_closest_up(15, 4) = 16
>> + * round_closest_up(14, 4) = 16
> 
> Btw, is kernel-doc validator happy about all kernel docs you added?
> 

Yes, except the aforementioned blank line.

Regards
Devarsh


[PATCH v9 10/10] gpu: ipu-v3: Use generic macro for rounding closest to specified value

2024-05-26 Thread Devarsh Thakkar
Use generic macro round_closest_up() for rounding closest to specified
value instead of using local macro round_closest().

There is no change from functionality point of view as round_closest_up()
is functionally same as the previously used local macro round_closest().

Signed-off-by: Devarsh Thakkar 
---
V9: No change
V8: Update commit message
V1->V7 : (No change, patch introduced in V7)
---
 drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c 
b/drivers/gpu/ipu-v3/ipu-image-convert.c
index 841316582ea9..5192a8b5c02c 100644
--- a/drivers/gpu/ipu-v3/ipu-image-convert.c
+++ b/drivers/gpu/ipu-v3/ipu-image-convert.c
@@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct 
ipu_image_convert_ctx *ctx,
return 0;
 }
 
-#define round_closest(x, y) round_down((x) + (y)/2, (y))
-
 /*
  * Find the best aligned seam position for the given column / row index.
  * Rotation and image offsets are out of scope.
@@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx 
*ctx,
 * The closest input sample position that we could actually
 * start the input tile at, 19.13 fixed point.
 */
-   in_pos_aligned = round_closest(in_pos, 8192U * in_align);
+   in_pos_aligned = round_closest_up(in_pos, 8192U * in_align);
/* Convert 19.13 fixed point to integer */
in_pos_rounded = in_pos_aligned / 8192U;
 
-- 
2.39.1



[PATCH v9 09/10] media: imagination: Round to closest multiple for cropping region

2024-05-26 Thread Devarsh Thakkar
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up
(V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest
multiple of requested value while updating the crop rectangle coordinates.

Use the rounding macro which gives preference to rounding down in case two
nearest values (high and low) are possible to raise the probability of
cropping rectangle falling inside the bound region.

This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl
description as documented in v4l uapi [1] which specifies that driver
should choose crop rectangle as close as possible if no flags are passed by
user-space, as quoted below :

"``0`` - The driver can adjust the rectangle size freely and shall choose a
crop/compose rectangle as close as possible to the requested
 one."

[1] :
https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst

Signed-off-by: Devarsh Thakkar 
---
V9: No change
V8: Update commit message with specification reference
V1->V7 (No change, patch introduced in V7)
---
 drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c 
b/drivers/media/platform/imagination/e5010-jpeg-enc.c
index e701d573a26a..d65646f0c38c 100644
--- a/drivers/media/platform/imagination/e5010-jpeg-enc.c
+++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c
@@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, 
struct v4l2_selection
 
switch (s->flags) {
case 0:
-   s->r.width = round_down(s->r.width, 
queue->fmt->frmsize.step_width);
-   s->r.height = round_down(s->r.height, 
queue->fmt->frmsize.step_height);
-   s->r.left = round_down(s->r.left, 
queue->fmt->frmsize.step_width);
-   s->r.top = round_down(s->r.top, 2);
+   s->r.width = round_closest_down(s->r.width, 
queue->fmt->frmsize.step_width);
+   s->r.height = round_closest_down(s->r.height, 
queue->fmt->frmsize.step_height);
+   s->r.left = round_closest_down(s->r.left, 
queue->fmt->frmsize.step_width);
+   s->r.top = round_closest_down(s->r.top, 2);
 
if (s->r.left + s->r.width > queue->width)
s->r.width = round_down(s->r.width + s->r.left - 
queue->width,
-- 
2.39.1



[PATCH v9 08/10] lib: math_kunit: Add tests for new macros related to rounding to nearest value

2024-05-26 Thread Devarsh Thakkar
Add tests for round_closest_up/down and roundclosest macros which round
to nearest multiple of specified argument. These are tested with kunit
tool as shared here [1].

[1]: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876

Signed-off-by: Devarsh Thakkar 
---
V1->V9 (No change, patch introduced in V8)
---
 lib/math/math_kunit.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
index 16263e30a292..60b66aa41121 100644
--- a/lib/math/math_kunit.c
+++ b/lib/math/math_kunit.c
@@ -67,6 +67,26 @@ static void round_down_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, round_down((1 << 30) - 1, 1 << 29), 1 << 29);
 }
 
+static void round_closest_up_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_up(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(14, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 2), 1 << 30);
+}
+
+static void round_closest_down_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_down(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(14, 4), 12);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 2), (1 << 30) - 
2);
+}
+
 /* These versions can round to numbers that aren't a power of two */
 static void roundup_test(struct kunit *test)
 {
@@ -92,6 +112,18 @@ static void rounddown_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, rounddown(4, 3), 3);
 }
 
+static void roundclosest_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, roundclosest(21, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(19, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(17, 5), 15);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30), 3), (1 << 30) - 1);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30) - 1, 1 << 29), 1 << 30);
+
+   KUNIT_EXPECT_EQ(test, roundclosest(4, 3), 3);
+   KUNIT_EXPECT_EQ(test, roundclosest(5, 3), 6);
+}
+
 static void div_round_up_test(struct kunit *test)
 {
KUNIT_EXPECT_EQ(test, DIV_ROUND_UP(0, 1), 0);
@@ -269,8 +301,11 @@ static struct kunit_case math_test_cases[] = {
KUNIT_CASE(int_sqrt_test),
KUNIT_CASE(round_up_test),
KUNIT_CASE(round_down_test),
+   KUNIT_CASE(round_closest_up_test),
+   KUNIT_CASE(round_closest_down_test),
KUNIT_CASE(roundup_test),
KUNIT_CASE(rounddown_test),
+   KUNIT_CASE(roundclosest_test),
KUNIT_CASE(div_round_up_test),
KUNIT_CASE(div_round_closest_test),
KUNIT_CASE_PARAM(gcd_test, gcd_gen_params),
-- 
2.39.1



[PATCH v9 07/10] lib: add basic KUnit test for lib/math

2024-05-26 Thread Devarsh Thakkar
From: Daniel Latypov 

Add basic test coverage for files that don't require any config options:
* part of math.h (what seem to be the most commonly used macros)
* gcd.c
* lcm.c
* int_sqrt.c
* reciprocal_div.c
(Ignored int_pow.c since it's a simple textbook algorithm.)

These tests aren't particularly interesting, but they
* provide short and simple examples of parameterized tests
* provide a place to add tests for any new files in this dir
* are written so adding new test cases to cover edge cases should be
  easy
  * looking at code coverage, we hit all the branches in the .c files

Signed-off-by: Daniel Latypov 
Reviewed-by: David Gow 
[devarsht: Rebase to 6.9, remove kernel.h and change license to GPL]
Signed-off-by: Devarsh Thakkar 
---
Changes since v8:
* Remove unrequired header file linux/kernel.h

Changes since v7:
* Rebase to linux-next, change license to GPL as suggested by checkpatch.

Changes since v6:
* No change

Changes since v5:
* add in test cases for roundup/rounddown
* address misc comments from David

Changes since v4:
* add in test cases for some math.h macros (abs, round_up/round_down,
  div_round_down/closest)
* use parameterized testing less to keep things terser

Changes since v3:
* fix `checkpatch.pl --strict` warnings
* add test cases for gcd(0,0) and lcm(0,0)
* minor: don't test both gcd(a,b) and gcd(b,a) when a == b

Changes since v2: mv math_test.c => math_kunit.c

Changes since v1:
* Rebase and rewrite to use the new parameterized testing support.
* misc: fix overflow in literal and inline int_sqrt format string.
* related: commit 1f0e943df68a ("Documentation: kunit: provide guidance
for testing many inputs") was merged explaining the patterns shown here.
  * there's an in-flight patch to update it for parameterized testing.
---
 lib/math/Kconfig  |  11 ++
 lib/math/Makefile |   1 +
 lib/math/math_kunit.c | 290 ++
 3 files changed, 302 insertions(+)
 create mode 100644 lib/math/math_kunit.c

diff --git a/lib/math/Kconfig b/lib/math/Kconfig
index 0634b428d0cb..832c6989ca13 100644
--- a/lib/math/Kconfig
+++ b/lib/math/Kconfig
@@ -15,3 +15,14 @@ config PRIME_NUMBERS
 
 config RATIONAL
tristate
+
+config MATH_KUNIT_TEST
+   tristate "KUnit test for math helper functions"
+   help
+ This builds unit test for math helper functions as defined in lib/math
+ and math.h.
+
+ For more information on KUNIT and unit tests in general, please refer
+ to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+ If unsure, say N.
diff --git a/lib/math/Makefile b/lib/math/Makefile
index 91fcdb0c9efe..dcf65d10dab2 100644
--- a/lib/math/Makefile
+++ b/lib/math/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_RATIONAL)  += rational.o
 
 obj-$(CONFIG_TEST_DIV64)   += test_div64.o
 obj-$(CONFIG_RATIONAL_KUNIT_TEST) += rational-test.o
+obj-$(CONFIG_MATH_KUNIT_TEST) += math_kunit.o
diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
new file mode 100644
index ..16263e30a292
--- /dev/null
+++ b/lib/math/math_kunit.c
@@ -0,0 +1,290 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Simple KUnit suite for math helper funcs that are always enabled.
+ *
+ * Copyright (C) 2020, Google LLC.
+ * Author: Daniel Latypov 
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+static void abs_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, abs((char)0), (char)0);
+   KUNIT_EXPECT_EQ(test, abs((char)42), (char)42);
+   KUNIT_EXPECT_EQ(test, abs((char)-42), (char)42);
+
+   /* The expression in the macro is actually promoted to an int. */
+   KUNIT_EXPECT_EQ(test, abs((short)0),  0);
+   KUNIT_EXPECT_EQ(test, abs((short)42),  42);
+   KUNIT_EXPECT_EQ(test, abs((short)-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0),  0);
+   KUNIT_EXPECT_EQ(test, abs(42),  42);
+   KUNIT_EXPECT_EQ(test, abs(-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0L), 0L);
+   KUNIT_EXPECT_EQ(test, abs(42L), 42L);
+   KUNIT_EXPECT_EQ(test, abs(-42L), 42L);
+
+   KUNIT_EXPECT_EQ(test, abs(0LL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42LL), 42LL);
+   KUNIT_EXPECT_EQ(test, abs(-42LL), 42LL);
+
+   /* Unsigned types get casted to signed. */
+   KUNIT_EXPECT_EQ(test, abs(0ULL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42ULL), 42LL);
+}
+
+static void int_sqrt_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, int_sqrt(0UL), 0UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(1UL), 1UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(4UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(5UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(8UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(1UL << 30), 1UL << 15);
+}
+
+static void round_up_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_up(0, 1), 0);
+   KUNIT_EXPECT_EQ(test, round_up(1, 2), 2);
+   KUNIT_EXPECT_EQ(test, round_up(3, 

[PATCH v9 06/10] math.h: Add macros for rounding to closest value

2024-05-26 Thread Devarsh Thakkar
Add below rounding related macros:

round_closest_up(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round up in case two nearest values are
possible.

round_closest_down(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round down in case two nearest values are
possible.

roundclosest(x, y) : Rounds x to closest multiple of y, this macro should
generally be used only when y is not multiple of 2 as otherwise
round_closest* macros should be used which are much faster.

Examples:
 * round_closest_up(17, 4) = 16
 * round_closest_up(15, 4) = 16
 * round_closest_up(14, 4) = 16
 * round_closest_down(17, 4) = 16
 * round_closest_down(15, 4) = 16
 * round_closest_down(14, 4) = 12
 * roundclosest(21, 5) = 20
 * roundclosest(19, 5) = 20
 * roundclosest(17, 5) = 15

Signed-off-by: Devarsh Thakkar 
---
NOTE: This patch is inspired from the Mentor Graphics IPU driver [1]
which uses similar macro locally and which is updated in further patch
in the series to use this generic macro instead along with other drivers
having similar requirements.

[1]:
https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480

V9:
- No change

V8:
- Add new macro to round to nearest value for non-multiple of 2
- Update commit message as suggested:

V1->V6 (No change, patch introduced in V7)
---
 include/linux/math.h | 65 
 1 file changed, 65 insertions(+)

diff --git a/include/linux/math.h b/include/linux/math.h
index dd4152711de7..e2cc3769ed0e 100644
--- a/include/linux/math.h
+++ b/include/linux/math.h
@@ -34,6 +34,54 @@
  */
 #define round_down(x, y) ((x) & ~__round_mask(x, y))
 
+/**
+ * round_closest_up - round closest to be multiple of specified value (which is
+ *power of 2) with preference to rounding up
+
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples :
+ * round_closest_up(17, 4) = 16
+ * round_closest_up(15, 4) = 16
+ * round_closest_up(14, 4) = 16
+ */
+#define round_closest_up(x, y) round_down((x) + (y) / 2, (y))
+
+/**
+ * round_closest_down - round closest to be multiple of specified value (which
+ * is power of 2) with preference to rounding down
+ *
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples :
+ * round_closest_down(17, 4) = 16
+ * round_closest_down(15, 4) = 16
+ * round_closest_down(14, 4) = 12
+ */
+#define round_closest_down(x, y) round_up((x) - (y) / 2, (y))
+
 #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
 
 #define DIV_ROUND_DOWN_ULL(ll, d) \
@@ -77,6 +125,23 @@
 }  \
 )
 
+/**
+ * roundclosest - round to nearest multiple
+ * @x: the value to round
+ * @y: multiple to round nearest to
+ *
+ * Rounds @x to nearest multiple of @y.
+ * The rounded value can be greater than or less than @x depending
+ * upon it's nearness to @x. If @y will always be a power of 2, consider
+ * using the faster round_closest_up() or round_closest_down().
+ *
+ * Examples :
+ * roundclosest(21, 5) = 20
+ * roundclosest(19, 5) = 20
+ * roundclosest(17, 5) = 15
+ */
+#define roundclosest(x, y) rounddown((x) + (y) / 2, (y))
+
 /*
  * Divide positive or negative dividend by positive or negative divisor
  * and round to closest integer. Result is undefined for negative
-- 
2.39.1



[PATCH v9 00/10] Add V4L2 M2M Driver for E5010 JPEG Encoder

2024-05-26 Thread Devarsh Thakkar
This adds support for V4L2 M2M based driver for E5010 JPEG Encoder
which is a stateful JPEG encoder from Imagination technologies
and is present in TI AM62A SoC.

While adding support for it, following additional framework changes were
made:
 - Moved reference quantization and huffman tables provided in
   ITU-T-REC-T.81 to v4l2-jpeg.c as suggested in mailing list [1].
 - Add macros to round to closest integer (either higher or lower) while
   rounding in order of 2.
 - Add KUnit tests for math functions.

v4l2-compliance test :
Link: https://gist.github.com/devarsht/1f039c631ca953a57f405cfce1b69e49

E5010 JPEG Encoder Manual tests :

Performance:
Link: https://gist.github.com/devarsht/c40672944fd71c9a53ab55adbfd9e28b

Functionality:
Link: https://gist.github.com/devarsht/8e88fcaabff016bb2bac83d89c9d23ce

Compression Quality:
Link: https://gist.github.com/devarsht/cbcc7cd97e8c48ba1486caa2b7884655

Multi Instance:
Link: https://gist.github.com/devarsht/22c2fca08cd3441fb40f2c7a4cebc95a

Crop support:
Link: https://gist.github.com/devarsht/de6f5142f678bb1a5338abfd9f814abd

Runtime PM:
Link: https://gist.github.com/devarsht/70cd95d4440ddc678489d93885ddd4dd

Math lib KUnit tests:
Link: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876

[1]: 
https://lore.kernel.org/all/de46aefe-36da-4e1a-b4fa-b375b2749...@xs4all.nl/

Changelog:
V8->V9:
 - Remove kernel.h header file
 - Remove stale filler data on jpeg header in E5010 jpeg driver

V7->V8:
 - Add KUnit tests for math functions
 - Add roundclosest() for supporting rounding for non-multiple of 2
 - Update commit message as suggested
 - Add Reviewed-by and Acked-by tags to patches as received

V6->V7:
 - Fix cropping support
 - Move reference huffman and quantization tables to v4l2-jpeg.c
 - Fix suspend/resume use-case
 - Add Reviewed-by

V5->V6:
 - Fix sparse warnings

V4->V5:
 - Sort the #includes in driver file alphabetically
 - Rename huffman and quantization tables to not use '_'
 - Add Reviewed-by tag

V3->V4:
- Use ti-specific compatible ti,am62a-jpeg-enc as secondary one in
  dt-binding
- Remove clock-names as only single clock in dt-binding
- Fix issue with default params setting
- Correct v4l2 error prints
- Simplify register write functions with single statement return values
- Remove unrequired error checks from get_queue()
- Drop explicit device_caps setting as it is already taken care by v4l2
  core
- Remove unrequired multiplanar checks and memset from s_fmt, g_fmt
  callback functions
- Fix try_fmt callback to not update the queues
- Remove unrequired contiguous format attribute from queue_init
- Use dynamic allocation for video_device and remove unrequired
  assignments in probe()
- Remove unrequired checks from queue_setup function
- Return queued buffers back if start_streaming fails
- Use ARRAY_SIZE in place of hard-coding
- Use huffman and quantization tables from reference header file

V2->V3:
- Add DONOTMERGE patches for dts and defconfig
- Update driver with below changes :
  - Correct license headers
  - Use more generic name core instead of jasper for base registers
  - Add Comment for forward declarations
  - Simplify quantization table calculations
  - Use v4l2_apply_frmsize_constraints for updating framesize and remove
unrequired functions
  - Place TODO at top of file and in commit message too
  - Use dev_err_probe helper in probe function
  - Fix return value checking for failure scenarios in probe function
  - Use v4l2_err/info/warn helpers instead of dev_err/info/warn helpers
  - Fix unexpected indentation
  - Correct commit message
- Update dt-bindings with below changes :
  - Add vendor specific compatible 
  - Fix commit title and message
  - Update reg names
  - Update clocks to 1
  - Fix dts example with proper naming

V1->V2:
 - Send dt-bindings and driver together

Patch-Diff between the series :
V8->V9 Range diff :
https://gist.github.com/devarsht/3fd6c4e8031ab114248f93d01c8dfc74

V7->V8 Range diff :
https://gist.github.com/devarsht/3fd6c4e8031ab114248f93d01c8dfc74

V6->V7 Range diff :
https://gist.github.com/devarsht/1db185b1e187eaf397e9e4c37066777e

V5->V6 Range diff :
https://gist.github.com/devarsht/c89180ac2b0d2814614f2b59d0705c19

V4->V5 Range diff :
https://gist.github.com/devarsht/298790af819f299a0a05fec89371097b

V3->V4 Range diff :
https://gist.github.com/devarsht/22a744d999080de6e813bcfb5a596272

Previous patch series:
V8: https://lore.kernel.org/all/20240517171532.748684-1-devar...@ti.com/
V7: https://lore.kernel.org/all/20240510082603.1263256-1-devar...@ti.com/
V6: https://lore.kernel.org/all/20240228141140.3530612-1-devar...@ti.com/
V5: https://lore.kernel.org/all/20240215134641.3381478-1-devar...@ti.com/
V4: https://lore.kernel.org/all/20240205114239.924697-1-devar...@ti.com/
V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/
V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/

Daniel Latypov (1

Re: [PATCH v8 07/10] lib: add basic KUnit test for lib/math

2024-05-20 Thread Devarsh Thakkar
On 20/05/24 17:52, Andy Shevchenko wrote:
> On Mon, May 20, 2024 at 05:11:18PM +0530, Devarsh Thakkar wrote:
>> On 18/05/24 01:44, Andy Shevchenko wrote:
>>> On Fri, May 17, 2024 at 11:06:07PM +0530, Devarsh Thakkar wrote:
> 
> [..]
> 

[..]
> Yes, and one should follow IWYU principle and not cargo cult or whatever
> arbitrary lists.
> 

Agreed.

>>>> +#include 
>>>
>>> + math.h // obviously
>>> + module.h
>>>
>>>> +#include 
>>>
>>> + types.h
>>
>> All the above headers are already included as part of kernel.h
> 
> Yes, that's why you should not use "proxy" headers.
> Have you read the top comment in the kernel.h?
> 

Yes, it says it is not recommended to include this inside another header file.
Although here we are adding it inside c file, but I can still try avoid it and
include only the required headers instead of kernel.h as you recommended.

Regards
Devarsh


Re: [PATCH v8 07/10] lib: add basic KUnit test for lib/math

2024-05-20 Thread Devarsh Thakkar
Hi Andy, Daniel,

On 18/05/24 01:44, Andy Shevchenko wrote:
> On Fri, May 17, 2024 at 11:06:07PM +0530, Devarsh Thakkar wrote:
[..]
> 
>> [devarsht: Rebase to 6.9 and change license to GPL]
> 
> I'm not sure that you may change license. It needs the author's confirmation.
> 

As per latest licensing doc [1], It quotes that GPL  is same as GPL v2 and GPL
v2 exists only for historical reasons as shared below :

“GPL v2 Same as “GPL”. It exists for historic reasons."

So I think it should be fine and more apt to use GPL. This is also highlighted
in below commit:
bf7fbeeae6db644ef5995085de2bc5c6121f8c8d module: Cure the MODULE_LICENSE "GPL"
vs. "GPL v2" bogosity

>> ---
>> Changes since v6:
>> * Rebase to linux-next, change license to GPL as suggested by checkpatch.
> 
> Note, checkpatch.pl is not false positives free. Be careful
> with what it suggests.
> 
>> +#include 
>> +#include 
> 
>> +#include 
> 
> Do you know why this header is included?
> 

It includes all the other required headers (including those you mentioned
below), and header list is probably copied from other files in same directory.
But it does suffice the requirements as I have verified the compilation.

>> +#include 
> 
> + math.h // obviously
> + module.h
> 
>> +#include 
> 
> + types.h
> 

All the above headers are already included as part of kernel.h

Kindly let me know if any queries.

[1]: https://docs.kernel.org/process/license-rules.html

Regards
Devarsh


[PATCH v8 10/10] gpu: ipu-v3: Use generic macro for rounding closest to specified value

2024-05-17 Thread Devarsh Thakkar
Use generic macro round_closest_up() for rounding closest to specified
value instead of using local macro round_closest().

There is no change from functionality point of view as round_closest_up()
is functionally same as the previously used local macro round_closest().

Signed-off-by: Devarsh Thakkar 
---
V8: Update commit message
V1->V7 : (No change, patch introduced in V7)
---
 drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c 
b/drivers/gpu/ipu-v3/ipu-image-convert.c
index 841316582ea9..5192a8b5c02c 100644
--- a/drivers/gpu/ipu-v3/ipu-image-convert.c
+++ b/drivers/gpu/ipu-v3/ipu-image-convert.c
@@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct 
ipu_image_convert_ctx *ctx,
return 0;
 }
 
-#define round_closest(x, y) round_down((x) + (y)/2, (y))
-
 /*
  * Find the best aligned seam position for the given column / row index.
  * Rotation and image offsets are out of scope.
@@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx 
*ctx,
 * The closest input sample position that we could actually
 * start the input tile at, 19.13 fixed point.
 */
-   in_pos_aligned = round_closest(in_pos, 8192U * in_align);
+   in_pos_aligned = round_closest_up(in_pos, 8192U * in_align);
/* Convert 19.13 fixed point to integer */
in_pos_rounded = in_pos_aligned / 8192U;
 
-- 
2.39.1



[PATCH v8 09/10] media: imagination: Round to closest multiple for cropping region

2024-05-17 Thread Devarsh Thakkar
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up
(V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest
multiple of requested value while updating the crop rectangle coordinates.

Use the rounding macro which gives preference to rounding down in case two
nearest values (high and low) are possible to raise the probability of
cropping rectangle falling inside the bound region.

This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl
description as documented in v4l uapi [1] which specifies that driver
should choose crop rectangle as close as possible if no flags are passed by
user-space, as quoted below :

"``0`` - The driver can adjust the rectangle size freely and shall choose a
crop/compose rectangle as close as possible to the requested
 one."

[1] :
https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst

Signed-off-by: Devarsh Thakkar 
---
V8: Update commit message with specification reference
V1->V7 (No change, patch introduced in V7)
---
 drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c 
b/drivers/media/platform/imagination/e5010-jpeg-enc.c
index 189e2a99c43d..abd66bc9b96c 100644
--- a/drivers/media/platform/imagination/e5010-jpeg-enc.c
+++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c
@@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, 
struct v4l2_selection
 
switch (s->flags) {
case 0:
-   s->r.width = round_down(s->r.width, 
queue->fmt->frmsize.step_width);
-   s->r.height = round_down(s->r.height, 
queue->fmt->frmsize.step_height);
-   s->r.left = round_down(s->r.left, 
queue->fmt->frmsize.step_width);
-   s->r.top = round_down(s->r.top, 2);
+   s->r.width = round_closest_down(s->r.width, 
queue->fmt->frmsize.step_width);
+   s->r.height = round_closest_down(s->r.height, 
queue->fmt->frmsize.step_height);
+   s->r.left = round_closest_down(s->r.left, 
queue->fmt->frmsize.step_width);
+   s->r.top = round_closest_down(s->r.top, 2);
 
if (s->r.left + s->r.width > queue->width)
s->r.width = round_down(s->r.width + s->r.left - 
queue->width,
-- 
2.39.1



[PATCH v8 08/10] lib: math_kunit: Add tests for new rounding related macros

2024-05-17 Thread Devarsh Thakkar
Add tests for round_closest_up/down and roundclosest macros which round
to nearest multiple of specified argument. These are tested with kunit
tool as shared here [1].

[1]: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876

Signed-off-by: Devarsh Thakkar 
---
V1->V8 (No change, patch introduced in V8)
---
 lib/math/math_kunit.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
index 1b00e4195a1a..1955dcd5655d 100644
--- a/lib/math/math_kunit.c
+++ b/lib/math/math_kunit.c
@@ -68,6 +68,26 @@ static void round_down_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, round_down((1 << 30) - 1, 1 << 29), 1 << 29);
 }
 
+static void round_closest_up_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_up(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(14, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 2), 1 << 30);
+}
+
+static void round_closest_down_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_down(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(14, 4), 12);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 2), (1 << 30) - 
2);
+}
+
 /* These versions can round to numbers that aren't a power of two */
 static void roundup_test(struct kunit *test)
 {
@@ -93,6 +113,18 @@ static void rounddown_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, rounddown(4, 3), 3);
 }
 
+static void roundclosest_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, roundclosest(21, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(19, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(17, 5), 15);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30), 3), (1 << 30) - 1);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30) - 1, 1 << 29), 1 << 30);
+
+   KUNIT_EXPECT_EQ(test, roundclosest(4, 3), 3);
+   KUNIT_EXPECT_EQ(test, roundclosest(5, 3), 6);
+}
+
 static void div_round_up_test(struct kunit *test)
 {
KUNIT_EXPECT_EQ(test, DIV_ROUND_UP(0, 1), 0);
@@ -270,8 +302,11 @@ static struct kunit_case math_test_cases[] = {
KUNIT_CASE(int_sqrt_test),
KUNIT_CASE(round_up_test),
KUNIT_CASE(round_down_test),
+   KUNIT_CASE(round_closest_up_test),
+   KUNIT_CASE(round_closest_down_test),
KUNIT_CASE(roundup_test),
KUNIT_CASE(rounddown_test),
+   KUNIT_CASE(roundclosest_test),
KUNIT_CASE(div_round_up_test),
KUNIT_CASE(div_round_closest_test),
KUNIT_CASE_PARAM(gcd_test, gcd_gen_params),
-- 
2.39.1



[PATCH v8 07/10] lib: add basic KUnit test for lib/math

2024-05-17 Thread Devarsh Thakkar
From: Daniel Latypov 

Add basic test coverage for files that don't require any config options:
* part of math.h (what seem to be the most commonly used macros)
* gcd.c
* lcm.c
* int_sqrt.c
* reciprocal_div.c
(Ignored int_pow.c since it's a simple textbook algorithm.)

These tests aren't particularly interesting, but they
* provide short and simple examples of parameterized tests
* provide a place to add tests for any new files in this dir
* are written so adding new test cases to cover edge cases should be
  easy
  * looking at code coverage, we hit all the branches in the .c files

Signed-off-by: Daniel Latypov 
Reviewed-by: David Gow 
[devarsht: Rebase to 6.9 and change license to GPL]
Signed-off-by: Devarsh Thakkar 
---
Changes since v6:
* Rebase to linux-next, change license to GPL as suggested by checkpatch.

Changes since v5:
* add in test cases for roundup/rounddown
* address misc comments from David

Changes since v4:
* add in test cases for some math.h macros (abs, round_up/round_down,
  div_round_down/closest)
* use parameterized testing less to keep things terser

Changes since v3:
* fix `checkpatch.pl --strict` warnings
* add test cases for gcd(0,0) and lcm(0,0)
* minor: don't test both gcd(a,b) and gcd(b,a) when a == b

Changes since v2: mv math_test.c => math_kunit.c

Changes since v1:
* Rebase and rewrite to use the new parameterized testing support.
* misc: fix overflow in literal and inline int_sqrt format string.
* related: commit 1f0e943df68a ("Documentation: kunit: provide guidance
for testing many inputs") was merged explaining the patterns shown here.
  * there's an in-flight patch to update it for parameterized testing.
---
 lib/math/Kconfig  |  11 ++
 lib/math/Makefile |   1 +
 lib/math/math_kunit.c | 291 ++
 3 files changed, 303 insertions(+)
 create mode 100644 lib/math/math_kunit.c

diff --git a/lib/math/Kconfig b/lib/math/Kconfig
index 0634b428d0cb..832c6989ca13 100644
--- a/lib/math/Kconfig
+++ b/lib/math/Kconfig
@@ -15,3 +15,14 @@ config PRIME_NUMBERS
 
 config RATIONAL
tristate
+
+config MATH_KUNIT_TEST
+   tristate "KUnit test for math helper functions"
+   help
+ This builds unit test for math helper functions as defined in lib/math
+ and math.h.
+
+ For more information on KUNIT and unit tests in general, please refer
+ to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+ If unsure, say N.
diff --git a/lib/math/Makefile b/lib/math/Makefile
index 91fcdb0c9efe..dcf65d10dab2 100644
--- a/lib/math/Makefile
+++ b/lib/math/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_RATIONAL)  += rational.o
 
 obj-$(CONFIG_TEST_DIV64)   += test_div64.o
 obj-$(CONFIG_RATIONAL_KUNIT_TEST) += rational-test.o
+obj-$(CONFIG_MATH_KUNIT_TEST) += math_kunit.o
diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
new file mode 100644
index ..1b00e4195a1a
--- /dev/null
+++ b/lib/math/math_kunit.c
@@ -0,0 +1,291 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Simple KUnit suite for math helper funcs that are always enabled.
+ *
+ * Copyright (C) 2020, Google LLC.
+ * Author: Daniel Latypov 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static void abs_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, abs((char)0), (char)0);
+   KUNIT_EXPECT_EQ(test, abs((char)42), (char)42);
+   KUNIT_EXPECT_EQ(test, abs((char)-42), (char)42);
+
+   /* The expression in the macro is actually promoted to an int. */
+   KUNIT_EXPECT_EQ(test, abs((short)0),  0);
+   KUNIT_EXPECT_EQ(test, abs((short)42),  42);
+   KUNIT_EXPECT_EQ(test, abs((short)-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0),  0);
+   KUNIT_EXPECT_EQ(test, abs(42),  42);
+   KUNIT_EXPECT_EQ(test, abs(-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0L), 0L);
+   KUNIT_EXPECT_EQ(test, abs(42L), 42L);
+   KUNIT_EXPECT_EQ(test, abs(-42L), 42L);
+
+   KUNIT_EXPECT_EQ(test, abs(0LL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42LL), 42LL);
+   KUNIT_EXPECT_EQ(test, abs(-42LL), 42LL);
+
+   /* Unsigned types get casted to signed. */
+   KUNIT_EXPECT_EQ(test, abs(0ULL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42ULL), 42LL);
+}
+
+static void int_sqrt_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, int_sqrt(0UL), 0UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(1UL), 1UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(4UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(5UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(8UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(1UL << 30), 1UL << 15);
+}
+
+static void round_up_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_up(0, 1), 0);
+   KUNIT_EXPECT_EQ(test, round_up(1, 2), 2);
+   KUNIT_EXPECT_EQ(test, round_up(3, 2), 4);
+   KUNIT_EXPECT_EQ(test, round_up((1 << 30) - 1, 2), 1 << 30);
+   KUNIT_

[PATCH v8 06/10] math.h: Add macros for rounding to closest value

2024-05-17 Thread Devarsh Thakkar
Add below rounding related macros:

round_closest_up(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round up in case two nearest values are
possible.

round_closest_down(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round down in case two nearest values are
possible.

roundclosest(x, y) : Rounds x to closest multiple of y, this macro should
generally be used only when y is not multiple of 2 as otherwise
round_closest* macros should be used which are much faster.

Examples:
 * round_closest_up(17, 4) = 16
 * round_closest_up(15, 4) = 16
 * round_closest_up(14, 4) = 16
 * round_closest_down(17, 4) = 16
 * round_closest_down(15, 4) = 16
 * round_closest_down(14, 4) = 12
 * roundclosest(21, 5) = 20
 * roundclosest(19, 5) = 20
 * roundclosest(17, 5) = 15

Signed-off-by: Devarsh Thakkar 
---
NOTE: This patch is inspired from the Mentor Graphics IPU driver [1]
which uses similar macro locally and which is updated in further patch
in the series to use this generic macro instead along with other drivers
having similar requirements.

[1]:
https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480

V8:
- Add new macro to round to nearest value for non-multiple of 2
- Update commit message as suggested

V1->V6 (No change, patch introduced in V7)
---
 include/linux/math.h | 65 
 1 file changed, 65 insertions(+)

diff --git a/include/linux/math.h b/include/linux/math.h
index dd4152711de7..e2cc3769ed0e 100644
--- a/include/linux/math.h
+++ b/include/linux/math.h
@@ -34,6 +34,54 @@
  */
 #define round_down(x, y) ((x) & ~__round_mask(x, y))
 
+/**
+ * round_closest_up - round closest to be multiple of specified value (which is
+ *power of 2) with preference to rounding up
+
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples :
+ * round_closest_up(17, 4) = 16
+ * round_closest_up(15, 4) = 16
+ * round_closest_up(14, 4) = 16
+ */
+#define round_closest_up(x, y) round_down((x) + (y) / 2, (y))
+
+/**
+ * round_closest_down - round closest to be multiple of specified value (which
+ * is power of 2) with preference to rounding down
+ *
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples :
+ * round_closest_down(17, 4) = 16
+ * round_closest_down(15, 4) = 16
+ * round_closest_down(14, 4) = 12
+ */
+#define round_closest_down(x, y) round_up((x) - (y) / 2, (y))
+
 #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
 
 #define DIV_ROUND_DOWN_ULL(ll, d) \
@@ -77,6 +125,23 @@
 }  \
 )
 
+/**
+ * roundclosest - round to nearest multiple
+ * @x: the value to round
+ * @y: multiple to round nearest to
+ *
+ * Rounds @x to nearest multiple of @y.
+ * The rounded value can be greater than or less than @x depending
+ * upon it's nearness to @x. If @y will always be a power of 2, consider
+ * using the faster round_closest_up() or round_closest_down().
+ *
+ * Examples :
+ * roundclosest(21, 5) = 20
+ * roundclosest(19, 5) = 20
+ * roundclosest(17, 5) = 15
+ */
+#define roundclosest(x, y) rounddown((x) + (y) / 2, (y))
+
 /*
  * Divide positive or negative dividend by positive or negative divisor
  * and round to closest integer. Result is undefined for negative
-- 
2.39.1



[PATCH v8 00/10] Add V4L2 M2M Driver for E5010 JPEG Encoder

2024-05-17 Thread Devarsh Thakkar
This adds support for V4L2 M2M based driver for E5010 JPEG Encoder
which is a stateful JPEG encoder from Imagination technologies
and is present in TI AM62A SoC.

While adding support for it, following additional framework changes were
made:
 - Moved reference quantization and huffman tables provided in
   ITU-T-REC-T.81 to v4l2-jpeg.c as suggested in mailing list [1].
 - Add macros to round to closest integer (either higher or lower) while
   rounding in order of 2.
 - Add KUnit tests for math functions.

v4l2-compliance test :
Link: https://gist.github.com/devarsht/1f039c631ca953a57f405cfce1b69e49

E5010 JPEG Encoder Manual tests :

Performance:
Link: https://gist.github.com/devarsht/c40672944fd71c9a53ab55adbfd9e28b

Functionality:
Link: https://gist.github.com/devarsht/8e88fcaabff016bb2bac83d89c9d23ce

Compression Quality:
Link: https://gist.github.com/devarsht/cbcc7cd97e8c48ba1486caa2b7884655

Multi Instance:
Link: https://gist.github.com/devarsht/22c2fca08cd3441fb40f2c7a4cebc95a

Crop support:
Link: https://gist.github.com/devarsht/de6f5142f678bb1a5338abfd9f814abd

Runtime PM:
Link: https://gist.github.com/devarsht/70cd95d4440ddc678489d93885ddd4dd

Math lib KUnit tests:
Link: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876

[1]: 
https://lore.kernel.org/all/de46aefe-36da-4e1a-b4fa-b375b2749...@xs4all.nl/

Changelog:
V7->V8:
 - Add KUnit tests for math functions
 - Add roundclosest() for supporting rounding for non-multiple of 2
 - Update commit message as suggested
 - Add Reviewed-by and Acked-by tags to patches as received

V6->V7:
 - Fix cropping support
 - Move reference huffman and quantization tables to v4l2-jpeg.c
 - Fix suspend/resume use-case
 - Add Reviewed-by

V5->V6:
 - Fix sparse warnings

V4->V5:
 - Sort the #includes in driver file alphabetically
 - Rename huffman and quantization tables to not use '_'
 - Add Reviewed-by tag

V3->V4:
- Use ti-specific compatible ti,am62a-jpeg-enc as secondary one in
  dt-binding
- Remove clock-names as only single clock in dt-binding
- Fix issue with default params setting
- Correct v4l2 error prints
- Simplify register write functions with single statement return values
- Remove unrequired error checks from get_queue()
- Drop explicit device_caps setting as it is already taken care by v4l2
  core
- Remove unrequired multiplanar checks and memset from s_fmt, g_fmt
  callback functions
- Fix try_fmt callback to not update the queues
- Remove unrequired contiguous format attribute from queue_init
- Use dynamic allocation for video_device and remove unrequired
  assignments in probe()
- Remove unrequired checks from queue_setup function
- Return queued buffers back if start_streaming fails
- Use ARRAY_SIZE in place of hard-coding
- Use huffman and quantization tables from reference header file

V2->V3:
- Add DONOTMERGE patches for dts and defconfig
- Update driver with below changes :
  - Correct license headers
  - Use more generic name core instead of jasper for base registers
  - Add Comment for forward declarations
  - Simplify quantization table calculations
  - Use v4l2_apply_frmsize_constraints for updating framesize and remove
unrequired functions
  - Place TODO at top of file and in commit message too
  - Use dev_err_probe helper in probe function
  - Fix return value checking for failure scenarios in probe function
  - Use v4l2_err/info/warn helpers instead of dev_err/info/warn helpers
  - Fix unexpected indentation
  - Correct commit message
- Update dt-bindings with below changes :
  - Add vendor specific compatible 
  - Fix commit title and message
  - Update reg names
  - Update clocks to 1
  - Fix dts example with proper naming

V1->V2:
 - Send dt-bindings and driver together

Patch-Diff between the series :
V7->V8 Range diff :
https://gist.github.com/devarsht/3fd6c4e8031ab114248f93d01c8dfc74

V6->V7 Range diff :
https://gist.github.com/devarsht/1db185b1e187eaf397e9e4c37066777e

V5->V6 Range diff :
https://gist.github.com/devarsht/c89180ac2b0d2814614f2b59d0705c19

V4->V5 Range diff :
https://gist.github.com/devarsht/298790af819f299a0a05fec89371097b

V3->V4 Range diff :
https://gist.github.com/devarsht/22a744d999080de6e813bcfb5a596272

Previous patch series:
V7: https://lore.kernel.org/all/20240510082603.1263256-1-devar...@ti.com/
V6: https://lore.kernel.org/all/20240228141140.3530612-1-devar...@ti.com/
V5: https://lore.kernel.org/all/20240215134641.3381478-1-devar...@ti.com/
V4: https://lore.kernel.org/all/20240205114239.924697-1-devar...@ti.com/
V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/
V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/

Daniel Latypov (1):
  lib: add basic KUnit test for lib/math

Devarsh Thakkar (9):
  media: dt-bindings: Add Imagination E5010 JPEG Encoder
  media: imagination: Add E5010 JPEG Encoder driver
  media: v4l2-jpeg: Export reference quantization and huffman tables
  media: imagination: U

Re: [PATCH v7 7/8] media: imagination: Round to closest multiple for cropping region

2024-05-14 Thread Devarsh Thakkar
Hi Nicolas,

Thanks for the review.

On 15/05/24 01:52, Nicolas Dufresne wrote:
> Le samedi 11 mai 2024 à 22:38 +0530, Devarsh Thakkar a écrit :
>> Hi Andy,
>>
>> Thanks for the quick review.
>> On 10/05/24 20:40, Andy Shevchenko wrote:
>>> On Fri, May 10, 2024 at 12:10:01AM +0530, Devarsh Thakkar wrote:
>>>> If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up
>>>> (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest
>>>> multiple of requested value while updating the crop rectangle coordinates.
>>>>
>>>> Use the rounding macro which gives preference to rounding down in case two
>>>> nearest values (high and low) are possible to raise the probability of
>>>> cropping rectangle falling inside the bound region.
>>>
>>> This is arguable. How do we know that the bigger range is supported?
>>> The safest side is to go smaller than bigger.
>>>
>>
>> Yes and that's what the driver does when do when application passes
>>  while doing the selection. If application does not
>> specify explicitly whether to round down or round up the cropping
>> parameters requested by it (i.e app is neither passing V4L2_SEL_FLAG_LE
>> nor V4L2_SEL_FLAG_GE flags), then it is preferred by driver to round the
>> cropping parameters to nearest possible value by either rounding down or
>> rounding up to align with hardware requirements.
>>
>> For e.g. If requested width for cropping region is 127 and HW requires
>> width to be multiple of 64 then we would prefer to round it up to 128
>> rather than rounding down to a more distant value (i.e. 64), but if
>> requested cropping width is 129 then we would prefer to instead round it
>> down to 128. But if requested cropping width is 160 then there are two
>> nearest possible values 160 - 32 = 128 and 160 + 32 = 192 and in which
>> case we prefer the smaller value as you suggested and that's why the
>> driver uses round_closest_down.
>>
>> For any reason, if still the cropping rectangle falls beyond the bound
>> region, then driver will return out of range error (-ERANGE) to
>> application.
> 
> I would appreciate if this change was based on specification text, meaning
> improving the next if that behaviour is undefined. We might not be able to fix
> it everywhere, but we can recommend something.
> 

Yes, this change is based on specification text. This complies with the
VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl description as documented in
v4l uapi [1] which means driver should choose crop rectangle as close as
possible if no flags are passed by user-space, as quoted below :

"``0`` - The driver can adjust the rectangle size freely and shall
 choose a crop/compose rectangle as close as possible to the requested
 one."

If the user-space has specific requirement to either round down, round up or
honor exact values, it should pass V4L2_SEL_FLAG_LE, V4L2_SEL_FLAG_GE or
V4L2_SEL_FLAG_LE | V4L2_SEL_FLAG_GE flags respectively.

[1]
https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst#:~:text=compose%20rectangle%20as-,close,-as%20possible%20to

Regards
Devarsh


Re: [PATCH v7 6/8] math.h Add macros to round to closest specified power of 2

2024-05-13 Thread Devarsh Thakkar
Hi Andy,

On 13/05/24 17:55, Andy Shevchenko wrote:
> On Mon, May 13, 2024 at 04:55:58PM +0530, Devarsh Thakkar wrote:
>> On 13/05/24 14:29, Andy Shevchenko wrote:
>>> On Sat, May 11, 2024 at 11:11:14PM +0530, Devarsh Thakkar wrote:
>>>> On 10/05/24 20:45, Jani Nikula wrote:
> 
> [...]
> - align naming (with the existing round*() macros)

I think round_closest_up/round_closest_down align already and inspired by the
existing naming convention used for round*() and DIV_ROUND_CLOSEST() macros in
math.h as explained below (copied from my previous reply [1])

"Coming back to naming, this is as per existing convention used for naming
round_up, round_down (notice the `_` being used for macros working with pow of
2) and DIV_ROUND_CLOSEST (notice the work `closest` used to specify the answer
 to be nearest to specified value)"

But do let me know if you have any other suggestions for naming?

> - add examples into commit message of the math.h patch

Agreed
> - add test cases (you need to create lib/math_kunit.c for that)
Agreed.
> - elaborate in the commit message of the IPU3 change what you posted above
>   (some kind of a summary)
Agreed.


[1]: https://lore.kernel.org/all/zkig0-01pz632...@smile.fi.intel.com/

Regards
Devarsh


Re: [PATCH v7 6/8] math.h Add macros to round to closest specified power of 2

2024-05-13 Thread Devarsh Thakkar
Hi Andy,

On 13/05/24 14:29, Andy Shevchenko wrote:
> On Sat, May 11, 2024 at 11:11:14PM +0530, Devarsh Thakkar wrote:
>> On 10/05/24 20:45, Jani Nikula wrote:
> 
> [...]
> 
>>> Moreover, I think the naming of round_up() and round_down() should have
>>> reflected the fact that they operate on powers of 2. It's unfortunate
>>> that the difference to roundup() and rounddown() is just the underscore!
>>> That's just a trap.
>>>
>>> So let's perhaps not repeat the same with round_closest_up() and
>>> round_closest_down()?
>>
>> Yes the naming is inspired by existing macros i.e. round_up, round_down
>> (which round up/down to next pow of 2) and DIV_ROUND_CLOSEST (which
>> round the divided value to closest value) and there are already a lot of
>> users for these API's :
>>
>>   linux-next git:(heads/next-20240509) ✗ grep -nr round_up drivers | wc
>> 7304261   74775
>>
>>   linux-next git:(heads/next-20240509) ✗ grep -nr round_down drivers | wc
>> 2261293   22194
>>
>>  linux-next git:(heads/next-20240509) ✗ grep -nr DIV_ROUND_CLOSEST
>> drivers | wc
>>12077461  111822
> 
> Side note, discover `git grep ...`: it's much much faster on Git index,
> than classic one on a working copy.
> 
>> so I thought to align with existing naming convention assuming
>> developers are already familiar with this.
>>
>> But if a wider consensus is to go with a newer naming convention then I
>> am open to it, although a challenge there would be to keep it short. For
>> e.g. this one is already 3 words, if we go with more explicit
>> "round_closest_up_pow_2" it looks quite long in my opinion :) .
> 
> You need properly name the macros. Again, round_up() / roundup() and
> roundup_pow_of_two() are three _different_ macros, and it's not clear
> why you can't use one of them in your case.
> 

I can't use any of these because these macros either round up or round down,
whereas I want to round to closest value for the argument specified by the
user, be it achieved either by rounding up or rounding down depending upon
whichever makes the answer closer to the user-specified argument.

To make it clear, I have already included the examples in the macro
description [2], copying it here, maybe I can put the same examples in the
commit message too to avoid confusions :

round_closest_up(17, 4) = 16
round_closest_up(15, 4) = 16
round_closest_up(14, 4) = 16

round_closest_down(17, 4) = 16
round_closest_down(15, 4) = 16
round_closest_down(14, 4) = 12

Coming back to naming, this is as per existing convention used for naming
round_up, round_down (notice the `_` being used for macros working with pow of
2) and DIV_ROUND_CLOSEST (notice the work `closest` used to specify the answer
to be nearest to specified value). Naming is a bit subjective, but I
personally don't think it is a good idea to go away with the existing naming
convention or go with longer names.

> The patch that changes those to a new one are doubtful to begin with.
> I.e. need a careful review on the arithmetics side of the change
> including HW capabilities of handling "closest" results.
> 

This is already tested from my side, in-fact I have posted some of the results
in cover-letter with these macros [1] :

Regarding hardware capabilities, it uses existing round_up, round_down macros
underneath which are optimized to handle pow of 2 after modifying the user
provided argument using addition/subtraction and division, so I don't think it
should generally a problem with the hardware.
And I see other macros DIV_ROUND_CLOSEST [3] already using similar operations
i.e. addition/subtraction and division so don't think it should be a problem
to keep similar other macros in the same file.

[1]:
https://gist.github.com/devarsht/de6f5142f678bb1a5338abfd9f814abd#file-v7_jpeg_encoder_crop_validation-L204
[2]: https://lore.kernel.org/all/20240509183952.4064331-1-devar...@ti.com/
[3]: https://elixir.bootlin.com/linux/latest/source/include/linux/math.h#L86

Regards
Devarsh


Re: [PATCH v7 8/8] gpu: ipu-v3: Use generic macro for rounding to nearest multiple

2024-05-11 Thread Devarsh Thakkar
Hi Andy,

Thanks for the quick review.

On 10/05/24 20:33, Andy Shevchenko wrote:
> On Fri, May 10, 2024 at 12:10:10AM +0530, Devarsh Thakkar wrote:
>> Use generic macro round_closest_up for rounding to nearest multiple instead
> 
> round_closest_up()
> 
> We refer to the functions as func().
> 

Agreed. Will fix commit msg to use round_closest_up()

>> of using local function.
> 
> ...
> 
>> @@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx 
>> *ctx,
>>   * The closest input sample position that we could actually
>>   * start the input tile at, 19.13 fixed point.
>>   */
>> -in_pos_aligned = round_closest(in_pos, 8192U * in_align);
>> +in_pos_aligned = round_closest_up(in_pos, 8192U * in_align);
>>  /* Convert 19.13 fixed point to integer */
>>  in_pos_rounded = in_pos_aligned / 8192U;
> 
> Oh, these seems to be better to use either ALIGN*(), or PFN_*() / PAGE_*()
> families of macros. What the semantic of 8192 is?
> 

As Laurent mentioned, it looks like the fractional part of the integer.
But functionality wise, there is no change with the introduction of this
patch. round_closest_up() does exactly the same thing as what the local
function round_closest used to do before this patch.

Regards
Devarsh


Re: [PATCH v7 6/8] math.h Add macros to round to closest specified power of 2

2024-05-11 Thread Devarsh Thakkar
Hi Jani,

Thanks for the quick review.

On 10/05/24 20:45, Jani Nikula wrote:

[...]
> Moreover, I think the naming of round_up() and round_down() should have
> reflected the fact that they operate on powers of 2. It's unfortunate
> that the difference to roundup() and rounddown() is just the underscore!
> That's just a trap.
> 
> So let's perhaps not repeat the same with round_closest_up() and
> round_closest_down()?
> 

Yes the naming is inspired by existing macros i.e. round_up, round_down
(which round up/down to next pow of 2) and DIV_ROUND_CLOSEST (which
round the divided value to closest value) and there are already a lot of
users for these API's :

  linux-next git:(heads/next-20240509) ✗ grep -nr round_up drivers | wc
7304261   74775

  linux-next git:(heads/next-20240509) ✗ grep -nr round_down drivers | wc
2261293   22194

 linux-next git:(heads/next-20240509) ✗ grep -nr DIV_ROUND_CLOSEST
drivers | wc
   12077461  111822


so I thought to align with existing naming convention assuming
developers are already familiar with this.

But if a wider consensus is to go with a newer naming convention then I
am open to it, although a challenge there would be to keep it short. For
e.g. this one is already 3 words, if we go with more explicit
"round_closest_up_pow_2" it looks quite long in my opinion :) .

Regards
Devarsh


Re: [PATCH v7 6/8] math.h Add macros to round to closest specified power of 2

2024-05-11 Thread Devarsh Thakkar
Hi Andy,

Thanks for the quick review.

On 10/05/24 20:31, Andy Shevchenko wrote:
> On Fri, May 10, 2024 at 12:09:52AM +0530, Devarsh Thakkar wrote:
>> Add macros to round to nearest specified power of 2.
> 
> This is not what they are doing. For the above we already have macros defined.
> 

Sorry I did not understand this comment, if you are talking about
existing macros round_up & round_down they either round "up" and round
"down" to specified power of 2 as specified here [1].
whereas the macros introduced in this patch round to "nearest" specified
power of 2.

>> Two macros are added :
> 
> (Yes, after I wrapped to comment this line looks better on its own,
>  so whatever will be the first sentence, this line should be separated
>  from.)
> 

Agreed.

>> round_closest_up and round_closest_down which round up to nearest multiple
> 
> round_closest_up() and round_closest_down()
> 
> 
>> of 2 with a preference to round up or round down respectively if there are
>> two possible nearest values to the given number.
> 
> You should reformulate, because AFAICS there is the crucial difference
> from these and existing round_*_pow_of_two().
> 

In math.h, we already have round_up/round_down macros which rounded
up/down to next specified power of 2. Then we had the DIV_ROUND_CLOSEST
which used the suffix _CLOSEST to imply the meaning that divided value
will be rounded to nearest/closest int value either by rounding up or
rounding down.

So inspired from naming convention of this macros given developers are
already familiar with them, I used round_closest for specifying the
rounded value to nearest/closest value which can be achieved either by
rounding up/down. And I also wanted user to have finer control for the
scenarios where there are two possible nearest values :

For e.g round(160, 64) can be either 192 and 128 and both are 32 away
from 160.

And that's the reason I went ahead with two macros instead i.e
round_closest_up, round_closest_down so that developers can choose
exactly what they want.

Regards
Devarsh


Re: [PATCH v7 7/8] media: imagination: Round to closest multiple for cropping region

2024-05-11 Thread Devarsh Thakkar
Hi Andy,

Thanks for the quick review.
On 10/05/24 20:40, Andy Shevchenko wrote:
> On Fri, May 10, 2024 at 12:10:01AM +0530, Devarsh Thakkar wrote:
>> If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up
>> (V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest
>> multiple of requested value while updating the crop rectangle coordinates.
>>
>> Use the rounding macro which gives preference to rounding down in case two
>> nearest values (high and low) are possible to raise the probability of
>> cropping rectangle falling inside the bound region.
> 
> This is arguable. How do we know that the bigger range is supported?
> The safest side is to go smaller than bigger.
> 

Yes and that's what the driver does when do when application passes
V4L2_SEL_FLAG_LE while doing the selection. If application does not
specify explicitly whether to round down or round up the cropping
parameters requested by it (i.e app is neither passing V4L2_SEL_FLAG_LE
nor V4L2_SEL_FLAG_GE flags), then it is preferred by driver to round the
cropping parameters to nearest possible value by either rounding down or
rounding up to align with hardware requirements.

For e.g. If requested width for cropping region is 127 and HW requires
width to be multiple of 64 then we would prefer to round it up to 128
rather than rounding down to a more distant value (i.e. 64), but if
requested cropping width is 129 then we would prefer to instead round it
down to 128. But if requested cropping width is 160 then there are two
nearest possible values 160 - 32 = 128 and 160 + 32 = 192 and in which
case we prefer the smaller value as you suggested and that's why the
driver uses round_closest_down.

For any reason, if still the cropping rectangle falls beyond the bound
region, then driver will return out of range error (-ERANGE) to
application.

Regards
Devarsh




[PATCH RESEND v7 8/8] gpu: ipu-v3: Use generic macro for rounding to nearest multiple

2024-05-10 Thread Devarsh Thakkar
Use generic macro round_closest_up for rounding to nearest multiple instead
of using local function.

Signed-off-by: Devarsh Thakkar 
---
V1->V6 (No change, patch introduced in V7)
---
 drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c 
b/drivers/gpu/ipu-v3/ipu-image-convert.c
index 841316582ea9..5192a8b5c02c 100644
--- a/drivers/gpu/ipu-v3/ipu-image-convert.c
+++ b/drivers/gpu/ipu-v3/ipu-image-convert.c
@@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct 
ipu_image_convert_ctx *ctx,
return 0;
 }
 
-#define round_closest(x, y) round_down((x) + (y)/2, (y))
-
 /*
  * Find the best aligned seam position for the given column / row index.
  * Rotation and image offsets are out of scope.
@@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx 
*ctx,
 * The closest input sample position that we could actually
 * start the input tile at, 19.13 fixed point.
 */
-   in_pos_aligned = round_closest(in_pos, 8192U * in_align);
+   in_pos_aligned = round_closest_up(in_pos, 8192U * in_align);
/* Convert 19.13 fixed point to integer */
in_pos_rounded = in_pos_aligned / 8192U;
 
-- 
2.39.1



[PATCH RESEND v7 7/8] media: imagination: Round to closest multiple for cropping region

2024-05-10 Thread Devarsh Thakkar
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up
(V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest
multiple of requested value while updating the crop rectangle coordinates.

Use the rounding macro which gives preference to rounding down in case two
nearest values (high and low) are possible to raise the probability of
cropping rectangle falling inside the bound region.

Signed-off-by: Devarsh Thakkar 
---
V1->V6 (No change, patch introduced in V7)
---
 drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c 
b/drivers/media/platform/imagination/e5010-jpeg-enc.c
index 189e2a99c43d..abd66bc9b96c 100644
--- a/drivers/media/platform/imagination/e5010-jpeg-enc.c
+++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c
@@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, 
struct v4l2_selection
 
switch (s->flags) {
case 0:
-   s->r.width = round_down(s->r.width, 
queue->fmt->frmsize.step_width);
-   s->r.height = round_down(s->r.height, 
queue->fmt->frmsize.step_height);
-   s->r.left = round_down(s->r.left, 
queue->fmt->frmsize.step_width);
-   s->r.top = round_down(s->r.top, 2);
+   s->r.width = round_closest_down(s->r.width, 
queue->fmt->frmsize.step_width);
+   s->r.height = round_closest_down(s->r.height, 
queue->fmt->frmsize.step_height);
+   s->r.left = round_closest_down(s->r.left, 
queue->fmt->frmsize.step_width);
+   s->r.top = round_closest_down(s->r.top, 2);
 
if (s->r.left + s->r.width > queue->width)
s->r.width = round_down(s->r.width + s->r.left - 
queue->width,
-- 
2.39.1



[PATCH RESEND v7 6/8] math.h Add macros to round to closest specified power of 2

2024-05-10 Thread Devarsh Thakkar
Add macros to round to nearest specified power of 2. Two macros are added :
round_closest_up and round_closest_down which round up to nearest multiple
of 2 with a preference to round up or round down respectively if there are
two possible nearest values to the given number.

This patch is inspired from the Mentor Graphics IPU driver [1] which uses
similar macro locally and which can be updated to use this generic macro
instead along with other drivers having similar requirements.

[1]:
https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480

Signed-off-by: Devarsh Thakkar 
---
V1->V6 (No change, patch introduced in V7)
---
 include/linux/math.h | 36 
 1 file changed, 36 insertions(+)

diff --git a/include/linux/math.h b/include/linux/math.h
index dd4152711de7..82ee3265c910 100644
--- a/include/linux/math.h
+++ b/include/linux/math.h
@@ -34,6 +34,42 @@
  */
 #define round_down(x, y) ((x) & ~__round_mask(x, y))
 
+/**
+ * round_closest_up - round to nearest specified power of 2 with preference
+ *   to rounding up
+ * @x: the value to round
+ * @y: multiple to round to (must be a power of 2)
+ *
+ * Rounds @x to nearest multiple of @y (which must be a power of 2).
+ * The rounded value can be greater than or less than @x depending
+ * upon it's nearness to @x. If there are two nearest possible values,
+ * then preference will be given to the greater value.
+ *
+ * Examples :
+ * round_closest_up(17, 4) = 16
+ * round_closest_up(15, 4) = 16
+ * round_closest_up(14, 4) = 16
+ */
+#define round_closest_up(x, y) round_down((x) + (y) / 2, (y))
+
+/**
+ * round_closest_down - round to nearest specified power of 2 with preference
+ * to rounding down
+ * @x: the value to round
+ * @y: multiple to round down to (must be a power of 2)
+ *
+ * Rounds @x to nearest multiple of @y (which must be a power of 2).
+ * The rounded value can be greater than or less than @x depending
+ * upon it's nearness to @x. If there are two nearest possible values,
+ * then preference will be given to the lesser value.
+ *
+ * Examples :
+ * round_closest_down(17, 4) = 16
+ * round_closest_down(15, 4) = 16
+ * round_closest_down(14, 4) = 12
+ */
+#define round_closest_down(x, y) round_up((x) - (y) / 2, (y))
+
 #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
 
 #define DIV_ROUND_DOWN_ULL(ll, d) \
-- 
2.39.1



[PATCH RESEND v7 0/8] Add V4L2 M2M Driver for E5010 JPEG Encoder

2024-05-10 Thread Devarsh Thakkar
Resending this V7 series to have proper linking of patches in the series
with cover-letter while doing git send-email.

Original cover letter: 
This adds support for V4L2 M2M based driver for E5010 JPEG Encoder
which is a stateful JPEG encoder from Imagination technologies
and is present in TI AM62A SoC.

While adding support for it, following additional framework changes were
made:
 - Moved reference quantization and huffman tables provided in
   ITU-T-REC-T.81 to v4l2-jpeg.c as suggested in mailing list [1].
 - Add macros to round to closest integer (either higher or lower) while
   rounding in order of 2.

v4l2-compliance test :
Link: https://gist.github.com/devarsht/1f039c631ca953a57f405cfce1b69e49

E5010 JPEG Encoder Manual tests :

Performance:
Link: https://gist.github.com/devarsht/c40672944fd71c9a53ab55adbfd9e28b

Functionality:
Link: https://gist.github.com/devarsht/8e88fcaabff016bb2bac83d89c9d23ce

Compression Quality:
Link: https://gist.github.com/devarsht/cbcc7cd97e8c48ba1486caa2b7884655

Multi Instance:
Link: https://gist.github.com/devarsht/22c2fca08cd3441fb40f2c7a4cebc95a

Crop support:
Link: https://gist.github.com/devarsht/de6f5142f678bb1a5338abfd9f814abd

Runtime PM:
Link: https://gist.github.com/devarsht/70cd95d4440ddc678489d93885ddd4dd

[1]: 
https://lore.kernel.org/all/de46aefe-36da-4e1a-b4fa-b375b2749...@xs4all.nl/

Changelog:
V6->V7:
 - Fix cropping support
 - Move reference huffman and quantization tables to v4l2-jpeg.c
 - Fix suspend/resume use-case
 - Add Reviewed-by

V5->V6:
 - Fix sparse warnings

V4->V5:
 - Sort the #includes in driver file alphabetically
 - Rename huffman and quantization tables to not use '_'
 - Add Reviewed-by tag

V3->V4:
- Use ti-specific compatible ti,am62a-jpeg-enc as secondary one in
  dt-binding
- Remove clock-names as only single clock in dt-binding
- Fix issue with default params setting
- Correct v4l2 error prints
- Simplify register write functions with single statement return values
- Remove unrequired error checks from get_queue()
- Drop explicit device_caps setting as it is already taken care by v4l2
  core
- Remove unrequired multiplanar checks and memset from s_fmt, g_fmt
  callback functions
- Fix try_fmt callback to not update the queues
- Remove unrequired contiguous format attribute from queue_init
- Use dynamic allocation for video_device and remove unrequired
  assignments in probe()
- Remove unrequired checks from queue_setup function
- Return queued buffers back if start_streaming fails
- Use ARRAY_SIZE in place of hard-coding
- Use huffman and quantization tables from reference header file

V2->V3:
- Add DONOTMERGE patches for dts and defconfig
- Update driver with below changes :
  - Correct license headers
  - Use more generic name core instead of jasper for base registers
  - Add Comment for forward declarations
  - Simplify quantization table calculations
  - Use v4l2_apply_frmsize_constraints for updating framesize and remove
unrequired functions
  - Place TODO at top of file and in commit message too
  - Use dev_err_probe helper in probe function
  - Fix return value checking for failure scenarios in probe function
  - Use v4l2_err/info/warn helpers instead of dev_err/info/warn helpers
  - Fix unexpected indentation
  - Correct commit message
- Update dt-bindings with below changes :
  - Add vendor specific compatible 
  - Fix commit title and message
  - Update reg names
  - Update clocks to 1
  - Fix dts example with proper naming

V1->V2:
 - Send dt-bindings and driver together

Patch-Diff between the series :

V3->V4 Range diff :
https://gist.github.com/devarsht/22a744d999080de6e813bcfb5a596272

V4->V5 Range diff :
https://gist.github.com/devarsht/298790af819f299a0a05fec89371097b

V5->V6 Range diff :
https://gist.github.com/devarsht/c89180ac2b0d2814614f2b59d0705c19

V6->V7 Range diff :
https://gist.github.com/devarsht/1db185b1e187eaf397e9e4c37066777e

Previous patch series:
V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/
V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/
V4: https://lore.kernel.org/all/20240205114239.924697-1-devar...@ti.com/
V5: https://lore.kernel.org/all/20240215134641.3381478-1-devar...@ti.com/
V6: https://lore.kernel.org/all/20240228141140.3530612-1-devar...@ti.com/

Devarsh Thakkar (8):
  media: dt-bindings: Add Imagination E5010 JPEG Encoder
  media: imagination: Add E5010 JPEG Encoder driver
  media: v4l2-jpeg: Export reference quantization and huffman tables
  media: imagination: Use exported tables from v4l2-jpeg core
  media: verisilcon : Use exported tables from v4l2-jpeg for hantro
codec
  math.h Add macros to round to closest specified power of 2
  media: imagination: Round to closest multiple for cropping region
  gpu: ipu-v3: Use generic macro for rounding to nearest multiple

 .../bindings/media/img,e5010-jpeg-enc.yaml|   75 +
 MAINTAINERS   |7 +

[PATCH v7 7/8] media: imagination: Round to closest multiple for cropping region

2024-05-09 Thread Devarsh Thakkar
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up
(V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest
multiple of requested value while updating the crop rectangle coordinates.

Use the rounding macro which gives preference to rounding down in case two
nearest values (high and low) are possible to raise the probability of
cropping rectangle falling inside the bound region.

Signed-off-by: Devarsh Thakkar 
---
V1->V6 (No change, patch introduced in V7)
---
 drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c 
b/drivers/media/platform/imagination/e5010-jpeg-enc.c
index 189e2a99c43d..abd66bc9b96c 100644
--- a/drivers/media/platform/imagination/e5010-jpeg-enc.c
+++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c
@@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, 
struct v4l2_selection
 
switch (s->flags) {
case 0:
-   s->r.width = round_down(s->r.width, 
queue->fmt->frmsize.step_width);
-   s->r.height = round_down(s->r.height, 
queue->fmt->frmsize.step_height);
-   s->r.left = round_down(s->r.left, 
queue->fmt->frmsize.step_width);
-   s->r.top = round_down(s->r.top, 2);
+   s->r.width = round_closest_down(s->r.width, 
queue->fmt->frmsize.step_width);
+   s->r.height = round_closest_down(s->r.height, 
queue->fmt->frmsize.step_height);
+   s->r.left = round_closest_down(s->r.left, 
queue->fmt->frmsize.step_width);
+   s->r.top = round_closest_down(s->r.top, 2);
 
if (s->r.left + s->r.width > queue->width)
s->r.width = round_down(s->r.width + s->r.left - 
queue->width,
-- 
2.39.1



[PATCH v7 6/8] math.h Add macros to round to closest specified power of 2

2024-05-09 Thread Devarsh Thakkar
Add macros to round to nearest specified power of 2. Two macros are added :
round_closest_up and round_closest_down which round up to nearest multiple
of 2 with a preference to round up or round down respectively if there are
two possible nearest values to the given number.

This patch is inspired from the Mentor Graphics IPU driver [1] which uses
similar macro locally and which can be updated to use this generic macro
instead along with other drivers having similar requirements.

[1]:
https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480

Signed-off-by: Devarsh Thakkar 
---
V1->V6 (No change, patch introduced in V7)
---
 include/linux/math.h | 36 
 1 file changed, 36 insertions(+)

diff --git a/include/linux/math.h b/include/linux/math.h
index dd4152711de7..82ee3265c910 100644
--- a/include/linux/math.h
+++ b/include/linux/math.h
@@ -34,6 +34,42 @@
  */
 #define round_down(x, y) ((x) & ~__round_mask(x, y))
 
+/**
+ * round_closest_up - round to nearest specified power of 2 with preference
+ *   to rounding up
+ * @x: the value to round
+ * @y: multiple to round to (must be a power of 2)
+ *
+ * Rounds @x to nearest multiple of @y (which must be a power of 2).
+ * The rounded value can be greater than or less than @x depending
+ * upon it's nearness to @x. If there are two nearest possible values,
+ * then preference will be given to the greater value.
+ *
+ * Examples :
+ * round_closest_up(17, 4) = 16
+ * round_closest_up(15, 4) = 16
+ * round_closest_up(14, 4) = 16
+ */
+#define round_closest_up(x, y) round_down((x) + (y) / 2, (y))
+
+/**
+ * round_closest_down - round to nearest specified power of 2 with preference
+ * to rounding down
+ * @x: the value to round
+ * @y: multiple to round down to (must be a power of 2)
+ *
+ * Rounds @x to nearest multiple of @y (which must be a power of 2).
+ * The rounded value can be greater than or less than @x depending
+ * upon it's nearness to @x. If there are two nearest possible values,
+ * then preference will be given to the lesser value.
+ *
+ * Examples :
+ * round_closest_down(17, 4) = 16
+ * round_closest_down(15, 4) = 16
+ * round_closest_down(14, 4) = 12
+ */
+#define round_closest_down(x, y) round_up((x) - (y) / 2, (y))
+
 #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
 
 #define DIV_ROUND_DOWN_ULL(ll, d) \
-- 
2.39.1



[PATCH v7 8/8] gpu: ipu-v3: Use generic macro for rounding to nearest multiple

2024-05-09 Thread Devarsh Thakkar
Use generic macro round_closest_up for rounding to nearest multiple instead
of using local function.

Signed-off-by: Devarsh Thakkar 
---
V1->V6 (No change, patch introduced in V7)
---
 drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c 
b/drivers/gpu/ipu-v3/ipu-image-convert.c
index 841316582ea9..5192a8b5c02c 100644
--- a/drivers/gpu/ipu-v3/ipu-image-convert.c
+++ b/drivers/gpu/ipu-v3/ipu-image-convert.c
@@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct 
ipu_image_convert_ctx *ctx,
return 0;
 }
 
-#define round_closest(x, y) round_down((x) + (y)/2, (y))
-
 /*
  * Find the best aligned seam position for the given column / row index.
  * Rotation and image offsets are out of scope.
@@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx 
*ctx,
 * The closest input sample position that we could actually
 * start the input tile at, 19.13 fixed point.
 */
-   in_pos_aligned = round_closest(in_pos, 8192U * in_align);
+   in_pos_aligned = round_closest_up(in_pos, 8192U * in_align);
/* Convert 19.13 fixed point to integer */
in_pos_rounded = in_pos_aligned / 8192U;
 
-- 
2.39.1



Re: [RFC PATCH 2/3] drm/tidss: Add support for display sharing

2024-03-22 Thread Devarsh Thakkar
Hi Maxime,

On 14/03/24 20:04, Maxime Ripard wrote:
> Hi,
> 
> On Wed, Feb 14, 2024 at 09:17:12PM +0530, Devarsh Thakkar wrote:
>> On 13/02/24 19:34, Maxime Ripard wrote:
>>> On Thu, Feb 08, 2024 at 06:26:17PM +0530, Devarsh Thakkar wrote:
>>>> On 26/01/24 17:45, Maxime Ripard wrote:
>>>>> Hi,
>>>>>
>>>>> Thanks a lot for working on that.
>>>>>
>>>>> On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote:
>>>>>> Display subsystem present in TI Keystone family of devices supports 
>>>>>> sharing
>>>>>> of display between multiple hosts as it provides separate register space
>>>>>> (common* region) for each host to programming display controller and 
>>>>>> also a
>>>>>> unique interrupt line for each host.
>>>>>>
>>>>>> This adds support for display sharing, by allowing partitioning of
>>>>>> resources either at video port level or at video plane level as
>>>>>> described below :
>>>>>>
>>>>>> 1) Linux can own (i.e have write access) completely one or more of video
>>>>>> ports along with corresponding resources (viz. overlay managers,
>>>>>> video planes) used by Linux in context of those video ports.
>>>>>> Even if Linux is owning
>>>>>> these video ports it can still share this video port with a remote core
>>>>>> which can own one or more video planes associated with this video port.
>>>>>>
>>>>>> 2) Linux owns one or more of the video planes with video port
>>>>>> (along with corresponding overlay manager) associated with these planes
>>>>>> being owned and controlled by a remote core. Linux still has read-only
>>>>>> access to the associated video port and overlay managers so that it can
>>>>>> parse the settings made by remote core.
>>>>>
>>>>> So, just to make sure we're on the same page. 1) means Linux drives the
>>>>> whole display engine, but can lend planes to the R5? How does that work,
>>>>> is Linux aware of the workload being there (plane size, format, etc) ?
>>>>>
>>>>
>>>> Well, there is no dynamic procedure being followed for lending. The
>>>> partitioning scheme is decided and known before hand, and the remote
>>>> core firmware updated and compiled accordingly, and similarly the
>>>> device-tree overlay for Linux is also updated with partitioning
>>>> information before bootup.
>>>>
>>>> What would happen here is that Linux will know before-hand this
>>>> partitioning information via device-tree properties and won't enumerate
>>>> the plane owned by RTOS, but it will enumerate the rest of the display
>>>> components and initialize the DSS, after which user can load the DSS
>>>> firmware on remote core and this firmware will only have control of
>>>> plane as it was compiled with that configuration.
>>>
>>> Right. If the RTOS is in control of a single plane, how it is expected
>>> to deal with Linux shutting the CRTC down, or enforcing a configuration
>>> that isn't compatible with what the RTOS expects (like a plane with a
>>> higher zpos masking its plane), what is the mechanism to reconcile it?
>>>
>>
>> Just for the note, for this "RTOS control single plane" mode, we don't have a
>> firmware available to test (right now we are only supporting example for 
>> "RTOS
>> controlling the display mode" as shared here [1]) and hence this is not
>> validated but the idea was to keep dt-bindings generic enough to support them
>> in future and that's why I referred to it here.
>>
>> Coming back to your questions, with the current scheme the Linux (tidss) 
>> would
>> be expected to make sure the CRTC being shared with RTOS is never shutdown 
>> and
>> the RTOS plane should never gets masked.
> 
> I'm probably missing something then here, but if the Linux side of
> things is expected to keep the current configuration and keep it active
> for it to work, what use-case would it be useful for?
> 

It's just one of the partitioning possibilities that I mentioned here, that
Linux is in control of DSS as a whole and the user want the other host (be it
RTOS or any other core) to control a single plane. For e.g it could be Linux
(with GPU rend

[PATCH v5 3/4] arm64: dts: ti: Add common1 register space for AM62x SoC

2024-02-15 Thread Devarsh Thakkar
This adds common1 register space for AM62x SoC which is using TI's Keystone
display hardware and supporting it as described in
Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml

Fixes: 8ccc1073c7bb ("arm64: dts: ti: k3-am62-main: Add node for DSS")
Signed-off-by: Devarsh Thakkar 
---
V1->V4 :
- No change (this was part of "arm64: dts: ti: Add common1 register
  space for AM62x, AM62A & AM65x SoCs )

V5 :
- Split this as a separate patch from "arm64: dts: ti: Add common1
  register space for AM62x, AM62A & AM65x SoCs"
- Remove Reviewed-By tag as patch is split now
---
 arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
index fe0cc4a9a501..8cee4d94cdd3 100644
--- a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
@@ -779,9 +779,10 @@ dss: dss@3020 {
  <0x00 0x30207000 0x00 0x1000>, /* ovr1 */
  <0x00 0x30208000 0x00 0x1000>, /* ovr2 */
  <0x00 0x3020a000 0x00 0x1000>, /* vp1: Used for OLDI */
- <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */
+ <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */
+ <0x00 0x30201000 0x00 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>;
clocks = <&k3_clks 186 6>,
 <&dss_vp1_clk>,
-- 
2.34.1



[PATCH v5 1/4] dt-bindings: display: ti, am65x-dss: Add support for common1 region

2024-02-15 Thread Devarsh Thakkar
TI keystone display subsystem present in AM65, AM62 and AM62A SoC support
two separate register spaces namely "common" and "common1" which can be
used by two separate hosts to program the display controller as described
in respective Technical Reference Manuals [1].

The common1 register space has similar set of configuration registers as
supported in common register space except the global configuration
registers which are exclusive to common region.

This adds binding for "common1" register region too as supported by the
hardware.

[1]:
AM62x TRM:
https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers)

AM65x TRM:
https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers)

AM62A TRM:
https://www.ti.com/lit/pdf/spruj16 (Section 14.9.9 Display Subsystem Registers)

Fixes: 2d8730f1021f ("dt-bindings: display: ti,am65x-dss: Add dt-schema yaml 
binding")
Signed-off-by: Devarsh Thakkar 
Reviewed-by: Aradhya Bhatia 
Acked-by: Conor Dooley 
---
V2: Add Acked-by tag
V3: Add Fixes tag
V4: Add Reviewed-by and AM62A TRM link
V5: No change
---
 .../devicetree/bindings/display/ti/ti,am65x-dss.yaml   | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml 
b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
index b6767ef0d24d..55e3e490d0e6 100644
--- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
+++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
@@ -37,6 +37,7 @@ properties:
   - description: OVR2 overlay manager for vp2
   - description: VP1 video port 1
   - description: VP2 video port 2
+  - description: common1 DSS register area
 
   reg-names:
 items:
@@ -47,6 +48,7 @@ properties:
   - const: ovr2
   - const: vp1
   - const: vp2
+  - const: common1
 
   clocks:
 items:
@@ -147,9 +149,10 @@ examples:
 <0x04a07000 0x1000>, /* ovr1 */
 <0x04a08000 0x1000>, /* ovr2 */
 <0x04a0a000 0x1000>, /* vp1 */
-<0x04a0b000 0x1000>; /* vp2 */
+<0x04a0b000 0x1000>, /* vp2 */
+<0x04a01000 0x1000>; /* common1 */
 reg-names = "common", "vidl1", "vid",
-"ovr1", "ovr2", "vp1", "vp2";
+"ovr1", "ovr2", "vp1", "vp2", "common1";
 ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>;
 power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>;
 clocks =<&k3_clks 67 1>,
-- 
2.34.1



[PATCH v5 2/4] arm64: dts: ti: Add common1 register space for AM65x SoC

2024-02-15 Thread Devarsh Thakkar
This adds common1 register space for AM65x SoC which is using TI's Keystone
display hardware and supporting it as described in
Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml

Fixes: fc539b90eda2 ("arm64: dts: ti: am654: Add DSS node")
Signed-off-by: Devarsh Thakkar 
---
V1->V4 :
- No change (this was part of "arm64: dts: ti: Add common1 register
  space for AM62x, AM62A & AM65x SoCs )

V5 :
- Split this as a separate patch from "arm64: dts: ti: Add common1
  register space for AM62x, AM62A & AM65x SoCs"
- Remove Reviewed-By tag as patch is split now
---
 arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
index 07010d31350e..ff857117d719 100644
--- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
@@ -991,9 +991,10 @@ dss: dss@4a0 {
  <0x0 0x04a07000 0x0 0x1000>, /* ovr1 */
  <0x0 0x04a08000 0x0 0x1000>, /* ovr2 */
  <0x0 0x04a0a000 0x0 0x1000>, /* vp1 */
- <0x0 0x04a0b000 0x0 0x1000>; /* vp2 */
+ <0x0 0x04a0b000 0x0 0x1000>, /* vp2 */
+ <0x0 0x04a01000 0x0 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
 
ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>;
 
-- 
2.34.1



[PATCH v5 0/4] Add common1 region for AM62, AM62A & AM65x

2024-02-15 Thread Devarsh Thakkar
This adds DSS common1 region for respective SoCs supporting it.

Changelog:
V2 : Remove do-not-merge tag and add am62a dss common1 reion
V3 : Add Fixes tag to each commit
V4 : Add Reviewed-by tag and AM62A SoC TRM Link
V5 : Split dts patch to separate patches for each SoC

Devarsh Thakkar (4):
  dt-bindings: display: ti,am65x-dss: Add support for common1 region
  arm64: dts: ti: Add common1 register space for AM65x SoC
  arm64: dts: ti: Add common1 register space for AM62x SoC
  arm64: dts: ti: Add common1 register space for AM62A SoC

 .../devicetree/bindings/display/ti/ti,am65x-dss.yaml   | 7 +--
 arch/arm64/boot/dts/ti/k3-am62-main.dtsi   | 5 +++--
 arch/arm64/boot/dts/ti/k3-am62a-main.dtsi  | 5 +++--
 arch/arm64/boot/dts/ti/k3-am65-main.dtsi   | 5 +++--
 4 files changed, 14 insertions(+), 8 deletions(-)

-- 
2.34.1



[PATCH v5 4/4] arm64: dts: ti: Add common1 register space for AM62A SoC

2024-02-15 Thread Devarsh Thakkar
This adds common1 register space for AM62A SoC which is using TI's Keystone
display hardware and supporting it as described in
Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml

Fixes: 3618811657b3 ("arm64: dts: ti: k3-am62a-main: Add node for Display 
SubSystem (DSS)")
Signed-off-by: Devarsh Thakkar 
---
V1->V4 :
- No change (this was part of "arm64: dts: ti: Add common1 register
  space for AM62x, AM62A & AM65x SoCs )

V5 :
- Split this as a separate patch from "arm64: dts: ti: Add common1
  register space for AM62x, AM62A & AM65x SoCs"
- Remove Reviewed-By tag as this is split from older revision
---
 arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
index 2749533cf4fd..dbdd6bec4667 100644
--- a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
@@ -994,9 +994,10 @@ dss: dss@3020 {
  <0x00 0x30207000 0x00 0x1000>, /* ovr1 */
  <0x00 0x30208000 0x00 0x1000>, /* ovr2 */
  <0x00 0x3020a000 0x00 0x1000>, /* vp1: Tied OFF in the 
SoC */
- <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */
+ <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */
+ <0x00 0x30201000 0x00 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>;
clocks = <&k3_clks 186 6>,
 <&k3_clks 186 0>,
-- 
2.34.1



[PATCH v4 1/2] dt-bindings: display: ti, am65x-dss: Add support for common1 region

2024-02-15 Thread Devarsh Thakkar
TI keystone display subsystem present in AM65 and other SoCs such as AM62
support two separate register spaces namely "common" and "common1" which
can be used by two separate hosts to program the display controller as
described in respective Technical Reference Manuals [1].

The common1 register space has similar set of configuration registers as
supported in common register space except the global configuration
registers which are exclusive to common region.

This adds binding for "common1" register region too as supported by the
hardware.

[1]:
AM62x TRM:
https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers)

AM65x TRM:
https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers)

AM62A TRM:
https://www.ti.com/lit/pdf/spruj16 (Section 14.9.9 Display Subsystem Registers)

Fixes: 2d8730f1021f ("dt-bindings: display: ti,am65x-dss: Add dt-schema yaml 
binding")
Signed-off-by: Devarsh Thakkar 
Reviewed-by: Aradhya Bhatia 
Acked-by: Conor Dooley 
---
V2: Add Acked-by tag
V3: Add Fixes tag
V4: Add Reviewed-by and AM62A TRM link
---
 .../devicetree/bindings/display/ti/ti,am65x-dss.yaml   | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml 
b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
index b6767ef0d24d..55e3e490d0e6 100644
--- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
+++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
@@ -37,6 +37,7 @@ properties:
   - description: OVR2 overlay manager for vp2
   - description: VP1 video port 1
   - description: VP2 video port 2
+  - description: common1 DSS register area
 
   reg-names:
 items:
@@ -47,6 +48,7 @@ properties:
   - const: ovr2
   - const: vp1
   - const: vp2
+  - const: common1
 
   clocks:
 items:
@@ -147,9 +149,10 @@ examples:
 <0x04a07000 0x1000>, /* ovr1 */
 <0x04a08000 0x1000>, /* ovr2 */
 <0x04a0a000 0x1000>, /* vp1 */
-<0x04a0b000 0x1000>; /* vp2 */
+<0x04a0b000 0x1000>, /* vp2 */
+<0x04a01000 0x1000>; /* common1 */
 reg-names = "common", "vidl1", "vid",
-"ovr1", "ovr2", "vp1", "vp2";
+"ovr1", "ovr2", "vp1", "vp2", "common1";
 ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>;
 power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>;
 clocks =<&k3_clks 67 1>,
-- 
2.34.1



[PATCH v4 0/2] Add common1 region for AM62, AM62A & AM65x

2024-02-15 Thread Devarsh Thakkar
This adds DSS common1 region for respective SoCs supporting it.

Changelog:
V2 : Remove do-not-merge tag and add am62a dss common1 reion
V3 : Add Fixes tag to each commit
V4 : Add Reviewed-by tag and AM62A SoC TRM Link
 
Devarsh Thakkar (2):
  dt-bindings: display: ti,am65x-dss: Add support for common1 region
  arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x
SoCs

 .../devicetree/bindings/display/ti/ti,am65x-dss.yaml   | 7 +--
 arch/arm64/boot/dts/ti/k3-am62-main.dtsi   | 5 +++--
 arch/arm64/boot/dts/ti/k3-am62a-main.dtsi  | 5 +++--
 arch/arm64/boot/dts/ti/k3-am65-main.dtsi   | 5 +++--
 4 files changed, 14 insertions(+), 8 deletions(-)

-- 
2.34.1



[PATCH v4 2/2] arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs

2024-02-15 Thread Devarsh Thakkar
This adds common1 register space for AM62x, AM62A and AM65x SoC's which are
using TI's Keystone display hardware and supporting it as described in
Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml

Fixes: 3618811657b3 ("arm64: dts: ti: k3-am62a-main: Add node for Display 
SubSystem (DSS)")
Fixes: 8ccc1073c7bb ("arm64: dts: ti: k3-am62-main: Add node for DSS")
Fixes: fc539b90eda2 ("arm64: dts: ti: am654: Add DSS node")
Signed-off-by: Devarsh Thakkar 
Reviewed-by: Aradhya Bhatia 
---
V2: Add common1 region for AM62A SoC too
V3: Add Fixes tag
V4: Add Reviewed-by
---
 arch/arm64/boot/dts/ti/k3-am62-main.dtsi  | 5 +++--
 arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 5 +++--
 arch/arm64/boot/dts/ti/k3-am65-main.dtsi  | 5 +++--
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
index fe0cc4a9a501..8cee4d94cdd3 100644
--- a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
@@ -779,9 +779,10 @@ dss: dss@3020 {
  <0x00 0x30207000 0x00 0x1000>, /* ovr1 */
  <0x00 0x30208000 0x00 0x1000>, /* ovr2 */
  <0x00 0x3020a000 0x00 0x1000>, /* vp1: Used for OLDI */
- <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */
+ <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */
+ <0x00 0x30201000 0x00 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>;
clocks = <&k3_clks 186 6>,
 <&dss_vp1_clk>,
diff --git a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
index 972971159a62..f475daea548e 100644
--- a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
@@ -994,9 +994,10 @@ dss: dss@3020 {
  <0x00 0x30207000 0x00 0x1000>, /* ovr1 */
  <0x00 0x30208000 0x00 0x1000>, /* ovr2 */
  <0x00 0x3020a000 0x00 0x1000>, /* vp1: Tied OFF in the 
SoC */
- <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */
+ <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */
+ <0x00 0x30201000 0x00 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>;
clocks = <&k3_clks 186 6>,
 <&k3_clks 186 0>,
diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
index 07010d31350e..ff857117d719 100644
--- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
@@ -991,9 +991,10 @@ dss: dss@4a0 {
  <0x0 0x04a07000 0x0 0x1000>, /* ovr1 */
  <0x0 0x04a08000 0x0 0x1000>, /* ovr2 */
  <0x0 0x04a0a000 0x0 0x1000>, /* vp1 */
- <0x0 0x04a0b000 0x0 0x1000>; /* vp2 */
+ <0x0 0x04a0b000 0x0 0x1000>, /* vp2 */
+ <0x0 0x04a01000 0x0 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
 
ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>;
 
-- 
2.34.1



[PATCH v3 0/2] Add common1 region for AM62, AM62A & AM65x

2024-02-15 Thread Devarsh Thakkar
This adds DSS common1 region for respective SoCs supporting it.

Changelog:
V2 : Remove do-not-merge tag and add am62a dss common1 reion
V3 : Add Fixes tag to each commit
 
Devarsh Thakkar (2):
  dt-bindings: display: ti,am65x-dss: Add support for common1 region
  arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x
SoCs

 .../devicetree/bindings/display/ti/ti,am65x-dss.yaml   | 7 +--
 arch/arm64/boot/dts/ti/k3-am62-main.dtsi   | 5 +++--
 arch/arm64/boot/dts/ti/k3-am62a-main.dtsi  | 5 +++--
 arch/arm64/boot/dts/ti/k3-am65-main.dtsi   | 5 +++--
 4 files changed, 14 insertions(+), 8 deletions(-)

-- 
2.34.1



[PATCH v3 1/2] dt-bindings: display: ti, am65x-dss: Add support for common1 region

2024-02-15 Thread Devarsh Thakkar
TI keystone display subsystem present in AM65 and other SoCs such as AM62
support two separate register spaces namely "common" and "common1" which
can be used by two separate hosts to program the display controller as
described in respective Technical Reference Manuals [1].

The common1 register space has similar set of configuration registers as
supported in common register space except the global configuration
registers which are exclusive to common region.

This adds binding for "common1" register region too as supported by the
hardware.

[1]:
AM62x TRM:
https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers)

AM65x TRM:
https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers)

Fixes: 2d8730f1021f ("dt-bindings: display: ti,am65x-dss: Add dt-schema yaml 
binding")
Signed-off-by: Devarsh Thakkar 
Acked-by: Conor Dooley 
---
V2: Add Acked-by tag
V3: Add Fixes tag
---
 .../devicetree/bindings/display/ti/ti,am65x-dss.yaml   | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml 
b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
index b6767ef0d24d..55e3e490d0e6 100644
--- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
+++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
@@ -37,6 +37,7 @@ properties:
   - description: OVR2 overlay manager for vp2
   - description: VP1 video port 1
   - description: VP2 video port 2
+  - description: common1 DSS register area
 
   reg-names:
 items:
@@ -47,6 +48,7 @@ properties:
   - const: ovr2
   - const: vp1
   - const: vp2
+  - const: common1
 
   clocks:
 items:
@@ -147,9 +149,10 @@ examples:
 <0x04a07000 0x1000>, /* ovr1 */
 <0x04a08000 0x1000>, /* ovr2 */
 <0x04a0a000 0x1000>, /* vp1 */
-<0x04a0b000 0x1000>; /* vp2 */
+<0x04a0b000 0x1000>, /* vp2 */
+<0x04a01000 0x1000>; /* common1 */
 reg-names = "common", "vidl1", "vid",
-"ovr1", "ovr2", "vp1", "vp2";
+"ovr1", "ovr2", "vp1", "vp2", "common1";
 ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>;
 power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>;
 clocks =<&k3_clks 67 1>,
-- 
2.34.1



[PATCH v3 2/2] arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs

2024-02-15 Thread Devarsh Thakkar
This adds common1 register space for AM62x, AM62A and AM65x SoC's which are
using TI's Keystone display hardware and supporting it as described in
Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml

Fixes: 3618811657b3 ("arm64: dts: ti: k3-am62a-main: Add node for Display 
SubSystem (DSS)")
Fixes: 8ccc1073c7bb ("arm64: dts: ti: k3-am62-main: Add node for DSS")
Fixes: fc539b90eda2 ("arm64: dts: ti: am654: Add DSS node")
Signed-off-by: Devarsh Thakkar 
---
V2: Add common1 region for AM62A SoC too
V3: Add Fixes tag
---
 arch/arm64/boot/dts/ti/k3-am62-main.dtsi  | 5 +++--
 arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 5 +++--
 arch/arm64/boot/dts/ti/k3-am65-main.dtsi  | 5 +++--
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
index fe0cc4a9a501..8cee4d94cdd3 100644
--- a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
@@ -779,9 +779,10 @@ dss: dss@3020 {
  <0x00 0x30207000 0x00 0x1000>, /* ovr1 */
  <0x00 0x30208000 0x00 0x1000>, /* ovr2 */
  <0x00 0x3020a000 0x00 0x1000>, /* vp1: Used for OLDI */
- <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */
+ <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */
+ <0x00 0x30201000 0x00 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>;
clocks = <&k3_clks 186 6>,
 <&dss_vp1_clk>,
diff --git a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
index 972971159a62..f475daea548e 100644
--- a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
@@ -994,9 +994,10 @@ dss: dss@3020 {
  <0x00 0x30207000 0x00 0x1000>, /* ovr1 */
  <0x00 0x30208000 0x00 0x1000>, /* ovr2 */
  <0x00 0x3020a000 0x00 0x1000>, /* vp1: Tied OFF in the 
SoC */
- <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */
+ <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */
+ <0x00 0x30201000 0x00 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>;
clocks = <&k3_clks 186 6>,
 <&k3_clks 186 0>,
diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
index 07010d31350e..ff857117d719 100644
--- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
@@ -991,9 +991,10 @@ dss: dss@4a0 {
  <0x0 0x04a07000 0x0 0x1000>, /* ovr1 */
  <0x0 0x04a08000 0x0 0x1000>, /* ovr2 */
  <0x0 0x04a0a000 0x0 0x1000>, /* vp1 */
- <0x0 0x04a0b000 0x0 0x1000>; /* vp2 */
+ <0x0 0x04a0b000 0x0 0x1000>, /* vp2 */
+ <0x0 0x04a01000 0x0 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
 
ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>;
 
-- 
2.34.1



Re: [RFC PATCH 2/3] drm/tidss: Add support for display sharing

2024-02-14 Thread Devarsh Thakkar
Hi Maxime,

Thanks for the quick reply.

On 13/02/24 19:34, Maxime Ripard wrote:
> Hi Devarsh,
> 
> On Thu, Feb 08, 2024 at 06:26:17PM +0530, Devarsh Thakkar wrote:
>> Hi Maxime,
>>
>> Thanks a lot for checking on this.
>>
>> On 26/01/24 17:45, Maxime Ripard wrote:
>>> Hi,
>>>
>>> Thanks a lot for working on that.
>>>
>>> On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote:
>>>> Display subsystem present in TI Keystone family of devices supports sharing
>>>> of display between multiple hosts as it provides separate register space
>>>> (common* region) for each host to programming display controller and also a
>>>> unique interrupt line for each host.
>>>>
>>>> This adds support for display sharing, by allowing partitioning of
>>>> resources either at video port level or at video plane level as
>>>> described below :
>>>>
>>>> 1) Linux can own (i.e have write access) completely one or more of video
>>>> ports along with corresponding resources (viz. overlay managers,
>>>> video planes) used by Linux in context of those video ports.
>>>> Even if Linux is owning
>>>> these video ports it can still share this video port with a remote core
>>>> which can own one or more video planes associated with this video port.
>>>>
>>>> 2) Linux owns one or more of the video planes with video port
>>>> (along with corresponding overlay manager) associated with these planes
>>>> being owned and controlled by a remote core. Linux still has read-only
>>>> access to the associated video port and overlay managers so that it can
>>>> parse the settings made by remote core.
>>>
>>> So, just to make sure we're on the same page. 1) means Linux drives the
>>> whole display engine, but can lend planes to the R5? How does that work,
>>> is Linux aware of the workload being there (plane size, format, etc) ?
>>>
>>
>> Well, there is no dynamic procedure being followed for lending. The
>> partitioning scheme is decided and known before hand, and the remote
>> core firmware updated and compiled accordingly, and similarly the
>> device-tree overlay for Linux is also updated with partitioning
>> information before bootup.
>>
>> What would happen here is that Linux will know before-hand this
>> partitioning information via device-tree properties and won't enumerate
>> the plane owned by RTOS, but it will enumerate the rest of the display
>> components and initialize the DSS, after which user can load the DSS
>> firmware on remote core and this firmware will only have control of
>> plane as it was compiled with that configuration.
> 
> Right. If the RTOS is in control of a single plane, how it is expected
> to deal with Linux shutting the CRTC down, or enforcing a configuration
> that isn't compatible with what the RTOS expects (like a plane with a
> higher zpos masking its plane), what is the mechanism to reconcile it?
> 

Just for the note, for this "RTOS control single plane" mode, we don't have a
firmware available to test (right now we are only supporting example for "RTOS
controlling the display mode" as shared here [1]) and hence this is not
validated but the idea was to keep dt-bindings generic enough to support them
in future and that's why I referred to it here.

Coming back to your questions, with the current scheme the Linux (tidss) would
be expected to make sure the CRTC being shared with RTOS is never shutdown and
the RTOS plane should never gets masked.

I think the IPC based scheme would have been mainly needed for the case where
you have a single entity controlling the display for e.g you have a single
display controller register space and a single IRQ but you have multiple
planes and say you want to divide these planes to different host processors.
In that case you want a single entity to act as a main entity and be in
control of DSS and rest of the processors communicate with the "main entity"
to request display resources and plane updates and main entity also programs
dss on their behalf.

But unlike above, TI DSS7 is designed to support static partitioning of
display resources among multiple hosts, where each host can program the
display hardware independently using separate register space and having a
separate irq and without requirement of any communication between the hosts.
Now as this feature is unique to TI DSS7 we want to support this feature in
tidss driver. The DSS resource partitioning feature is described in detail
here [2]

>>> And 2) would mean that the display e

[PATCH v2 2/2] arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x SoCs

2024-02-14 Thread Devarsh Thakkar
This adds common1 register space for AM62x, AM62A and AM65x SoC's which are
using TI's Keystone display hardware and supporting it as described in
Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml

Signed-off-by: Devarsh Thakkar 
---
V2: Add common1 region for AM62A SoC too
---
 arch/arm64/boot/dts/ti/k3-am62-main.dtsi  | 5 +++--
 arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 5 +++--
 arch/arm64/boot/dts/ti/k3-am65-main.dtsi  | 5 +++--
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
index fe0cc4a9a501..8cee4d94cdd3 100644
--- a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
@@ -779,9 +779,10 @@ dss: dss@3020 {
  <0x00 0x30207000 0x00 0x1000>, /* ovr1 */
  <0x00 0x30208000 0x00 0x1000>, /* ovr2 */
  <0x00 0x3020a000 0x00 0x1000>, /* vp1: Used for OLDI */
- <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */
+ <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */
+ <0x00 0x30201000 0x00 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>;
clocks = <&k3_clks 186 6>,
 <&dss_vp1_clk>,
diff --git a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
index 972971159a62..f475daea548e 100644
--- a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi
@@ -994,9 +994,10 @@ dss: dss@3020 {
  <0x00 0x30207000 0x00 0x1000>, /* ovr1 */
  <0x00 0x30208000 0x00 0x1000>, /* ovr2 */
  <0x00 0x3020a000 0x00 0x1000>, /* vp1: Tied OFF in the 
SoC */
- <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */
+ <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */
+ <0x00 0x30201000 0x00 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>;
clocks = <&k3_clks 186 6>,
 <&k3_clks 186 0>,
diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
index 07010d31350e..ff857117d719 100644
--- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
@@ -991,9 +991,10 @@ dss: dss@4a0 {
  <0x0 0x04a07000 0x0 0x1000>, /* ovr1 */
  <0x0 0x04a08000 0x0 0x1000>, /* ovr2 */
  <0x0 0x04a0a000 0x0 0x1000>, /* vp1 */
- <0x0 0x04a0b000 0x0 0x1000>; /* vp2 */
+ <0x0 0x04a0b000 0x0 0x1000>, /* vp2 */
+ <0x0 0x04a01000 0x0 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
 
ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>;
 
-- 
2.34.1



[PATCH v2 0/2] Add common1 region for AM62, AM62A & AM65x

2024-02-14 Thread Devarsh Thakkar
This adds DSS common1 region for respective SoCs supporting it.

Devarsh Thakkar (2):
  dt-bindings: display: ti,am65x-dss: Add support for common1 region
  arm64: dts: ti: Add common1 register space for AM62x, AM62A & AM65x
SoCs

 .../devicetree/bindings/display/ti/ti,am65x-dss.yaml   | 7 +--
 arch/arm64/boot/dts/ti/k3-am62-main.dtsi   | 5 +++--
 arch/arm64/boot/dts/ti/k3-am62a-main.dtsi  | 5 +++--
 arch/arm64/boot/dts/ti/k3-am65-main.dtsi   | 5 +++--
 4 files changed, 14 insertions(+), 8 deletions(-)

-- 
2.34.1



[PATCH v2 1/2] dt-bindings: display: ti, am65x-dss: Add support for common1 region

2024-02-14 Thread Devarsh Thakkar
TI keystone display subsystem present in AM65 and other SoCs such as AM62
support two separate register spaces namely "common" and "common1" which
can be used by two separate hosts to program the display controller as
described in respective Technical Reference Manuals [1].

The common1 register space has similar set of configuration registers as
supported in common register space except the global configuration
registers which are exclusive to common region.

This adds binding for "common1" register region too as supported by the
hardware.

[1]:
AM62x TRM:
https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers)

AM65x TRM:
https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers)

AM62A TRM:
https://www.ti.com/lit/pdf/spruj16 (Section 14.9.9.1 DSS Registers)

Signed-off-by: Devarsh Thakkar 
Acked-by: Conor Dooley 
---
V2: Add Acked-by tag
---
 .../devicetree/bindings/display/ti/ti,am65x-dss.yaml   | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml 
b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
index b6767ef0d24d..55e3e490d0e6 100644
--- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
+++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
@@ -37,6 +37,7 @@ properties:
   - description: OVR2 overlay manager for vp2
   - description: VP1 video port 1
   - description: VP2 video port 2
+  - description: common1 DSS register area
 
   reg-names:
 items:
@@ -47,6 +48,7 @@ properties:
   - const: ovr2
   - const: vp1
   - const: vp2
+  - const: common1
 
   clocks:
 items:
@@ -147,9 +149,10 @@ examples:
 <0x04a07000 0x1000>, /* ovr1 */
 <0x04a08000 0x1000>, /* ovr2 */
 <0x04a0a000 0x1000>, /* vp1 */
-<0x04a0b000 0x1000>; /* vp2 */
+<0x04a0b000 0x1000>, /* vp2 */
+<0x04a01000 0x1000>; /* common1 */
 reg-names = "common", "vidl1", "vid",
-"ovr1", "ovr2", "vp1", "vp2";
+"ovr1", "ovr2", "vp1", "vp2", "common1";
 ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>;
 power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>;
 clocks =<&k3_clks 67 1>,
-- 
2.34.1



Re: [PATCH 1/2] dt-bindings: display: ti,am65x-dss: Add support for common1 region

2024-02-14 Thread Devarsh Thakkar
Hi Tomi, Vignesh,

On 14/02/24 14:53, Tomi Valkeinen wrote:
> On 14/02/2024 11:10, Tomi Valkeinen wrote:
>> Hi,
>>
>> On 15/01/2024 14:57, Devarsh Thakkar wrote:
>>> TI keystone display subsystem present in AM65 and other SoCs such as AM62
>>> support two separate register spaces namely "common" and "common1" which
>>> can be used by two separate hosts to program the display controller as
>>> described in respective Technical Reference Manuals [1].
>>>
>>> The common1 register space has similar set of configuration registers as
>>> supported in common register space except the global configuration
>>> registers which are exclusive to common region.
>>>
>>> This adds binding for "common1" register region too as supported by the
>>> hardware.
>>>
>>> [1]:
>>> AM62x TRM:
>>> https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers)
>>>
>>> AM65x TRM:
>>> https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers)
>>>
>>> Signed-off-by: Devarsh Thakkar 
>>> ---
>>>   .../devicetree/bindings/display/ti/ti,am65x-dss.yaml   | 7 +--
>>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
>>> b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
>>> index b6767ef0d24d..55e3e490d0e6 100644
>>> --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
>>> +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
>>> @@ -37,6 +37,7 @@ properties:
>>>     - description: OVR2 overlay manager for vp2
>>>     - description: VP1 video port 1
>>>     - description: VP2 video port 2
>>> +  - description: common1 DSS register area
>>>     reg-names:
>>>   items:
>>> @@ -47,6 +48,7 @@ properties:
>>>     - const: ovr2
>>>     - const: vp1
>>>     - const: vp2
>>> +  - const: common1
>>>     clocks:
>>>   items:
>>> @@ -147,9 +149,10 @@ examples:
>>>   <0x04a07000 0x1000>, /* ovr1 */
>>>   <0x04a08000 0x1000>, /* ovr2 */
>>>   <0x04a0a000 0x1000>, /* vp1 */
>>> -    <0x04a0b000 0x1000>; /* vp2 */
>>> +    <0x04a0b000 0x1000>, /* vp2 */
>>> +    <0x04a01000 0x1000>; /* common1 */
>>>   reg-names = "common", "vidl1", "vid",
>>> -    "ovr1", "ovr2", "vp1", "vp2";
>>> +    "ovr1", "ovr2", "vp1", "vp2", "common1";
>>>   ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>;
>>>   power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>;
>>>   clocks =    <&k3_clks 67 1>,
>>
>> Looks fine to me, I'll apply to drm-misc-next.
> 
> Hmm, now thinking about this, doesn't this cause dtb checks to start failing,
> as the dtbs are missing one entry? Is it better to merge these kind of changes
> with the dts changes? Or does it matter?
> 

Yes if one get's applied and other doesn't then there will be such issues.
I am sending shortly both the dt-binding and device-tree patches together, as
long as both look fine and ready to be accepted by respective maintainers, I
think both can get merged to respective trees and land in linux-next without
causing any issues.

Regards
Devarsh

>  Tomi
> 


Re: [RFC PATCH 2/3] drm/tidss: Add support for display sharing

2024-02-08 Thread Devarsh Thakkar
Hi Maxime,

Thanks a lot for checking on this.

On 26/01/24 17:45, Maxime Ripard wrote:
> Hi,
> 
> Thanks a lot for working on that.
> 
> On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote:
>> Display subsystem present in TI Keystone family of devices supports sharing
>> of display between multiple hosts as it provides separate register space
>> (common* region) for each host to programming display controller and also a
>> unique interrupt line for each host.
>>
>> This adds support for display sharing, by allowing partitioning of
>> resources either at video port level or at video plane level as
>> described below :
>>
>> 1) Linux can own (i.e have write access) completely one or more of video
>> ports along with corresponding resources (viz. overlay managers,
>> video planes) used by Linux in context of those video ports.
>> Even if Linux is owning
>> these video ports it can still share this video port with a remote core
>> which can own one or more video planes associated with this video port.
>>
>> 2) Linux owns one or more of the video planes with video port
>> (along with corresponding overlay manager) associated with these planes
>> being owned and controlled by a remote core. Linux still has read-only
>> access to the associated video port and overlay managers so that it can
>> parse the settings made by remote core.
> 
> So, just to make sure we're on the same page. 1) means Linux drives the
> whole display engine, but can lend planes to the R5? How does that work,
> is Linux aware of the workload being there (plane size, format, etc) ?
> 

Well, there is no dynamic procedure being followed for lending. The
partitioning scheme is decided and known before hand, and the remote
core firmware updated and compiled accordingly, and similarly the
device-tree overlay for Linux is also updated with partitioning
information before bootup.

What would happen here is that Linux will know before-hand this
partitioning information via device-tree properties and won't enumerate
the plane owned by RTOS, but it will enumerate the rest of the display
components and initialize the DSS, after which user can load the DSS
firmware on remote core and this firmware will only have control of
plane as it was compiled with that configuration.


> And 2) would mean that the display engine is under the R5 control and
> Linux only gets to fill the plane and let the firmware know of what it
> wants?
> 

Here too the partitioning information is pre-decided and remote core
firmware and device-tree overlay for Linux updated accordingly. But in
this case as remote core firmware owns the display (minus the plane
owned by Linux) it is started and initialized during the bootloader
phase itself where it initializes the DSS and starts rendering using the
plane owned by it and Linux just latches to the DSS without
re-initializing it, with write access only to the plane that is owned by
Linux. You can refer [1] for more details on this.


> If so, do we even need the tidss driver in the second case? We could
> just write a fwkms driver of some sorts that could be used by multiple
> implementations of the same "defer to firmware" logic.
> 

This feature of static partitioning of DSS resources is specific to DSS7
hardware (which is controlled by tidss driver) which supports dedicated
register space and interrupt line for each of the hosts [0], so that
multiple hosts can drive the display controller simultaneously as  per
the desired static partitioning of resources, and so I don't think a
separate driver is required here and tidss seems the right place to
support this, where using this device-tree approach different resource
partitioning schemas can be achieved as described here [1]. This was
also aligned with Tomi too where we discussed that tidss is the right
place to support this as we are simply leveraging the DSS hardware
capabilities of static partitioning here.


>> For both the cases, the resources used in context of processing core
>> running Linux along with ownership information are exposed by user as
>> part of device-tree blob and driver uses an updated feature list tailored
>> for this shared mode accordingly. The driver also auto-populates
>> matching overlay managers and output types from shared video
>> port list provided in device-tree blob.
>> In dispc_feature struct remove const access specfier for output_type
>> array as it is required to be updated dynamically in run-time for shared
>> mode.
> 
> I'm also not entirely sure that the device tree is the right path there.
> Surely the firmware capabilities will evolve over time, while the device
> tree won't. Is there some way to make it discoverable at probe time by
&g

Re: [RFC PATCH 3/3] arm64: dts: ti: k3-am62x: Add overlay to use DSS in display sharing mode

2024-02-08 Thread Devarsh Thakkar
Hi Tomi,

Thanks for the review.

On 23/01/24 13:59, Tomi Valkeinen wrote:
> Hi,
> 
> On 16/01/2024 15:41, Devarsh Thakkar wrote:
>> This overlay needs to be used with display sharing supported device
>> manager firmware only.
>>
>> Remote core running this firmware has write access to "common" register
>> space, VIDL pipeline, OVR1 overlay and VP1 videoport.
>>
>> The processing core running Linux is provided write access to VID
>> pipeline and "common1" register space.
>>
>> The VP1 video port is shared between Linux and remote core with remote
>> core configuring the overlay manager to set Zorder 1 for VID pipeline
>> and Zorder 2 for VIDL pipeline.
>>
>> Add reserved memory region for framebuffer region used by remote core in
>> dss shared mode overlay file so that Linux does not re-use the same
>> while allocating memory.
> 
> I don't understand this one. Why is RAM used by RTOS accessible by Linux
> in the first place?
> 

Well, I think the R5 SPL initializes full DDR before starting firmwares
on remote cores and the regions used by this remote cores be it for IPC
or Code/data are marked as reserved both in Linux as well as U-boot so
that Linux/U-boot does not use it [1].

Same scheme is followed here w.r. t RTOS framebuffer too.

[1] :
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi?h=v6.7#n63

Regards
Devarsh

>  Tomi
> 
>> Also add a label for reserved memory region in base device-tree file so
>> that it can be referred back in overlay file.
>>
>> Signed-off-by: Devarsh Thakkar 
>> ---
>>   arch/arm64/boot/dts/ti/Makefile   |  1 +
>>   .../arm64/boot/dts/ti/k3-am62x-sk-common.dtsi |  2 +-
>>   .../dts/ti/k3-am62x-sk-dss-shared-mode.dtso   | 33 +++
>>   3 files changed, 35 insertions(+), 1 deletion(-)
>>   create mode 100644
>> arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso
>>
>> diff --git a/arch/arm64/boot/dts/ti/Makefile
>> b/arch/arm64/boot/dts/ti/Makefile
>> index 52c1dc910308..ff832741b367 100644
>> --- a/arch/arm64/boot/dts/ti/Makefile
>> +++ b/arch/arm64/boot/dts/ti/Makefile
>> @@ -35,6 +35,7 @@ dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-csi2-ov5640.dtbo
>>   dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-csi2-tevi-ov5640.dtbo
>>   dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-csi2-imx219.dtbo
>>   dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-hdmi-audio.dtbo
>> +dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-dss-shared-mode.dtbo
>>     # Boards with AM64x SoC
>>   dtb-$(CONFIG_ARCH_K3) += k3-am642-evm.dtb
>> diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>> b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>> index 33768c02d8eb..8b55ca10102f 100644
>> --- a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>> +++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>> @@ -34,7 +34,7 @@ memory@8000 {
>>   reg = <0x 0x8000 0x 0x8000>;
>>   };
>>   -    reserved-memory {
>> +    reserved_memory: reserved-memory {
>>   #address-cells = <2>;
>>   #size-cells = <2>;
>>   ranges;
>> diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso
>> b/arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso
>> new file mode 100644
>> index ..02153748a5c2
>> --- /dev/null
>> +++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso
>> @@ -0,0 +1,33 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/**
>> + * DT overlay to enable display sharing mode for AM62P DSS0
>> + * This is compatible with custom AM62 Device Manager firmware
>> + *
>> + * Copyright (C) 2023 Texas Instruments Incorporated -
>> http://www.ti.com/
>> + */
>> +
>> +/dts-v1/;
>> +/plugin/;
>> +
>> +#include 
>> +#include 
>> +
>> +&dss0 {
>> +    ti,dss-shared-mode;
>> +    ti,dss-shared-mode-vp = "vp1";
>> +    ti,dss-shared-mode-vp-owned = <0>;
>> +    ti,dss-shared-mode-common = "common1";
>> +    ti,dss-shared-mode-planes = "vid";
>> +    ti,dss-shared-mode-plane-zorder = <0>;
>> +    interrupts = ;
>> +};
>> +
>> +&reserved_memory {
>> +    #address-cells = <2>;
>> +    #size-cells = <2>;
>> +    rtos_framebuffer_memory_region: rtos-framebuffer-memory@9480 {
>> +    compatible = "shared-dma-pool";
>> +    reg = <0x00 0x9480 0x00 0x0800>;
>> +    no-map;
>> +    };
>> +};
> 
> 


Re: [RFC PATCH 1/3] dt-bindings: display: ti,am65x-dss: Add support for display sharing mode

2024-02-08 Thread Devarsh Thakkar
Hi Rob,

Thanks for the review.

On 19/01/24 19:25, Rob Herring wrote:
> On Thu, Jan 18, 2024 at 7:59 AM Devarsh Thakkar  wrote:
>>
>> Hi Rob,
>>
>> Thanks for the quick review.
>>
>> On 18/01/24 01:43, Rob Herring wrote:
>>> On Tue, Jan 16, 2024 at 07:11:40PM +0530, Devarsh Thakkar wrote:
>>>> Add support for using TI Keystone DSS hardware present in display
>>>> sharing mode.
>>>>
>>>> TI Keystone DSS hardware supports partitioning of resources between
>>>> multiple hosts as it provides separate register space and unique
>>>> interrupt line to each host.
>>>>
>>>> The DSS hardware can be used in shared mode in such a way that one or
>>>> more of video planes can be owned by Linux wherease other planes can be
>>>> owned by remote cores.
>>>>
>>>> One or more of the video ports can be dedicated exclusively to a
>>>> processing core, wherease some of the video ports can be shared between
>>>> two hosts too with only one of them having write access.
>>>>
>>>> Signed-off-by: Devarsh Thakkar 
>>>> ---
>>>>  .../bindings/display/ti/ti,am65x-dss.yaml | 82 +++
>>>>  1 file changed, 82 insertions(+)
>>>>
>>>> diff --git 
>>>> a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml 
>>>> b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
>>>> index 55e3e490d0e6..d9bc69fbf1fb 100644
>>>> --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
>>>> +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
>>>> @@ -112,6 +112,86 @@ properties:
>>>>Input memory (from main memory to dispc) bandwidth limit in
>>>>bytes per second
>>>>
>>>> +  ti,dss-shared-mode:
>>>> +type: boolean
>>>> +description:
>>>> +  TI DSS7 supports sharing of display between multiple hosts
>>>> +  as it provides separate register space for display configuration and
>>>> +  unique interrupt line to each host.
>>>
>>> If you care about line breaks, you need '|'.
>>>
>>
>> Noted.
>>
>>>> +  One of the host is provided access to the global display
>>>> +  configuration labelled as "common" region of DSS allows that host
>>>> +  exclusive access to global registers of DSS while other host can
>>>> +  configure the display for it's usage using a separate register
>>>> +  space labelled as "common1".
>>>> +  The DSS resources can be partitioned in such a way that one or more
>>>> +  of the video planes are owned by Linux whereas other video planes
>>>
>>> Your h/w can only run Linux?
>>>
>>> What if you want to use this same binding to define the configuration to
>>> the 'remote processor'? You can easily s/Linux/the OS/, but it all
>>> should be reworded to describe things in terms of the local processor.
>>>
>>
>> It can run both Linux and RTOS or for that matter any other OS too. But yes I
>> got your point, will reword accordingly.
>>
>>>> +  can be owned by a remote core.
>>>> +  The video port controlling these planes acts as a shared video port
>>>> +  and it can be configured with write access either by Linux or the
>>>> +  remote core in which case Linux only has read-only access to that
>>>> +  video port.
>>>
>>> What is the purpose of this property when all the other properties are
>>> required?
>>>
>>
>> The ti,dss-shared-mode and below group of properties are optional. But
>> if ti,dss-shared-mode is set then only driver should parse below set of
>> properties.
> 
> Let me re-phrase. Drop this property. It serves no purpose since all
> the properties have to be present anyway.
> 

Noted.

>>>> +  ti,dss-shared-mode-planes:
>>>> +description:
>>>> +  The video layer that is owned by processing core running Linux.
>>>> +  The display driver running from Linux has exclusive write access to
>>>> +  this video layer.
>>>> +$ref: /schemas/types.yaml#/definitions/string
>>>> +enum: [vidl, vid]
>>>> +
>>>> +  ti,dss-shared-mode-vp:
>>>> +description:
>>>> +  The video port that 

Re: [RFC PATCH 1/3] dt-bindings: display: ti,am65x-dss: Add support for display sharing mode

2024-01-18 Thread Devarsh Thakkar
Hi Rob,

Thanks for the quick review.

On 18/01/24 01:43, Rob Herring wrote:
> On Tue, Jan 16, 2024 at 07:11:40PM +0530, Devarsh Thakkar wrote:
>> Add support for using TI Keystone DSS hardware present in display
>> sharing mode.
>>
>> TI Keystone DSS hardware supports partitioning of resources between
>> multiple hosts as it provides separate register space and unique
>> interrupt line to each host.
>>
>> The DSS hardware can be used in shared mode in such a way that one or
>> more of video planes can be owned by Linux wherease other planes can be
>> owned by remote cores.
>>
>> One or more of the video ports can be dedicated exclusively to a
>> processing core, wherease some of the video ports can be shared between
>> two hosts too with only one of them having write access.
>>
>> Signed-off-by: Devarsh Thakkar 
>> ---
>>  .../bindings/display/ti/ti,am65x-dss.yaml | 82 +++
>>  1 file changed, 82 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml 
>> b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
>> index 55e3e490d0e6..d9bc69fbf1fb 100644
>> --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
>> +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
>> @@ -112,6 +112,86 @@ properties:
>>Input memory (from main memory to dispc) bandwidth limit in
>>bytes per second
>>  
>> +  ti,dss-shared-mode:
>> +type: boolean
>> +description:
>> +  TI DSS7 supports sharing of display between multiple hosts
>> +  as it provides separate register space for display configuration and
>> +  unique interrupt line to each host.
> 
> If you care about line breaks, you need '|'. 
> 

Noted.

>> +  One of the host is provided access to the global display
>> +  configuration labelled as "common" region of DSS allows that host
>> +  exclusive access to global registers of DSS while other host can
>> +  configure the display for it's usage using a separate register
>> +  space labelled as "common1".
>> +  The DSS resources can be partitioned in such a way that one or more
>> +  of the video planes are owned by Linux whereas other video planes
> 
> Your h/w can only run Linux?
> 
> What if you want to use this same binding to define the configuration to 
> the 'remote processor'? You can easily s/Linux/the OS/, but it all 
> should be reworded to describe things in terms of the local processor.
> 

It can run both Linux and RTOS or for that matter any other OS too. But yes I
got your point, will reword accordingly.

>> +  can be owned by a remote core.
>> +  The video port controlling these planes acts as a shared video port
>> +  and it can be configured with write access either by Linux or the
>> +  remote core in which case Linux only has read-only access to that
>> +  video port.
> 
> What is the purpose of this property when all the other properties are 
> required?
> 

The ti,dss-shared-mode and below group of properties are optional. But
if ti,dss-shared-mode is set then only driver should parse below set of
properties.

>> +
>> +  ti,dss-shared-mode-planes:
>> +description:
>> +  The video layer that is owned by processing core running Linux.
>> +  The display driver running from Linux has exclusive write access to
>> +  this video layer.
>> +$ref: /schemas/types.yaml#/definitions/string
>> +enum: [vidl, vid]
>> +
>> +  ti,dss-shared-mode-vp:
>> +description:
>> +  The video port that is being used in context of processing core
>> +  running Linux with display susbsytem being used in shared mode.
>> +  This can be owned either by the processing core running Linux in
>> +  which case Linux has the write access and the responsibility to
>> +  configure this video port and the associated overlay manager or
>> +  it can be shared between core running Linux and a remote core
>> +  with remote core provided with write access to this video port and
>> +  associated overlay managers and remote core configures and drives
>> +  this video port also feeding data from one or more of the
>> +  video planes owned by Linux, with Linux only having read-only access
>> +  to this video port and associated overlay managers.
>> +
>> +$ref: /schemas/types.yaml#/definitions/string
>> +enum: [vp1, vp2]
>> +
>> +  ti,dss-shared-mode-common:
>>

[RFC PATCH 3/3] arm64: dts: ti: k3-am62x: Add overlay to use DSS in display sharing mode

2024-01-16 Thread Devarsh Thakkar
This overlay needs to be used with display sharing supported device
manager firmware only.

Remote core running this firmware has write access to "common" register
space, VIDL pipeline, OVR1 overlay and VP1 videoport.

The processing core running Linux is provided write access to VID
pipeline and "common1" register space.

The VP1 video port is shared between Linux and remote core with remote
core configuring the overlay manager to set Zorder 1 for VID pipeline
and Zorder 2 for VIDL pipeline.

Add reserved memory region for framebuffer region used by remote core in
dss shared mode overlay file so that Linux does not re-use the same
while allocating memory.

Also add a label for reserved memory region in base device-tree file so
that it can be referred back in overlay file.

Signed-off-by: Devarsh Thakkar 
---
 arch/arm64/boot/dts/ti/Makefile   |  1 +
 .../arm64/boot/dts/ti/k3-am62x-sk-common.dtsi |  2 +-
 .../dts/ti/k3-am62x-sk-dss-shared-mode.dtso   | 33 +++
 3 files changed, 35 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso

diff --git a/arch/arm64/boot/dts/ti/Makefile b/arch/arm64/boot/dts/ti/Makefile
index 52c1dc910308..ff832741b367 100644
--- a/arch/arm64/boot/dts/ti/Makefile
+++ b/arch/arm64/boot/dts/ti/Makefile
@@ -35,6 +35,7 @@ dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-csi2-ov5640.dtbo
 dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-csi2-tevi-ov5640.dtbo
 dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-csi2-imx219.dtbo
 dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-hdmi-audio.dtbo
+dtb-$(CONFIG_ARCH_K3) += k3-am62x-sk-dss-shared-mode.dtbo
 
 # Boards with AM64x SoC
 dtb-$(CONFIG_ARCH_K3) += k3-am642-evm.dtb
diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi 
b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
index 33768c02d8eb..8b55ca10102f 100644
--- a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
@@ -34,7 +34,7 @@ memory@8000 {
reg = <0x 0x8000 0x 0x8000>;
};
 
-   reserved-memory {
+   reserved_memory: reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;
diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso 
b/arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso
new file mode 100644
index ..02153748a5c2
--- /dev/null
+++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso
@@ -0,0 +1,33 @@
+// SPDX-License-Identifier: GPL-2.0
+/**
+ * DT overlay to enable display sharing mode for AM62P DSS0
+ * This is compatible with custom AM62 Device Manager firmware
+ *
+ * Copyright (C) 2023 Texas Instruments Incorporated - http://www.ti.com/
+ */
+
+/dts-v1/;
+/plugin/;
+
+#include 
+#include 
+
+&dss0 {
+   ti,dss-shared-mode;
+   ti,dss-shared-mode-vp = "vp1";
+   ti,dss-shared-mode-vp-owned = <0>;
+   ti,dss-shared-mode-common = "common1";
+   ti,dss-shared-mode-planes = "vid";
+   ti,dss-shared-mode-plane-zorder = <0>;
+   interrupts = ;
+};
+
+&reserved_memory {
+   #address-cells = <2>;
+   #size-cells = <2>;
+   rtos_framebuffer_memory_region: rtos-framebuffer-memory@9480 {
+   compatible = "shared-dma-pool";
+   reg = <0x00 0x9480 0x00 0x0800>;
+   no-map;
+   };
+};
-- 
2.34.1



[RFC PATCH 1/3] dt-bindings: display: ti, am65x-dss: Add support for display sharing mode

2024-01-16 Thread Devarsh Thakkar
Add support for using TI Keystone DSS hardware present in display
sharing mode.

TI Keystone DSS hardware supports partitioning of resources between
multiple hosts as it provides separate register space and unique
interrupt line to each host.

The DSS hardware can be used in shared mode in such a way that one or
more of video planes can be owned by Linux wherease other planes can be
owned by remote cores.

One or more of the video ports can be dedicated exclusively to a
processing core, wherease some of the video ports can be shared between
two hosts too with only one of them having write access.

Signed-off-by: Devarsh Thakkar 
---
 .../bindings/display/ti/ti,am65x-dss.yaml | 82 +++
 1 file changed, 82 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml 
b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
index 55e3e490d0e6..d9bc69fbf1fb 100644
--- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
+++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
@@ -112,6 +112,86 @@ properties:
   Input memory (from main memory to dispc) bandwidth limit in
   bytes per second
 
+  ti,dss-shared-mode:
+type: boolean
+description:
+  TI DSS7 supports sharing of display between multiple hosts
+  as it provides separate register space for display configuration and
+  unique interrupt line to each host.
+  One of the host is provided access to the global display
+  configuration labelled as "common" region of DSS allows that host
+  exclusive access to global registers of DSS while other host can
+  configure the display for it's usage using a separate register
+  space labelled as "common1".
+  The DSS resources can be partitioned in such a way that one or more
+  of the video planes are owned by Linux whereas other video planes
+  can be owned by a remote core.
+  The video port controlling these planes acts as a shared video port
+  and it can be configured with write access either by Linux or the
+  remote core in which case Linux only has read-only access to that
+  video port.
+
+  ti,dss-shared-mode-planes:
+description:
+  The video layer that is owned by processing core running Linux.
+  The display driver running from Linux has exclusive write access to
+  this video layer.
+$ref: /schemas/types.yaml#/definitions/string
+enum: [vidl, vid]
+
+  ti,dss-shared-mode-vp:
+description:
+  The video port that is being used in context of processing core
+  running Linux with display susbsytem being used in shared mode.
+  This can be owned either by the processing core running Linux in
+  which case Linux has the write access and the responsibility to
+  configure this video port and the associated overlay manager or
+  it can be shared between core running Linux and a remote core
+  with remote core provided with write access to this video port and
+  associated overlay managers and remote core configures and drives
+  this video port also feeding data from one or more of the
+  video planes owned by Linux, with Linux only having read-only access
+  to this video port and associated overlay managers.
+
+$ref: /schemas/types.yaml#/definitions/string
+enum: [vp1, vp2]
+
+  ti,dss-shared-mode-common:
+description:
+  The DSS register region owned by processing core running Linux.
+$ref: /schemas/types.yaml#/definitions/string
+enum: [common, common1]
+
+  ti,dss-shared-mode-vp-owned:
+description:
+  This tells whether processing core running Linux has write access to
+  the video ports enlisted in ti,dss-shared-mode-vps.
+$ref: /schemas/types.yaml#/definitions/uint32
+enum: [0, 1]
+
+  ti,dss-shared-mode-plane-zorder:
+description:
+  The zorder of the planes owned by Linux.
+  For the scenario where Linux is not having write access to associated
+  video port, this field is just for
+  informational purpose to enumerate the zorder configuration
+  being used by remote core.
+
+$ref: /schemas/types.yaml#/definitions/uint32
+enum: [0, 1]
+
+dependencies:
+  ti,dss-shared-mode: [ 'ti,dss-shared-mode-planes', 'ti,dss-shared-mode-vp',
+'ti,dss-shared-mode-plane-zorder', 
'ti,dss-shared-mode-vp-owned']
+  ti,dss-shared-mode-vp: ['ti,dss-shared-mode', 'ti,dss-shared-mode-planes',
+  'ti,dss-shared-mode-plane-zorder', 
'ti,dss-shared-mode-vp-owned']
+  ti,dss-shared-mode-planes: ['ti,dss-shared-mode', 'ti,dss-shared-mode-vp',
+  'ti,dss-shared-mode-plane-zorder', 
'ti,dss-shared-mode-vp-owned']
+  ti,dss-shared-mode-plane-zorder: ['ti,dss-shared-mode-planes', 
'ti,dss-shared-

[RFC PATCH 2/3] drm/tidss: Add support for display sharing

2024-01-16 Thread Devarsh Thakkar
Display subsystem present in TI Keystone family of devices supports sharing
of display between multiple hosts as it provides separate register space
(common* region) for each host to programming display controller and also a
unique interrupt line for each host.

This adds support for display sharing, by allowing partitioning of
resources either at video port level or at video plane level as
described below :

1) Linux can own (i.e have write access) completely one or more of video
ports along with corresponding resources (viz. overlay managers,
video planes) used by Linux in context of those video ports.
Even if Linux is owning
these video ports it can still share this video port with a remote core
which can own one or more video planes associated with this video port.

2) Linux owns one or more of the video planes with video port
(along with corresponding overlay manager) associated with these planes
being owned and controlled by a remote core. Linux still has read-only
access to the associated video port and overlay managers so that it can
parse the settings made by remote core.

For both the cases, the resources used in context of processing core
running Linux along with ownership information are exposed by user as
part of device-tree blob and driver uses an updated feature list tailored
for this shared mode accordingly. The driver also auto-populates
matching overlay managers and output types from shared video
port list provided in device-tree blob.
In dispc_feature struct remove const access specfier for output_type
array as it is required to be updated dynamically in run-time for shared
mode.

For 2) where Linux is only owning a set of video planes with
corresponding video port and overlay manager controlled by a remote
core, separate set of CRTC callbacks are used which just latch on
to the preset mode set by remote core, thus avoiding any reconfiguration
of associated video ports, overlay managers and clocks.
For this case, it is also checked that Linux controlled video planes
don't exceed screen size set by remote core while running the display.
Display clocks and OLDI related fields are also not
populated for this scenario as remote core is owning those resources.

For 1), where Linux owns only a set of video port and associated
planes with rest of resources owned completely by remote cores,
only those set of resources are exposed to Linux and programmed using
traditional CRTC helpers and rest of video ports and associated resources
are removed from feature list accordingly.

Signed-off-by: Devarsh Thakkar 
---
 drivers/gpu/drm/tidss/tidss_crtc.c  | 120 -
 drivers/gpu/drm/tidss/tidss_dispc.c | 254 +---
 drivers/gpu/drm/tidss/tidss_dispc.h |   2 +-
 drivers/gpu/drm/tidss/tidss_drv.c   |  33 ++--
 drivers/gpu/drm/tidss/tidss_drv.h   |   6 +
 5 files changed, 375 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/tidss/tidss_crtc.c 
b/drivers/gpu/drm/tidss/tidss_crtc.c
index 5f838980c7a1..f6a877ff4c6c 100644
--- a/drivers/gpu/drm/tidss/tidss_crtc.c
+++ b/drivers/gpu/drm/tidss/tidss_crtc.c
@@ -31,13 +31,19 @@ static void tidss_crtc_finish_page_flip(struct tidss_crtc 
*tcrtc)
/*
 * New settings are taken into use at VFP, and GO bit is cleared at
 * the same time. This happens before the vertical blank interrupt.
-* So there is a small change that the driver sets GO bit after VFP, but
+* So there is a small chance that the driver sets GO bit after VFP, but
 * before vblank, and we have to check for that case here.
+*
+* For a video port shared between Linux and remote core but owned by 
remote core,
+* this is not required since Linux just attaches to mode that was 
preset by remote
+* core with which display is being shared.
 */
-   busy = dispc_vp_go_busy(tidss->dispc, tcrtc->hw_videoport);
-   if (busy) {
-   spin_unlock_irqrestore(&ddev->event_lock, flags);
-   return;
+   if (!tidss->shared_mode || 
tidss->shared_mode_owned_vps[tcrtc->hw_videoport]) {
+   busy = dispc_vp_go_busy(tidss->dispc, tcrtc->hw_videoport);
+   if (busy) {
+   spin_unlock_irqrestore(&ddev->event_lock, flags);
+   return;
+   }
}
 
event = tcrtc->event;
@@ -208,6 +214,44 @@ static void tidss_crtc_atomic_flush(struct drm_crtc *crtc,
spin_unlock_irqrestore(&ddev->event_lock, flags);
 }
 
+static void tidss_shared_vp_crtc_atomic_flush(struct drm_crtc *crtc,
+ struct drm_atomic_state *state)
+{
+   struct tidss_crtc *tcrtc = to_tidss_crtc(crtc);
+   struct drm_device *ddev = crtc->dev;
+   unsigned long flags;
+
+   dev_dbg(ddev->dev,
+   "%s: %s enabled %d, needs modeset %d, event %p\n", __func__,
+   crtc->n

[RFC PATCH 0/3] Add display sharing support in tidss

2024-01-16 Thread Devarsh Thakkar
This adds display sharing support in tidss display driver along with an
example overlay devicetree file using which can be used to enable display
sharing on AM62x devices with device manager core i.e. R5F expected to run
a custom firmware which supports corresponding display sharing
configuration.

As resources can be partitioned at different levels a flexible scheme is
followed while designing devicetree bindings and driver changes keeping in
mind possible scenarios in which resources can be partitioned and various
DSS hardware IP's supported across different devices.

A rebased version of this patch has been tested on AM62P SoC which also
supports same DSS IP as AM62x and the patch with display sharing
functionality is already available in vendor-specific kernel source tree
[1] along with documentation which explains the design [2] and DM firmware
support [3].

NOTE1: This is marked as RFC for upstream since AM62P is not yet supported
in upstream tree and for AM62x SoC which is the target SoC for this patch,
the dss sharing functionality is not validated on upstream tree due to
missing OLDI support and display sharing DM firmware support, but we still
wanted to get some feedback on this.

NOTE2: This series depends on :
https://lore.kernel.org/all/20240115125716.560363-1-devar...@ti.com/

[1] :
https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=09.01.00.007&id=d805270609cfb6b2e67bd2fd5959d71f48393196
 
https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=09.01.00.007&id=93d751a94cbf9ad07c7f658e78aa510d919d7cd6
https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=09.01.00.007&id=665c17837dc8bed27e8d63388f8f7f7a85c0cd94
https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=ti-linux-6.1.y-cicd&id=f8d7f1a9617862af922c6bc10e0765ba4f4857d6
 
[2] :
https://software-dl.ti.com/processor-sdk-linux/esd/AM62PX/09_01_00_08/exports/docs/linux/Foundational_Components/Kernel/Kernel_Drivers/Display/DSS7.html#driver-features
(Display Sharing mode feature description)
https://software-dl.ti.com/processor-sdk-linux/esd/AM62PX/09_01_00_08/exports/docs/linux/How_to_Guides/Target/How_to_enable_display_sharing_between_remotecore_and_Linux.html
(How To Guide)
https://software-dl.ti.com/processor-sdk-linux/esd/AM62PX/09_01_00_08/exports/docs/system/Demo_User_Guides/Display_Cluster_User_Guide.html
(Display Cluster user guide with demo details)

[3] :
https://git.ti.com/cgit/processor-firmware/ti-linux-firmware/tree/ti-dm/am62pxx/dss_display_share.wkup-r5f0_0.release.strip.out?h=ti-linux-firmware-next

Devarsh Thakkar (3):
  dt-bindings: display: ti,am65x-dss: Add support for display sharing
mode
  drm/tidss: Add support for display sharing
  arm64: dts: ti: k3-am62x: Add overlay to use DSS in display sharing
mode

 .../bindings/display/ti/ti,am65x-dss.yaml |  82 ++
 arch/arm64/boot/dts/ti/Makefile   |   1 +
 .../dts/ti/k3-am62x-sk-dss-shared-mode.dtso   |  23 ++
 drivers/gpu/drm/tidss/tidss_crtc.c| 120 -
 drivers/gpu/drm/tidss/tidss_dispc.c   | 254 --
 drivers/gpu/drm/tidss/tidss_dispc.h   |   2 +-
 drivers/gpu/drm/tidss/tidss_drv.c |  33 ++-
 drivers/gpu/drm/tidss/tidss_drv.h |   6 +
 8 files changed, 481 insertions(+), 40 deletions(-)
 create mode 100644 arch/arm64/boot/dts/ti/k3-am62x-sk-dss-shared-mode.dtso

-- 
2.34.1



Re: [DO NOT MERGE PATCH 2/2] arm64: dts: ti: Add common1 register space for AM62x and AM65x SoCs

2024-01-16 Thread Devarsh Thakkar
Hi Conor,

Thanks for the review.

On 15/01/24 21:44, Conor Dooley wrote:
> On Mon, Jan 15, 2024 at 06:27:16PM +0530, Devarsh Thakkar wrote:
>> This adds common1 register space for AM62x and AM65x SoC's which are using
>> TI's Keystone display hardware and supporting it as described in
>> Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml.
>>
>> This region is documented in respective Technical Reference Manuals [1].
>>
>> [1]:
>> AM62x TRM:
>> https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers)
>>
>> AM65x TRM:
>> https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers)
>>
>> Signed-off-by: Devarsh Thakkar 
>> ---
> 
> "[DO NOT MERGE PATCH 2/2]" but no rationale here as to why this cannot
> be merged? What's the problem with it?
> 

No problem as such from my point of view, but this is the process I follow
since maintainer trees for device-tree file and bindings are different. I
generally mark a [DO NOT MERGE] tag for device-tree file patches until binding
patch gets merged so that the device-tree patches don't get applied by mistake
if binding patch has some pending comments.

Once binding patch gets merged, I re-send the device-tree file patches again
to respective list.

Regards
Devarsh

> Cheers,
> Conor.
> 
>>  arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++--
>>  arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++--
>>  2 files changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi 
>> b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
>> index 464b7565d085..298bf8d5de8c 100644
>> --- a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
>> +++ b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
>> @@ -779,9 +779,10 @@ dss: dss@3020 {
>><0x00 0x30207000 0x00 0x1000>, /* ovr1 */
>><0x00 0x30208000 0x00 0x1000>, /* ovr2 */
>><0x00 0x3020a000 0x00 0x1000>, /* vp1: Used for OLDI */
>> -  <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */
>> +  <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */
>> +  <0x00 0x30201000 0x00 0x1000>; /* common1 */
>>  reg-names = "common", "vidl1", "vid",
>> -"ovr1", "ovr2", "vp1", "vp2";
>> +"ovr1", "ovr2", "vp1", "vp2", "common1";
>>  power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>;
>>  clocks = <&k3_clks 186 6>,
>>   <&dss_vp1_clk>,
>> diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi 
>> b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
>> index fcea54465636..5b2d4365b911 100644
>> --- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
>> +++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
>> @@ -1019,9 +1019,10 @@ dss: dss@4a0 {
>><0x0 0x04a07000 0x0 0x1000>, /* ovr1 */
>><0x0 0x04a08000 0x0 0x1000>, /* ovr2 */
>><0x0 0x04a0a000 0x0 0x1000>, /* vp1 */
>> -  <0x0 0x04a0b000 0x0 0x1000>; /* vp2 */
>> +  <0x0 0x04a0b000 0x0 0x1000>, /* vp2 */
>> +  <0x0 0x04a01000 0x0 0x1000>; /* common1 */
>>  reg-names = "common", "vidl1", "vid",
>> -"ovr1", "ovr2", "vp1", "vp2";
>> +"ovr1", "ovr2", "vp1", "vp2", "common1";
>>  
>>  ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>;
>>  
>> -- 
>> 2.34.1
>>


Re: [PATCH 1/2] dt-bindings: display: ti,am65x-dss: Add support for common1 region

2024-01-16 Thread Devarsh Thakkar
Hi Conor,

Thanks for the review.

On 15/01/24 21:47, Conor Dooley wrote:
> On Mon, Jan 15, 2024 at 06:27:15PM +0530, Devarsh Thakkar wrote:
>> TI keystone display subsystem present in AM65 and other SoCs such as AM62
> 
> Do all 3 SoCs supported by this binding (am625 am62a7 am65x) have this
> common1 register? If not, you should limit it the platforms that do have
> it.
> 

Yes all 3 SoCs supported by binding have common1 register space supported.

AM62x TRM:
https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers)

AM65x TRM:
https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers)

AM62A TRM:
https://www.ti.com/lit/pdf/spruj16 (Section 14.9.9.1 DSS Registers)

Regards
Devarsh

> Thanks,
> Conor.
> 
>> support two separate register spaces namely "common" and "common1" which
>> can be used by two separate hosts to program the display controller as
>> described in respective Technical Reference Manuals [1].
>>
>> The common1 register space has similar set of configuration registers as
>> supported in common register space except the global configuration
>> registers which are exclusive to common region.
>>
>> This adds binding for "common1" register region too as supported by the
>> hardware.
>>
>> [1]:
>> AM62x TRM:
>> https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers)
>>
>> AM65x TRM:
>> https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers)
>>
>> Signed-off-by: Devarsh Thakkar 
>> ---
>>  .../devicetree/bindings/display/ti/ti,am65x-dss.yaml   | 7 +--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml 
>> b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
>> index b6767ef0d24d..55e3e490d0e6 100644
>> --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
>> +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
>> @@ -37,6 +37,7 @@ properties:
>>- description: OVR2 overlay manager for vp2
>>- description: VP1 video port 1
>>- description: VP2 video port 2
>> +  - description: common1 DSS register area
>>  
>>reg-names:
>>  items:
>> @@ -47,6 +48,7 @@ properties:
>>- const: ovr2
>>- const: vp1
>>- const: vp2
>> +  - const: common1
>>  
>>clocks:
>>  items:
>> @@ -147,9 +149,10 @@ examples:
>>  <0x04a07000 0x1000>, /* ovr1 */
>>  <0x04a08000 0x1000>, /* ovr2 */
>>  <0x04a0a000 0x1000>, /* vp1 */
>> -<0x04a0b000 0x1000>; /* vp2 */
>> +<0x04a0b000 0x1000>, /* vp2 */
>> +<0x04a01000 0x1000>; /* common1 */
>>  reg-names = "common", "vidl1", "vid",
>> -"ovr1", "ovr2", "vp1", "vp2";
>> +"ovr1", "ovr2", "vp1", "vp2", "common1";
>>  ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>;
>>  power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>;
>>  clocks =<&k3_clks 67 1>,
>> -- 
>> 2.34.1
>>


[PATCH 0/2] Add common1 register space for TI Keystone displays

2024-01-15 Thread Devarsh Thakkar
Add common1 register space for SoC's supporting TI Keystone displays
present in AM65x and AM62x SoCs. This is required to support use-cases
where Linux may want to use common1 region instead of common region with
the latter being controlled by another processing core.

The enumeration of common1 region in device-tree bindings seem to be a
miss as ideally bindings should enumerate all supported register spaces
as done in
Documentation/devicetree/bindings/display/ti/ti,j721e-dss.yaml which
also uses TI Keystone display subsystem albeit with some more features.

Devarsh Thakkar (2):
  dt-bindings: display: ti,am65x-dss: Add support for common1 region
  arm64: dts: ti: Add common1 register space for AM62x and AM65x SoCs

 .../devicetree/bindings/display/ti/ti,am65x-dss.yaml   | 7 +--
 arch/arm64/boot/dts/ti/k3-am62-main.dtsi   | 5 +++--
 arch/arm64/boot/dts/ti/k3-am65-main.dtsi   | 5 +++--
 3 files changed, 11 insertions(+), 6 deletions(-)

-- 
2.34.1



[DO NOT MERGE PATCH 2/2] arm64: dts: ti: Add common1 register space for AM62x and AM65x SoCs

2024-01-15 Thread Devarsh Thakkar
This adds common1 register space for AM62x and AM65x SoC's which are using
TI's Keystone display hardware and supporting it as described in
Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml.

This region is documented in respective Technical Reference Manuals [1].

[1]:
AM62x TRM:
https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers)

AM65x TRM:
https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers)

Signed-off-by: Devarsh Thakkar 
---
 arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 5 +++--
 arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 +++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
index 464b7565d085..298bf8d5de8c 100644
--- a/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62-main.dtsi
@@ -779,9 +779,10 @@ dss: dss@3020 {
  <0x00 0x30207000 0x00 0x1000>, /* ovr1 */
  <0x00 0x30208000 0x00 0x1000>, /* ovr2 */
  <0x00 0x3020a000 0x00 0x1000>, /* vp1: Used for OLDI */
- <0x00 0x3020b000 0x00 0x1000>; /* vp2: Used as DPI Out */
+ <0x00 0x3020b000 0x00 0x1000>, /* vp2: Used as DPI Out */
+ <0x00 0x30201000 0x00 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
power-domains = <&k3_pds 186 TI_SCI_PD_EXCLUSIVE>;
clocks = <&k3_clks 186 6>,
 <&dss_vp1_clk>,
diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi 
b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
index fcea54465636..5b2d4365b911 100644
--- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi
@@ -1019,9 +1019,10 @@ dss: dss@4a0 {
  <0x0 0x04a07000 0x0 0x1000>, /* ovr1 */
  <0x0 0x04a08000 0x0 0x1000>, /* ovr2 */
  <0x0 0x04a0a000 0x0 0x1000>, /* vp1 */
- <0x0 0x04a0b000 0x0 0x1000>; /* vp2 */
+ <0x0 0x04a0b000 0x0 0x1000>, /* vp2 */
+ <0x0 0x04a01000 0x0 0x1000>; /* common1 */
reg-names = "common", "vidl1", "vid",
-   "ovr1", "ovr2", "vp1", "vp2";
+   "ovr1", "ovr2", "vp1", "vp2", "common1";
 
ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>;
 
-- 
2.34.1



[PATCH 1/2] dt-bindings: display: ti, am65x-dss: Add support for common1 region

2024-01-15 Thread Devarsh Thakkar
TI keystone display subsystem present in AM65 and other SoCs such as AM62
support two separate register spaces namely "common" and "common1" which
can be used by two separate hosts to program the display controller as
described in respective Technical Reference Manuals [1].

The common1 register space has similar set of configuration registers as
supported in common register space except the global configuration
registers which are exclusive to common region.

This adds binding for "common1" register region too as supported by the
hardware.

[1]:
AM62x TRM:
https://www.ti.com/lit/pdf/spruiv7 (Section 14.8.9.1 DSS Registers)

AM65x TRM:
https://www.ti.com/lit/pdf/spruid7 (Section 12.6.5 DSS Registers)

Signed-off-by: Devarsh Thakkar 
---
 .../devicetree/bindings/display/ti/ti,am65x-dss.yaml   | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml 
b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
index b6767ef0d24d..55e3e490d0e6 100644
--- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
+++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
@@ -37,6 +37,7 @@ properties:
   - description: OVR2 overlay manager for vp2
   - description: VP1 video port 1
   - description: VP2 video port 2
+  - description: common1 DSS register area
 
   reg-names:
 items:
@@ -47,6 +48,7 @@ properties:
   - const: ovr2
   - const: vp1
   - const: vp2
+  - const: common1
 
   clocks:
 items:
@@ -147,9 +149,10 @@ examples:
 <0x04a07000 0x1000>, /* ovr1 */
 <0x04a08000 0x1000>, /* ovr2 */
 <0x04a0a000 0x1000>, /* vp1 */
-<0x04a0b000 0x1000>; /* vp2 */
+<0x04a0b000 0x1000>, /* vp2 */
+<0x04a01000 0x1000>; /* common1 */
 reg-names = "common", "vidl1", "vid",
-"ovr1", "ovr2", "vp1", "vp2";
+"ovr1", "ovr2", "vp1", "vp2", "common1";
 ti,am65x-oldi-io-ctrl = <&dss_oldi_io_ctrl>;
 power-domains = <&k3_pds 67 TI_SCI_PD_EXCLUSIVE>;
 clocks =<&k3_clks 67 1>,
-- 
2.34.1



Re: [PATCH] drm/tidss: Power up attached PM domains on probe

2023-10-16 Thread Devarsh Thakkar
Hi Maxime,

On 12/10/23 18:10, Maxime Ripard wrote:
> Hi,
> 
> On Thu, Oct 12, 2023 at 05:10:06PM +0530, Devarsh Thakkar wrote:
>> On 09/10/23 13:20, Devarsh Thakkar wrote:
>>> Some SoC's such as AM62P have dedicated power domains
>>> for OLDI which need to be powered on separetely along
>>> with display controller.
>>>
>>> So during driver probe, power up all attached PM domains
>>> enumerated in devicetree node for DSS.
>>>
>>> This also prepares base to add display support for AM62P.
>>>
>>
>> NAK, for this patch, as discussed with team there are already plans
>> to have separate OLDI bridge driver which should eventually handle
>> the additional power domains.
> 
> Just for the record in case your current plan doesn't work out and we
> need to revisit this patch: my point was that it's something that
> deviates by a margin from what drivers are usually expected to do, so we
> should document why that deviation is there.
> 

Sure, thanks for suggesting, I agree that if going with this logic, we should
definitely put a comment in driver regarding same.

> The patch itself looks reasonable to me otherwise.

Yes, it's just that we are planning a separate driver for OLDI. We will see if
we need similar logic in that driver too.

Regards
Devarsh

> 
> Maxime


Re: [PATCH] drm/tidss: Power up attached PM domains on probe

2023-10-12 Thread Devarsh Thakkar




On 09/10/23 13:20, Devarsh Thakkar wrote:

Some SoC's such as AM62P have dedicated power domains
for OLDI which need to be powered on separetely along
with display controller.

So during driver probe, power up all attached PM domains
enumerated in devicetree node for DSS.

This also prepares base to add display support for AM62P.



NAK, for this patch, as discussed with team there are already plans
to have separate OLDI bridge driver which should eventually handle
the additional power domains.

Sorry for the noise.

Regards
Devarsh

Signed-off-by: Devarsh Thakkar  > ---
  drivers/gpu/drm/tidss/tidss_drv.c | 76 +++
  drivers/gpu/drm/tidss/tidss_drv.h |  5 ++
  2 files changed, 81 insertions(+)

diff --git a/drivers/gpu/drm/tidss/tidss_drv.c 
b/drivers/gpu/drm/tidss/tidss_drv.c
index 4d063eb9cd0b..a703a27d17bf 100644
--- a/drivers/gpu/drm/tidss/tidss_drv.c
+++ b/drivers/gpu/drm/tidss/tidss_drv.c
@@ -8,6 +8,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include 

  #include 
@@ -114,6 +115,72 @@ static const struct drm_driver tidss_driver = {
.minor  = 0,
  };
  
+static int tidss_detach_pm_domains(struct tidss_device *tidss)

+{
+   int i;
+
+   if (tidss->num_domains <= 1)
+   return 0;
+
+   for (i = 0; i < tidss->num_domains; i++) {
+   if (tidss->pd_link[i] && !IS_ERR(tidss->pd_link[i]))
+   device_link_del(tidss->pd_link[i]);
+   if (tidss->pd_dev[i] && !IS_ERR(tidss->pd_dev[i]))
+   dev_pm_domain_detach(tidss->pd_dev[i], true);
+   tidss->pd_dev[i] = NULL;
+   tidss->pd_link[i] = NULL;
+   }
+
+   return 0;
+}
+
+static int tidss_attach_pm_domains(struct tidss_device *tidss)
+{
+   struct device *dev = tidss->dev;
+   int i;
+   int ret;
+   struct platform_device *pdev = to_platform_device(dev);
+   struct device_node *np = pdev->dev.of_node;
+
+   tidss->num_domains = of_count_phandle_with_args(np, "power-domains",
+   "#power-domain-cells");
+   if (tidss->num_domains <= 1) {
+   dev_dbg(dev, "One or less power domains, no need to do attach 
domains\n");
+   return 0;
+   }
+
+   tidss->pd_dev = devm_kmalloc_array(dev, tidss->num_domains,
+  sizeof(*tidss->pd_dev), GFP_KERNEL);
+   if (!tidss->pd_dev)
+   return -ENOMEM;
+
+   tidss->pd_link = devm_kmalloc_array(dev, tidss->num_domains,
+   sizeof(*tidss->pd_link), 
GFP_KERNEL);
+   if (!tidss->pd_link)
+   return -ENOMEM;
+
+   for (i = 0; i < tidss->num_domains; i++) {
+   tidss->pd_dev[i] = dev_pm_domain_attach_by_id(dev, i);
+   if (IS_ERR(tidss->pd_dev[i])) {
+   ret = PTR_ERR(tidss->pd_dev[i]);
+   goto fail;
+   }
+
+   tidss->pd_link[i] = device_link_add(dev, tidss->pd_dev[i],
+   DL_FLAG_STATELESS |
+   DL_FLAG_PM_RUNTIME | 
DL_FLAG_RPM_ACTIVE);
+   if (!tidss->pd_link[i]) {
+   ret = -EINVAL;
+   goto fail;
+   }
+   }
+
+   return 0;
+fail:
+   tidss_detach_pm_domains(tidss);
+   return ret;
+}
+
  static int tidss_probe(struct platform_device *pdev)
  {
struct device *dev = &pdev->dev;
@@ -136,6 +203,13 @@ static int tidss_probe(struct platform_device *pdev)
  
  	platform_set_drvdata(pdev, tidss);
  
+	/* powering up associated OLDI domains */

+   ret = tidss_attach_pm_domains(tidss);
+   if (ret < 0) {
+   dev_err(dev, "failed to attach power domains %d\n", ret);
+   return ret;
+   }
+
ret = dispc_init(tidss);
if (ret) {
dev_err(dev, "failed to initialize dispc: %d\n", ret);
@@ -193,6 +267,7 @@ static int tidss_probe(struct platform_device *pdev)
dispc_runtime_suspend(tidss->dispc);
  #endif
pm_runtime_disable(dev);
+   tidss_detach_pm_domains(tidss);
  
  	return ret;

  }
@@ -220,6 +295,7 @@ static void tidss_remove(struct platform_device *pdev)
/* devm allocated dispc goes away with the dev so mark it NULL */
dispc_remove(tidss);
  
+	tidss_detach_pm_domains(tidss);

dev_dbg(dev, "%s done\n", __func__);
  }
  
diff --git a/drivers/gpu/drm/tidss/tidss_drv.h b/drivers/gpu/drm/tidss/tidss_drv.h

index d7f27b0b0315..3c8b37b3aba6 100644
--- a/drivers/gpu/drm/tidss/tidss_drv.h
+++ b/drivers/gpu/drm/tidss/tidss_drv.h
@@ -31,6 +31,11 @@ struct tidss_dev

  1   2   >