[PATCHv4 3/6] videodev2.h: add V4L2_DEC_CMD_FLUSH

2019-10-11 Thread Hans Verkuil
Add this new V4L2_DEC_CMD_FLUSH decoder command and document it.

Reviewed-by: Boris Brezillon 
Reviewed-by: Alexandre Courbot 
Signed-off-by: Hans Verkuil 
[Adjusted description]
Signed-off-by: Jernej Skrabec 
---
 Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst | 10 +-
 Documentation/media/videodev2.h.rst.exceptions  |  1 +
 include/uapi/linux/videodev2.h  |  1 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst 
b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
index 57f0066f4cff..f1a504836f31 100644
--- a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
+++ b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
@@ -208,7 +208,15 @@ introduced in Linux 3.3. They are, however, mandatory for 
stateful mem2mem decod
been started yet, the driver will return an ``EPERM`` error code. When
the decoder is already running, this command does nothing. No
flags are defined for this command.
-
+* - ``V4L2_DEC_CMD_FLUSH``
+  - 4
+  - Flush any held capture buffers. Only valid for stateless decoders.
+   This command is typically used when the application reached the
+   end of the stream and the last output buffer had the
+   ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag set. This would prevent
+   dequeueing the capture buffer containing the last decoded frame.
+   So this command can be used to explicitly flush that final decoded
+   frame. This command does nothing if there are no held capture buffers.
 
 Return Value
 
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index b58e381bdf7b..c23e5ef30c78 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -435,6 +435,7 @@ replace define V4L2_DEC_CMD_START decoder-cmds
 replace define V4L2_DEC_CMD_STOP decoder-cmds
 replace define V4L2_DEC_CMD_PAUSE decoder-cmds
 replace define V4L2_DEC_CMD_RESUME decoder-cmds
+replace define V4L2_DEC_CMD_FLUSH decoder-cmds
 
 replace define V4L2_DEC_CMD_START_MUTE_AUDIO decoder-cmds
 replace define V4L2_DEC_CMD_PAUSE_TO_BLACK decoder-cmds
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 9f4e66affac4..d969842bbfe2 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -1984,6 +1984,7 @@ struct v4l2_encoder_cmd {
 #define V4L2_DEC_CMD_STOP(1)
 #define V4L2_DEC_CMD_PAUSE   (2)
 #define V4L2_DEC_CMD_RESUME  (3)
+#define V4L2_DEC_CMD_FLUSH   (4)
 
 /* Flags for V4L2_DEC_CMD_START */
 #define V4L2_DEC_CMD_START_MUTE_AUDIO  (1 << 0)
-- 
2.23.0



[PATCHv3 3/8] videodev2.h: add V4L2_DEC_CMD_FLUSH

2019-10-10 Thread Hans Verkuil
Add this new V4L2_DEC_CMD_FLUSH decoder command and document it.

Reviewed-by: Boris Brezillon 
Reviewed-by: Alexandre Courbot 
Signed-off-by: Hans Verkuil 
[Adjusted description]
Signed-off-by: Jernej Skrabec 
---
 Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst | 10 +-
 Documentation/media/videodev2.h.rst.exceptions  |  1 +
 include/uapi/linux/videodev2.h  |  1 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst 
b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
index 57f0066f4cff..f1a504836f31 100644
--- a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
+++ b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
@@ -208,7 +208,15 @@ introduced in Linux 3.3. They are, however, mandatory for 
stateful mem2mem decod
been started yet, the driver will return an ``EPERM`` error code. When
the decoder is already running, this command does nothing. No
flags are defined for this command.
-
+* - ``V4L2_DEC_CMD_FLUSH``
+  - 4
+  - Flush any held capture buffers. Only valid for stateless decoders.
+   This command is typically used when the application reached the
+   end of the stream and the last output buffer had the
+   ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag set. This would prevent
+   dequeueing the capture buffer containing the last decoded frame.
+   So this command can be used to explicitly flush that final decoded
+   frame. This command does nothing if there are no held capture buffers.
 
 Return Value
 
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index adeb6b7a15cb..a79028e4d929 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -434,6 +434,7 @@ replace define V4L2_DEC_CMD_START decoder-cmds
 replace define V4L2_DEC_CMD_STOP decoder-cmds
 replace define V4L2_DEC_CMD_PAUSE decoder-cmds
 replace define V4L2_DEC_CMD_RESUME decoder-cmds
+replace define V4L2_DEC_CMD_FLUSH decoder-cmds
 
 replace define V4L2_DEC_CMD_START_MUTE_AUDIO decoder-cmds
 replace define V4L2_DEC_CMD_PAUSE_TO_BLACK decoder-cmds
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 4fa9f543742d..91a79e16089c 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -1978,6 +1978,7 @@ struct v4l2_encoder_cmd {
 #define V4L2_DEC_CMD_STOP(1)
 #define V4L2_DEC_CMD_PAUSE   (2)
 #define V4L2_DEC_CMD_RESUME  (3)
+#define V4L2_DEC_CMD_FLUSH   (4)
 
 /* Flags for V4L2_DEC_CMD_START */
 #define V4L2_DEC_CMD_START_MUTE_AUDIO  (1 << 0)
-- 
2.23.0



Re: [PATCH 1/2] videodev2.h: add macros to print a fourcc

2019-09-16 Thread Hans Verkuil
On 9/16/19 11:26 AM, Sakari Ailus wrote:
> Hi Hans,
> 
> On Mon, Sep 16, 2019 at 11:00:46AM +0200, Hans Verkuil wrote:
>> Add new macros V4L2_FOURCC_CONV and V4L2_FOURCC_ARGS for use
>> in code that prints a fourcc. These macros can be used in both
>> kernel and userspace.
>>
>> Signed-off-by: Hans Verkuil 
>> Suggested-by: Sakari Ailus 
>> ---
>>  include/uapi/linux/videodev2.h | 13 +
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>> index 530638dffd93..7a34eb93437e 100644
>> --- a/include/uapi/linux/videodev2.h
>> +++ b/include/uapi/linux/videodev2.h
>> @@ -82,6 +82,19 @@
>>  ((__u32)(a) | ((__u32)(b) << 8) | ((__u32)(c) << 16) | ((__u32)(d) << 
>> 24))
>>  #define v4l2_fourcc_be(a, b, c, d)  (v4l2_fourcc(a, b, c, d) | (1U << 31))
>>  
>> +/*
>> + * Helper macros to print a fourcc in a standard format. E.g.:
>> + *
>> + * printf("fourcc is " V4L2_FOURCC_CONV "\n", V4L2_FOURCC_ARGS(fourcc));
>> + *
>> + * Note that V4L2_FOURCC_ARGS reuses fourcc, so this can't be an
>> + * expression with side-effects.
>> + */
>> +#define V4L2_FOURCC_CONV "%c%c%c%c%s"
>> +#define V4L2_FOURCC_ARGS(fourcc) \
>> +(fourcc) & 0x7f, ((fourcc) >> 8) & 0x7f, ((fourcc) >> 16) & 0x7f, \
>> +((fourcc) >> 24) & 0x7f, ((fourcc) & (1U << 31) ? "-BE" : "")
>> +
>>  /*
>>   *  E N U M S
>>   */
> 
> KernelDoc comments would be nice. Such as in here:
> 
> https://patchwork.linuxtv.org/patch/48372/>

I was searching for old patches with the string 'fourcc', not '4cc',
so that's why I missed your patch.

I'll respin with that (slightly updated) patch.

Regards,

Hans

> 
> I'm fine with either patch though.
> 



Re: [PATCH 1/2] videodev2.h: add macros to print a fourcc

2019-09-16 Thread Sakari Ailus
Hi Hans,

On Mon, Sep 16, 2019 at 11:00:46AM +0200, Hans Verkuil wrote:
> Add new macros V4L2_FOURCC_CONV and V4L2_FOURCC_ARGS for use
> in code that prints a fourcc. These macros can be used in both
> kernel and userspace.
> 
> Signed-off-by: Hans Verkuil 
> Suggested-by: Sakari Ailus 
> ---
>  include/uapi/linux/videodev2.h | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 530638dffd93..7a34eb93437e 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -82,6 +82,19 @@
>   ((__u32)(a) | ((__u32)(b) << 8) | ((__u32)(c) << 16) | ((__u32)(d) << 
> 24))
>  #define v4l2_fourcc_be(a, b, c, d)   (v4l2_fourcc(a, b, c, d) | (1U << 31))
>  
> +/*
> + * Helper macros to print a fourcc in a standard format. E.g.:
> + *
> + * printf("fourcc is " V4L2_FOURCC_CONV "\n", V4L2_FOURCC_ARGS(fourcc));
> + *
> + * Note that V4L2_FOURCC_ARGS reuses fourcc, so this can't be an
> + * expression with side-effects.
> + */
> +#define V4L2_FOURCC_CONV "%c%c%c%c%s"
> +#define V4L2_FOURCC_ARGS(fourcc) \
> + (fourcc) & 0x7f, ((fourcc) >> 8) & 0x7f, ((fourcc) >> 16) & 0x7f, \
> + ((fourcc) >> 24) & 0x7f, ((fourcc) & (1U << 31) ? "-BE" : "")
> +
>  /*
>   *   E N U M S
>   */

KernelDoc comments would be nice. Such as in here:

https://patchwork.linuxtv.org/patch/48372/>

I'm fine with either patch though.

-- 
Sakari Ailus


[PATCH 1/2] videodev2.h: add macros to print a fourcc

2019-09-16 Thread Hans Verkuil
Add new macros V4L2_FOURCC_CONV and V4L2_FOURCC_ARGS for use
in code that prints a fourcc. These macros can be used in both
kernel and userspace.

Signed-off-by: Hans Verkuil 
Suggested-by: Sakari Ailus 
---
 include/uapi/linux/videodev2.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 530638dffd93..7a34eb93437e 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -82,6 +82,19 @@
((__u32)(a) | ((__u32)(b) << 8) | ((__u32)(c) << 16) | ((__u32)(d) << 
24))
 #define v4l2_fourcc_be(a, b, c, d) (v4l2_fourcc(a, b, c, d) | (1U << 31))
 
+/*
+ * Helper macros to print a fourcc in a standard format. E.g.:
+ *
+ * printf("fourcc is " V4L2_FOURCC_CONV "\n", V4L2_FOURCC_ARGS(fourcc));
+ *
+ * Note that V4L2_FOURCC_ARGS reuses fourcc, so this can't be an
+ * expression with side-effects.
+ */
+#define V4L2_FOURCC_CONV "%c%c%c%c%s"
+#define V4L2_FOURCC_ARGS(fourcc) \
+   (fourcc) & 0x7f, ((fourcc) >> 8) & 0x7f, ((fourcc) >> 16) & 0x7f, \
+   ((fourcc) >> 24) & 0x7f, ((fourcc) & (1U << 31) ? "-BE" : "")
+
 /*
  * E N U M S
  */
-- 
2.20.1



[RFC,v5, 3/5] media: videodev2.h: Add new boottime timestamp type

2019-09-02 Thread Jungo Lin
For Camera AR(Augmented Reality) application requires camera timestamps
to be reported with CLOCK_BOOTTIME to sync timestamp with other sensor
sources.

The boottime timestamp is identical to monotonic timestamp,
except it also includes any time that the system is suspended.

Signed-off-by: Jungo Lin 
---
 Documentation/media/uapi/v4l/buffer.rst | 11 ++-
 include/uapi/linux/videodev2.h  |  2 ++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/Documentation/media/uapi/v4l/buffer.rst 
b/Documentation/media/uapi/v4l/buffer.rst
index 1cbd9cde57f3..9e636f4118f5 100644
--- a/Documentation/media/uapi/v4l/buffer.rst
+++ b/Documentation/media/uapi/v4l/buffer.rst
@@ -649,13 +649,22 @@ Buffer Flags
   - 0x2000
   - The buffer timestamp has been taken from the ``CLOCK_MONOTONIC``
clock. To access the same clock outside V4L2, use
-   :c:func:`clock_gettime`.
+   :c:func:`clock_gettime` using clock IDs ``CLOCK_MONOTONIC``.
 * .. _`V4L2-BUF-FLAG-TIMESTAMP-COPY`:
 
   - ``V4L2_BUF_FLAG_TIMESTAMP_COPY``
   - 0x4000
   - The CAPTURE buffer timestamp has been taken from the corresponding
OUTPUT buffer. This flag applies only to mem2mem devices.
+* .. _`V4L2_BUF_FLAG_TIMESTAMP_BOOTIME`:
+
+  - ``V4L2_BUF_FLAG_TIMESTAMP_BOOTIME``
+  - 0x8000
+  - The buffer timestamp has been taken from the ``CLOCK_BOOTTIME``
+   clock. To access the same clock outside V4L2, use
+   :c:func:`clock_gettime` using clock IDs ``CLOCK_BOOTTIME``.
+   Identical to CLOCK_MONOTONIC, except it also includes any time that
+   the system is suspended.
 * .. _`V4L2-BUF-FLAG-TSTAMP-SRC-MASK`:
 
   - ``V4L2_BUF_FLAG_TSTAMP_SRC_MASK``
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 9d9705ceda76..a4fd271348e7 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -1043,6 +1043,8 @@ static inline __u64 v4l2_timeval_to_ns(const struct 
timeval *tv)
 #define V4L2_BUF_FLAG_TIMESTAMP_UNKNOWN0x
 #define V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC  0x2000
 #define V4L2_BUF_FLAG_TIMESTAMP_COPY   0x4000
+#define V4L2_BUF_FLAG_TIMESTAMP_BOOTIME0x8000
+
 /* Timestamp sources. */
 #define V4L2_BUF_FLAG_TSTAMP_SRC_MASK  0x0007
 #define V4L2_BUF_FLAG_TSTAMP_SRC_EOF   0x
-- 
2.18.0



Re: [GIT PULL FOR v5.4] Hantro H.264 + finish stateful decoder spec

2019-08-17 Thread Jenkins
From: buil...@linuxtv.org

Pull request: https://patchwork.linuxtv.org/patch/58252/
Build log: https://builder.linuxtv.org/job/patchwork/10603/
Build time: 00:35:01
Link: 
https://lore.kernel.org/linux-media/1be8ac17-349b-ef4d-299d-4f3888949...@xs4all.nl
Summary: 8 patches and/or PDF generation with issues, being 0 at build time

gpg: Signature made Sat 17 Aug 2019 08:26:34 AM UTC
gpg:using RSA key AAA7FFBA4D2D77EF4CAEA1421326E0CD23ABDCE5
gpg: Good signature from "Hans Verkuil " [unknown]
gpg: aka "Hans Verkuil " [full]


Error/warnings:

patches/0001-lib-sort.c-implement-sort-variant-taking-context-arg.patch:30: 
WARNING: Non-standard signature: Requested-by:
patches/0001-lib-sort.c-implement-sort-variant-taking-context-arg.patch:114: 
WARNING: line over 80 characters
patches/0001-lib-sort.c-implement-sort-variant-taking-context-arg.patch:120: 
WARNING: line over 80 characters

Error #256 when running ./scripts/checkpatch.pl --terse --mailback --no-summary 
--strict patches/0001-lib-sort.c-implement-sort-variant-taking-context-arg.patch
patches/0002-media-uapi-h264-Rename-pixel-format.patch:50: WARNING: line over 
80 characters
patches/0002-media-uapi-h264-Rename-pixel-format.patch:50: ERROR: trailing 
statements should be on next line
patches/0002-media-uapi-h264-Rename-pixel-format.patch:107: WARNING: line over 
80 characters

Error #256 when running ./scripts/checkpatch.pl --terse --mailback --no-summary 
--strict patches/0002-media-uapi-h264-Rename-pixel-format.patch
patches/0003-media-uapi-h264-Add-the-concept-of-decoding-mode.patch:154: 
WARNING: line over 80 characters
patches/0003-media-uapi-h264-Add-the-concept-of-decoding-mode.patch:174: CHECK: 
spaces preferred around that '+' (ctx:VxV)

Error #256 when running ./scripts/checkpatch.pl --terse --mailback --no-summary 
--strict patches/0003-media-uapi-h264-Add-the-concept-of-decoding-mode.patch
patches/0004-media-uapi-h264-Add-the-concept-of-start-code.patch:118: WARNING: 
line over 80 characters
patches/0004-media-uapi-h264-Add-the-concept-of-start-code.patch:138: CHECK: 
spaces preferred around that '+' (ctx:VxV)

Error #256 when running ./scripts/checkpatch.pl --terse --mailback --no-summary 
--strict patches/0004-media-uapi-h264-Add-the-concept-of-start-code.patch
patches/0006-media-cedrus-Cleanup-control-initialization.patch:109: WARNING: 
line over 80 characters

Error #256 when running ./scripts/checkpatch.pl --terse --mailback --no-summary 
--strict patches/0006-media-cedrus-Cleanup-control-initialization.patch
patches/0009-media-hantro-Add-core-bits-to-support-H264-decoding.patch:138: 
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
patches/0009-media-hantro-Add-core-bits-to-support-H264-decoding.patch:536: 
CHECK: Unnecessary parentheses around 'poca < builder->curpoc'
patches/0009-media-hantro-Add-core-bits-to-support-H264-decoding.patch:536: 
CHECK: Unnecessary parentheses around 'pocb < builder->curpoc'
patches/0009-media-hantro-Add-core-bits-to-support-H264-decoding.patch:577: 
CHECK: Unnecessary parentheses around 'poca < builder->curpoc'
patches/0009-media-hantro-Add-core-bits-to-support-H264-decoding.patch:577: 
CHECK: Unnecessary parentheses around 'pocb < builder->curpoc'

Error #256 when running ./scripts/checkpatch.pl --terse --mailback --no-summary 
--strict patches/0009-media-hantro-Add-core-bits-to-support-H264-decoding.patch
patches/0010-media-hantro-Add-support-for-H264-decoding-on-G1.patch:32: 
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
patches/0010-media-hantro-Add-support-for-H264-decoding-on-G1.patch:89: 
WARNING: line over 80 characters
patches/0010-media-hantro-Add-support-for-H264-decoding-on-G1.patch:95: 
WARNING: line over 80 characters
patches/0010-media-hantro-Add-support-for-H264-decoding-on-G1.patch:107: 
WARNING: line over 80 characters
patches/0010-media-hantro-Add-support-for-H264-decoding-on-G1.patch:111: 
WARNING: line over 80 characters
patches/0010-media-hantro-Add-support-for-H264-decoding-on-G1.patch:125: 
WARNING: line over 80 characters
patches/0010-media-hantro-Add-support-for-H264-decoding-on-G1.patch:141: 
WARNING: line over 80 characters
patches/0010-media-hantro-Add-support-for-H264-decoding-on-G1.patch:142: 
WARNING: line over 80 characters

Error #256 when running ./scripts/checkpatch.pl --terse --mailback --no-summary 
--strict patches/0010-media-hantro-Add-support-for-H264-decoding-on-G1.patch
patches/0017-media-docs-rst-Document-memory-to-memory-video-decod.patch:39: 
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?

Error #256 when running ./scripts/checkpatch.pl --terse --mailback --no-summary 
--strict patches/0017-media-docs-rst-Document-memory-to-memory-video-decod.patch



[GIT PULL FOR v5.4] Hantro H.264 + finish stateful decoder spec

2019-08-17 Thread Hans Verkuil
Hi Mauro,

This PR takes Ezequiel's series adding H.264 decoding for hantro
(https://patchwork.linuxtv.org/project/linux-media/list/?series=603).

The first patch (lib/sort.c) is Acked by Andrew Morton and is intended to
go in through the media subsystem since this is the first driver that uses
it.

This series substantially improves the H.264 API. Only H.264 slicing support
still requires some more work.

I double-checked that the H.264 control structures have the same layout between
32 and 64 bit architectures and do not contain any holes.

The second part of this PR is core API improvements to help give more
information about the HW capabilities by adding two new format flags.

This second part consists of patches 1-3 and 5-8 of this series:

https://patchwork.linuxtv.org/project/linux-media/list/?series=588

I dropped patch 4 until I have an Ack from Samsung, and patches 9-12
are not ready yet for merging.

Most importantly, the stateful decoder specification is now merged.

The final patch improves pixfmt-compressed.srt. It's still not perfect
and I plan to make more changes there with references to the various
codec standards, so there will be a follow-up patch, but for now this
is still better than what we had.

Note that the v4l2-compliance test fails with vicodec after this PR
is merged. I have a patch for v4l-utils ready to fix this.

Regards,

Hans


The following changes since commit 31d5d15dfc3418a57cfab419a353d8dc5f5698b5:

  media: MAINTAINERS: Add entry for the ov5670 driver (2019-08-15 08:17:04 
-0300)

are available in the Git repository at:

  git://linuxtv.org/hverkuil/media_tree.git tags/br-v5.4l

for you to fetch changes up to bf7ca7e0046e7f4e246876e9cfab5a65ca1ec72a:

  pixfmt-compressed.rst: improve H264/HEVC/MPEG1+2/VP8+9 documentation 
(2019-08-17 10:24:37 +0200)


Tag branch


Boris Brezillon (3):
  media: uapi: h264: Add the concept of decoding mode
  media: uapi: h264: Get rid of the p0/b0/b1 ref-lists
  media: hantro: Move copy_metadata() before doing a decode operation

Ezequiel Garcia (4):
  media: uapi: h264: Rename pixel format
  media: uapi: h264: Add the concept of start code
  media: cedrus: Cleanup control initialization
  media: cedrus: Specify H264 startcode and decoding mode

Hans Verkuil (2):
  videodev2.h: add V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM
  pixfmt-compressed.rst: improve H264/HEVC/MPEG1+2/VP8+9 documentation

Hertz Wong (3):
  media: hantro: Add core bits to support H264 decoding
  media: hantro: Add support for H264 decoding on G1
  media: hantro: Enable H264 decoding on rk3288

Maxime Jourdan (4):
  videodev2.h: add V4L2_FMT_FLAG_DYN_RESOLUTION
  media: venus: vdec: flag OUTPUT formats with V4L2_FMT_FLAG_DYN_RESOLUTION
  media: mtk-vcodec: flag OUTPUT formats with V4L2_FMT_FLAG_DYN_RESOLUTION
  media: vicodec: set flags for vdec/stateful OUTPUT coded formats

Rasmus Villemoes (1):
  lib/sort.c: implement sort() variant taking context argument

Tomasz Figa (1):
  media: docs-rst: Document memory-to-memory video decoder interface

 Documentation/media/uapi/v4l/dev-decoder.rst| 1101 
+++
 Documentation/media/uapi/v4l/dev-mem2mem.rst|8 +-
 Documentation/media/uapi/v4l/ext-ctrls-codec.rst|   99 -
 Documentation/media/uapi/v4l/pixfmt-compressed.rst  |   47 +-
 Documentation/media/uapi/v4l/pixfmt-v4l2.rst|5 +
 Documentation/media/uapi/v4l/v4l2.rst   |   10 +-
 Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst |   41 +-
 Documentation/media/uapi/v4l/vidioc-dqevent.rst |   11 +-
 Documentation/media/uapi/v4l/vidioc-enum-fmt.rst|   16 +
 Documentation/media/videodev2.h.rst.exceptions  |2 +
 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c  |4 +
 drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h  |1 +
 drivers/media/platform/qcom/venus/core.h|1 +
 drivers/media/platform/qcom/venus/vdec.c|   11 +
 drivers/media/platform/vicodec/vicodec-core.c   |3 +
 drivers/media/v4l2-core/v4l2-ctrls.c|   18 +
 drivers/media/v4l2-core/v4l2-ioctl.c|2 +-
 drivers/staging/media/hantro/Makefile   |2 +
 drivers/staging/media/hantro/hantro.h   |9 +-
 drivers/staging/media/hantro/hantro_drv.c   |   50 ++-
 drivers/staging/media/hantro/hantro_g1_h264_dec.c   |  292 +
 drivers/staging/media/hantro/hantro_h264.c  |  646 
+++
 drivers/staging/media/hantro/hantro_hw.h|   56 +++
 drivers/staging/media/hantro/hantro_v4l2.c  |   10 +
 drivers/staging/media/hantro/rk3288_vpu_hw.c|   21 +-
 drivers/staging/media/sunxi/cedrus/cedrus.c |   63 ++-
 drivers/staging/media/su

[PATCHv3 02/12] videodev2.h: add V4L2_FMT_FLAG_DYN_RESOLUTION

2019-08-15 Thread Hans Verkuil
From: Maxime Jourdan 

Add an enum_fmt format flag to specifically tag coded formats where
dynamic resolution switching is supported by the device.

This is useful for some codec drivers that can support dynamic
resolution switching for one or more of their listed coded formats. It
allows userspace to know whether it should extract the video parameters
itself, or if it can rely on the device to send V4L2_EVENT_SOURCE_CHANGE
when such changes are detected.

Signed-off-by: Maxime Jourdan 
Reviewed-by: Paul Kocialkowski 
Reviewed-by: Alexandre Courbot 
Acked-by: Tomasz Figa 
Signed-off-by: Hans Verkuil 
---
 Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
 Documentation/media/videodev2.h.rst.exceptions   | 1 +
 include/uapi/linux/videodev2.h   | 1 +
 3 files changed, 10 insertions(+)

diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
index ebc05ce74bdf..399ef1062bac 100644
--- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
+++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
@@ -135,6 +135,14 @@ one until ``EINVAL`` is returned.
between frames/fields. This flag can only be used in combination with
the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
formats only. This flag is valid for stateful decoders only.
+* - ``V4L2_FMT_FLAG_DYN_RESOLUTION``
+  - 0x0008
+  - Dynamic resolution switching is supported by the device for this
+   compressed bytestream format (aka coded format). It will notify the user
+   via the event ``V4L2_EVENT_SOURCE_CHANGE`` when changes in the video
+   parameters are detected. This flag can only be used in combination
+   with the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to
+   compressed formats only. It is also only applies to stateful codecs.
 
 
 Return Value
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index a0640b6d0f68..adeb6b7a15cb 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -181,6 +181,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
reserved-formats
 replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
 replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
 replace define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM fmtdesc-flags
+replace define V4L2_FMT_FLAG_DYN_RESOLUTION fmtdesc-flags
 
 # V4L2 timecode types
 replace define V4L2_TC_TYPE_24FPS timecode-type
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 67077d52c59d..530638dffd93 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -777,6 +777,7 @@ struct v4l2_fmtdesc {
 #define V4L2_FMT_FLAG_COMPRESSED   0x0001
 #define V4L2_FMT_FLAG_EMULATED 0x0002
 #define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM0x0004
+#define V4L2_FMT_FLAG_DYN_RESOLUTION   0x0008
 
/* Frame Size and frame rate enumeration */
 /*
-- 
2.20.1



[PATCHv3 01/12] videodev2.h: add V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM

2019-08-15 Thread Hans Verkuil
Add an enum_fmt format flag to specifically tag coded formats where
full bytestream parsing is supported by the device.

Some stateful decoders are capable of fully parsing a bytestream,
but others require that userspace pre-parses the bytestream into
frames or fields (see the corresponding pixelformat descriptions
for details).

If this flag is set, then this pre-parsing step is not required
(but still possible, of course).

Signed-off-by: Hans Verkuil 
Reviewed-by: Paul Kocialkowski 
Reviewed-by: Alexandre Courbot 
---
 Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
 Documentation/media/videodev2.h.rst.exceptions   | 1 +
 include/uapi/linux/videodev2.h   | 5 +++--
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
index 822d6730e7d2..ebc05ce74bdf 100644
--- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
+++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
@@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
   - This format is not native to the device but emulated through
software (usually libv4l2), where possible try to use a native
format instead for better performance.
+* - ``V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM``
+  - 0x0004
+  - The hardware decoder for this compressed bytestream format (aka coded
+   format) is capable of parsing a continuous bytestream. Applications do
+   not need to parse the bytestream themselves to find the boundaries
+   between frames/fields. This flag can only be used in combination with
+   the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
+   formats only. This flag is valid for stateful decoders only.
 
 
 Return Value
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index 8e7d3492d248..a0640b6d0f68 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -180,6 +180,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
reserved-formats
 # V4L2 format flags
 replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
 replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
+replace define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM fmtdesc-flags
 
 # V4L2 timecode types
 replace define V4L2_TC_TYPE_24FPS timecode-type
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 2427bc4d8eba..67077d52c59d 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -774,8 +774,9 @@ struct v4l2_fmtdesc {
__u32   reserved[4];
 };
 
-#define V4L2_FMT_FLAG_COMPRESSED 0x0001
-#define V4L2_FMT_FLAG_EMULATED   0x0002
+#define V4L2_FMT_FLAG_COMPRESSED   0x0001
+#define V4L2_FMT_FLAG_EMULATED 0x0002
+#define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM0x0004
 
/* Frame Size and frame rate enumeration */
 /*
-- 
2.20.1



[PATCHv3 10/12] videodev2.h: add V4L2_DEC_CMD_FLUSH

2019-08-15 Thread Hans Verkuil
Add this new V4L2_DEC_CMD_FLUSH decoder command and document it.

Signed-off-by: Hans Verkuil 
---
 Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst | 11 ++-
 Documentation/media/videodev2.h.rst.exceptions  |  1 +
 include/uapi/linux/videodev2.h  |  1 +
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst 
b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
index 57f0066f4cff..0bffef6058f7 100644
--- a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
+++ b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
@@ -208,7 +208,16 @@ introduced in Linux 3.3. They are, however, mandatory for 
stateful mem2mem decod
been started yet, the driver will return an ``EPERM`` error code. When
the decoder is already running, this command does nothing. No
flags are defined for this command.
-
+* - ``V4L2_DEC_CMD_FLUSH``
+  - 4
+  - Flush any held capture buffers. Only valid for stateless decoders,
+and only if ``V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF`` was set.
+   This command is typically used when the application reached the
+   end of the stream and the last output buffer had the
+   ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag set. This would prevent
+   dequeueing the last capture buffer containing the last decoded frame.
+   So this command can be used to explicitly flush that last decoded
+   frame.
 
 Return Value
 
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index adeb6b7a15cb..a79028e4d929 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -434,6 +434,7 @@ replace define V4L2_DEC_CMD_START decoder-cmds
 replace define V4L2_DEC_CMD_STOP decoder-cmds
 replace define V4L2_DEC_CMD_PAUSE decoder-cmds
 replace define V4L2_DEC_CMD_RESUME decoder-cmds
+replace define V4L2_DEC_CMD_FLUSH decoder-cmds
 
 replace define V4L2_DEC_CMD_START_MUTE_AUDIO decoder-cmds
 replace define V4L2_DEC_CMD_PAUSE_TO_BLACK decoder-cmds
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 4fa9f543742d..91a79e16089c 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -1978,6 +1978,7 @@ struct v4l2_encoder_cmd {
 #define V4L2_DEC_CMD_STOP(1)
 #define V4L2_DEC_CMD_PAUSE   (2)
 #define V4L2_DEC_CMD_RESUME  (3)
+#define V4L2_DEC_CMD_FLUSH   (4)
 
 /* Flags for V4L2_DEC_CMD_START */
 #define V4L2_DEC_CMD_START_MUTE_AUDIO  (1 << 0)
-- 
2.20.1



Re: [PATCHv2 02/12] videodev2.h: add V4L2_FMT_FLAG_DYN_RESOLUTION

2019-08-15 Thread Hans Verkuil
On 8/14/19 2:53 PM, Paul Kocialkowski wrote:
> Hi,
> 
> On Mon 12 Aug 19, 13:05, Hans Verkuil wrote:
>> From: Maxime Jourdan 
>>
>> Add an enum_fmt format flag to specifically tag coded formats where
>> dynamic resolution switching is supported by the device.
>>
>> This is useful for some codec drivers that can support dynamic
>> resolution switching for one or more of their listed coded formats. It
>> allows userspace to know whether it should extract the video parameters
>> itself, or if it can rely on the device to send V4L2_EVENT_SOURCE_CHANGE
>> when such changes are detected.
> 
> Makes sense and looks good to me:
> Reviewed-by: Paul Kocialkowski 
> 
> The docs aren't saying that this only applies to stateful decoders, but I 
> think
> it is quite clear that this can't apply to stateless decoders.
> 
> Cheers,
> 
> Paul
> 
>> Signed-off-by: Maxime Jourdan 
>> Signed-off-by: Hans Verkuil 
>> [hverkuil-ci...@xs4all.nl: added flag to videodev2.h.rst.exceptions]
>> [hverkuil-ci...@xs4all.nl: updated commit text: 'one or more' instead of 
>> 'all']
>> Acked-by: Tomasz Figa 
>> ---
>>  Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
>>  Documentation/media/videodev2.h.rst.exceptions   | 1 +
>>  include/uapi/linux/videodev2.h   | 1 +
>>  3 files changed, 10 insertions(+)
>>
>> diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
>> b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
>> index ebc05ce74bdf..719f1ed64f7d 100644
>> --- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
>> +++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
>> @@ -135,6 +135,14 @@ one until ``EINVAL`` is returned.
>>  between frames/fields. This flag can only be used in combination with
>>  the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
>>  formats only. This flag is valid for stateful decoders only.
>> +* - ``V4L2_FMT_FLAG_DYN_RESOLUTION``
>> +  - 0x0008
>> +  - Dynamic resolution switching is supported by the device for this
>> +compressed bytestream format (aka coded format). It will notify the user
>> +via the event ``V4L2_EVENT_SOURCE_CHANGE`` when changes in the video
>> +parameters are detected. This flag can only be used in combination
>> +with the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to
>> +compressed formats only.

I added "It is also only applies to stateful codecs." at the end of this
paragraph.

Regards,

Hans

>>  
>>  
>>  Return Value
>> diff --git a/Documentation/media/videodev2.h.rst.exceptions 
>> b/Documentation/media/videodev2.h.rst.exceptions
>> index a0640b6d0f68..adeb6b7a15cb 100644
>> --- a/Documentation/media/videodev2.h.rst.exceptions
>> +++ b/Documentation/media/videodev2.h.rst.exceptions
>> @@ -181,6 +181,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
>> reserved-formats
>>  replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
>>  replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
>>  replace define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM fmtdesc-flags
>> +replace define V4L2_FMT_FLAG_DYN_RESOLUTION fmtdesc-flags
>>  
>>  # V4L2 timecode types
>>  replace define V4L2_TC_TYPE_24FPS timecode-type
>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>> index 67077d52c59d..530638dffd93 100644
>> --- a/include/uapi/linux/videodev2.h
>> +++ b/include/uapi/linux/videodev2.h
>> @@ -777,6 +777,7 @@ struct v4l2_fmtdesc {
>>  #define V4L2_FMT_FLAG_COMPRESSED0x0001
>>  #define V4L2_FMT_FLAG_EMULATED  0x0002
>>  #define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM 0x0004
>> +#define V4L2_FMT_FLAG_DYN_RESOLUTION0x0008
>>  
>>  /* Frame Size and frame rate enumeration */
>>  /*
>> -- 
>> 2.20.1
>>
> 



Re: [PATCHv2 10/12] videodev2.h: add V4L2_DEC_CMD_FLUSH

2019-08-15 Thread Hans Verkuil
On 8/15/19 10:12 AM, Alexandre Courbot wrote:
> On Mon, Aug 12, 2019 at 8:07 PM Hans Verkuil  wrote:
>>
>> Add this new V4L2_DEC_CMD_FLUSH decoder command and document it.
>>
>> Signed-off-by: Hans Verkuil 
>> ---
>>  Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst | 11 ++-
>>  Documentation/media/videodev2.h.rst.exceptions  |  1 +
>>  include/uapi/linux/videodev2.h  |  1 +
>>  3 files changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst 
>> b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
>> index 57f0066f4cff..0bffef6058f7 100644
>> --- a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
>> +++ b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
>> @@ -208,7 +208,16 @@ introduced in Linux 3.3. They are, however, mandatory 
>> for stateful mem2mem decod
>> been started yet, the driver will return an ``EPERM`` error code. 
>> When
>> the decoder is already running, this command does nothing. No
>> flags are defined for this command.
>> -
>> +* - ``V4L2_DEC_CMD_FLUSH``
>> +  - 4
>> +  - Flush any held capture buffers. Only valid for stateless decoders,
>> +and only if ``V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF`` was set.
>> +   This command is typically used when the application reached the
>> +   end of the stream and the last output buffer had the
>> +   ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag set. This would prevent
>> +   dequeueing the last capture buffer containing the last decoded frame.
>> +   So this command can be used to explicitly flush that last decoded
>> +   frame.
> 
> I'm a bit confused here, isn't this command referred to as
> V4L2_DEC_CMD_STOP in the previous patch?

That was a typo in the previous patch. It really is FLUSH.

Regards,

Hans

> 
> 
>>
>>  Return Value
>>  
>> diff --git a/Documentation/media/videodev2.h.rst.exceptions 
>> b/Documentation/media/videodev2.h.rst.exceptions
>> index adeb6b7a15cb..a79028e4d929 100644
>> --- a/Documentation/media/videodev2.h.rst.exceptions
>> +++ b/Documentation/media/videodev2.h.rst.exceptions
>> @@ -434,6 +434,7 @@ replace define V4L2_DEC_CMD_START decoder-cmds
>>  replace define V4L2_DEC_CMD_STOP decoder-cmds
>>  replace define V4L2_DEC_CMD_PAUSE decoder-cmds
>>  replace define V4L2_DEC_CMD_RESUME decoder-cmds
>> +replace define V4L2_DEC_CMD_FLUSH decoder-cmds
>>
>>  replace define V4L2_DEC_CMD_START_MUTE_AUDIO decoder-cmds
>>  replace define V4L2_DEC_CMD_PAUSE_TO_BLACK decoder-cmds
>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>> index 4fa9f543742d..91a79e16089c 100644
>> --- a/include/uapi/linux/videodev2.h
>> +++ b/include/uapi/linux/videodev2.h
>> @@ -1978,6 +1978,7 @@ struct v4l2_encoder_cmd {
>>  #define V4L2_DEC_CMD_STOP(1)
>>  #define V4L2_DEC_CMD_PAUSE   (2)
>>  #define V4L2_DEC_CMD_RESUME  (3)
>> +#define V4L2_DEC_CMD_FLUSH   (4)
>>
>>  /* Flags for V4L2_DEC_CMD_START */
>>  #define V4L2_DEC_CMD_START_MUTE_AUDIO  (1 << 0)
>> --
>> 2.20.1
>>



Re: [PATCHv2 02/12] videodev2.h: add V4L2_FMT_FLAG_DYN_RESOLUTION

2019-08-15 Thread Tomasz Figa
On Thu, Aug 15, 2019 at 5:12 PM Alexandre Courbot  wrote:
>
> On Wed, Aug 14, 2019 at 9:53 PM Paul Kocialkowski
>  wrote:
> >
> > Hi,
> >
> > On Mon 12 Aug 19, 13:05, Hans Verkuil wrote:
> > > From: Maxime Jourdan 
> > >
> > > Add an enum_fmt format flag to specifically tag coded formats where
> > > dynamic resolution switching is supported by the device.
> > >
> > > This is useful for some codec drivers that can support dynamic
> > > resolution switching for one or more of their listed coded formats. It
> > > allows userspace to know whether it should extract the video parameters
> > > itself, or if it can rely on the device to send V4L2_EVENT_SOURCE_CHANGE
> > > when such changes are detected.
> >
> > Makes sense and looks good to me:
> > Reviewed-by: Paul Kocialkowski 
> >
> > The docs aren't saying that this only applies to stateful decoders, but I 
> > think
> > it is quite clear that this can't apply to stateless decoders.
>
> Even though this can be inferred from reading the specs, I think it
> would be reasonable to explicitly mention it though.
>
> I also wonder, since this flag does not make sense for encoders, maybe
> we can use more precise vocabulary in the patch description and doc?
> I.e. s/codec/decoder.
>
> With that,
> Reviewed-by: Alexandre Courbot 

There is no reason why it couldn't apply to an encoder. I think the
idea is to actually have the encoder advertise the same flag once we
figure out how to implement encoding with resolution changes.

Best regards,
Tomasz


Re: [PATCHv2 01/12] videodev2.h: add V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM

2019-08-15 Thread Alexandre Courbot
On Thu, Aug 15, 2019 at 5:11 PM Alexandre Courbot  wrote:
>
> On Mon, Aug 12, 2019 at 8:05 PM Hans Verkuil  wrote:
> >
> > Add an enum_fmt format flag to specifically tag coded formats where
> > full bytestream parsing is supported by the device.
> >
> > Some stateful decoders are capable of fully parsing a bytestream,
> > but others require that userspace pre-parses the bytestream into
> > frames or fields (see the corresponding pixelformat descriptions
> > for details).
>
> Reviewed-by: Alexandre Courbot 
>
> This patch does not update the pixelformat descriptions though, are we
> planning on doing this?

I pressed Send too fast, patch 8 takes care of this. Sorry for the noise.

>
>
> >
> > If this flag is set, then this pre-parsing step is not required
> > (but still possible, of course).
> >
> > Signed-off-by: Hans Verkuil 
> > ---
> >  Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
> >  Documentation/media/videodev2.h.rst.exceptions   | 1 +
> >  include/uapi/linux/videodev2.h   | 5 +++--
> >  3 files changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
> > b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > index 822d6730e7d2..ebc05ce74bdf 100644
> > --- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > +++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > @@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
> >- This format is not native to the device but emulated through
> > software (usually libv4l2), where possible try to use a native
> > format instead for better performance.
> > +* - ``V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM``
> > +  - 0x0004
> > +  - The hardware decoder for this compressed bytestream format (aka 
> > coded
> > +   format) is capable of parsing a continuous bytestream. Applications 
> > do
> > +   not need to parse the bytestream themselves to find the boundaries
> > +   between frames/fields. This flag can only be used in combination 
> > with
> > +   the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to 
> > compressed
> > +   formats only. This flag is valid for stateful decoders only.
> >
> >
> >  Return Value
> > diff --git a/Documentation/media/videodev2.h.rst.exceptions 
> > b/Documentation/media/videodev2.h.rst.exceptions
> > index 8e7d3492d248..a0640b6d0f68 100644
> > --- a/Documentation/media/videodev2.h.rst.exceptions
> > +++ b/Documentation/media/videodev2.h.rst.exceptions
> > @@ -180,6 +180,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
> > reserved-formats
> >  # V4L2 format flags
> >  replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
> >  replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
> > +replace define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM fmtdesc-flags
> >
> >  # V4L2 timecode types
> >  replace define V4L2_TC_TYPE_24FPS timecode-type
> > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> > index 2427bc4d8eba..67077d52c59d 100644
> > --- a/include/uapi/linux/videodev2.h
> > +++ b/include/uapi/linux/videodev2.h
> > @@ -774,8 +774,9 @@ struct v4l2_fmtdesc {
> > __u32   reserved[4];
> >  };
> >
> > -#define V4L2_FMT_FLAG_COMPRESSED 0x0001
> > -#define V4L2_FMT_FLAG_EMULATED   0x0002
> > +#define V4L2_FMT_FLAG_COMPRESSED   0x0001
> > +#define V4L2_FMT_FLAG_EMULATED 0x0002
> > +#define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM0x0004
> >
> > /* Frame Size and frame rate enumeration */
> >  /*
> > --
> > 2.20.1
> >


Re: [PATCHv2 10/12] videodev2.h: add V4L2_DEC_CMD_FLUSH

2019-08-15 Thread Alexandre Courbot
On Mon, Aug 12, 2019 at 8:07 PM Hans Verkuil  wrote:
>
> Add this new V4L2_DEC_CMD_FLUSH decoder command and document it.
>
> Signed-off-by: Hans Verkuil 
> ---
>  Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst | 11 ++-
>  Documentation/media/videodev2.h.rst.exceptions  |  1 +
>  include/uapi/linux/videodev2.h  |  1 +
>  3 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst 
> b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
> index 57f0066f4cff..0bffef6058f7 100644
> --- a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
> +++ b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
> @@ -208,7 +208,16 @@ introduced in Linux 3.3. They are, however, mandatory 
> for stateful mem2mem decod
> been started yet, the driver will return an ``EPERM`` error code. When
> the decoder is already running, this command does nothing. No
> flags are defined for this command.
> -
> +* - ``V4L2_DEC_CMD_FLUSH``
> +  - 4
> +  - Flush any held capture buffers. Only valid for stateless decoders,
> +and only if ``V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF`` was set.
> +   This command is typically used when the application reached the
> +   end of the stream and the last output buffer had the
> +   ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag set. This would prevent
> +   dequeueing the last capture buffer containing the last decoded frame.
> +   So this command can be used to explicitly flush that last decoded
> +   frame.

I'm a bit confused here, isn't this command referred to as
V4L2_DEC_CMD_STOP in the previous patch?


>
>  Return Value
>  
> diff --git a/Documentation/media/videodev2.h.rst.exceptions 
> b/Documentation/media/videodev2.h.rst.exceptions
> index adeb6b7a15cb..a79028e4d929 100644
> --- a/Documentation/media/videodev2.h.rst.exceptions
> +++ b/Documentation/media/videodev2.h.rst.exceptions
> @@ -434,6 +434,7 @@ replace define V4L2_DEC_CMD_START decoder-cmds
>  replace define V4L2_DEC_CMD_STOP decoder-cmds
>  replace define V4L2_DEC_CMD_PAUSE decoder-cmds
>  replace define V4L2_DEC_CMD_RESUME decoder-cmds
> +replace define V4L2_DEC_CMD_FLUSH decoder-cmds
>
>  replace define V4L2_DEC_CMD_START_MUTE_AUDIO decoder-cmds
>  replace define V4L2_DEC_CMD_PAUSE_TO_BLACK decoder-cmds
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 4fa9f543742d..91a79e16089c 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -1978,6 +1978,7 @@ struct v4l2_encoder_cmd {
>  #define V4L2_DEC_CMD_STOP(1)
>  #define V4L2_DEC_CMD_PAUSE   (2)
>  #define V4L2_DEC_CMD_RESUME  (3)
> +#define V4L2_DEC_CMD_FLUSH   (4)
>
>  /* Flags for V4L2_DEC_CMD_START */
>  #define V4L2_DEC_CMD_START_MUTE_AUDIO  (1 << 0)
> --
> 2.20.1
>


Re: [PATCHv2 01/12] videodev2.h: add V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM

2019-08-15 Thread Alexandre Courbot
On Mon, Aug 12, 2019 at 8:05 PM Hans Verkuil  wrote:
>
> Add an enum_fmt format flag to specifically tag coded formats where
> full bytestream parsing is supported by the device.
>
> Some stateful decoders are capable of fully parsing a bytestream,
> but others require that userspace pre-parses the bytestream into
> frames or fields (see the corresponding pixelformat descriptions
> for details).

Reviewed-by: Alexandre Courbot 

This patch does not update the pixelformat descriptions though, are we
planning on doing this?


>
> If this flag is set, then this pre-parsing step is not required
> (but still possible, of course).
>
> Signed-off-by: Hans Verkuil 
> ---
>  Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
>  Documentation/media/videodev2.h.rst.exceptions   | 1 +
>  include/uapi/linux/videodev2.h   | 5 +++--
>  3 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
> b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> index 822d6730e7d2..ebc05ce74bdf 100644
> --- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> +++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> @@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
>- This format is not native to the device but emulated through
> software (usually libv4l2), where possible try to use a native
> format instead for better performance.
> +* - ``V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM``
> +  - 0x0004
> +  - The hardware decoder for this compressed bytestream format (aka coded
> +   format) is capable of parsing a continuous bytestream. Applications do
> +   not need to parse the bytestream themselves to find the boundaries
> +   between frames/fields. This flag can only be used in combination with
> +   the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to 
> compressed
> +   formats only. This flag is valid for stateful decoders only.
>
>
>  Return Value
> diff --git a/Documentation/media/videodev2.h.rst.exceptions 
> b/Documentation/media/videodev2.h.rst.exceptions
> index 8e7d3492d248..a0640b6d0f68 100644
> --- a/Documentation/media/videodev2.h.rst.exceptions
> +++ b/Documentation/media/videodev2.h.rst.exceptions
> @@ -180,6 +180,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
> reserved-formats
>  # V4L2 format flags
>  replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
>  replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
> +replace define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM fmtdesc-flags
>
>  # V4L2 timecode types
>  replace define V4L2_TC_TYPE_24FPS timecode-type
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 2427bc4d8eba..67077d52c59d 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -774,8 +774,9 @@ struct v4l2_fmtdesc {
> __u32   reserved[4];
>  };
>
> -#define V4L2_FMT_FLAG_COMPRESSED 0x0001
> -#define V4L2_FMT_FLAG_EMULATED   0x0002
> +#define V4L2_FMT_FLAG_COMPRESSED   0x0001
> +#define V4L2_FMT_FLAG_EMULATED 0x0002
> +#define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM0x0004
>
> /* Frame Size and frame rate enumeration */
>  /*
> --
> 2.20.1
>


Re: [PATCHv2 02/12] videodev2.h: add V4L2_FMT_FLAG_DYN_RESOLUTION

2019-08-15 Thread Alexandre Courbot
On Wed, Aug 14, 2019 at 9:53 PM Paul Kocialkowski
 wrote:
>
> Hi,
>
> On Mon 12 Aug 19, 13:05, Hans Verkuil wrote:
> > From: Maxime Jourdan 
> >
> > Add an enum_fmt format flag to specifically tag coded formats where
> > dynamic resolution switching is supported by the device.
> >
> > This is useful for some codec drivers that can support dynamic
> > resolution switching for one or more of their listed coded formats. It
> > allows userspace to know whether it should extract the video parameters
> > itself, or if it can rely on the device to send V4L2_EVENT_SOURCE_CHANGE
> > when such changes are detected.
>
> Makes sense and looks good to me:
> Reviewed-by: Paul Kocialkowski 
>
> The docs aren't saying that this only applies to stateful decoders, but I 
> think
> it is quite clear that this can't apply to stateless decoders.

Even though this can be inferred from reading the specs, I think it
would be reasonable to explicitly mention it though.

I also wonder, since this flag does not make sense for encoders, maybe
we can use more precise vocabulary in the patch description and doc?
I.e. s/codec/decoder.

With that,
Reviewed-by: Alexandre Courbot 


Re: [PATCHv2 01/12] videodev2.h: add V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM

2019-08-14 Thread Paul Kocialkowski
Hi,

On Mon 12 Aug 19, 13:05, Hans Verkuil wrote:
> Add an enum_fmt format flag to specifically tag coded formats where
> full bytestream parsing is supported by the device.
> 
> Some stateful decoders are capable of fully parsing a bytestream,
> but others require that userspace pre-parses the bytestream into
> frames or fields (see the corresponding pixelformat descriptions
> for details).
> 
> If this flag is set, then this pre-parsing step is not required
> (but still possible, of course).

Although I wasn't involved with the initial issue, looks good to me!

Reviewed-by: Paul Kocialkowski 

Cheers,

Paul

> Signed-off-by: Hans Verkuil 
> ---
>  Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
>  Documentation/media/videodev2.h.rst.exceptions   | 1 +
>  include/uapi/linux/videodev2.h   | 5 +++--
>  3 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
> b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> index 822d6730e7d2..ebc05ce74bdf 100644
> --- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> +++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> @@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
>- This format is not native to the device but emulated through
>   software (usually libv4l2), where possible try to use a native
>   format instead for better performance.
> +* - ``V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM``
> +  - 0x0004
> +  - The hardware decoder for this compressed bytestream format (aka coded
> + format) is capable of parsing a continuous bytestream. Applications do
> + not need to parse the bytestream themselves to find the boundaries
> + between frames/fields. This flag can only be used in combination with
> + the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
> + formats only. This flag is valid for stateful decoders only.
>  
>  
>  Return Value
> diff --git a/Documentation/media/videodev2.h.rst.exceptions 
> b/Documentation/media/videodev2.h.rst.exceptions
> index 8e7d3492d248..a0640b6d0f68 100644
> --- a/Documentation/media/videodev2.h.rst.exceptions
> +++ b/Documentation/media/videodev2.h.rst.exceptions
> @@ -180,6 +180,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
> reserved-formats
>  # V4L2 format flags
>  replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
>  replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
> +replace define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM fmtdesc-flags
>  
>  # V4L2 timecode types
>  replace define V4L2_TC_TYPE_24FPS timecode-type
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 2427bc4d8eba..67077d52c59d 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -774,8 +774,9 @@ struct v4l2_fmtdesc {
>   __u32   reserved[4];
>  };
>  
> -#define V4L2_FMT_FLAG_COMPRESSED 0x0001
> -#define V4L2_FMT_FLAG_EMULATED   0x0002
> +#define V4L2_FMT_FLAG_COMPRESSED 0x0001
> +#define V4L2_FMT_FLAG_EMULATED   0x0002
> +#define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM  0x0004
>  
>   /* Frame Size and frame rate enumeration */
>  /*
> -- 
> 2.20.1
> 

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com


signature.asc
Description: PGP signature


Re: [PATCHv2 02/12] videodev2.h: add V4L2_FMT_FLAG_DYN_RESOLUTION

2019-08-14 Thread Paul Kocialkowski
Hi,

On Mon 12 Aug 19, 13:05, Hans Verkuil wrote:
> From: Maxime Jourdan 
> 
> Add an enum_fmt format flag to specifically tag coded formats where
> dynamic resolution switching is supported by the device.
> 
> This is useful for some codec drivers that can support dynamic
> resolution switching for one or more of their listed coded formats. It
> allows userspace to know whether it should extract the video parameters
> itself, or if it can rely on the device to send V4L2_EVENT_SOURCE_CHANGE
> when such changes are detected.

Makes sense and looks good to me:
Reviewed-by: Paul Kocialkowski 

The docs aren't saying that this only applies to stateful decoders, but I think
it is quite clear that this can't apply to stateless decoders.

Cheers,

Paul

> Signed-off-by: Maxime Jourdan 
> Signed-off-by: Hans Verkuil 
> [hverkuil-ci...@xs4all.nl: added flag to videodev2.h.rst.exceptions]
> [hverkuil-ci...@xs4all.nl: updated commit text: 'one or more' instead of 
> 'all']
> Acked-by: Tomasz Figa 
> ---
>  Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
>  Documentation/media/videodev2.h.rst.exceptions   | 1 +
>  include/uapi/linux/videodev2.h   | 1 +
>  3 files changed, 10 insertions(+)
> 
> diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
> b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> index ebc05ce74bdf..719f1ed64f7d 100644
> --- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> +++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> @@ -135,6 +135,14 @@ one until ``EINVAL`` is returned.
>   between frames/fields. This flag can only be used in combination with
>   the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
>   formats only. This flag is valid for stateful decoders only.
> +* - ``V4L2_FMT_FLAG_DYN_RESOLUTION``
> +  - 0x0008
> +  - Dynamic resolution switching is supported by the device for this
> + compressed bytestream format (aka coded format). It will notify the user
> + via the event ``V4L2_EVENT_SOURCE_CHANGE`` when changes in the video
> + parameters are detected. This flag can only be used in combination
> + with the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to
> + compressed formats only.
>  
>  
>  Return Value
> diff --git a/Documentation/media/videodev2.h.rst.exceptions 
> b/Documentation/media/videodev2.h.rst.exceptions
> index a0640b6d0f68..adeb6b7a15cb 100644
> --- a/Documentation/media/videodev2.h.rst.exceptions
> +++ b/Documentation/media/videodev2.h.rst.exceptions
> @@ -181,6 +181,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
> reserved-formats
>  replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
>  replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
>  replace define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM fmtdesc-flags
> +replace define V4L2_FMT_FLAG_DYN_RESOLUTION fmtdesc-flags
>  
>  # V4L2 timecode types
>  replace define V4L2_TC_TYPE_24FPS timecode-type
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 67077d52c59d..530638dffd93 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -777,6 +777,7 @@ struct v4l2_fmtdesc {
>  #define V4L2_FMT_FLAG_COMPRESSED 0x0001
>  #define V4L2_FMT_FLAG_EMULATED   0x0002
>  #define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM  0x0004
> +#define V4L2_FMT_FLAG_DYN_RESOLUTION 0x0008
>  
>   /* Frame Size and frame rate enumeration */
>  /*
> -- 
> 2.20.1
> 

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com


signature.asc
Description: PGP signature


[PATCHv2 10/12] videodev2.h: add V4L2_DEC_CMD_FLUSH

2019-08-12 Thread Hans Verkuil
Add this new V4L2_DEC_CMD_FLUSH decoder command and document it.

Signed-off-by: Hans Verkuil 
---
 Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst | 11 ++-
 Documentation/media/videodev2.h.rst.exceptions  |  1 +
 include/uapi/linux/videodev2.h  |  1 +
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst 
b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
index 57f0066f4cff..0bffef6058f7 100644
--- a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
+++ b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
@@ -208,7 +208,16 @@ introduced in Linux 3.3. They are, however, mandatory for 
stateful mem2mem decod
been started yet, the driver will return an ``EPERM`` error code. When
the decoder is already running, this command does nothing. No
flags are defined for this command.
-
+* - ``V4L2_DEC_CMD_FLUSH``
+  - 4
+  - Flush any held capture buffers. Only valid for stateless decoders,
+and only if ``V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF`` was set.
+   This command is typically used when the application reached the
+   end of the stream and the last output buffer had the
+   ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag set. This would prevent
+   dequeueing the last capture buffer containing the last decoded frame.
+   So this command can be used to explicitly flush that last decoded
+   frame.
 
 Return Value
 
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index adeb6b7a15cb..a79028e4d929 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -434,6 +434,7 @@ replace define V4L2_DEC_CMD_START decoder-cmds
 replace define V4L2_DEC_CMD_STOP decoder-cmds
 replace define V4L2_DEC_CMD_PAUSE decoder-cmds
 replace define V4L2_DEC_CMD_RESUME decoder-cmds
+replace define V4L2_DEC_CMD_FLUSH decoder-cmds
 
 replace define V4L2_DEC_CMD_START_MUTE_AUDIO decoder-cmds
 replace define V4L2_DEC_CMD_PAUSE_TO_BLACK decoder-cmds
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 4fa9f543742d..91a79e16089c 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -1978,6 +1978,7 @@ struct v4l2_encoder_cmd {
 #define V4L2_DEC_CMD_STOP(1)
 #define V4L2_DEC_CMD_PAUSE   (2)
 #define V4L2_DEC_CMD_RESUME  (3)
+#define V4L2_DEC_CMD_FLUSH   (4)
 
 /* Flags for V4L2_DEC_CMD_START */
 #define V4L2_DEC_CMD_START_MUTE_AUDIO  (1 << 0)
-- 
2.20.1



[PATCHv2 01/12] videodev2.h: add V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM

2019-08-12 Thread Hans Verkuil
Add an enum_fmt format flag to specifically tag coded formats where
full bytestream parsing is supported by the device.

Some stateful decoders are capable of fully parsing a bytestream,
but others require that userspace pre-parses the bytestream into
frames or fields (see the corresponding pixelformat descriptions
for details).

If this flag is set, then this pre-parsing step is not required
(but still possible, of course).

Signed-off-by: Hans Verkuil 
---
 Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
 Documentation/media/videodev2.h.rst.exceptions   | 1 +
 include/uapi/linux/videodev2.h   | 5 +++--
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
index 822d6730e7d2..ebc05ce74bdf 100644
--- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
+++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
@@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
   - This format is not native to the device but emulated through
software (usually libv4l2), where possible try to use a native
format instead for better performance.
+* - ``V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM``
+  - 0x0004
+  - The hardware decoder for this compressed bytestream format (aka coded
+   format) is capable of parsing a continuous bytestream. Applications do
+   not need to parse the bytestream themselves to find the boundaries
+   between frames/fields. This flag can only be used in combination with
+   the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
+   formats only. This flag is valid for stateful decoders only.
 
 
 Return Value
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index 8e7d3492d248..a0640b6d0f68 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -180,6 +180,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
reserved-formats
 # V4L2 format flags
 replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
 replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
+replace define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM fmtdesc-flags
 
 # V4L2 timecode types
 replace define V4L2_TC_TYPE_24FPS timecode-type
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 2427bc4d8eba..67077d52c59d 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -774,8 +774,9 @@ struct v4l2_fmtdesc {
__u32   reserved[4];
 };
 
-#define V4L2_FMT_FLAG_COMPRESSED 0x0001
-#define V4L2_FMT_FLAG_EMULATED   0x0002
+#define V4L2_FMT_FLAG_COMPRESSED   0x0001
+#define V4L2_FMT_FLAG_EMULATED 0x0002
+#define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM0x0004
 
/* Frame Size and frame rate enumeration */
 /*
-- 
2.20.1



[PATCHv2 02/12] videodev2.h: add V4L2_FMT_FLAG_DYN_RESOLUTION

2019-08-12 Thread Hans Verkuil
From: Maxime Jourdan 

Add an enum_fmt format flag to specifically tag coded formats where
dynamic resolution switching is supported by the device.

This is useful for some codec drivers that can support dynamic
resolution switching for one or more of their listed coded formats. It
allows userspace to know whether it should extract the video parameters
itself, or if it can rely on the device to send V4L2_EVENT_SOURCE_CHANGE
when such changes are detected.

Signed-off-by: Maxime Jourdan 
Signed-off-by: Hans Verkuil 
[hverkuil-ci...@xs4all.nl: added flag to videodev2.h.rst.exceptions]
[hverkuil-ci...@xs4all.nl: updated commit text: 'one or more' instead of 'all']
Acked-by: Tomasz Figa 
---
 Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
 Documentation/media/videodev2.h.rst.exceptions   | 1 +
 include/uapi/linux/videodev2.h   | 1 +
 3 files changed, 10 insertions(+)

diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
index ebc05ce74bdf..719f1ed64f7d 100644
--- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
+++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
@@ -135,6 +135,14 @@ one until ``EINVAL`` is returned.
between frames/fields. This flag can only be used in combination with
the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
formats only. This flag is valid for stateful decoders only.
+* - ``V4L2_FMT_FLAG_DYN_RESOLUTION``
+  - 0x0008
+  - Dynamic resolution switching is supported by the device for this
+   compressed bytestream format (aka coded format). It will notify the user
+   via the event ``V4L2_EVENT_SOURCE_CHANGE`` when changes in the video
+   parameters are detected. This flag can only be used in combination
+   with the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to
+   compressed formats only.
 
 
 Return Value
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index a0640b6d0f68..adeb6b7a15cb 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -181,6 +181,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
reserved-formats
 replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
 replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
 replace define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM fmtdesc-flags
+replace define V4L2_FMT_FLAG_DYN_RESOLUTION fmtdesc-flags
 
 # V4L2 timecode types
 replace define V4L2_TC_TYPE_24FPS timecode-type
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 67077d52c59d..530638dffd93 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -777,6 +777,7 @@ struct v4l2_fmtdesc {
 #define V4L2_FMT_FLAG_COMPRESSED   0x0001
 #define V4L2_FMT_FLAG_EMULATED 0x0002
 #define V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM0x0004
+#define V4L2_FMT_FLAG_DYN_RESOLUTION   0x0008
 
/* Frame Size and frame rate enumeration */
 /*
-- 
2.20.1



Re: [PATCH 02/14] videodev2.h: add V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER

2019-08-01 Thread Nicolas Dufresne
Le mardi 30 juillet 2019 à 09:21 +0200, Hans Verkuil a écrit :
> On 7/29/19 3:18 PM, Tomasz Figa wrote:
> > On Mon, Jul 29, 2019 at 10:12 PM Paul Kocialkowski
> >  wrote:
> > > Hi,
> > > 
> > > On Sun 28 Jul 19, 23:05, Tomasz Figa wrote:
> > > > On Sat, Jul 27, 2019 at 6:37 PM Paul Kocialkowski
> > > >  wrote:
> > > > > Hi,
> > > > > 
> > > > > On Wed 24 Jul 19, 13:05, Hans Verkuil wrote:
> > > > > > Add an enum_fmt format flag to specifically tag coded formats where
> > > > > > full bitstream parsing is supported by the device.
> > > > > > 
> > > > > > Some stateful decoders are capable of fully parsing a bitstream,
> > > > > > but others require that userspace pre-parses the bitstream into
> > > > > > frames or fields (see the corresponding pixelformat descriptions
> > > > > > for details).
> > > > > > 
> > > > > > If this flag is set, then this pre-parsing step is not required
> > > > > > (but still possible, of course).
> > > > > > 
> > > > > > Signed-off-by: Hans Verkuil 
> > > > > > ---
> > > > > >  Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
> > > > > >  Documentation/media/videodev2.h.rst.exceptions   | 1 +
> > > > > >  include/uapi/linux/videodev2.h   | 5 +++--
> > > > > >  3 files changed, 12 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
> > > > > > b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > > > > > index 822d6730e7d2..4e24e671f32e 100644
> > > > > > --- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > > > > > +++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > > > > > @@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
> > > > > >- This format is not native to the device but emulated 
> > > > > > through
> > > > > >   software (usually libv4l2), where possible try to use a native
> > > > > >   format instead for better performance.
> > > > > > +* - ``V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER``
> > > > > > +  - 0x0004
> > > > > > +  - The hardware decoder for this compressed bitstream format 
> > > > > > (aka coded
> > > > > > + format) is capable of parsing the bitstream. Applications do 
> > > > > > not
> > > > > > + need to parse the bitstream themselves to find the boundaries 
> > > > > > between
> > > > > > + frames/fields. This flag can only be used in combination with 
> > > > > > the
> > > > > > + ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to 
> > > > > > compressed
> > > > > > + formats only.
> > > > > 
> > > > > Should this flag be set for stateless codecs as well? It seems a bit 
> > > > > over-kill
> > > > > for this case. I am not sure whether "compressed bitstream format" 
> > > > > clearly only
> > > > > covers the formats used by stateful decoders and not the ones for 
> > > > > stateless
> > > > > decoders.
> > > > 
> > > > I'd suggest using a different name for the flag, because "bitstream
> > > > parser" is actually one of the core differences between stateful and
> > > > stateless. All stateful decoders have bitstream parsers, the only
> > > > difference between the implementations is the unit on which the parser
> > > > operates, i.e. full stream, frame, NALU.
> > > > 
> > > > Perhaps V4L2_FMT_FLAG_CONTINUOUS_BITSTREAM (as opposed to discrete,
> > > > framed/sliced chunks)?
> > > 
> > > Sure, that seems like a more explicit name regarding what it's supposed to
> > > describe in my opinion.
> 
> I like that name. And this flag is valid for stateful decoders only.

Sorry, I'm not against the name change, but it should be
V4L2_FMT_FLAG_HAS_BYTESTREAM_PARSER (BYTE). Parsers don't support
random bit alignment, so I think usage of bitstream would be miss-
leading. This is playing on words of course, H264 is a bitstream, but
what is passed to the driver is byte

Re: [PATCH 02/14] videodev2.h: add V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER

2019-07-30 Thread Hans Verkuil
On 7/29/19 3:18 PM, Tomasz Figa wrote:
> On Mon, Jul 29, 2019 at 10:12 PM Paul Kocialkowski
>  wrote:
>>
>> Hi,
>>
>> On Sun 28 Jul 19, 23:05, Tomasz Figa wrote:
>>> On Sat, Jul 27, 2019 at 6:37 PM Paul Kocialkowski
>>>  wrote:
>>>>
>>>> Hi,
>>>>
>>>> On Wed 24 Jul 19, 13:05, Hans Verkuil wrote:
>>>>> Add an enum_fmt format flag to specifically tag coded formats where
>>>>> full bitstream parsing is supported by the device.
>>>>>
>>>>> Some stateful decoders are capable of fully parsing a bitstream,
>>>>> but others require that userspace pre-parses the bitstream into
>>>>> frames or fields (see the corresponding pixelformat descriptions
>>>>> for details).
>>>>>
>>>>> If this flag is set, then this pre-parsing step is not required
>>>>> (but still possible, of course).
>>>>>
>>>>> Signed-off-by: Hans Verkuil 
>>>>> ---
>>>>>  Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
>>>>>  Documentation/media/videodev2.h.rst.exceptions   | 1 +
>>>>>  include/uapi/linux/videodev2.h   | 5 +++--
>>>>>  3 files changed, 12 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
>>>>> b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
>>>>> index 822d6730e7d2..4e24e671f32e 100644
>>>>> --- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
>>>>> +++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
>>>>> @@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
>>>>>- This format is not native to the device but emulated through
>>>>>   software (usually libv4l2), where possible try to use a native
>>>>>   format instead for better performance.
>>>>> +* - ``V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER``
>>>>> +  - 0x0004
>>>>> +  - The hardware decoder for this compressed bitstream format (aka 
>>>>> coded
>>>>> + format) is capable of parsing the bitstream. Applications do not
>>>>> + need to parse the bitstream themselves to find the boundaries 
>>>>> between
>>>>> + frames/fields. This flag can only be used in combination with the
>>>>> + ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
>>>>> + formats only.
>>>>
>>>> Should this flag be set for stateless codecs as well? It seems a bit 
>>>> over-kill
>>>> for this case. I am not sure whether "compressed bitstream format" clearly 
>>>> only
>>>> covers the formats used by stateful decoders and not the ones for stateless
>>>> decoders.
>>>
>>> I'd suggest using a different name for the flag, because "bitstream
>>> parser" is actually one of the core differences between stateful and
>>> stateless. All stateful decoders have bitstream parsers, the only
>>> difference between the implementations is the unit on which the parser
>>> operates, i.e. full stream, frame, NALU.
>>>
>>> Perhaps V4L2_FMT_FLAG_CONTINUOUS_BITSTREAM (as opposed to discrete,
>>> framed/sliced chunks)?
>>
>> Sure, that seems like a more explicit name regarding what it's supposed to
>> describe in my opinion.

I like that name. And this flag is valid for stateful decoders only.

>>
>>> Regardless of that, it doesn't make sense for a stateless decoder to
>>> set this flag anyway, because the userspace needs to parse the whole
>>> stream anyway and the whole stateless API is based on the assumption
>>> that the userspace splits the bitstream into frames (or slices).
>>
>> Indeed, I agree that it doesn't make sense, but I thought that the name of 
>> the
>> flag could be confusing. Since there is no direct equivalency between
>> "stateless" and "doesn't parse the bitstream" (as we've seen with the 
>> rockchip
>> decoder needing to parse the slice header on its own), that could have been
>> ambiguous. I think the name you're suggesting mostly solves this concern.
>>
>> I'm still a bit unsure about what "compressed formats" entails or not, so it
>> could be good to explicitly mention that this applies to stateful decoders 
>> only
>> (but it's just a suggestion, advanced users of the API will probably find it
>> straightforward).
> 
> My understanding is that a compressed format is any format that
> doesn't have a directly accessible 2D pixel matrix in its memory
> representation, so all the bitstream formats should have it set.

Correct.

Regards,

Hans

> 
> Best regards,
> Tomasz
> 



Re: [PATCH 02/14] videodev2.h: add V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER

2019-07-29 Thread Tomasz Figa
On Mon, Jul 29, 2019 at 10:12 PM Paul Kocialkowski
 wrote:
>
> Hi,
>
> On Sun 28 Jul 19, 23:05, Tomasz Figa wrote:
> > On Sat, Jul 27, 2019 at 6:37 PM Paul Kocialkowski
> >  wrote:
> > >
> > > Hi,
> > >
> > > On Wed 24 Jul 19, 13:05, Hans Verkuil wrote:
> > > > Add an enum_fmt format flag to specifically tag coded formats where
> > > > full bitstream parsing is supported by the device.
> > > >
> > > > Some stateful decoders are capable of fully parsing a bitstream,
> > > > but others require that userspace pre-parses the bitstream into
> > > > frames or fields (see the corresponding pixelformat descriptions
> > > > for details).
> > > >
> > > > If this flag is set, then this pre-parsing step is not required
> > > > (but still possible, of course).
> > > >
> > > > Signed-off-by: Hans Verkuil 
> > > > ---
> > > >  Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
> > > >  Documentation/media/videodev2.h.rst.exceptions   | 1 +
> > > >  include/uapi/linux/videodev2.h   | 5 +++--
> > > >  3 files changed, 12 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
> > > > b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > > > index 822d6730e7d2..4e24e671f32e 100644
> > > > --- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > > > +++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > > > @@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
> > > >- This format is not native to the device but emulated through
> > > >   software (usually libv4l2), where possible try to use a native
> > > >   format instead for better performance.
> > > > +* - ``V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER``
> > > > +  - 0x0004
> > > > +  - The hardware decoder for this compressed bitstream format (aka 
> > > > coded
> > > > + format) is capable of parsing the bitstream. Applications do not
> > > > + need to parse the bitstream themselves to find the boundaries 
> > > > between
> > > > + frames/fields. This flag can only be used in combination with the
> > > > + ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to 
> > > > compressed
> > > > + formats only.
> > >
> > > Should this flag be set for stateless codecs as well? It seems a bit 
> > > over-kill
> > > for this case. I am not sure whether "compressed bitstream format" 
> > > clearly only
> > > covers the formats used by stateful decoders and not the ones for 
> > > stateless
> > > decoders.
> >
> > I'd suggest using a different name for the flag, because "bitstream
> > parser" is actually one of the core differences between stateful and
> > stateless. All stateful decoders have bitstream parsers, the only
> > difference between the implementations is the unit on which the parser
> > operates, i.e. full stream, frame, NALU.
> >
> > Perhaps V4L2_FMT_FLAG_CONTINUOUS_BITSTREAM (as opposed to discrete,
> > framed/sliced chunks)?
>
> Sure, that seems like a more explicit name regarding what it's supposed to
> describe in my opinion.
>
> > Regardless of that, it doesn't make sense for a stateless decoder to
> > set this flag anyway, because the userspace needs to parse the whole
> > stream anyway and the whole stateless API is based on the assumption
> > that the userspace splits the bitstream into frames (or slices).
>
> Indeed, I agree that it doesn't make sense, but I thought that the name of the
> flag could be confusing. Since there is no direct equivalency between
> "stateless" and "doesn't parse the bitstream" (as we've seen with the rockchip
> decoder needing to parse the slice header on its own), that could have been
> ambiguous. I think the name you're suggesting mostly solves this concern.
>
> I'm still a bit unsure about what "compressed formats" entails or not, so it
> could be good to explicitly mention that this applies to stateful decoders 
> only
> (but it's just a suggestion, advanced users of the API will probably find it
> straightforward).

My understanding is that a compressed format is any format that
doesn't have a directly accessible 2D pixel matrix in its memory
representation, so all the bitstream formats should have it set.

Best regards,
Tomasz


Re: [PATCH 02/14] videodev2.h: add V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER

2019-07-29 Thread Paul Kocialkowski
Hi,

On Sun 28 Jul 19, 23:05, Tomasz Figa wrote:
> On Sat, Jul 27, 2019 at 6:37 PM Paul Kocialkowski
>  wrote:
> >
> > Hi,
> >
> > On Wed 24 Jul 19, 13:05, Hans Verkuil wrote:
> > > Add an enum_fmt format flag to specifically tag coded formats where
> > > full bitstream parsing is supported by the device.
> > >
> > > Some stateful decoders are capable of fully parsing a bitstream,
> > > but others require that userspace pre-parses the bitstream into
> > > frames or fields (see the corresponding pixelformat descriptions
> > > for details).
> > >
> > > If this flag is set, then this pre-parsing step is not required
> > > (but still possible, of course).
> > >
> > > Signed-off-by: Hans Verkuil 
> > > ---
> > >  Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
> > >  Documentation/media/videodev2.h.rst.exceptions   | 1 +
> > >  include/uapi/linux/videodev2.h   | 5 +++--
> > >  3 files changed, 12 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
> > > b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > > index 822d6730e7d2..4e24e671f32e 100644
> > > --- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > > +++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > > @@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
> > >- This format is not native to the device but emulated through
> > >   software (usually libv4l2), where possible try to use a native
> > >   format instead for better performance.
> > > +* - ``V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER``
> > > +  - 0x0004
> > > +  - The hardware decoder for this compressed bitstream format (aka 
> > > coded
> > > + format) is capable of parsing the bitstream. Applications do not
> > > + need to parse the bitstream themselves to find the boundaries 
> > > between
> > > + frames/fields. This flag can only be used in combination with the
> > > + ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
> > > + formats only.
> >
> > Should this flag be set for stateless codecs as well? It seems a bit 
> > over-kill
> > for this case. I am not sure whether "compressed bitstream format" clearly 
> > only
> > covers the formats used by stateful decoders and not the ones for stateless
> > decoders.
> 
> I'd suggest using a different name for the flag, because "bitstream
> parser" is actually one of the core differences between stateful and
> stateless. All stateful decoders have bitstream parsers, the only
> difference between the implementations is the unit on which the parser
> operates, i.e. full stream, frame, NALU.
> 
> Perhaps V4L2_FMT_FLAG_CONTINUOUS_BITSTREAM (as opposed to discrete,
> framed/sliced chunks)?

Sure, that seems like a more explicit name regarding what it's supposed to
describe in my opinion.

> Regardless of that, it doesn't make sense for a stateless decoder to
> set this flag anyway, because the userspace needs to parse the whole
> stream anyway and the whole stateless API is based on the assumption
> that the userspace splits the bitstream into frames (or slices).

Indeed, I agree that it doesn't make sense, but I thought that the name of the
flag could be confusing. Since there is no direct equivalency between
"stateless" and "doesn't parse the bitstream" (as we've seen with the rockchip
decoder needing to parse the slice header on its own), that could have been
ambiguous. I think the name you're suggesting mostly solves this concern.

I'm still a bit unsure about what "compressed formats" entails or not, so it
could be good to explicitly mention that this applies to stateful decoders only
(but it's just a suggestion, advanced users of the API will probably find it
straightforward).

Cheers,

Paul

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com


Re: [PATCH 02/14] videodev2.h: add V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER

2019-07-28 Thread Tomasz Figa
On Sat, Jul 27, 2019 at 6:37 PM Paul Kocialkowski
 wrote:
>
> Hi,
>
> On Wed 24 Jul 19, 13:05, Hans Verkuil wrote:
> > Add an enum_fmt format flag to specifically tag coded formats where
> > full bitstream parsing is supported by the device.
> >
> > Some stateful decoders are capable of fully parsing a bitstream,
> > but others require that userspace pre-parses the bitstream into
> > frames or fields (see the corresponding pixelformat descriptions
> > for details).
> >
> > If this flag is set, then this pre-parsing step is not required
> > (but still possible, of course).
> >
> > Signed-off-by: Hans Verkuil 
> > ---
> >  Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 ++++
> >  Documentation/media/videodev2.h.rst.exceptions   | 1 +
> >  include/uapi/linux/videodev2.h   | 5 +++--
> >  3 files changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
> > b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > index 822d6730e7d2..4e24e671f32e 100644
> > --- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > +++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> > @@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
> >- This format is not native to the device but emulated through
> >   software (usually libv4l2), where possible try to use a native
> >   format instead for better performance.
> > +* - ``V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER``
> > +  - 0x0004
> > +  - The hardware decoder for this compressed bitstream format (aka 
> > coded
> > + format) is capable of parsing the bitstream. Applications do not
> > + need to parse the bitstream themselves to find the boundaries between
> > + frames/fields. This flag can only be used in combination with the
> > + ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
> > + formats only.
>
> Should this flag be set for stateless codecs as well? It seems a bit over-kill
> for this case. I am not sure whether "compressed bitstream format" clearly 
> only
> covers the formats used by stateful decoders and not the ones for stateless
> decoders.

I'd suggest using a different name for the flag, because "bitstream
parser" is actually one of the core differences between stateful and
stateless. All stateful decoders have bitstream parsers, the only
difference between the implementations is the unit on which the parser
operates, i.e. full stream, frame, NALU.

Perhaps V4L2_FMT_FLAG_CONTINUOUS_BITSTREAM (as opposed to discrete,
framed/sliced chunks)?

Regardless of that, it doesn't make sense for a stateless decoder to
set this flag anyway, because the userspace needs to parse the whole
stream anyway and the whole stateless API is based on the assumption
that the userspace splits the bitstream into frames (or slices).

Best regards,
Tomasz


Re: [PATCH 02/14] videodev2.h: add V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER

2019-07-27 Thread Paul Kocialkowski
Hi,

On Wed 24 Jul 19, 13:05, Hans Verkuil wrote:
> Add an enum_fmt format flag to specifically tag coded formats where
> full bitstream parsing is supported by the device.
> 
> Some stateful decoders are capable of fully parsing a bitstream,
> but others require that userspace pre-parses the bitstream into
> frames or fields (see the corresponding pixelformat descriptions
> for details).
> 
> If this flag is set, then this pre-parsing step is not required
> (but still possible, of course).
> 
> Signed-off-by: Hans Verkuil 
> ---
>  Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
>  Documentation/media/videodev2.h.rst.exceptions   | 1 +
>  include/uapi/linux/videodev2.h   | 5 +++--
>  3 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
> b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> index 822d6730e7d2..4e24e671f32e 100644
> --- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> +++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
> @@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
>- This format is not native to the device but emulated through
>   software (usually libv4l2), where possible try to use a native
>   format instead for better performance.
> +* - ``V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER``
> +  - 0x0004
> +  - The hardware decoder for this compressed bitstream format (aka coded
> + format) is capable of parsing the bitstream. Applications do not
> + need to parse the bitstream themselves to find the boundaries between
> + frames/fields. This flag can only be used in combination with the
> + ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
> + formats only.

Should this flag be set for stateless codecs as well? It seems a bit over-kill
for this case. I am not sure whether "compressed bitstream format" clearly only
covers the formats used by stateful decoders and not the ones for stateless
decoders.

Cheers,

Paul

>  
>  Return Value
> diff --git a/Documentation/media/videodev2.h.rst.exceptions 
> b/Documentation/media/videodev2.h.rst.exceptions
> index 55cbe324b9fc..74fb9f00c12d 100644
> --- a/Documentation/media/videodev2.h.rst.exceptions
> +++ b/Documentation/media/videodev2.h.rst.exceptions
> @@ -180,6 +180,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
> reserved-formats
>  # V4L2 format flags
>  replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
>  replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
> +replace define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER fmtdesc-flags
>  
>  # V4L2 tymecode types
>  replace define V4L2_TC_TYPE_24FPS timecode-type
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 2427bc4d8eba..8c5a28666b16 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -774,8 +774,9 @@ struct v4l2_fmtdesc {
>   __u32   reserved[4];
>  };
>  
> -#define V4L2_FMT_FLAG_COMPRESSED 0x0001
> -#define V4L2_FMT_FLAG_EMULATED   0x0002
> +#define V4L2_FMT_FLAG_COMPRESSED 0x0001
> +#define V4L2_FMT_FLAG_EMULATED   0x0002
> +#define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER   0x0004
>  
>   /* Frame Size and frame rate enumeration */
>  /*
> -- 
> 2.20.1
> 

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com


[PATCH 10/14] videodev2.h: add V4L2_DEC_CMD_FLUSH

2019-07-24 Thread Hans Verkuil
Add this new V4L2_DEC_CMD_FLUSH decoder command and document it.

Signed-off-by: Hans Verkuil 
---
 Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst | 11 ++-
 Documentation/media/videodev2.h.rst.exceptions  |  1 +
 include/uapi/linux/videodev2.h  |  1 +
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst 
b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
index ccf83b05afa7..1577850ce407 100644
--- a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
+++ b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
@@ -200,7 +200,16 @@ introduced in Linux 3.3.
been started yet, the driver will return an ``EPERM`` error code. When
the decoder is already running, this command does nothing. No
flags are defined for this command.
-
+* - ``V4L2_DEC_CMD_FLUSH``
+  - 4
+  - Flush any held capture buffers. Only valid for stateless decoders,
+and only if ``V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF`` was set.
+   This command is typically used when the application reached the
+   end of the stream and the last output buffer had the
+   ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag set. This would prevent
+   dequeueing the last capture buffer containing the last decoded frame.
+   So this command can be used to explicitly flush that last decoded
+   frame.
 
 Return Value
 
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index b6cb9fa6c8a8..a2cd94430638 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -434,6 +434,7 @@ replace define V4L2_DEC_CMD_START decoder-cmds
 replace define V4L2_DEC_CMD_STOP decoder-cmds
 replace define V4L2_DEC_CMD_PAUSE decoder-cmds
 replace define V4L2_DEC_CMD_RESUME decoder-cmds
+replace define V4L2_DEC_CMD_FLUSH decoder-cmds
 
 replace define V4L2_DEC_CMD_START_MUTE_AUDIO decoder-cmds
 replace define V4L2_DEC_CMD_PAUSE_TO_BLACK decoder-cmds
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 0682fbc9980d..e22511cfc24a 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -1978,6 +1978,7 @@ struct v4l2_encoder_cmd {
 #define V4L2_DEC_CMD_STOP(1)
 #define V4L2_DEC_CMD_PAUSE   (2)
 #define V4L2_DEC_CMD_RESUME  (3)
+#define V4L2_DEC_CMD_FLUSH   (4)
 
 /* Flags for V4L2_DEC_CMD_START */
 #define V4L2_DEC_CMD_START_MUTE_AUDIO  (1 << 0)
-- 
2.20.1



[PATCH 03/14] videodev2.h: add V4L2_FMT_FLAG_DYN_RESOLUTION

2019-07-24 Thread Hans Verkuil
From: Maxime Jourdan 

Add an enum_fmt format flag to specifically tag coded formats where
dynamic resolution switching is supported by the device.

This is useful for some codec drivers that can support dynamic
resolution switching for one or more of their listed coded formats. It
allows userspace to know whether it should extract the video parameters
itself, or if it can rely on the device to send V4L2_EVENT_SOURCE_CHANGE
when such changes are detected.

Signed-off-by: Maxime Jourdan 
Signed-off-by: Hans Verkuil 
[hverkuil-ci...@xs4all.nl: added flag to videodev2.h.rst.exceptions]
[hverkuil-ci...@xs4all.nl: updated commit text: 'one or more' instead of 'all']
Acked-by: Tomasz Figa 
---
 Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
 Documentation/media/videodev2.h.rst.exceptions   | 1 +
 include/uapi/linux/videodev2.h   | 1 +
 3 files changed, 10 insertions(+)

diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
index 4e24e671f32e..05454780cb21 100644
--- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
+++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
@@ -135,6 +135,14 @@ one until ``EINVAL`` is returned.
frames/fields. This flag can only be used in combination with the
``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
formats only.
+* - ``V4L2_FMT_FLAG_DYN_RESOLUTION``
+  - 0x0008
+  - Dynamic resolution switching is supported by the device for this
+   compressed bitstream format (aka coded format). It will notify the user
+   via the event ``V4L2_EVENT_SOURCE_CHANGE`` when changes in the video
+   parameters are detected. This flag can only be used in combination
+   with the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to
+   compressed formats only.
 
 
 Return Value
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index 74fb9f00c12d..0a9a1b386443 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -181,6 +181,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
reserved-formats
 replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
 replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
 replace define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER fmtdesc-flags
+replace define V4L2_FMT_FLAG_DYN_RESOLUTION fmtdesc-flags
 
 # V4L2 tymecode types
 replace define V4L2_TC_TYPE_24FPS timecode-type
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 8c5a28666b16..ed572b05bd25 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -777,6 +777,7 @@ struct v4l2_fmtdesc {
 #define V4L2_FMT_FLAG_COMPRESSED   0x0001
 #define V4L2_FMT_FLAG_EMULATED 0x0002
 #define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER 0x0004
+#define V4L2_FMT_FLAG_DYN_RESOLUTION   0x0008
 
/* Frame Size and frame rate enumeration */
 /*
-- 
2.20.1



[PATCH 02/14] videodev2.h: add V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER

2019-07-24 Thread Hans Verkuil
Add an enum_fmt format flag to specifically tag coded formats where
full bitstream parsing is supported by the device.

Some stateful decoders are capable of fully parsing a bitstream,
but others require that userspace pre-parses the bitstream into
frames or fields (see the corresponding pixelformat descriptions
for details).

If this flag is set, then this pre-parsing step is not required
(but still possible, of course).

Signed-off-by: Hans Verkuil 
---
 Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
 Documentation/media/videodev2.h.rst.exceptions   | 1 +
 include/uapi/linux/videodev2.h   | 5 +++--
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
index 822d6730e7d2..4e24e671f32e 100644
--- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
+++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
@@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
   - This format is not native to the device but emulated through
software (usually libv4l2), where possible try to use a native
format instead for better performance.
+* - ``V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER``
+  - 0x0004
+  - The hardware decoder for this compressed bitstream format (aka coded
+   format) is capable of parsing the bitstream. Applications do not
+   need to parse the bitstream themselves to find the boundaries between
+   frames/fields. This flag can only be used in combination with the
+   ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
+   formats only.
 
 
 Return Value
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index 55cbe324b9fc..74fb9f00c12d 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -180,6 +180,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
reserved-formats
 # V4L2 format flags
 replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
 replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
+replace define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER fmtdesc-flags
 
 # V4L2 tymecode types
 replace define V4L2_TC_TYPE_24FPS timecode-type
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 2427bc4d8eba..8c5a28666b16 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -774,8 +774,9 @@ struct v4l2_fmtdesc {
__u32   reserved[4];
 };
 
-#define V4L2_FMT_FLAG_COMPRESSED 0x0001
-#define V4L2_FMT_FLAG_EMULATED   0x0002
+#define V4L2_FMT_FLAG_COMPRESSED   0x0001
+#define V4L2_FMT_FLAG_EMULATED 0x0002
+#define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER 0x0004
 
/* Frame Size and frame rate enumeration */
 /*
-- 
2.20.1



[PATCH 02/14] videodev2.h: add V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER

2019-07-24 Thread Hans Verkuil
Add an enum_fmt format flag to specifically tag coded formats where
full bitstream parsing is supported by the device.

Some stateful decoders are capable of fully parsing a bitstream,
but others require that userspace pre-parses the bitstream into
frames or fields (see the corresponding pixelformat descriptions
for details).

If this flag is set, then this pre-parsing step is not required
(but still possible, of course).

Signed-off-by: Hans Verkuil 
---
 Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
 Documentation/media/videodev2.h.rst.exceptions   | 1 +
 include/uapi/linux/videodev2.h   | 5 +++--
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
index 822d6730e7d2..4e24e671f32e 100644
--- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
+++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
@@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
   - This format is not native to the device but emulated through
software (usually libv4l2), where possible try to use a native
format instead for better performance.
+* - ``V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER``
+  - 0x0004
+  - The hardware decoder for this compressed bitstream format (aka coded
+   format) is capable of parsing the bitstream. Applications do not
+   need to parse the bitstream themselves to find the boundaries between
+   frames/fields. This flag can only be used in combination with the
+   ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
+   formats only.
 
 
 Return Value
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index 55cbe324b9fc..74fb9f00c12d 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -180,6 +180,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
reserved-formats
 # V4L2 format flags
 replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
 replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
+replace define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER fmtdesc-flags
 
 # V4L2 tymecode types
 replace define V4L2_TC_TYPE_24FPS timecode-type
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 2427bc4d8eba..8c5a28666b16 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -774,8 +774,9 @@ struct v4l2_fmtdesc {
__u32   reserved[4];
 };
 
-#define V4L2_FMT_FLAG_COMPRESSED 0x0001
-#define V4L2_FMT_FLAG_EMULATED   0x0002
+#define V4L2_FMT_FLAG_COMPRESSED   0x0001
+#define V4L2_FMT_FLAG_EMULATED 0x0002
+#define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER 0x0004
 
/* Frame Size and frame rate enumeration */
 /*
-- 
2.20.1



[PATCH 03/14] videodev2.h: add V4L2_FMT_FLAG_DYN_RESOLUTION

2019-07-24 Thread Hans Verkuil
From: Maxime Jourdan 

Add an enum_fmt format flag to specifically tag coded formats where
dynamic resolution switching is supported by the device.

This is useful for some codec drivers that can support dynamic
resolution switching for one or more of their listed coded formats. It
allows userspace to know whether it should extract the video parameters
itself, or if it can rely on the device to send V4L2_EVENT_SOURCE_CHANGE
when such changes are detected.

Signed-off-by: Maxime Jourdan 
Signed-off-by: Hans Verkuil 
[hverkuil-ci...@xs4all.nl: added flag to videodev2.h.rst.exceptions]
[hverkuil-ci...@xs4all.nl: updated commit text: 'one or more' instead of 'all']
Acked-by: Tomasz Figa 
---
 Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
 Documentation/media/videodev2.h.rst.exceptions   | 1 +
 include/uapi/linux/videodev2.h   | 1 +
 3 files changed, 10 insertions(+)

diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
index 4e24e671f32e..05454780cb21 100644
--- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
+++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
@@ -135,6 +135,14 @@ one until ``EINVAL`` is returned.
frames/fields. This flag can only be used in combination with the
``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
formats only.
+* - ``V4L2_FMT_FLAG_DYN_RESOLUTION``
+  - 0x0008
+  - Dynamic resolution switching is supported by the device for this
+   compressed bitstream format (aka coded format). It will notify the user
+   via the event ``V4L2_EVENT_SOURCE_CHANGE`` when changes in the video
+   parameters are detected. This flag can only be used in combination
+   with the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to
+   compressed formats only.
 
 
 Return Value
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index 74fb9f00c12d..0a9a1b386443 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -181,6 +181,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
reserved-formats
 replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
 replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
 replace define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER fmtdesc-flags
+replace define V4L2_FMT_FLAG_DYN_RESOLUTION fmtdesc-flags
 
 # V4L2 tymecode types
 replace define V4L2_TC_TYPE_24FPS timecode-type
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 8c5a28666b16..ed572b05bd25 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -777,6 +777,7 @@ struct v4l2_fmtdesc {
 #define V4L2_FMT_FLAG_COMPRESSED   0x0001
 #define V4L2_FMT_FLAG_EMULATED 0x0002
 #define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER 0x0004
+#define V4L2_FMT_FLAG_DYN_RESOLUTION   0x0008
 
/* Frame Size and frame rate enumeration */
 /*
-- 
2.20.1



[PATCH 02/14] videodev2.h: add V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER

2019-07-24 Thread Hans Verkuil
Add an enum_fmt format flag to specifically tag coded formats where
full bitstream parsing is supported by the device.

Some stateful decoders are capable of fully parsing a bitstream,
but others require that userspace pre-parses the bitstream into
frames or fields (see the corresponding pixelformat descriptions
for details).

If this flag is set, then this pre-parsing step is not required
(but still possible, of course).

Signed-off-by: Hans Verkuil 
---
 Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
 Documentation/media/videodev2.h.rst.exceptions   | 1 +
 include/uapi/linux/videodev2.h   | 5 +++--
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
index 822d6730e7d2..4e24e671f32e 100644
--- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
+++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
@@ -127,6 +127,14 @@ one until ``EINVAL`` is returned.
   - This format is not native to the device but emulated through
software (usually libv4l2), where possible try to use a native
format instead for better performance.
+* - ``V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER``
+  - 0x0004
+  - The hardware decoder for this compressed bitstream format (aka coded
+   format) is capable of parsing the bitstream. Applications do not
+   need to parse the bitstream themselves to find the boundaries between
+   frames/fields. This flag can only be used in combination with the
+   ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
+   formats only.
 
 
 Return Value
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index 55cbe324b9fc..74fb9f00c12d 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -180,6 +180,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
reserved-formats
 # V4L2 format flags
 replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
 replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
+replace define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER fmtdesc-flags
 
 # V4L2 tymecode types
 replace define V4L2_TC_TYPE_24FPS timecode-type
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 2427bc4d8eba..8c5a28666b16 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -774,8 +774,9 @@ struct v4l2_fmtdesc {
__u32   reserved[4];
 };
 
-#define V4L2_FMT_FLAG_COMPRESSED 0x0001
-#define V4L2_FMT_FLAG_EMULATED   0x0002
+#define V4L2_FMT_FLAG_COMPRESSED   0x0001
+#define V4L2_FMT_FLAG_EMULATED 0x0002
+#define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER 0x0004
 
/* Frame Size and frame rate enumeration */
 /*
-- 
2.20.1



[PATCH 03/14] videodev2.h: add V4L2_FMT_FLAG_DYN_RESOLUTION

2019-07-24 Thread Hans Verkuil
From: Maxime Jourdan 

Add an enum_fmt format flag to specifically tag coded formats where
dynamic resolution switching is supported by the device.

This is useful for some codec drivers that can support dynamic
resolution switching for one or more of their listed coded formats. It
allows userspace to know whether it should extract the video parameters
itself, or if it can rely on the device to send V4L2_EVENT_SOURCE_CHANGE
when such changes are detected.

Signed-off-by: Maxime Jourdan 
Signed-off-by: Hans Verkuil 
[hverkuil-ci...@xs4all.nl: added flag to videodev2.h.rst.exceptions]
[hverkuil-ci...@xs4all.nl: updated commit text: 'one or more' instead of 'all']
Acked-by: Tomasz Figa 
---
 Documentation/media/uapi/v4l/vidioc-enum-fmt.rst | 8 
 Documentation/media/videodev2.h.rst.exceptions   | 1 +
 include/uapi/linux/videodev2.h   | 1 +
 3 files changed, 10 insertions(+)

diff --git a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst 
b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
index 4e24e671f32e..05454780cb21 100644
--- a/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
+++ b/Documentation/media/uapi/v4l/vidioc-enum-fmt.rst
@@ -135,6 +135,14 @@ one until ``EINVAL`` is returned.
frames/fields. This flag can only be used in combination with the
``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to compressed
formats only.
+* - ``V4L2_FMT_FLAG_DYN_RESOLUTION``
+  - 0x0008
+  - Dynamic resolution switching is supported by the device for this
+   compressed bitstream format (aka coded format). It will notify the user
+   via the event ``V4L2_EVENT_SOURCE_CHANGE`` when changes in the video
+   parameters are detected. This flag can only be used in combination
+   with the ``V4L2_FMT_FLAG_COMPRESSED`` flag, since this applies to
+   compressed formats only.
 
 
 Return Value
diff --git a/Documentation/media/videodev2.h.rst.exceptions 
b/Documentation/media/videodev2.h.rst.exceptions
index 74fb9f00c12d..0a9a1b386443 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -181,6 +181,7 @@ replace define V4L2_PIX_FMT_FLAG_PREMUL_ALPHA 
reserved-formats
 replace define V4L2_FMT_FLAG_COMPRESSED fmtdesc-flags
 replace define V4L2_FMT_FLAG_EMULATED fmtdesc-flags
 replace define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER fmtdesc-flags
+replace define V4L2_FMT_FLAG_DYN_RESOLUTION fmtdesc-flags
 
 # V4L2 tymecode types
 replace define V4L2_TC_TYPE_24FPS timecode-type
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 8c5a28666b16..ed572b05bd25 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -777,6 +777,7 @@ struct v4l2_fmtdesc {
 #define V4L2_FMT_FLAG_COMPRESSED   0x0001
 #define V4L2_FMT_FLAG_EMULATED 0x0002
 #define V4L2_FMT_FLAG_HAS_BITSTREAM_PARSER 0x0004
+#define V4L2_FMT_FLAG_DYN_RESOLUTION   0x0008
 
/* Frame Size and frame rate enumeration */
 /*
-- 
2.20.1



Re: [PATCH for 5.3] videodev2.h: change V4L2_PIX_FMT_BGRA444 define: fourcc was already in use

2019-07-11 Thread Kieran Bingham
Hi Hans,

On 11/07/2019 09:53, Hans Verkuil wrote:
> The V4L2_PIX_FMT_BGRA444 define clashed with the pre-existing 
> V4L2_PIX_FMT_SGRBG12
> which strangely enough used the same fourcc, even though that fourcc made no 
> sense
> for a Bayer format. In any case, you can't have duplicates, so change the 
> fourcc of
> V4L2_PIX_FMT_BGRA444.

Aha - I started looking at this this morning as well, and I see you've
beaten me to a patch.

That's a good thing anyway, as I was worried about what the actual code
should be instead, but hey I got to spend some time looking at how these
are all laid out.

There's not a lot of choice of letters left, with BA12 and RA12,
consumed, so GA12 seems reasonable.

It's a shame the SRGB formats didn't choose an 'S' leading char or such.

> Signed-off-by: Hans Verkuil 
> Cc:   # for v5.2 and up

Reviewed-by: Kieran Bingham 


> ---
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 9d9705ceda76..2427bc4d8eba 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -518,7 +518,13 @@ struct v4l2_pix_format {
>  #define V4L2_PIX_FMT_RGBX444 v4l2_fourcc('R', 'X', '1', '2') /* 16   
>  */
>  #define V4L2_PIX_FMT_ABGR444 v4l2_fourcc('A', 'B', '1', '2') /* 16   
>  */
>  #define V4L2_PIX_FMT_XBGR444 v4l2_fourcc('X', 'B', '1', '2') /* 16   
>  */
> -#define V4L2_PIX_FMT_BGRA444 v4l2_fourcc('B', 'A', '1', '2') /* 16   
>  */
> +
> +/*
> + * Originally this had 'BA12' as fourcc, but this clashed with the older
> + * V4L2_PIX_FMT_SGRBG12 which inexplicably used that same fourcc.
> + * So use 'GA12' instead for V4L2_PIX_FMT_BGRA444.
> + */
> +#define V4L2_PIX_FMT_BGRA444 v4l2_fourcc('G', 'A', '1', '2') /* 16   
>  */
>  #define V4L2_PIX_FMT_BGRX444 v4l2_fourcc('B', 'X', '1', '2') /* 16   
>  */
>  #define V4L2_PIX_FMT_RGB555  v4l2_fourcc('R', 'G', 'B', 'O') /* 16  
> RGB-5-5-5 */
>  #define V4L2_PIX_FMT_ARGB555 v4l2_fourcc('A', 'R', '1', '5') /* 16  
> ARGB-1-5-5-5  */
> 



Re: [PATCH for 5.3] videodev2.h: change V4L2_PIX_FMT_BGRA444 define: fourcc was already in use

2019-07-11 Thread Laurent Pinchart
Hi Hans,

Thank you for the patch.

On Thu, Jul 11, 2019 at 10:53:25AM +0200, Hans Verkuil wrote:
> The V4L2_PIX_FMT_BGRA444 define clashed with the pre-existing 
> V4L2_PIX_FMT_SGRBG12
> which strangely enough used the same fourcc, even though that fourcc made no 
> sense
> for a Bayer format. In any case, you can't have duplicates, so change the 
> fourcc of
> V4L2_PIX_FMT_BGRA444.
> 
> Signed-off-by: Hans Verkuil 
> Cc:   # for v5.2 and up

Maybe a Fixes: line ?

Reviewed-by: Laurent Pinchart 

> ---
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 9d9705ceda76..2427bc4d8eba 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -518,7 +518,13 @@ struct v4l2_pix_format {
>  #define V4L2_PIX_FMT_RGBX444 v4l2_fourcc('R', 'X', '1', '2') /* 16   
>  */
>  #define V4L2_PIX_FMT_ABGR444 v4l2_fourcc('A', 'B', '1', '2') /* 16   
>  */
>  #define V4L2_PIX_FMT_XBGR444 v4l2_fourcc('X', 'B', '1', '2') /* 16   
>  */
> -#define V4L2_PIX_FMT_BGRA444 v4l2_fourcc('B', 'A', '1', '2') /* 16   
>  */
> +
> +/*
> + * Originally this had 'BA12' as fourcc, but this clashed with the older
> + * V4L2_PIX_FMT_SGRBG12 which inexplicably used that same fourcc.
> + * So use 'GA12' instead for V4L2_PIX_FMT_BGRA444.
> + */
> +#define V4L2_PIX_FMT_BGRA444 v4l2_fourcc('G', 'A', '1', '2') /* 16   
>  */
>  #define V4L2_PIX_FMT_BGRX444 v4l2_fourcc('B', 'X', '1', '2') /* 16   
>  */
>  #define V4L2_PIX_FMT_RGB555  v4l2_fourcc('R', 'G', 'B', 'O') /* 16  
> RGB-5-5-5 */
>  #define V4L2_PIX_FMT_ARGB555 v4l2_fourcc('A', 'R', '1', '5') /* 16  
> ARGB-1-5-5-5  */

-- 
Regards,

Laurent Pinchart


[PATCH for 5.3] videodev2.h: change V4L2_PIX_FMT_BGRA444 define: fourcc was already in use

2019-07-11 Thread Hans Verkuil
The V4L2_PIX_FMT_BGRA444 define clashed with the pre-existing 
V4L2_PIX_FMT_SGRBG12
which strangely enough used the same fourcc, even though that fourcc made no 
sense
for a Bayer format. In any case, you can't have duplicates, so change the 
fourcc of
V4L2_PIX_FMT_BGRA444.

Signed-off-by: Hans Verkuil 
Cc:   # for v5.2 and up
---
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 9d9705ceda76..2427bc4d8eba 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -518,7 +518,13 @@ struct v4l2_pix_format {
 #define V4L2_PIX_FMT_RGBX444 v4l2_fourcc('R', 'X', '1', '2') /* 16   
 */
 #define V4L2_PIX_FMT_ABGR444 v4l2_fourcc('A', 'B', '1', '2') /* 16   
 */
 #define V4L2_PIX_FMT_XBGR444 v4l2_fourcc('X', 'B', '1', '2') /* 16   
 */
-#define V4L2_PIX_FMT_BGRA444 v4l2_fourcc('B', 'A', '1', '2') /* 16   
 */
+
+/*
+ * Originally this had 'BA12' as fourcc, but this clashed with the older
+ * V4L2_PIX_FMT_SGRBG12 which inexplicably used that same fourcc.
+ * So use 'GA12' instead for V4L2_PIX_FMT_BGRA444.
+ */
+#define V4L2_PIX_FMT_BGRA444 v4l2_fourcc('G', 'A', '1', '2') /* 16   
 */
 #define V4L2_PIX_FMT_BGRX444 v4l2_fourcc('B', 'X', '1', '2') /* 16   
 */
 #define V4L2_PIX_FMT_RGB555  v4l2_fourcc('R', 'G', 'B', 'O') /* 16  RGB-5-5-5  
   */
 #define V4L2_PIX_FMT_ARGB555 v4l2_fourcc('A', 'R', '1', '5') /* 16  
ARGB-1-5-5-5  */


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-07 Thread Hans Verkuil
On 6/7/19 8:45 AM, Hans Verkuil wrote:
> On 6/7/19 8:11 AM, Tomasz Figa wrote:
>> On Wed, May 22, 2019 at 7:56 PM Hans Verkuil  
>> wrote:
 I share the same experience. Bitstream buffers are usually so small that
 you can always find a physically contiguous memory region for them and a
 memcpy() will be faster than the overhead of getting an IOMMU involved.
 This obviously depends on the specific hardware, but there's always some
 threshold before which mapping through an IOMMU just doesn't make sense
 from a fragmentation and/or performance point of view.

 I wonder, though, if it's not possible to keep userptr buffers around
 and avoid the constant mapping/unmapping. If we only performed cache
 maintenance on them as necessary, perhaps that could provide a viable,
 maybe even good, zero-copy mechanism.
>>>
>>> The vb2 framework will keep the mapping for a userptr as long as userspace
>>> uses the same userptr for every buffer.
>>>
>>> I.e. the first time a buffer with index I is queued the userptr is mapped.
>>> If that buffer is later dequeued and then requeued again with the same
>>> userptr the vb2 core will reuse the old mapping. Otherwise it will unmap
>>> and map again with the new userptr.
>>
>> That's a good point. I forgot that we've been seeing random memory
>> corruptions (fortunately of the userptr memory only, not random system
>> memory) because of this behavior and carrying a patch in all
>> downstream branches to remove this caching.
>>
>> I can see that we keep references on the pages that corresponded to
>> the user VMA at the time the buffer was queued, but are we guaranteed
>> that the list of pages backing that VMA hasn't changed over time?
> 
> Since you are seeing memory corruptions, the answer to this is perhaps 'no'?
> 
> I think the (quite possibly faulty) reasoning was that while memory is mapped,
> userspace can't do a free()/malloc() pair and end up with the same address.
> 
> I suspect this might be a wrong assumption, and in that case we're better off
> removing this check.
> 
> But I'd like to have some confirmation that it is really wrong.

I did some testing, and indeed, this doesn't work.

A patch fixing this will be posted soon.

Regards,

Hans

> 
> USERPTR isn't used very often, so it wouldn't surprise me if it is buggy.
> 
> Regards,
> 
>   Hans
> 
>>
>>>
>>> The same is done for dmabuf, BTW. So if userspace keeps changing dmabuf
>>> fds for each buffer, then that is not optimal.
>>
>> We could possibly try to search through the other buffers and reuse
>> the mapping if there is a match?
>>
>> Best regards,
>> Tomasz
>>
> 



Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-06 Thread Hans Verkuil
On 6/7/19 8:11 AM, Tomasz Figa wrote:
> On Wed, May 22, 2019 at 7:56 PM Hans Verkuil  wrote:
>>> I share the same experience. Bitstream buffers are usually so small that
>>> you can always find a physically contiguous memory region for them and a
>>> memcpy() will be faster than the overhead of getting an IOMMU involved.
>>> This obviously depends on the specific hardware, but there's always some
>>> threshold before which mapping through an IOMMU just doesn't make sense
>>> from a fragmentation and/or performance point of view.
>>>
>>> I wonder, though, if it's not possible to keep userptr buffers around
>>> and avoid the constant mapping/unmapping. If we only performed cache
>>> maintenance on them as necessary, perhaps that could provide a viable,
>>> maybe even good, zero-copy mechanism.
>>
>> The vb2 framework will keep the mapping for a userptr as long as userspace
>> uses the same userptr for every buffer.
>>
>> I.e. the first time a buffer with index I is queued the userptr is mapped.
>> If that buffer is later dequeued and then requeued again with the same
>> userptr the vb2 core will reuse the old mapping. Otherwise it will unmap
>> and map again with the new userptr.
> 
> That's a good point. I forgot that we've been seeing random memory
> corruptions (fortunately of the userptr memory only, not random system
> memory) because of this behavior and carrying a patch in all
> downstream branches to remove this caching.
> 
> I can see that we keep references on the pages that corresponded to
> the user VMA at the time the buffer was queued, but are we guaranteed
> that the list of pages backing that VMA hasn't changed over time?

Since you are seeing memory corruptions, the answer to this is perhaps 'no'?

I think the (quite possibly faulty) reasoning was that while memory is mapped,
userspace can't do a free()/malloc() pair and end up with the same address.

I suspect this might be a wrong assumption, and in that case we're better off
removing this check.

But I'd like to have some confirmation that it is really wrong.

USERPTR isn't used very often, so it wouldn't surprise me if it is buggy.

Regards,

Hans

> 
>>
>> The same is done for dmabuf, BTW. So if userspace keeps changing dmabuf
>> fds for each buffer, then that is not optimal.
> 
> We could possibly try to search through the other buffers and reuse
> the mapping if there is a match?
> 
> Best regards,
> Tomasz
> 



Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-06 Thread Tomasz Figa
On Wed, May 22, 2019 at 7:56 PM Hans Verkuil  wrote:
>
> On 5/22/19 12:42 PM, Thierry Reding wrote:
> > On Wed, May 22, 2019 at 10:26:28AM +0200, Paul Kocialkowski wrote:
> >> Hi,
> >>
> >> Le mercredi 22 mai 2019 à 15:48 +0900, Tomasz Figa a écrit :
> >>> On Sat, May 18, 2019 at 11:09 PM Nicolas Dufresne  
> >>> wrote:
> >>>> Le samedi 18 mai 2019 à 12:29 +0200, Paul Kocialkowski a écrit :
> >>>>> Hi,
> >>>>>
> >>>>> Le samedi 18 mai 2019 à 12:04 +0200, Jernej Škrabec a écrit :
> >>>>>> Dne sobota, 18. maj 2019 ob 11:50:37 CEST je Paul Kocialkowski 
> >>>>>> napisal(a):
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> On Fri, 2019-05-17 at 16:43 -0400, Nicolas Dufresne wrote:
> >>>>>>>> Le jeudi 16 mai 2019 à 20:45 +0200, Paul Kocialkowski a écrit :
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> Le jeudi 16 mai 2019 à 14:24 -0400, Nicolas Dufresne a écrit :
> >>>>>>>>>> Le mercredi 15 mai 2019 à 22:59 +0200, Paul Kocialkowski a écrit :
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> Le mercredi 15 mai 2019 à 14:54 -0400, Nicolas Dufresne a écrit :
> >>>>>>>>>>>> Le mercredi 15 mai 2019 à 19:42 +0200, Paul Kocialkowski a écrit 
> >>>>>>>>>>>> :
> >>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit
> >>>>>> :
> >>>>>>>>>>>>>> Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a
> >>>>>> écrit :
> >>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> With the Rockchip stateless VPU driver in the works, we now
> >>>>>>>>>>>>>>> have a
> >>>>>>>>>>>>>>> better idea of what the situation is like on platforms other
> >>>>>>>>>>>>>>> than
> >>>>>>>>>>>>>>> Allwinner. This email shares my conclusions about the
> >>>>>>>>>>>>>>> situation and how
> >>>>>>>>>>>>>>> we should update the MPEG-2, H.264 and H.265 controls
> >>>>>>>>>>>>>>> accordingly.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - Per-slice decoding
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> We've discussed this one already[0] and Hans has submitted a
> >>>>>>>>>>>>>>> patch[1]
> >>>>>>>>>>>>>>> to implement the required core bits. When we agree it looks
> >>>>>>>>>>>>>>> good, we
> >>>>>>>>>>>>>>> should lift the restriction that all slices must be
> >>>>>>>>>>>>>>> concatenated and
> >>>>>>>>>>>>>>> have them submitted as individual requests.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> One question is what to do about other controls. I feel like
> >>>>>>>>>>>>>>> it would
> >>>>>>>>>>>>>>> make sense to always pass all the required controls for
> >>>>>>>>>>>>>>> decoding the
> >>>>>>>>>>>>>>> slice, including the ones that don't change across slices.
> >>>>>>>>>>>>>>> But there
> >>>>>>>>>>>>>>> may be no particular advantage to this and only downsides.
> >>>>>>>>>>>>>>> Not doing it
> >>>>>>>>>>>>>>> and relying on the 

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-04 Thread Jonas Karlman
On 2019-06-04 11:38, Boris Brezillon wrote:
> On Tue, 4 Jun 2019 09:15:28 +
> Jonas Karlman  wrote:
>
>> On 2019-06-04 11:06, Thierry Reding wrote:
>>> On Tue, Jun 04, 2019 at 10:49:21AM +0200, Boris Brezillon wrote:  
>>>> On Tue, 4 Jun 2019 10:31:57 +0200
>>>> Thierry Reding  wrote:
>>>>  
>>>>>>>>> - Using flags
>>>>>>>>>
>>>>>>>>> The current MPEG-2 controls have lots of u8 values that can be
>>>>>>>>> represented as flags. Using flags also helps with padding.
>>>>>>>>> It's unlikely that we'll get more than 64 flags, so using a u64 by
>>>>>>>>> default for that sounds fine (we definitely do want to keep some room
>>>>>>>>> available and I don't think using 32 bits as a default is good 
>>>>>>>>> enough).
>>>>>>>>>
>>>>>>>>> I think H.264/HEVC per-control flags should also be moved to u64. 
>>>>>>>>>  
>>>>>>>> There was also some concensus on this, that u64 should be good enough
>>>>>>>> for anything out there, though we obviously don't know what the future
>>>>>>>> will hold, so perhaps adding some way for possible extending this in 
>>>>>>>> the
>>>>>>>> future might be good. I guess we'll get new controls for new codecs
>>>>>>>> anyway, so we can punt on this until then.
>>>>>>>>   
>>>>>>>>> - Clear split of controls and terminology
>>>>>>>>>
>>>>>>>>> Some codecs have explicit NAL units that are good fits to match as
>>>>>>>>> controls: e.g. slice header, pps, sps. I think we should stick to the
>>>>>>>>> bitstream element names for those.
>>>>>>>>>
>>>>>>>>> For H.264, that would suggest the following changes:
>>>>>>>>> - renaming v4l2_ctrl_h264_decode_param to v4l2_ctrl_h264_slice_header;
>>>>>>>>> - killing v4l2_ctrl_h264_decode_param and having the reference lists
>>>>>>>>> where they belong, which seems to be slice_header;  
>>>>>>> But now here it's being described per slice. When I look at the slice
>>>>>>> header, I only see list of modifications and when I look at userspace,
>>>>>>> That list is simply built from DPB, the modifications list found in the
>>>>>>> slice header seems to be only used to craft the l0/l1 list.
>>>>>> Yes, I think there was a misunderstanding which was then clarified
>>>>>> (unfortunately it happened on IRC, so we don't have a trace of this
>>>>>> discussion). The reference list should definitely be per-frame, and the
>>>>>> L0/L1 slice reflists are referring to the per-frame reference list (it's
>>>>>> just a sub-set of the per-frame reflist re-ordered differently).
>>>>>> 
>>>>>>> There is one thing that come up though, if we enable per-frame decoding
>>>>>>> on top of per-slice decoder (like Cedrus), won't we force userspace to
>>>>>>> always compute l0/l1 even though the HW might be handling that ?
>>>>>> That's true, the question is, what's the cost of this extra re-ordering? 
>>>>>>
>>>>> I think ultimately userspace is already forced to compute these lists
>>>>> even if some hardware may be able to do it in hardware. There's going to
>>>>> be other hardware that userspace wants to support that can't do it by
>>>>> itself, so userspace has to at least have the code anyway. What it could
>>>>> do on top of that decide not to run that code if it somehow detects that
>>>>> hardware can do it already. On the other hand this means that we have to
>>>>> expose a whole lot of capabilities to userspace and userspace has to go
>>>>> and detect all of them in order to parameterize all of the code.
>>>>>
>>>>> Ultimately I suspect many applications will just choose to pass the data
>>>>> all the time out of simplicity. I mean drivers that don't need it will
>>>>> already ignore it (i.e. they must not break if they get the ex

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-04 Thread Boris Brezillon
On Tue, 4 Jun 2019 09:15:28 +
Jonas Karlman  wrote:

> On 2019-06-04 11:06, Thierry Reding wrote:
> > On Tue, Jun 04, 2019 at 10:49:21AM +0200, Boris Brezillon wrote:  
> >> On Tue, 4 Jun 2019 10:31:57 +0200
> >> Thierry Reding  wrote:
> >>  
> >>>>>>> - Using flags
> >>>>>>>
> >>>>>>> The current MPEG-2 controls have lots of u8 values that can be
> >>>>>>> represented as flags. Using flags also helps with padding.
> >>>>>>> It's unlikely that we'll get more than 64 flags, so using a u64 by
> >>>>>>> default for that sounds fine (we definitely do want to keep some room
> >>>>>>> available and I don't think using 32 bits as a default is good 
> >>>>>>> enough).
> >>>>>>>
> >>>>>>> I think H.264/HEVC per-control flags should also be moved to u64. 
> >>>>>>>  
> >>>>>> There was also some concensus on this, that u64 should be good enough
> >>>>>> for anything out there, though we obviously don't know what the future
> >>>>>> will hold, so perhaps adding some way for possible extending this in 
> >>>>>> the
> >>>>>> future might be good. I guess we'll get new controls for new codecs
> >>>>>> anyway, so we can punt on this until then.
> >>>>>>   
> >>>>>>> - Clear split of controls and terminology
> >>>>>>>
> >>>>>>> Some codecs have explicit NAL units that are good fits to match as
> >>>>>>> controls: e.g. slice header, pps, sps. I think we should stick to the
> >>>>>>> bitstream element names for those.
> >>>>>>>
> >>>>>>> For H.264, that would suggest the following changes:
> >>>>>>> - renaming v4l2_ctrl_h264_decode_param to v4l2_ctrl_h264_slice_header;
> >>>>>>> - killing v4l2_ctrl_h264_decode_param and having the reference lists
> >>>>>>> where they belong, which seems to be slice_header;  
> >>>>> But now here it's being described per slice. When I look at the slice
> >>>>> header, I only see list of modifications and when I look at userspace,
> >>>>> That list is simply built from DPB, the modifications list found in the
> >>>>> slice header seems to be only used to craft the l0/l1 list.
> >>>> Yes, I think there was a misunderstanding which was then clarified
> >>>> (unfortunately it happened on IRC, so we don't have a trace of this
> >>>> discussion). The reference list should definitely be per-frame, and the
> >>>> L0/L1 slice reflists are referring to the per-frame reference list (it's
> >>>> just a sub-set of the per-frame reflist re-ordered differently).
> >>>> 
> >>>>> There is one thing that come up though, if we enable per-frame decoding
> >>>>> on top of per-slice decoder (like Cedrus), won't we force userspace to
> >>>>> always compute l0/l1 even though the HW might be handling that ?
> >>>> That's true, the question is, what's the cost of this extra re-ordering? 
> >>>>
> >>> I think ultimately userspace is already forced to compute these lists
> >>> even if some hardware may be able to do it in hardware. There's going to
> >>> be other hardware that userspace wants to support that can't do it by
> >>> itself, so userspace has to at least have the code anyway. What it could
> >>> do on top of that decide not to run that code if it somehow detects that
> >>> hardware can do it already. On the other hand this means that we have to
> >>> expose a whole lot of capabilities to userspace and userspace has to go
> >>> and detect all of them in order to parameterize all of the code.
> >>>
> >>> Ultimately I suspect many applications will just choose to pass the data
> >>> all the time out of simplicity. I mean drivers that don't need it will
> >>> already ignore it (i.e. they must not break if they get the extra data)
> >>> so other than the potential runtime savings on some hardware, there are
> >>> no advantages.
> >>>
> >>> Given that other APIs don't bother exposing this level of contr

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-04 Thread Paul Kocialkowski
Hi,

On Tue, 2019-06-04 at 09:15 +, Jonas Karlman wrote:
> On 2019-06-04 11:06, Thierry Reding wrote:
> > On Tue, Jun 04, 2019 at 10:49:21AM +0200, Boris Brezillon wrote:
> > > On Tue, 4 Jun 2019 10:31:57 +0200
> > > Thierry Reding  wrote:
> > > 
> > > > > > > > - Using flags
> > > > > > > > 
> > > > > > > > The current MPEG-2 controls have lots of u8 values that can be
> > > > > > > > represented as flags. Using flags also helps with padding.
> > > > > > > > It's unlikely that we'll get more than 64 flags, so using a u64 
> > > > > > > > by
> > > > > > > > default for that sounds fine (we definitely do want to keep 
> > > > > > > > some room
> > > > > > > > available and I don't think using 32 bits as a default is good 
> > > > > > > > enough).
> > > > > > > > 
> > > > > > > > I think H.264/HEVC per-control flags should also be moved to 
> > > > > > > > u64.
> > > > > > > There was also some concensus on this, that u64 should be good 
> > > > > > > enough
> > > > > > > for anything out there, though we obviously don't know what the 
> > > > > > > future
> > > > > > > will hold, so perhaps adding some way for possible extending this 
> > > > > > > in the
> > > > > > > future might be good. I guess we'll get new controls for new 
> > > > > > > codecs
> > > > > > > anyway, so we can punt on this until then.
> > > > > > > 
> > > > > > > > - Clear split of controls and terminology
> > > > > > > > 
> > > > > > > > Some codecs have explicit NAL units that are good fits to match 
> > > > > > > > as
> > > > > > > > controls: e.g. slice header, pps, sps. I think we should stick 
> > > > > > > > to the
> > > > > > > > bitstream element names for those.
> > > > > > > > 
> > > > > > > > For H.264, that would suggest the following changes:
> > > > > > > > - renaming v4l2_ctrl_h264_decode_param to 
> > > > > > > > v4l2_ctrl_h264_slice_header;
> > > > > > > > - killing v4l2_ctrl_h264_decode_param and having the reference 
> > > > > > > > lists
> > > > > > > > where they belong, which seems to be slice_header;
> > > > > > But now here it's being described per slice. When I look at the 
> > > > > > slice
> > > > > > header, I only see list of modifications and when I look at 
> > > > > > userspace,
> > > > > > That list is simply built from DPB, the modifications list found in 
> > > > > > the
> > > > > > slice header seems to be only used to craft the l0/l1 list.  
> > > > > Yes, I think there was a misunderstanding which was then clarified
> > > > > (unfortunately it happened on IRC, so we don't have a trace of this
> > > > > discussion). The reference list should definitely be per-frame, and 
> > > > > the
> > > > > L0/L1 slice reflists are referring to the per-frame reference list 
> > > > > (it's
> > > > > just a sub-set of the per-frame reflist re-ordered differently).
> > > > >   
> > > > > > There is one thing that come up though, if we enable per-frame 
> > > > > > decoding
> > > > > > on top of per-slice decoder (like Cedrus), won't we force userspace 
> > > > > > to
> > > > > > always compute l0/l1 even though the HW might be handling that ?  
> > > > > That's true, the question is, what's the cost of this extra 
> > > > > re-ordering?  
> > > > I think ultimately userspace is already forced to compute these lists
> > > > even if some hardware may be able to do it in hardware. There's going to
> > > > be other hardware that userspace wants to support that can't do it by
> > > > itself, so userspace has to at least have the code anyway. What it could
> > > > do on top of that decide not to run that code if it somehow detects that
> 

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-04 Thread Jonas Karlman
On 2019-06-04 11:06, Thierry Reding wrote:
> On Tue, Jun 04, 2019 at 10:49:21AM +0200, Boris Brezillon wrote:
>> On Tue, 4 Jun 2019 10:31:57 +0200
>> Thierry Reding  wrote:
>>
>>>>>>> - Using flags
>>>>>>>
>>>>>>> The current MPEG-2 controls have lots of u8 values that can be
>>>>>>> represented as flags. Using flags also helps with padding.
>>>>>>> It's unlikely that we'll get more than 64 flags, so using a u64 by
>>>>>>> default for that sounds fine (we definitely do want to keep some room
>>>>>>> available and I don't think using 32 bits as a default is good enough).
>>>>>>>
>>>>>>> I think H.264/HEVC per-control flags should also be moved to u64.
>>>>>> There was also some concensus on this, that u64 should be good enough
>>>>>> for anything out there, though we obviously don't know what the future
>>>>>> will hold, so perhaps adding some way for possible extending this in the
>>>>>> future might be good. I guess we'll get new controls for new codecs
>>>>>> anyway, so we can punt on this until then.
>>>>>> 
>>>>>>> - Clear split of controls and terminology
>>>>>>>
>>>>>>> Some codecs have explicit NAL units that are good fits to match as
>>>>>>> controls: e.g. slice header, pps, sps. I think we should stick to the
>>>>>>> bitstream element names for those.
>>>>>>>
>>>>>>> For H.264, that would suggest the following changes:
>>>>>>> - renaming v4l2_ctrl_h264_decode_param to v4l2_ctrl_h264_slice_header;
>>>>>>> - killing v4l2_ctrl_h264_decode_param and having the reference lists
>>>>>>> where they belong, which seems to be slice_header;
>>>>> But now here it's being described per slice. When I look at the slice
>>>>> header, I only see list of modifications and when I look at userspace,
>>>>> That list is simply built from DPB, the modifications list found in the
>>>>> slice header seems to be only used to craft the l0/l1 list.  
>>>> Yes, I think there was a misunderstanding which was then clarified
>>>> (unfortunately it happened on IRC, so we don't have a trace of this
>>>> discussion). The reference list should definitely be per-frame, and the
>>>> L0/L1 slice reflists are referring to the per-frame reference list (it's
>>>> just a sub-set of the per-frame reflist re-ordered differently).
>>>>   
>>>>> There is one thing that come up though, if we enable per-frame decoding
>>>>> on top of per-slice decoder (like Cedrus), won't we force userspace to
>>>>> always compute l0/l1 even though the HW might be handling that ?  
>>>> That's true, the question is, what's the cost of this extra re-ordering?  
>>> I think ultimately userspace is already forced to compute these lists
>>> even if some hardware may be able to do it in hardware. There's going to
>>> be other hardware that userspace wants to support that can't do it by
>>> itself, so userspace has to at least have the code anyway. What it could
>>> do on top of that decide not to run that code if it somehow detects that
>>> hardware can do it already. On the other hand this means that we have to
>>> expose a whole lot of capabilities to userspace and userspace has to go
>>> and detect all of them in order to parameterize all of the code.
>>>
>>> Ultimately I suspect many applications will just choose to pass the data
>>> all the time out of simplicity. I mean drivers that don't need it will
>>> already ignore it (i.e. they must not break if they get the extra data)
>>> so other than the potential runtime savings on some hardware, there are
>>> no advantages.
>>>
>>> Given that other APIs don't bother exposing this level of control to
>>> applications makes me think that it's just not worth it from a
>>> performance point of view.
>> That's not exactly what Nicolas proposed. He was suggesting that we
>> build those reflists kernel-side: V4L would provide an helper and
>> drivers that need those lists would use it, others won't. This way we
>> have no useless computation done, and userspace doesn't even have to
>> bother checking the device caps to avoid this extra step.
> Oh yeah, that sounds much better. I suppose one notable differences to
> other APIs is that they have to pass in buffers for all the frames in
> the DPB, so they basically have to build the lists in userspace. Since
> we'll end up looking up the frames in the kernel, it sounds reasonable
> to also build the lists in the kernel.

Userspace must already process the modification list or it wont have correct 
DPB for next frame.
If you move this processing into kernel side you also introduce state into the 
stateless driver.

Regards,
Jonas
>
> On that note, it would probably be useful to have some sort of helper
> to get at all the buffers that make up the DPB in the kernel. That's got
> to be something that everybody wants.
>
> Thierry



Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-04 Thread Paul Kocialkowski
Hi,

On Tue, 2019-06-04 at 11:05 +0200, Boris Brezillon wrote:
> On Tue, 4 Jun 2019 10:55:03 +0200
> Thierry Reding  wrote:
> 
> > On Mon, Jun 03, 2019 at 02:52:44PM -0400, Nicolas Dufresne wrote:
> > [...]
> > > There is one thing that come up though, if we enable per-frame decoding
> > > on top of per-slice decoder (like Cedrus), won't we force userspace to
> > > always compute l0/l1 even though the HW might be handling that ? Shall
> > > we instead pass the modification list and implement the non-parsing
> > > bits of applying the modifications in the kernel ?  
> > 
> > Applying the modifications is a standard procedure, right? If it's
> > completely driver-agnostic, it sounds to me like the right place to
> > perform the operation is in userspace.
> 
> Well, the counter argument to that is "drivers know better what's
> needed by the HW", and if we want to avoid doing useless work without
> having complex caps checking done in userspace, doing this task
> kenel-side makes sense.

I believe we should also try and alleviate the pain on the user-space
side by having these decoder-specific details handled by the kernel.

It also brings us closer to bitstream format (where the modifications
are coded) and leaves DPB management to the decoder/driver, which IMO
makes a lot of sense.

Cheers,

Paul

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com



Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-04 Thread Thierry Reding
On Tue, Jun 04, 2019 at 10:49:21AM +0200, Boris Brezillon wrote:
> On Tue, 4 Jun 2019 10:31:57 +0200
> Thierry Reding  wrote:
> 
> > > > > > - Using flags
> > > > > > 
> > > > > > The current MPEG-2 controls have lots of u8 values that can be
> > > > > > represented as flags. Using flags also helps with padding.
> > > > > > It's unlikely that we'll get more than 64 flags, so using a u64 by
> > > > > > default for that sounds fine (we definitely do want to keep some 
> > > > > > room
> > > > > > available and I don't think using 32 bits as a default is good 
> > > > > > enough).
> > > > > > 
> > > > > > I think H.264/HEVC per-control flags should also be moved to u64.   
> > > > > >  
> > > > > 
> > > > > There was also some concensus on this, that u64 should be good enough
> > > > > for anything out there, though we obviously don't know what the future
> > > > > will hold, so perhaps adding some way for possible extending this in 
> > > > > the
> > > > > future might be good. I guess we'll get new controls for new codecs
> > > > > anyway, so we can punt on this until then.
> > > > > 
> > > > > > - Clear split of controls and terminology
> > > > > > 
> > > > > > Some codecs have explicit NAL units that are good fits to match as
> > > > > > controls: e.g. slice header, pps, sps. I think we should stick to 
> > > > > > the
> > > > > > bitstream element names for those.
> > > > > > 
> > > > > > For H.264, that would suggest the following changes:
> > > > > > - renaming v4l2_ctrl_h264_decode_param to 
> > > > > > v4l2_ctrl_h264_slice_header;
> > > > > > - killing v4l2_ctrl_h264_decode_param and having the reference lists
> > > > > > where they belong, which seems to be slice_header;
> > > > 
> > > > But now here it's being described per slice. When I look at the slice
> > > > header, I only see list of modifications and when I look at userspace,
> > > > That list is simply built from DPB, the modifications list found in the
> > > > slice header seems to be only used to craft the l0/l1 list.  
> > > 
> > > Yes, I think there was a misunderstanding which was then clarified
> > > (unfortunately it happened on IRC, so we don't have a trace of this
> > > discussion). The reference list should definitely be per-frame, and the
> > > L0/L1 slice reflists are referring to the per-frame reference list (it's
> > > just a sub-set of the per-frame reflist re-ordered differently).
> > >   
> > > > 
> > > > There is one thing that come up though, if we enable per-frame decoding
> > > > on top of per-slice decoder (like Cedrus), won't we force userspace to
> > > > always compute l0/l1 even though the HW might be handling that ?  
> > > 
> > > That's true, the question is, what's the cost of this extra re-ordering?  
> > 
> > I think ultimately userspace is already forced to compute these lists
> > even if some hardware may be able to do it in hardware. There's going to
> > be other hardware that userspace wants to support that can't do it by
> > itself, so userspace has to at least have the code anyway. What it could
> > do on top of that decide not to run that code if it somehow detects that
> > hardware can do it already. On the other hand this means that we have to
> > expose a whole lot of capabilities to userspace and userspace has to go
> > and detect all of them in order to parameterize all of the code.
> > 
> > Ultimately I suspect many applications will just choose to pass the data
> > all the time out of simplicity. I mean drivers that don't need it will
> > already ignore it (i.e. they must not break if they get the extra data)
> > so other than the potential runtime savings on some hardware, there are
> > no advantages.
> > 
> > Given that other APIs don't bother exposing this level of control to
> > applications makes me think that it's just not worth it from a
> > performance point of view.
> 
> That's not exactly what Nicolas proposed. He was suggesting that we
> build those reflists kernel-side: V4L would provide an helper and
> drivers that need those lists would use it, others won't. This way we
> have no useless computation done, and userspace doesn't even have to
> bother checking the device caps to avoid this extra step.

Oh yeah, that sounds much better. I suppose one notable differences to
other APIs is that they have to pass in buffers for all the frames in
the DPB, so they basically have to build the lists in userspace. Since
we'll end up looking up the frames in the kernel, it sounds reasonable
to also build the lists in the kernel.

On that note, it would probably be useful to have some sort of helper
to get at all the buffers that make up the DPB in the kernel. That's got
to be something that everybody wants.

Thierry


signature.asc
Description: PGP signature


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-04 Thread Boris Brezillon
On Tue, 4 Jun 2019 10:55:03 +0200
Thierry Reding  wrote:

> On Mon, Jun 03, 2019 at 02:52:44PM -0400, Nicolas Dufresne wrote:
> [...]
> > There is one thing that come up though, if we enable per-frame decoding
> > on top of per-slice decoder (like Cedrus), won't we force userspace to
> > always compute l0/l1 even though the HW might be handling that ? Shall
> > we instead pass the modification list and implement the non-parsing
> > bits of applying the modifications in the kernel ?  
> 
> Applying the modifications is a standard procedure, right? If it's
> completely driver-agnostic, it sounds to me like the right place to
> perform the operation is in userspace.

Well, the counter argument to that is "drivers know better what's
needed by the HW", and if we want to avoid doing useless work without
having complex caps checking done in userspace, doing this task
kenel-side makes sense.


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-04 Thread Thierry Reding
On Mon, Jun 03, 2019 at 02:52:44PM -0400, Nicolas Dufresne wrote:
[...]
> There is one thing that come up though, if we enable per-frame decoding
> on top of per-slice decoder (like Cedrus), won't we force userspace to
> always compute l0/l1 even though the HW might be handling that ? Shall
> we instead pass the modification list and implement the non-parsing
> bits of applying the modifications in the kernel ?

Applying the modifications is a standard procedure, right? If it's
completely driver-agnostic, it sounds to me like the right place to
perform the operation is in userspace.

Thierry


signature.asc
Description: PGP signature


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-04 Thread Thierry Reding
On Mon, Jun 03, 2019 at 02:52:44PM -0400, Nicolas Dufresne wrote:
> Le lundi 03 juin 2019 à 13:24 +0200, Thierry Reding a écrit :
> > On Wed, May 15, 2019 at 12:09:45PM +0200, Paul Kocialkowski wrote:
> > > Hi,
> > > 
> > > With the Rockchip stateless VPU driver in the works, we now have a
> > > better idea of what the situation is like on platforms other than
> > > Allwinner. This email shares my conclusions about the situation and how
> > > we should update the MPEG-2, H.264 and H.265 controls accordingly.
> > > 
> > > - Per-slice decoding
> > > 
> > > We've discussed this one already[0] and Hans has submitted a patch[1]
> > > to implement the required core bits. When we agree it looks good, we
> > > should lift the restriction that all slices must be concatenated and
> > > have them submitted as individual requests.
> > > 
> > > One question is what to do about other controls. I feel like it would
> > > make sense to always pass all the required controls for decoding the
> > > slice, including the ones that don't change across slices. But there
> > > may be no particular advantage to this and only downsides. Not doing it
> > > and relying on the "control cache" can work, but we need to specify
> > > that only a single stream can be decoded per opened instance of the
> > > v4l2 device. This is the assumption we're going with for handling
> > > multi-slice anyway, so it shouldn't be an issue.
> > > 
> > > - Annex-B formats
> > > 
> > > I don't think we have really reached a conclusion on the pixel formats
> > > we want to expose. The main issue is how to deal with codecs that need
> > > the full slice NALU with start code, where the slice_header is
> > > duplicated in raw bitstream, when others are fine with just the encoded
> > > slice data and the parsed slice header control.
> > > 
> > > My initial thinking was that we'd need 3 formats:
> > > - One that only takes only the slice compressed data (without raw slice
> > > header and start code);
> > > - One that takes both the NALU data (including start code, raw header
> > > and compressed data) and slice header controls;
> > > - One that takes the NALU data but no slice header.
> > > 
> > > But I no longer think the latter really makes sense in the context of
> > > stateless video decoding.
> > > 
> > > A side-note: I think we should definitely have data offsets in every
> > > case, so that implementations can just push the whole NALU regardless
> > > of the format if they're lazy.
> > 
> > I started an NVIDIA internal discussion about this to get some thoughts
> > from our local experts and to fill in my gaps in understanding of NVIDIA
> > hardware that we might want to support.
> > 
> > As far as input format goes, there was pretty broad consensus that in
> > order for the ABI to be most broadly useful we need to settle on the
> > lowest common denominator, while drawing some inspiration from existing
> > APIs because they've already gone through a lot of these discussions and
> > came up with standard interfaces to deal with the differences between
> > decoders.
> 
> Note that we are making a statement with the sateless/stateful split.
> The userspace overhead is non-negligible if you start passing all this
> useless data to a stateful HW. About other implementation, that's what
> we went through in order to reach the state we are at now.
> 
> It's interesting that you have this dicussion with NVIDIA specialist,
> that being said, I think it would be better to provide with the actual
> data (how different generation of HW works) before providing
> conclusions made by your team. Right now, we have deeply studied
> Cedrus, Hantro and Rockchip IP, and that's how we manage to reach this
> low overhead compromise. What we really want to see, is if there exist
> NVidia HW, that does not fit any of the two interface, and why.

Sorry if I was being condescending, that was not my intention. I was
trying to share what I was able to learn in the short time while the
discussion was happening.

If I understand correctly, I think NVIDIA hardware falls in the category
covered by the second interface, that is: NALU data (start code, raw
header, compressed data) and slice header controls.

I'm trying to get some other things out of the way first, but then I
hope to have time to go back to porting the VDE driver to V4L2 so that I
have something more concrete to contribute.

> > In mor

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-04 Thread Boris Brezillon
On Tue, 4 Jun 2019 10:31:57 +0200
Thierry Reding  wrote:

> > > > > - Using flags
> > > > > 
> > > > > The current MPEG-2 controls have lots of u8 values that can be
> > > > > represented as flags. Using flags also helps with padding.
> > > > > It's unlikely that we'll get more than 64 flags, so using a u64 by
> > > > > default for that sounds fine (we definitely do want to keep some room
> > > > > available and I don't think using 32 bits as a default is good 
> > > > > enough).
> > > > > 
> > > > > I think H.264/HEVC per-control flags should also be moved to u64.
> > > > 
> > > > There was also some concensus on this, that u64 should be good enough
> > > > for anything out there, though we obviously don't know what the future
> > > > will hold, so perhaps adding some way for possible extending this in the
> > > > future might be good. I guess we'll get new controls for new codecs
> > > > anyway, so we can punt on this until then.
> > > > 
> > > > > - Clear split of controls and terminology
> > > > > 
> > > > > Some codecs have explicit NAL units that are good fits to match as
> > > > > controls: e.g. slice header, pps, sps. I think we should stick to the
> > > > > bitstream element names for those.
> > > > > 
> > > > > For H.264, that would suggest the following changes:
> > > > > - renaming v4l2_ctrl_h264_decode_param to v4l2_ctrl_h264_slice_header;
> > > > > - killing v4l2_ctrl_h264_decode_param and having the reference lists
> > > > > where they belong, which seems to be slice_header;
> > > 
> > > But now here it's being described per slice. When I look at the slice
> > > header, I only see list of modifications and when I look at userspace,
> > > That list is simply built from DPB, the modifications list found in the
> > > slice header seems to be only used to craft the l0/l1 list.  
> > 
> > Yes, I think there was a misunderstanding which was then clarified
> > (unfortunately it happened on IRC, so we don't have a trace of this
> > discussion). The reference list should definitely be per-frame, and the
> > L0/L1 slice reflists are referring to the per-frame reference list (it's
> > just a sub-set of the per-frame reflist re-ordered differently).
> >   
> > > 
> > > There is one thing that come up though, if we enable per-frame decoding
> > > on top of per-slice decoder (like Cedrus), won't we force userspace to
> > > always compute l0/l1 even though the HW might be handling that ?  
> > 
> > That's true, the question is, what's the cost of this extra re-ordering?  
> 
> I think ultimately userspace is already forced to compute these lists
> even if some hardware may be able to do it in hardware. There's going to
> be other hardware that userspace wants to support that can't do it by
> itself, so userspace has to at least have the code anyway. What it could
> do on top of that decide not to run that code if it somehow detects that
> hardware can do it already. On the other hand this means that we have to
> expose a whole lot of capabilities to userspace and userspace has to go
> and detect all of them in order to parameterize all of the code.
> 
> Ultimately I suspect many applications will just choose to pass the data
> all the time out of simplicity. I mean drivers that don't need it will
> already ignore it (i.e. they must not break if they get the extra data)
> so other than the potential runtime savings on some hardware, there are
> no advantages.
> 
> Given that other APIs don't bother exposing this level of control to
> applications makes me think that it's just not worth it from a
> performance point of view.

That's not exactly what Nicolas proposed. He was suggesting that we
build those reflists kernel-side: V4L would provide an helper and
drivers that need those lists would use it, others won't. This way we
have no useless computation done, and userspace doesn't even have to
bother checking the device caps to avoid this extra step.



Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-04 Thread Thierry Reding
On Mon, Jun 03, 2019 at 09:41:17PM +0200, Boris Brezillon wrote:
> On Mon, 03 Jun 2019 14:52:44 -0400
> Nicolas Dufresne  wrote:
> 
> > > > - Dropping the DPB concept in H.264/H.265
> > > > 
> > > > As far as I could understand, the decoded picture buffer (DPB) is a
> > > > concept that only makes sense relative to a decoder implementation. The
> > > > spec mentions how to manage it with the Hypothetical reference decoder
> > > > (Annex C), but that's about it.
> > > > 
> > > > What's really in the bitstream is the list of modified short-term and
> > > > long-term references, which is enough for every decoder.
> > > > 
> > > > For this reason, I strongly believe we should stop talking about DPB in
> > > > the controls and just pass these lists agremented with relevant
> > > > information for userspace.
> > > > 
> > > > I think it should be up to the driver to maintain a DPB and we could
> > > > have helpers for common cases. For instance, the rockchip decoder needs
> > > > to keep unused entries around[2] and cedrus has the same requirement
> > > > for H.264. However for cedrus/H.265, we don't need to do any book-
> > > > keeping in particular and can manage with the lists from the bitstream
> > > > directly.  
> > > 
> > > There was a bit of concern regarding this. Given that DPB maintenance is
> > > purely a software construct, this doesn't really belong in the kernel. A
> > > DPB will be the same no matter what hardware operates on the bitstream.
> > > Depending on the hardware it may use the DPB differently (or maybe not
> > > at all), but that's beside the point, really. This is pretty much the
> > > same rationale as discussed above for meta data.
> > > 
> > > Again, VAAPI and VDPAU don't require drivers to deal with this. Instead
> > > they just get the final list of reference pictures, ready to use.  
> > 
> > I think we need a bit of clarification from Boris, as what I read here
> > is a bit contradictory (or at least I am a bit confused). When I first
> > read this, I understood that this was just about renaming the dpb as
> > being the references list and only require the active references to be
> > there.
> 
> It's really just about renaming the field, it would contain exactly the
> same data.
> 
> > 
> > So what I'm not clear is where exactly this "active reference list"
> > comes from. In VAAPI it is describe "per-frame" 
> 
> That's my understanding as well.
> 
> > 
> > >   
> > > > - Using flags
> > > > 
> > > > The current MPEG-2 controls have lots of u8 values that can be
> > > > represented as flags. Using flags also helps with padding.
> > > > It's unlikely that we'll get more than 64 flags, so using a u64 by
> > > > default for that sounds fine (we definitely do want to keep some room
> > > > available and I don't think using 32 bits as a default is good enough).
> > > > 
> > > > I think H.264/HEVC per-control flags should also be moved to u64.  
> > > 
> > > There was also some concensus on this, that u64 should be good enough
> > > for anything out there, though we obviously don't know what the future
> > > will hold, so perhaps adding some way for possible extending this in the
> > > future might be good. I guess we'll get new controls for new codecs
> > > anyway, so we can punt on this until then.
> > >   
> > > > - Clear split of controls and terminology
> > > > 
> > > > Some codecs have explicit NAL units that are good fits to match as
> > > > controls: e.g. slice header, pps, sps. I think we should stick to the
> > > > bitstream element names for those.
> > > > 
> > > > For H.264, that would suggest the following changes:
> > > > - renaming v4l2_ctrl_h264_decode_param to v4l2_ctrl_h264_slice_header;
> > > > - killing v4l2_ctrl_h264_decode_param and having the reference lists
> > > > where they belong, which seems to be slice_header;  
> > 
> > But now here it's being described per slice. When I look at the slice
> > header, I only see list of modifications and when I look at userspace,
> > That list is simply built from DPB, the modifications list found in the
> > slice header seems to be only used to craft the l0/l1 list.
> 
> Yes, I

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-03 Thread Boris Brezillon
On Mon, 03 Jun 2019 14:52:44 -0400
Nicolas Dufresne  wrote:

> > > - Dropping the DPB concept in H.264/H.265
> > > 
> > > As far as I could understand, the decoded picture buffer (DPB) is a
> > > concept that only makes sense relative to a decoder implementation. The
> > > spec mentions how to manage it with the Hypothetical reference decoder
> > > (Annex C), but that's about it.
> > > 
> > > What's really in the bitstream is the list of modified short-term and
> > > long-term references, which is enough for every decoder.
> > > 
> > > For this reason, I strongly believe we should stop talking about DPB in
> > > the controls and just pass these lists agremented with relevant
> > > information for userspace.
> > > 
> > > I think it should be up to the driver to maintain a DPB and we could
> > > have helpers for common cases. For instance, the rockchip decoder needs
> > > to keep unused entries around[2] and cedrus has the same requirement
> > > for H.264. However for cedrus/H.265, we don't need to do any book-
> > > keeping in particular and can manage with the lists from the bitstream
> > > directly.  
> > 
> > There was a bit of concern regarding this. Given that DPB maintenance is
> > purely a software construct, this doesn't really belong in the kernel. A
> > DPB will be the same no matter what hardware operates on the bitstream.
> > Depending on the hardware it may use the DPB differently (or maybe not
> > at all), but that's beside the point, really. This is pretty much the
> > same rationale as discussed above for meta data.
> > 
> > Again, VAAPI and VDPAU don't require drivers to deal with this. Instead
> > they just get the final list of reference pictures, ready to use.  
> 
> I think we need a bit of clarification from Boris, as what I read here
> is a bit contradictory (or at least I am a bit confused). When I first
> read this, I understood that this was just about renaming the dpb as
> being the references list and only require the active references to be
> there.

It's really just about renaming the field, it would contain exactly the
same data.

> 
> So what I'm not clear is where exactly this "active reference list"
> comes from. In VAAPI it is describe "per-frame" 

That's my understanding as well.

> 
> >   
> > > - Using flags
> > > 
> > > The current MPEG-2 controls have lots of u8 values that can be
> > > represented as flags. Using flags also helps with padding.
> > > It's unlikely that we'll get more than 64 flags, so using a u64 by
> > > default for that sounds fine (we definitely do want to keep some room
> > > available and I don't think using 32 bits as a default is good enough).
> > > 
> > > I think H.264/HEVC per-control flags should also be moved to u64.  
> > 
> > There was also some concensus on this, that u64 should be good enough
> > for anything out there, though we obviously don't know what the future
> > will hold, so perhaps adding some way for possible extending this in the
> > future might be good. I guess we'll get new controls for new codecs
> > anyway, so we can punt on this until then.
> >   
> > > - Clear split of controls and terminology
> > > 
> > > Some codecs have explicit NAL units that are good fits to match as
> > > controls: e.g. slice header, pps, sps. I think we should stick to the
> > > bitstream element names for those.
> > > 
> > > For H.264, that would suggest the following changes:
> > > - renaming v4l2_ctrl_h264_decode_param to v4l2_ctrl_h264_slice_header;
> > > - killing v4l2_ctrl_h264_decode_param and having the reference lists
> > > where they belong, which seems to be slice_header;  
> 
> But now here it's being described per slice. When I look at the slice
> header, I only see list of modifications and when I look at userspace,
> That list is simply built from DPB, the modifications list found in the
> slice header seems to be only used to craft the l0/l1 list.

Yes, I think there was a misunderstanding which was then clarified
(unfortunately it happened on IRC, so we don't have a trace of this
discussion). The reference list should definitely be per-frame, and the
L0/L1 slice reflists are referring to the per-frame reference list (it's
just a sub-set of the per-frame reflist re-ordered differently).

> 
> There is one thing that come up though, if we enable per-frame decoding
> on top of per-slice decoder (like Cedrus), won't we force userspace to
> always compute l0/l1 even though the HW might be handling that ?

That's true, the question is, what's the cost of this extra re-ordering?

> Shall
> we instead pass the modification list and implement the non-parsing
> bits of applying the modifications in the kernel ?

I'd be fine with that option too.


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-03 Thread Nicolas Dufresne
Le lundi 03 juin 2019 à 13:24 +0200, Thierry Reding a écrit :
> On Wed, May 15, 2019 at 12:09:45PM +0200, Paul Kocialkowski wrote:
> > Hi,
> > 
> > With the Rockchip stateless VPU driver in the works, we now have a
> > better idea of what the situation is like on platforms other than
> > Allwinner. This email shares my conclusions about the situation and how
> > we should update the MPEG-2, H.264 and H.265 controls accordingly.
> > 
> > - Per-slice decoding
> > 
> > We've discussed this one already[0] and Hans has submitted a patch[1]
> > to implement the required core bits. When we agree it looks good, we
> > should lift the restriction that all slices must be concatenated and
> > have them submitted as individual requests.
> > 
> > One question is what to do about other controls. I feel like it would
> > make sense to always pass all the required controls for decoding the
> > slice, including the ones that don't change across slices. But there
> > may be no particular advantage to this and only downsides. Not doing it
> > and relying on the "control cache" can work, but we need to specify
> > that only a single stream can be decoded per opened instance of the
> > v4l2 device. This is the assumption we're going with for handling
> > multi-slice anyway, so it shouldn't be an issue.
> > 
> > - Annex-B formats
> > 
> > I don't think we have really reached a conclusion on the pixel formats
> > we want to expose. The main issue is how to deal with codecs that need
> > the full slice NALU with start code, where the slice_header is
> > duplicated in raw bitstream, when others are fine with just the encoded
> > slice data and the parsed slice header control.
> > 
> > My initial thinking was that we'd need 3 formats:
> > - One that only takes only the slice compressed data (without raw slice
> > header and start code);
> > - One that takes both the NALU data (including start code, raw header
> > and compressed data) and slice header controls;
> > - One that takes the NALU data but no slice header.
> > 
> > But I no longer think the latter really makes sense in the context of
> > stateless video decoding.
> > 
> > A side-note: I think we should definitely have data offsets in every
> > case, so that implementations can just push the whole NALU regardless
> > of the format if they're lazy.
> 
> I started an NVIDIA internal discussion about this to get some thoughts
> from our local experts and to fill in my gaps in understanding of NVIDIA
> hardware that we might want to support.
> 
> As far as input format goes, there was pretty broad consensus that in
> order for the ABI to be most broadly useful we need to settle on the
> lowest common denominator, while drawing some inspiration from existing
> APIs because they've already gone through a lot of these discussions and
> came up with standard interfaces to deal with the differences between
> decoders.

Note that we are making a statement with the sateless/stateful split.
The userspace overhead is non-negligible if you start passing all this
useless data to a stateful HW. About other implementation, that's what
we went through in order to reach the state we are at now.

It's interesting that you have this dicussion with NVIDIA specialist,
that being said, I think it would be better to provide with the actual
data (how different generation of HW works) before providing
conclusions made by your team. Right now, we have deeply studied
Cedrus, Hantro and Rockchip IP, and that's how we manage to reach this
low overhead compromise. What we really want to see, is if there exist
NVidia HW, that does not fit any of the two interface, and why.

> 
> In more concrete terms this means that we'll want to provide as much
> data to the kernel as possible. On one hand that means that we need to
> do all of the header parsing etc. in userspace and pass it to the kernel
> to support hardware that can't parse this data by itself. At the same
> time we want to provide the full bitstream to the kernel to make sure
> that hardware that does some (or all) of the parsing itself has access
> to this. We absolutely want to avoid having to reconstruct some of the
> bitstream that userspace may not have passed in order to optimize for
> some usecases.

Passing the entire bitstream without reconstruction is near impossible
for a VDPAU or VAAPI driver. Even for FFMPEG, it makes everything much
more complex. I think at some point we need to draw a line what this
new API should cover.

An example here, we have decided to support a new format H264_SLICE,
and this format has been defined as "slice

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-06-03 Thread Thierry Reding
On Wed, May 15, 2019 at 12:09:45PM +0200, Paul Kocialkowski wrote:
> Hi,
> 
> With the Rockchip stateless VPU driver in the works, we now have a
> better idea of what the situation is like on platforms other than
> Allwinner. This email shares my conclusions about the situation and how
> we should update the MPEG-2, H.264 and H.265 controls accordingly.
> 
> - Per-slice decoding
> 
> We've discussed this one already[0] and Hans has submitted a patch[1]
> to implement the required core bits. When we agree it looks good, we
> should lift the restriction that all slices must be concatenated and
> have them submitted as individual requests.
> 
> One question is what to do about other controls. I feel like it would
> make sense to always pass all the required controls for decoding the
> slice, including the ones that don't change across slices. But there
> may be no particular advantage to this and only downsides. Not doing it
> and relying on the "control cache" can work, but we need to specify
> that only a single stream can be decoded per opened instance of the
> v4l2 device. This is the assumption we're going with for handling
> multi-slice anyway, so it shouldn't be an issue.
> 
> - Annex-B formats
> 
> I don't think we have really reached a conclusion on the pixel formats
> we want to expose. The main issue is how to deal with codecs that need
> the full slice NALU with start code, where the slice_header is
> duplicated in raw bitstream, when others are fine with just the encoded
> slice data and the parsed slice header control.
> 
> My initial thinking was that we'd need 3 formats:
> - One that only takes only the slice compressed data (without raw slice
> header and start code);
> - One that takes both the NALU data (including start code, raw header
> and compressed data) and slice header controls;
> - One that takes the NALU data but no slice header.
> 
> But I no longer think the latter really makes sense in the context of
> stateless video decoding.
> 
> A side-note: I think we should definitely have data offsets in every
> case, so that implementations can just push the whole NALU regardless
> of the format if they're lazy.

I started an NVIDIA internal discussion about this to get some thoughts
from our local experts and to fill in my gaps in understanding of NVIDIA
hardware that we might want to support.

As far as input format goes, there was pretty broad consensus that in
order for the ABI to be most broadly useful we need to settle on the
lowest common denominator, while drawing some inspiration from existing
APIs because they've already gone through a lot of these discussions and
came up with standard interfaces to deal with the differences between
decoders.

In more concrete terms this means that we'll want to provide as much
data to the kernel as possible. On one hand that means that we need to
do all of the header parsing etc. in userspace and pass it to the kernel
to support hardware that can't parse this data by itself. At the same
time we want to provide the full bitstream to the kernel to make sure
that hardware that does some (or all) of the parsing itself has access
to this. We absolutely want to avoid having to reconstruct some of the
bitstream that userspace may not have passed in order to optimize for
some usecases.

Also, all bitstream parsing should be done in userspace, we don't want
to require the kernel to have to deal with this. There's nothing in the
bitstream that would be hardware-specific, so can all be done perfectly
fine in userspace.

As for an interface on what to pass along, most people suggested that we
pass both the raw bitstream along with a descriptor of what's contained
in that bitstream. That descriptor would contain the number of slices
contained in the bitstream chunk as well as per-slice data (such as the
offset in the bitstream chunk for that slice and the number/ID of the
slice). This is in addition to the extra meta data that we already pass
for the codecs (PPS, SPS, ...). The slice information would allow
drivers to point the hardware directly at the slice data if that's all
it needs, but it can also be used for error concealment if corrupted
slices are encountered. This would obviously require that controls can
be passed on a per-buffer basis. I'm not sure if that's possible since
the request API was already introduced to allow controls to be passed in
a more fine-grained manner than setting them globally. I'm not sure how
to pass per-buffer data in a nice way otherwise, but perhaps this is not
even a real problem?

The above is pretty close to the Mesa pipe_video infrastructure as well
as VAAPI and VDPAU, so I would expect most userspace to be able to deal
well with such an ABI.

Userspace applications would decide what the app

[PATCH v3 09/10] media: aspeed: use different delays for triggering VE H/W reset

2019-05-31 Thread Jae Hyun Yoo
In case of watchdog timeout detected while doing mode detection,
it's better triggering video engine hardware reset immediately so
this commit fixes code for the case. Other than the case, it will
trigger video engine hardware reset after RESOLUTION_CHANGE_DELAY.

Signed-off-by: Jae Hyun Yoo 
Reviewed-by: Eddie James 
---
v2 -> v3:
 None.

v1 -> v2:
 New.

 drivers/media/platform/aspeed-video.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/aspeed-video.c 
b/drivers/media/platform/aspeed-video.c
index d6708ddb0391..ba093096a5a7 100644
--- a/drivers/media/platform/aspeed-video.c
+++ b/drivers/media/platform/aspeed-video.c
@@ -522,7 +522,7 @@ static void aspeed_video_bufs_done(struct aspeed_video 
*video,
spin_unlock_irqrestore(&video->lock, flags);
 }
 
-static void aspeed_video_irq_res_change(struct aspeed_video *video)
+static void aspeed_video_irq_res_change(struct aspeed_video *video, ulong 
delay)
 {
dev_dbg(video->dev, "Resolution changed; resetting\n");
 
@@ -532,7 +532,7 @@ static void aspeed_video_irq_res_change(struct aspeed_video 
*video)
aspeed_video_off(video);
aspeed_video_bufs_done(video, VB2_BUF_STATE_ERROR);
 
-   schedule_delayed_work(&video->res_work, RESOLUTION_CHANGE_DELAY);
+   schedule_delayed_work(&video->res_work, delay);
 }
 
 static irqreturn_t aspeed_video_irq(int irq, void *arg)
@@ -545,7 +545,7 @@ static irqreturn_t aspeed_video_irq(int irq, void *arg)
 * re-initialize
 */
if (sts & VE_INTERRUPT_MODE_DETECT_WD) {
-   aspeed_video_irq_res_change(video);
+   aspeed_video_irq_res_change(video, 0);
return IRQ_HANDLED;
}
 
@@ -563,7 +563,8 @@ static irqreturn_t aspeed_video_irq(int irq, void *arg)
 * Signal acquired while NOT doing resolution
 * detection; reset the engine and re-initialize
 */
-   aspeed_video_irq_res_change(video);
+   aspeed_video_irq_res_change(video,
+   RESOLUTION_CHANGE_DELAY);
return IRQ_HANDLED;
}
}
-- 
2.21.0



Re: [PATCH v2 09/11] media: aspeed: use different delays for triggering VE H/W reset

2019-05-29 Thread Eddie James



On 5/24/19 6:17 PM, Jae Hyun Yoo wrote:

In case of watchdog timeout detected while doing mode detection,
it's better triggering video engine hardware reset immediately so
this commit fixes code for the case. Other than the case, it will
trigger video engine hardware reset after RESOLUTION_CHANGE_DELAY.



Reviewed-by: Eddie James 




Signed-off-by: Jae Hyun Yoo 
---
v1 -> v2:
  New.

  drivers/media/platform/aspeed-video.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/aspeed-video.c 
b/drivers/media/platform/aspeed-video.c
index 4647ed2e9e63..67f476bf0a03 100644
--- a/drivers/media/platform/aspeed-video.c
+++ b/drivers/media/platform/aspeed-video.c
@@ -522,7 +522,7 @@ static void aspeed_video_bufs_done(struct aspeed_video 
*video,
spin_unlock_irqrestore(&video->lock, flags);
  }
  
-static void aspeed_video_irq_res_change(struct aspeed_video *video)

+static void aspeed_video_irq_res_change(struct aspeed_video *video, ulong 
delay)
  {
dev_dbg(video->dev, "Resolution changed; resetting\n");
  
@@ -532,7 +532,7 @@ static void aspeed_video_irq_res_change(struct aspeed_video *video)

aspeed_video_off(video);
aspeed_video_bufs_done(video, VB2_BUF_STATE_ERROR);
  
-	schedule_delayed_work(&video->res_work, RESOLUTION_CHANGE_DELAY);

+   schedule_delayed_work(&video->res_work, delay);
  }
  
  static irqreturn_t aspeed_video_irq(int irq, void *arg)

@@ -545,7 +545,7 @@ static irqreturn_t aspeed_video_irq(int irq, void *arg)
 * re-initialize
 */
if (sts & VE_INTERRUPT_MODE_DETECT_WD) {
-   aspeed_video_irq_res_change(video);
+   aspeed_video_irq_res_change(video, 0);
return IRQ_HANDLED;
}
  
@@ -563,7 +563,8 @@ static irqreturn_t aspeed_video_irq(int irq, void *arg)

 * Signal acquired while NOT doing resolution
 * detection; reset the engine and re-initialize
 */
-   aspeed_video_irq_res_change(video);
+   aspeed_video_irq_res_change(video,
+   RESOLUTION_CHANGE_DELAY);
return IRQ_HANDLED;
}
}




Re: [GIT PULL FOR v5.3] Add ZynqMP VCU/Allegro DVT H.264 encoder driver

2019-05-29 Thread Hans Verkuil
On 5/29/19 1:29 PM, Hans Verkuil wrote:
> On 5/29/19 1:07 PM, Mauro Carvalho Chehab wrote:
>> Em Tue, 28 May 2019 20:10:27 +0200
>> Hans Verkuil  escreveu:
>>
>>> The following changes since commit 9bec226d8c79fcbc95817b082557f72a79d182f5:
>>>
>>>   media: v4l2-pci-skeleton.c: fix doc warning (2019-05-28 13:14:28 -0400)
>>>
>>> are available in the Git repository at:
>>>
>>>   git://linuxtv.org/hverkuil/media_tree.git tags/br-v5.3f
>>>
>>> for you to fetch changes up to 75e52767fa3d58a783dd6862a7fb686e5de67fc6:
>>>
>>>   allegro: add SPS/PPS nal unit writer (2019-05-28 20:00:26 +0200)
>>>
>>> 
>>> Tag branch
>>>
>>> 
>>> Hans Verkuil (1):
>>>   videobuf2-v4l2: set last_buffer_dequeued in dqbuf
>>>
>>> Michael Tretter (4):
>>>   media: dt-bindings: media: document allegro-dvt bindings
>>>   media: dt-bindings: media: Add vendor prefix for allegro
>>>   allegro: add Allegro DVT video IP core driver
>>>   allegro: add SPS/PPS nal unit writer
>>
>> As this is staging, merged, but there's something wrong with DT here:
>>
>> WARNING: DT compatible string vendor "allegro" appears un-documented -- 
>> check ./Documentation/devicetree/bindings/vendor-prefixes.yaml
>> #3013: FILE: drivers/staging/media/allegro-dvt/allegro-core.c:3013:
>> +{ .compatible = "allegro,al5e-1.1" },
>>
>> Please send a followup patch addressing it.
> 
> Huh? Something went wrong: this is the patch that is in my for-v5.3f branch:
> 
> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=for-v5.3f&id=ae4e36dd1945380ccd97090d2099f67be9a976d8
> 
> That's not what you tried to merge.

Never mind.

The cause is that the change to vendor-prefixes.yaml was done after the change
to allegro.txt. Those two patches should have been swapped.

So nothing is wrong.

Regards,

Hans


Re: [GIT PULL FOR v5.3] Add ZynqMP VCU/Allegro DVT H.264 encoder driver

2019-05-29 Thread Hans Verkuil
On 5/29/19 1:07 PM, Mauro Carvalho Chehab wrote:
> Em Tue, 28 May 2019 20:10:27 +0200
> Hans Verkuil  escreveu:
> 
>> The following changes since commit 9bec226d8c79fcbc95817b082557f72a79d182f5:
>>
>>   media: v4l2-pci-skeleton.c: fix doc warning (2019-05-28 13:14:28 -0400)
>>
>> are available in the Git repository at:
>>
>>   git://linuxtv.org/hverkuil/media_tree.git tags/br-v5.3f
>>
>> for you to fetch changes up to 75e52767fa3d58a783dd6862a7fb686e5de67fc6:
>>
>>   allegro: add SPS/PPS nal unit writer (2019-05-28 20:00:26 +0200)
>>
>> 
>> Tag branch
>>
>> 
>> Hans Verkuil (1):
>>   videobuf2-v4l2: set last_buffer_dequeued in dqbuf
>>
>> Michael Tretter (4):
>>   media: dt-bindings: media: document allegro-dvt bindings
>>   media: dt-bindings: media: Add vendor prefix for allegro
>>   allegro: add Allegro DVT video IP core driver
>>   allegro: add SPS/PPS nal unit writer
> 
> As this is staging, merged, but there's something wrong with DT here:
> 
> WARNING: DT compatible string vendor "allegro" appears un-documented -- check 
> ./Documentation/devicetree/bindings/vendor-prefixes.yaml
> #3013: FILE: drivers/staging/media/allegro-dvt/allegro-core.c:3013:
> + { .compatible = "allegro,al5e-1.1" },
> 
> Please send a followup patch addressing it.

Huh? Something went wrong: this is the patch that is in my for-v5.3f branch:

https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=for-v5.3f&id=ae4e36dd1945380ccd97090d2099f67be9a976d8

That's not what you tried to merge.

Regards,

Hans


Re: [GIT PULL FOR v5.3] Add ZynqMP VCU/Allegro DVT H.264 encoder driver

2019-05-29 Thread Mauro Carvalho Chehab
Em Tue, 28 May 2019 20:10:27 +0200
Hans Verkuil  escreveu:

> The following changes since commit 9bec226d8c79fcbc95817b082557f72a79d182f5:
> 
>   media: v4l2-pci-skeleton.c: fix doc warning (2019-05-28 13:14:28 -0400)
> 
> are available in the Git repository at:
> 
>   git://linuxtv.org/hverkuil/media_tree.git tags/br-v5.3f
> 
> for you to fetch changes up to 75e52767fa3d58a783dd6862a7fb686e5de67fc6:
> 
>   allegro: add SPS/PPS nal unit writer (2019-05-28 20:00:26 +0200)
> 
> 
> Tag branch
> 
> 
> Hans Verkuil (1):
>   videobuf2-v4l2: set last_buffer_dequeued in dqbuf
> 
> Michael Tretter (4):
>   media: dt-bindings: media: document allegro-dvt bindings
>   media: dt-bindings: media: Add vendor prefix for allegro
>   allegro: add Allegro DVT video IP core driver
>   allegro: add SPS/PPS nal unit writer

As this is staging, merged, but there's something wrong with DT here:

WARNING: DT compatible string vendor "allegro" appears un-documented -- check 
./Documentation/devicetree/bindings/vendor-prefixes.yaml
#3013: FILE: drivers/staging/media/allegro-dvt/allegro-core.c:3013:
+   { .compatible = "allegro,al5e-1.1" },

Please send a followup patch addressing it.

Thanks,
Mauro


[GIT PULL FOR v5.3] Add ZynqMP VCU/Allegro DVT H.264 encoder driver

2019-05-28 Thread Hans Verkuil
The following changes since commit 9bec226d8c79fcbc95817b082557f72a79d182f5:

  media: v4l2-pci-skeleton.c: fix doc warning (2019-05-28 13:14:28 -0400)

are available in the Git repository at:

  git://linuxtv.org/hverkuil/media_tree.git tags/br-v5.3f

for you to fetch changes up to 75e52767fa3d58a783dd6862a7fb686e5de67fc6:

  allegro: add SPS/PPS nal unit writer (2019-05-28 20:00:26 +0200)


Tag branch


Hans Verkuil (1):
  videobuf2-v4l2: set last_buffer_dequeued in dqbuf

Michael Tretter (4):
  media: dt-bindings: media: document allegro-dvt bindings
  media: dt-bindings: media: Add vendor prefix for allegro
  allegro: add Allegro DVT video IP core driver
  allegro: add SPS/PPS nal unit writer

 Documentation/devicetree/bindings/media/allegro.txt|   43 +
 Documentation/devicetree/bindings/vendor-prefixes.yaml |2 +
 MAINTAINERS|7 +
 drivers/media/common/videobuf2/videobuf2-v4l2.c|   10 +-
 drivers/staging/media/Kconfig  |2 +
 drivers/staging/media/Makefile |1 +
 drivers/staging/media/allegro-dvt/Kconfig  |   16 +
 drivers/staging/media/allegro-dvt/Makefile |6 +
 drivers/staging/media/allegro-dvt/TODO |4 +
 drivers/staging/media/allegro-dvt/allegro-core.c   | 3032 
+++
 drivers/staging/media/allegro-dvt/nal-h264.c   | 1001 
+
 drivers/staging/media/allegro-dvt/nal-h264.h   |  208 +
 12 files changed, 4327 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/media/allegro.txt
 create mode 100644 drivers/staging/media/allegro-dvt/Kconfig
 create mode 100644 drivers/staging/media/allegro-dvt/Makefile
 create mode 100644 drivers/staging/media/allegro-dvt/TODO
 create mode 100644 drivers/staging/media/allegro-dvt/allegro-core.c
 create mode 100644 drivers/staging/media/allegro-dvt/nal-h264.c
 create mode 100644 drivers/staging/media/allegro-dvt/nal-h264.h


[PATCH v8 0/5] Add ZynqMP VCU/Allegro DVT H.264 encoder driver

2019-05-28 Thread Michael Tretter
This is v8 of the Allegro DVT H.264 encoder driver found in the EV
family of the Xilinx ZynqMP platform.

v7 caused the following smatch warnings. I fixed these warnings and added
smatch to my list of tools to run before submitting patches.

drivers/staging/media/allegro-dvt/allegro-core.c:1849:36: warning: constant 
0x is so big it is unsigned long
drivers/staging/media/allegro-dvt/nal-h264.c:751: warning: Function parameter 
or member 'dev' not described in 'nal_h264_write_sps'
drivers/staging/media/allegro-dvt/nal-h264.c:792: warning: Function parameter 
or member 'dev' not described in 'nal_h264_read_sps'
drivers/staging/media/allegro-dvt/nal-h264.c:842: warning: Function parameter 
or member 'dev' not described in 'nal_h264_write_pps'
drivers/staging/media/allegro-dvt/nal-h264.c:884: warning: Function parameter 
or member 'dev' not described in 'nal_h264_read_pps'
drivers/staging/media/allegro-dvt/nal-h264.c:926: warning: Function parameter 
or member 'dev' not described in 'nal_h264_write_filler'
drivers/staging/media/allegro-dvt/nal-h264.c:969: warning: Function parameter 
or member 'dev' not described in 'nal_h264_read_filler'

This is the v4l2-compliance test result for v8:

v4l2-compliance SHA: c2ad13e4b7aef9ae160303189c67a91e1775f025, 64 bits

Compliance test for allegro device /dev/video2:

Driver Info:
Driver name  : allegro
Card type: Allegro DVT Video Encoder
Bus info : platform:a0009000.video-codec
Driver version   : 5.2.0
Capabilities : 0x84208000
Video Memory-to-Memory
Streaming
Extended Pix Format
Device Capabilities
Device Caps  : 0x04208000
Video Memory-to-Memory
Streaming
Extended Pix Format
Detected Stateful Encoder

Required ioctls:
test VIDIOC_QUERYCAP: OK

Allow for multiple opens:
test second /dev/video2 open: OK
test VIDIOC_QUERYCAP: OK
test VIDIOC_G/S_PRIORITY: OK
test for unlimited opens: OK

Debug ioctls:
test VIDIOC_DBG_G/S_REGISTER: OK
test VIDIOC_LOG_STATUS: OK (Not Supported)

Input ioctls:
test VIDIOC_G/S_TUNER/ENUM_FREQ_BANDS: OK (Not Supported)
test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
test VIDIOC_S_HW_FREQ_SEEK: OK (Not Supported)
test VIDIOC_ENUMAUDIO: OK (Not Supported)
test VIDIOC_G/S/ENUMINPUT: OK (Not Supported)
test VIDIOC_G/S_AUDIO: OK (Not Supported)
Inputs: 0 Audio Inputs: 0 Tuners: 0

Output ioctls:
test VIDIOC_G/S_MODULATOR: OK (Not Supported)
test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
test VIDIOC_ENUMAUDOUT: OK (Not Supported)
test VIDIOC_G/S/ENUMOUTPUT: OK (Not Supported)
test VIDIOC_G/S_AUDOUT: OK (Not Supported)
Outputs: 0 Audio Outputs: 0 Modulators: 0

Input/Output configuration ioctls:
test VIDIOC_ENUM/G/S/QUERY_STD: OK (Not Supported)
test VIDIOC_ENUM/G/S/QUERY_DV_TIMINGS: OK (Not Supported)
test VIDIOC_DV_TIMINGS_CAP: OK (Not Supported)
test VIDIOC_G/S_EDID: OK (Not Supported)

Control ioctls:
test VIDIOC_QUERY_EXT_CTRL/QUERYMENU: OK
test VIDIOC_QUERYCTRL: OK
test VIDIOC_G/S_CTRL: OK
test VIDIOC_G/S/TRY_EXT_CTRLS: OK
test VIDIOC_(UN)SUBSCRIBE_EVENT/DQEVENT: OK
test VIDIOC_G/S_JPEGCOMP: OK (Not Supported)
Standard Controls: 10 Private Controls: 0

Format ioctls:
test VIDIOC_ENUM_FMT/FRAMESIZES/FRAMEINTERVALS: OK
test VIDIOC_G/S_PARM: OK (Not Supported)
test VIDIOC_G_FBUF: OK (Not Supported)
test VIDIOC_G_FMT: OK
test VIDIOC_TRY_FMT: OK
test VIDIOC_S_FMT: OK
test VIDIOC_G_SLICED_VBI_CAP: OK (Not Supported)
test Cropping: OK (Not Supported)
test Composing: OK (Not Supported)
test Scaling: OK

Codec ioctls:
test VIDIOC_(TRY_)ENCODER_CMD: OK
test VIDIOC_G_ENC_INDEX: OK (Not Supported)
test VIDIOC_(TRY_)DECODER_CMD: OK (Not Supported)

Buffer ioctls:
test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
test VIDIOC_EXPBUF: OK
test Requests: OK (Not Supported)

Test input 0:

Streaming ioctls:
test read/write: OK (Not Supported)
test blocking wait: OK
Video Capture: Captured 60 buffers
test MMAP (select): OK
Video Capture: Captured 60 buffers
test MMAP (epoll): OK
test USERPTR (select): OK (Not Supported)
test DMABUF: Cannot test, specify --expbuf-device

Total for allegro device /dev/video2: 49, Succeeded: 49, Failed: 0, Warnings: 0

Michael

v7 -> v8:
- fix smatch warning about type of 0x
- fix smatch warning about missing document

[PATCH v6 07/16] rockchip/vpu: Rename rockchip_vpu_common.h into rockchip_vpu_v4l2.h

2019-05-28 Thread Ezequiel Garcia
From: Boris Brezillon 

We're about to add prototypes for the vb2/v4l2 helpers shared by the
encoder/decoder logic in this file, so let's pick a name that reflects
that (rockchip_vpu_common.h was a bit to generic).

Suggested-by: Ezequiel Garcia 
Signed-off-by: Boris Brezillon 
---
Changes from v3:
* None

Changes from v2:
* New patch

 drivers/staging/media/rockchip/vpu/rk3288_vpu_hw_jpeg_enc.c | 2 +-
 drivers/staging/media/rockchip/vpu/rk3399_vpu_hw_jpeg_enc.c | 2 +-
 drivers/staging/media/rockchip/vpu/rockchip_vpu_drv.c   | 2 +-
 drivers/staging/media/rockchip/vpu/rockchip_vpu_enc.c   | 2 +-
 .../vpu/{rockchip_vpu_common.h => rockchip_vpu_v4l2.h}  | 6 +++---
 5 files changed, 7 insertions(+), 7 deletions(-)
 rename drivers/staging/media/rockchip/vpu/{rockchip_vpu_common.h => 
rockchip_vpu_v4l2.h} (88%)

diff --git a/drivers/staging/media/rockchip/vpu/rk3288_vpu_hw_jpeg_enc.c 
b/drivers/staging/media/rockchip/vpu/rk3288_vpu_hw_jpeg_enc.c
index 791353ae01e7..68176e91330a 100644
--- a/drivers/staging/media/rockchip/vpu/rk3288_vpu_hw_jpeg_enc.c
+++ b/drivers/staging/media/rockchip/vpu/rk3288_vpu_hw_jpeg_enc.c
@@ -9,7 +9,7 @@
 #include 
 #include "rockchip_vpu_jpeg.h"
 #include "rockchip_vpu.h"
-#include "rockchip_vpu_common.h"
+#include "rockchip_vpu_v4l2.h"
 #include "rockchip_vpu_hw.h"
 #include "rk3288_vpu_regs.h"
 
diff --git a/drivers/staging/media/rockchip/vpu/rk3399_vpu_hw_jpeg_enc.c 
b/drivers/staging/media/rockchip/vpu/rk3399_vpu_hw_jpeg_enc.c
index 74823d25cd8d..460edc5ebe4d 100644
--- a/drivers/staging/media/rockchip/vpu/rk3399_vpu_hw_jpeg_enc.c
+++ b/drivers/staging/media/rockchip/vpu/rk3399_vpu_hw_jpeg_enc.c
@@ -27,7 +27,7 @@
 #include 
 #include "rockchip_vpu_jpeg.h"
 #include "rockchip_vpu.h"
-#include "rockchip_vpu_common.h"
+#include "rockchip_vpu_v4l2.h"
 #include "rockchip_vpu_hw.h"
 #include "rk3399_vpu_regs.h"
 
diff --git a/drivers/staging/media/rockchip/vpu/rockchip_vpu_drv.c 
b/drivers/staging/media/rockchip/vpu/rockchip_vpu_drv.c
index f47fbd0f9545..59b72245fb07 100644
--- a/drivers/staging/media/rockchip/vpu/rockchip_vpu_drv.c
+++ b/drivers/staging/media/rockchip/vpu/rockchip_vpu_drv.c
@@ -24,7 +24,7 @@
 #include 
 #include 
 
-#include "rockchip_vpu_common.h"
+#include "rockchip_vpu_v4l2.h"
 #include "rockchip_vpu.h"
 #include "rockchip_vpu_hw.h"
 
diff --git a/drivers/staging/media/rockchip/vpu/rockchip_vpu_enc.c 
b/drivers/staging/media/rockchip/vpu/rockchip_vpu_enc.c
index 4512e94c3f32..d2b4225516b5 100644
--- a/drivers/staging/media/rockchip/vpu/rockchip_vpu_enc.c
+++ b/drivers/staging/media/rockchip/vpu/rockchip_vpu_enc.c
@@ -28,7 +28,7 @@
 
 #include "rockchip_vpu.h"
 #include "rockchip_vpu_hw.h"
-#include "rockchip_vpu_common.h"
+#include "rockchip_vpu_v4l2.h"
 
 static const struct rockchip_vpu_fmt *
 rockchip_vpu_find_format(struct rockchip_vpu_ctx *ctx, u32 fourcc)
diff --git a/drivers/staging/media/rockchip/vpu/rockchip_vpu_common.h 
b/drivers/staging/media/rockchip/vpu/rockchip_vpu_v4l2.h
similarity index 88%
rename from drivers/staging/media/rockchip/vpu/rockchip_vpu_common.h
rename to drivers/staging/media/rockchip/vpu/rockchip_vpu_v4l2.h
index ca77668d9579..50ad40dfb4f4 100644
--- a/drivers/staging/media/rockchip/vpu/rockchip_vpu_common.h
+++ b/drivers/staging/media/rockchip/vpu/rockchip_vpu_v4l2.h
@@ -13,8 +13,8 @@
  * Copyright (C) 2011 Samsung Electronics Co., Ltd.
  */
 
-#ifndef ROCKCHIP_VPU_COMMON_H_
-#define ROCKCHIP_VPU_COMMON_H_
+#ifndef ROCKCHIP_VPU_V4L2_H_
+#define ROCKCHIP_VPU_V4L2_H_
 
 #include "rockchip_vpu.h"
 
@@ -26,4 +26,4 @@ void rockchip_vpu_enc_reset_src_fmt(struct rockchip_vpu_dev 
*vpu,
 void rockchip_vpu_enc_reset_dst_fmt(struct rockchip_vpu_dev *vpu,
struct rockchip_vpu_ctx *ctx);
 
-#endif /* ROCKCHIP_VPU_COMMON_H_ */
+#endif /* ROCKCHIP_VPU_V4L2_H_ */
-- 
2.20.1



Re: [PATCH v7 0/5] Add ZynqMP VCU/Allegro DVT H.264 encoder driver

2019-05-28 Thread Hans Verkuil
On 5/28/19 5:00 PM, Michael Tretter wrote:
> On Tue, 28 May 2019 15:54:58 +0200, Hans Verkuil wrote:
>> Hi Michael,
>>
>> On 5/28/19 3:09 PM, Michael Tretter wrote:
>>> This is v7 of the Allegro DVT H.264 encoder driver found in the EV
>>> family of the Xilinx ZynqMP platform.
>>>
>>> I moved the driver back to staging, because the v4l2 stateful encoder spec 
>>> is
>>> not finished, yet. Once the spec is finished, this driver shall be tested
>>> against the final v4l2-compliance and moved to mainline again.
>>>
>>> Further, I converted the allegro vendor prefix to the new json format in
>>> vendor-prefixes.yaml.
>>>
>>> The observed occasional failures in v4l2-compliance in v6 of this series
>>> turned out to be caused by a race condition with v4l2_m2m_poll(). I will 
>>> send
>>> patches to fix this issue as a separate series.  
>>
>> I'm getting these smatch warnings:
>>
>> drivers/staging/media/allegro-dvt/allegro-core.c:1849:36: warning: constant 
>> 0x is so big it is unsigned long
> 
> The constant is used to calculate an offset, which is used by the
> hardware as offset for addresses in mailbox messages. The hardware
> expects a 64 bit value, but the driver calculates the value using a
> dma_addr_t, which is fine for 64 bit systems (e.g. ZynqMP), but is a
> problem on 32 bit systems.
> 
> I am currently working on improving the handling of frame addresses and
> make it fit for using the PL-RAM (in the FPGA) instead of the normal
> system RAM (PS-RAM). I would fix the warning with that patch set, if
> it is OK.

Sorry, no. I don't want new drivers creating new warnings. It's OK to
do a quick workaround and fix it properly later, though.

Regards,

Hans

> 
>> drivers/staging/media/allegro-dvt/nal-h264.c:751: warning: Function 
>> parameter or member 'dev' not described in 'nal_h264_write_sps'
>> drivers/staging/media/allegro-dvt/nal-h264.c:792: warning: Function 
>> parameter or member 'dev' not described in 'nal_h264_read_sps'
>> drivers/staging/media/allegro-dvt/nal-h264.c:842: warning: Function 
>> parameter or member 'dev' not described in 'nal_h264_write_pps'
>> drivers/staging/media/allegro-dvt/nal-h264.c:884: warning: Function 
>> parameter or member 'dev' not described in 'nal_h264_read_pps'
>> drivers/staging/media/allegro-dvt/nal-h264.c:926: warning: Function 
>> parameter or member 'dev' not described in 'nal_h264_write_filler'
>> drivers/staging/media/allegro-dvt/nal-h264.c:969: warning: Function 
>> parameter or member 'dev' not described in 'nal_h264_read_filler'
> 
> I didn't describe the "struct device *dev" parameter, because it really
> doesn't add any value.
> 
> Michael
> 
>>
>> Can you take a look? The nal-h264.c warnings look trivial to fix, the
>> allegro-core.c warnings looks more interesting.
>>
>> Regards,
>>
>>  Hans
>>



Re: [PATCH v7 0/5] Add ZynqMP VCU/Allegro DVT H.264 encoder driver

2019-05-28 Thread Michael Tretter
On Tue, 28 May 2019 15:54:58 +0200, Hans Verkuil wrote:
> Hi Michael,
> 
> On 5/28/19 3:09 PM, Michael Tretter wrote:
> > This is v7 of the Allegro DVT H.264 encoder driver found in the EV
> > family of the Xilinx ZynqMP platform.
> > 
> > I moved the driver back to staging, because the v4l2 stateful encoder spec 
> > is
> > not finished, yet. Once the spec is finished, this driver shall be tested
> > against the final v4l2-compliance and moved to mainline again.
> > 
> > Further, I converted the allegro vendor prefix to the new json format in
> > vendor-prefixes.yaml.
> > 
> > The observed occasional failures in v4l2-compliance in v6 of this series
> > turned out to be caused by a race condition with v4l2_m2m_poll(). I will 
> > send
> > patches to fix this issue as a separate series.  
> 
> I'm getting these smatch warnings:
> 
> drivers/staging/media/allegro-dvt/allegro-core.c:1849:36: warning: constant 
> 0x is so big it is unsigned long

The constant is used to calculate an offset, which is used by the
hardware as offset for addresses in mailbox messages. The hardware
expects a 64 bit value, but the driver calculates the value using a
dma_addr_t, which is fine for 64 bit systems (e.g. ZynqMP), but is a
problem on 32 bit systems.

I am currently working on improving the handling of frame addresses and
make it fit for using the PL-RAM (in the FPGA) instead of the normal
system RAM (PS-RAM). I would fix the warning with that patch set, if
it is OK.

> drivers/staging/media/allegro-dvt/nal-h264.c:751: warning: Function parameter 
> or member 'dev' not described in 'nal_h264_write_sps'
> drivers/staging/media/allegro-dvt/nal-h264.c:792: warning: Function parameter 
> or member 'dev' not described in 'nal_h264_read_sps'
> drivers/staging/media/allegro-dvt/nal-h264.c:842: warning: Function parameter 
> or member 'dev' not described in 'nal_h264_write_pps'
> drivers/staging/media/allegro-dvt/nal-h264.c:884: warning: Function parameter 
> or member 'dev' not described in 'nal_h264_read_pps'
> drivers/staging/media/allegro-dvt/nal-h264.c:926: warning: Function parameter 
> or member 'dev' not described in 'nal_h264_write_filler'
> drivers/staging/media/allegro-dvt/nal-h264.c:969: warning: Function parameter 
> or member 'dev' not described in 'nal_h264_read_filler'

I didn't describe the "struct device *dev" parameter, because it really
doesn't add any value.

Michael

> 
> Can you take a look? The nal-h264.c warnings look trivial to fix, the
> allegro-core.c warnings looks more interesting.
> 
> Regards,
> 
>   Hans
> 


Re: [PATCH v7 0/5] Add ZynqMP VCU/Allegro DVT H.264 encoder driver

2019-05-28 Thread Hans Verkuil
Hi Michael,

On 5/28/19 3:09 PM, Michael Tretter wrote:
> This is v7 of the Allegro DVT H.264 encoder driver found in the EV
> family of the Xilinx ZynqMP platform.
> 
> I moved the driver back to staging, because the v4l2 stateful encoder spec is
> not finished, yet. Once the spec is finished, this driver shall be tested
> against the final v4l2-compliance and moved to mainline again.
> 
> Further, I converted the allegro vendor prefix to the new json format in
> vendor-prefixes.yaml.
> 
> The observed occasional failures in v4l2-compliance in v6 of this series
> turned out to be caused by a race condition with v4l2_m2m_poll(). I will send
> patches to fix this issue as a separate series.

I'm getting these smatch warnings:

drivers/staging/media/allegro-dvt/allegro-core.c:1849:36: warning: constant 
0x is so big it is unsigned long
drivers/staging/media/allegro-dvt/nal-h264.c:751: warning: Function parameter 
or member 'dev' not described in 'nal_h264_write_sps'
drivers/staging/media/allegro-dvt/nal-h264.c:792: warning: Function parameter 
or member 'dev' not described in 'nal_h264_read_sps'
drivers/staging/media/allegro-dvt/nal-h264.c:842: warning: Function parameter 
or member 'dev' not described in 'nal_h264_write_pps'
drivers/staging/media/allegro-dvt/nal-h264.c:884: warning: Function parameter 
or member 'dev' not described in 'nal_h264_read_pps'
drivers/staging/media/allegro-dvt/nal-h264.c:926: warning: Function parameter 
or member 'dev' not described in 'nal_h264_write_filler'
drivers/staging/media/allegro-dvt/nal-h264.c:969: warning: Function parameter 
or member 'dev' not described in 'nal_h264_read_filler'

Can you take a look? The nal-h264.c warnings look trivial to fix, the
allegro-core.c warnings looks more interesting.

Regards,

Hans


[PATCH v7 0/5] Add ZynqMP VCU/Allegro DVT H.264 encoder driver

2019-05-28 Thread Michael Tretter
This is v7 of the Allegro DVT H.264 encoder driver found in the EV
family of the Xilinx ZynqMP platform.

I moved the driver back to staging, because the v4l2 stateful encoder spec is
not finished, yet. Once the spec is finished, this driver shall be tested
against the final v4l2-compliance and moved to mainline again.

Further, I converted the allegro vendor prefix to the new json format in
vendor-prefixes.yaml.

The observed occasional failures in v4l2-compliance in v6 of this series
turned out to be caused by a race condition with v4l2_m2m_poll(). I will send
patches to fix this issue as a separate series.

This is the v4l2-compliance test result using the vicodec branch:

v4l2-compliance SHA: c2ad13e4b7aef9ae160303189c67a91e1775f025, 64 bits

Compliance test for allegro device /dev/video2:

Driver Info:
Driver name  : allegro
Card type: Allegro DVT Video Encoder
Bus info : platform:a0009000.video-codec
Driver version   : 5.2.0
Capabilities : 0x84208000
Video Memory-to-Memory
Streaming
Extended Pix Format
Device Capabilities
Device Caps  : 0x04208000
Video Memory-to-Memory
Streaming
Extended Pix Format
Detected Stateful Encoder

Required ioctls:
test VIDIOC_QUERYCAP: OK

Allow for multiple opens:
test second /dev/video2 open: OK
test VIDIOC_QUERYCAP: OK
test VIDIOC_G/S_PRIORITY: OK
test for unlimited opens: OK

Debug ioctls:
test VIDIOC_DBG_G/S_REGISTER: OK
test VIDIOC_LOG_STATUS: OK (Not Supported)

Input ioctls:
test VIDIOC_G/S_TUNER/ENUM_FREQ_BANDS: OK (Not Supported)
test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
test VIDIOC_S_HW_FREQ_SEEK: OK (Not Supported)
test VIDIOC_ENUMAUDIO: OK (Not Supported)
test VIDIOC_G/S/ENUMINPUT: OK (Not Supported)
test VIDIOC_G/S_AUDIO: OK (Not Supported)
Inputs: 0 Audio Inputs: 0 Tuners: 0

Output ioctls:
test VIDIOC_G/S_MODULATOR: OK (Not Supported)
test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
test VIDIOC_ENUMAUDOUT: OK (Not Supported)
test VIDIOC_G/S/ENUMOUTPUT: OK (Not Supported)
test VIDIOC_G/S_AUDOUT: OK (Not Supported)
Outputs: 0 Audio Outputs: 0 Modulators: 0

Input/Output configuration ioctls:
test VIDIOC_ENUM/G/S/QUERY_STD: OK (Not Supported)
test VIDIOC_ENUM/G/S/QUERY_DV_TIMINGS: OK (Not Supported)
test VIDIOC_DV_TIMINGS_CAP: OK (Not Supported)
test VIDIOC_G/S_EDID: OK (Not Supported)

Control ioctls:
test VIDIOC_QUERY_EXT_CTRL/QUERYMENU: OK
test VIDIOC_QUERYCTRL: OK
test VIDIOC_G/S_CTRL: OK
test VIDIOC_G/S/TRY_EXT_CTRLS: OK
test VIDIOC_(UN)SUBSCRIBE_EVENT/DQEVENT: OK
test VIDIOC_G/S_JPEGCOMP: OK (Not Supported)
Standard Controls: 10 Private Controls: 0

Format ioctls:
test VIDIOC_ENUM_FMT/FRAMESIZES/FRAMEINTERVALS: OK
test VIDIOC_G/S_PARM: OK (Not Supported)
test VIDIOC_G_FBUF: OK (Not Supported)
test VIDIOC_G_FMT: OK
test VIDIOC_TRY_FMT: OK
test VIDIOC_S_FMT: OK
test VIDIOC_G_SLICED_VBI_CAP: OK (Not Supported)
test Cropping: OK (Not Supported)
test Composing: OK (Not Supported)
test Scaling: OK

Codec ioctls:
test VIDIOC_(TRY_)ENCODER_CMD: OK
test VIDIOC_G_ENC_INDEX: OK (Not Supported)
test VIDIOC_(TRY_)DECODER_CMD: OK (Not Supported)

Buffer ioctls:
test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
test VIDIOC_EXPBUF: OK
test Requests: OK (Not Supported)

Test input 0:

Streaming ioctls:
test read/write: OK (Not Supported)
test blocking wait: OK
Video Capture: Captured 60 buffers
test MMAP (select): OK
Video Capture: Captured 60 buffers
test MMAP (epoll): OK
test USERPTR (select): OK (Not Supported)
test DMABUF: Cannot test, specify --expbuf-device

Total for allegro device /dev/video2: 49, Succeeded: 49, Failed: 0, Warnings: 0

Michael

v6 -> v7:
- move driver back into staging
- convert to json format for vendor-prefixes.yaml
- remove unused allegro_state_get_name()

v5 -> v6:
- drop selection api and document visual size
- drop references to the video decoder
- fix sparse warnings regarding non-static functions
- fix return type of rbsp_read_bit

v4 -> v5:
- add patch for allegro vendor prefix
- move driver out of staging
- implement draining with CMD_STOP and CMD_START
- rewrite NAL unit RBSP generator

v3 -> v4:
- fix checkpatch and compiler warnings
- use v4l2_m2m_buf_copy_metadata to copy buffer metadata
- resolve FIXME regarding channel creation and streamon
- resolve various TODOs
- add mailbox format to firmware info
- add suballocator_size to firmware info
- use struct_size

Re: [PATCH v6 0/5] Add ZynqMP VCU/Allegro DVT H.264 encoder driver

2019-05-27 Thread Hans Verkuil
On 5/27/19 3:45 PM, Michael Tretter wrote:
> On Wed, 22 May 2019 15:49:45 +0200, Michael Tretter wrote:
>> On Wed, 22 May 2019 14:04:23 +0200, Hans Verkuil wrote:
>>> On 5/13/19 7:21 PM, Michael Tretter wrote:  
>>>> This is v6 of the Allegro DVT H.264 encoder driver found in the EV
>>>> family of the Xilinx ZynqMP platform.
>>>>
>>>> Only minor changes this time. I dropped the implementation of the
>>>> selection api, removed all references mentioning the decoder, and fixed
>>>> a few issues reported by sparse and smatch.
>>>>
>>>> The v4l2-compliance result using the current vicodec branch is
>>>>
>>>> v4l2-compliance SHA: c2ad13e4b7aef9ae160303189c67a91e1775f025, 64 bits
>>>>
>>>> Compliance test for allegro device /dev/video4:  
>> [...]
>>>> I observed that the "MMAP (select)" test occasionally fails, because the
>>>> test did not receive an V4L2_EVENT_EOS when dequeuing a buffer with
>>>> V4L2_BUF_FLAG_LAST being set. The driver always queues the event before
>>>> returning the last buffer and the "MMAP (epoll)" does not fail. Thus, I
>>>> decided to send the series anyway.
>>>
>>> Where exactly does v4l2-compliance fail? This is weird, and I believe
>>> this warrants a bit more debugging. I recommend adding a debug
>>> statement in allegro_channel_buf_done() to see when a buffer is marked
>>> LAST.  
>>
>> v4l2-compliance fails in line 1074
>>
>>  fail: v4l2-test-buffers.cpp(1074): !got_eos && !got_source_change
>>
>> The corresponding code in v4l2-compliance is
>>
>>  if (buf.g_flags() & V4L2_BUF_FLAG_LAST) {
>>  fail_on_test(buf.dqbuf(node) != EPIPE);
>>> fail_on_test(!got_eos && !got_source_change);  
>>  if (!count)
>>  break;
>>  fail_on_test(node->streamoff(m2m_q.g_type()));
>>  m2m_q.munmap_bufs(node);
>>
>> When the test fails, the select/epoll_wait returns with readable data,
>> but without readable events on the last buffer. If the test is
>> successful, data and events are available. This looks like a race
>> between the event and the LAST buffer and if the LAST buffer comes
>> first, the test fails.
>>
>> As said, the driver always queues the EOS event before calling
>> v4l2_m2m_buf_done() on the LAST buffer. Right now, I don't understand
>> how this can happen, but I will continue debugging.
> 
> There is a race between v4l2_m2m_poll() and allegro_channel_finish_frame().
> 
> v4l2_m2m_poll() first calls v4l2_event_pending() to check if events are
> available and afterwards checks if there are buffers on src_q and
> dst_q. If allegro_channel_finish_frame() queues the V4L2_EVENT_EOS
> after v4l2_event_pending() but before the checks on the queues,
> v4l2_m2m_poll() sets EPOLLIN | EPOLLRDNORM for the LAST buffer, but does
> not set EPOLLPRI, because it missed V4L2_EVENT_EOS.
> 
> As a fix, the driver must hold the m2m_ctx->q_lock mutex while calling
> v4l2_event_queue_fh() for V4L2_EVENT_EOS to ensure that the event is
> not queued during v4l2_m2m_poll() after the v4l2_event_pending() check.

Nice analysis!

I think this can be fixed fairly simply: just call v4l2_event_pending as
the last thing in v4l2_m2m_poll() and in vb2_poll().

That will ensure that no events are missed by poll.

> 
> I'm not completely sure, but it seems to me that other v4l2 mem2mem
> drivers have the same issue.

Most likely, yes.

The good news is that this is not a driver bug, so I'll make a pull request
for this series.

It would be great if you can make two patches (one for vb2_poll, one for
v4l2_m2m_poll) that changes this behavior.

You can test it with your driver to verify that this indeed fixes the problem.

Regards,

Hans

> 
> Michael
> 
>>
>>>
>>> These tests really should not fail, and it is a strong indication of a
>>> bug somewhere.
>>>
>>> I don't want to merge a driver that has a FAIL in v4l2-compliance without
>>> at the very least understanding why that happens. Ignoring it defeats the
>>> purpose of v4l2-compliance.  
>>
>> Totally agreed.
>>
>> Michael
>>
>>>
>>> Regards,
>>>
>>> Hans
>>>   
>>



Re: [PATCH v6 0/5] Add ZynqMP VCU/Allegro DVT H.264 encoder driver

2019-05-27 Thread Michael Tretter
On Wed, 22 May 2019 15:49:45 +0200, Michael Tretter wrote:
> On Wed, 22 May 2019 14:04:23 +0200, Hans Verkuil wrote:
> > On 5/13/19 7:21 PM, Michael Tretter wrote:  
> > > This is v6 of the Allegro DVT H.264 encoder driver found in the EV
> > > family of the Xilinx ZynqMP platform.
> > > 
> > > Only minor changes this time. I dropped the implementation of the
> > > selection api, removed all references mentioning the decoder, and fixed
> > > a few issues reported by sparse and smatch.
> > > 
> > > The v4l2-compliance result using the current vicodec branch is
> > > 
> > > v4l2-compliance SHA: c2ad13e4b7aef9ae160303189c67a91e1775f025, 64 bits
> > > 
> > > Compliance test for allegro device /dev/video4:  
> [...]
> > > I observed that the "MMAP (select)" test occasionally fails, because the
> > > test did not receive an V4L2_EVENT_EOS when dequeuing a buffer with
> > > V4L2_BUF_FLAG_LAST being set. The driver always queues the event before
> > > returning the last buffer and the "MMAP (epoll)" does not fail. Thus, I
> > > decided to send the series anyway.
> > 
> > Where exactly does v4l2-compliance fail? This is weird, and I believe
> > this warrants a bit more debugging. I recommend adding a debug
> > statement in allegro_channel_buf_done() to see when a buffer is marked
> > LAST.  
> 
> v4l2-compliance fails in line 1074
> 
>   fail: v4l2-test-buffers.cpp(1074): !got_eos && !got_source_change
> 
> The corresponding code in v4l2-compliance is
> 
>   if (buf.g_flags() & V4L2_BUF_FLAG_LAST) {
>   fail_on_test(buf.dqbuf(node) != EPIPE);
> > fail_on_test(!got_eos && !got_source_change);  
>   if (!count)
>   break;
>   fail_on_test(node->streamoff(m2m_q.g_type()));
>   m2m_q.munmap_bufs(node);
> 
> When the test fails, the select/epoll_wait returns with readable data,
> but without readable events on the last buffer. If the test is
> successful, data and events are available. This looks like a race
> between the event and the LAST buffer and if the LAST buffer comes
> first, the test fails.
> 
> As said, the driver always queues the EOS event before calling
> v4l2_m2m_buf_done() on the LAST buffer. Right now, I don't understand
> how this can happen, but I will continue debugging.

There is a race between v4l2_m2m_poll() and allegro_channel_finish_frame().

v4l2_m2m_poll() first calls v4l2_event_pending() to check if events are
available and afterwards checks if there are buffers on src_q and
dst_q. If allegro_channel_finish_frame() queues the V4L2_EVENT_EOS
after v4l2_event_pending() but before the checks on the queues,
v4l2_m2m_poll() sets EPOLLIN | EPOLLRDNORM for the LAST buffer, but does
not set EPOLLPRI, because it missed V4L2_EVENT_EOS.

As a fix, the driver must hold the m2m_ctx->q_lock mutex while calling
v4l2_event_queue_fh() for V4L2_EVENT_EOS to ensure that the event is
not queued during v4l2_m2m_poll() after the v4l2_event_pending() check.

I'm not completely sure, but it seems to me that other v4l2 mem2mem
drivers have the same issue.

Michael

> 
> > 
> > These tests really should not fail, and it is a strong indication of a
> > bug somewhere.
> > 
> > I don't want to merge a driver that has a FAIL in v4l2-compliance without
> > at the very least understanding why that happens. Ignoring it defeats the
> > purpose of v4l2-compliance.  
> 
> Totally agreed.
> 
> Michael
> 
> > 
> > Regards,
> > 
> > Hans
> >   
> 


[PATCH v2 09/11] media: aspeed: use different delays for triggering VE H/W reset

2019-05-24 Thread Jae Hyun Yoo
In case of watchdog timeout detected while doing mode detection,
it's better triggering video engine hardware reset immediately so
this commit fixes code for the case. Other than the case, it will
trigger video engine hardware reset after RESOLUTION_CHANGE_DELAY.

Signed-off-by: Jae Hyun Yoo 
---
v1 -> v2:
 New.

 drivers/media/platform/aspeed-video.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/aspeed-video.c 
b/drivers/media/platform/aspeed-video.c
index 4647ed2e9e63..67f476bf0a03 100644
--- a/drivers/media/platform/aspeed-video.c
+++ b/drivers/media/platform/aspeed-video.c
@@ -522,7 +522,7 @@ static void aspeed_video_bufs_done(struct aspeed_video 
*video,
spin_unlock_irqrestore(&video->lock, flags);
 }
 
-static void aspeed_video_irq_res_change(struct aspeed_video *video)
+static void aspeed_video_irq_res_change(struct aspeed_video *video, ulong 
delay)
 {
dev_dbg(video->dev, "Resolution changed; resetting\n");
 
@@ -532,7 +532,7 @@ static void aspeed_video_irq_res_change(struct aspeed_video 
*video)
aspeed_video_off(video);
aspeed_video_bufs_done(video, VB2_BUF_STATE_ERROR);
 
-   schedule_delayed_work(&video->res_work, RESOLUTION_CHANGE_DELAY);
+   schedule_delayed_work(&video->res_work, delay);
 }
 
 static irqreturn_t aspeed_video_irq(int irq, void *arg)
@@ -545,7 +545,7 @@ static irqreturn_t aspeed_video_irq(int irq, void *arg)
 * re-initialize
 */
if (sts & VE_INTERRUPT_MODE_DETECT_WD) {
-   aspeed_video_irq_res_change(video);
+   aspeed_video_irq_res_change(video, 0);
return IRQ_HANDLED;
}
 
@@ -563,7 +563,8 @@ static irqreturn_t aspeed_video_irq(int irq, void *arg)
 * Signal acquired while NOT doing resolution
 * detection; reset the engine and re-initialize
 */
-   aspeed_video_irq_res_change(video);
+   aspeed_video_irq_res_change(video,
+   RESOLUTION_CHANGE_DELAY);
return IRQ_HANDLED;
}
}
-- 
2.21.0



Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-23 Thread Jonas Karlman
On 2019-05-15 12:09, Paul Kocialkowski wrote:
> Hi,
>
> With the Rockchip stateless VPU driver in the works, we now have a
> better idea of what the situation is like on platforms other than
> Allwinner. This email shares my conclusions about the situation and how
> we should update the MPEG-2, H.264 and H.265 controls accordingly.
>
> [...]
>
> - Clear split of controls and terminology
>
> Some codecs have explicit NAL units that are good fits to match as
> controls: e.g. slice header, pps, sps. I think we should stick to the
> bitstream element names for those.
>
> For H.264, that would suggest the following changes:
> - renaming v4l2_ctrl_h264_decode_param to v4l2_ctrl_h264_slice_header;
> - killing v4l2_ctrl_h264_decode_param and having the reference lists
> where they belong, which seems to be slice_header;

I have two more changes and/or clarifications that is needed for 
v4l2_ctrl_h264_scaling_matrix,
the expected order of scaling_list elements needs to be defined and documented.

In cedrus driver the expected order of elements is after the inverse scanning 
process as been applied.
This is in the order the hardware expects and what both ffmpeg use internally 
and vaapi expects,
allows for simple memcpy/sram write in both userspace and driver.

The rockchip vpu h264 driver from chromeos was expecting elements in scaling 
list order and would apply
the inverse zig-zag scan in driver. Side note: it would also wrongly apply 
zig-zag scan instead of field scan on field coded content.

I propose a clarification that the scaling lists element order should be after 
the inverse scanning process as been applied,
the order that cedrus, rockchip and vaapi expects.

Secondly the order of the six scaling_list_8x8 lists is currently using "ffmpeg 
order" where Intra Y is in [0] and Inter Y in [3].
Table 7-2 in H.264 specification list them in following order (index 6-11): 
Intra Y, Inter Y, Intra Cb, Inter Cb, Intra Cr and Inter Cr.
The 8x8 Cb/Cr lists should only be needed for 4:4:4 content.

Rockchip was expecting Intra/Inter Y to be in [0] and [1], cedrus use list [0] 
and [3].
VA-API only seem to support Intra/Inter Y, ffmpeg vaapi hwaccel copies [0] and 
[3] into vaapi [0] and [1].

I propose a clarification that the 8x8 scaling lists use the same order as they 
are listed in Table 7-2,
and that cedrus driver is changed to use 8x8 lists from [0] and [1] instead of 
[0] and [3].

Regards,
Jonas

> I'm up for preparing and submitting these control changes and updating
> cedrus if they seem agreeable.
>
> What do you think?
>
> Cheers,
>
> Paul


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Nicolas Dufresne
Le mercredi 22 mai 2019 à 12:08 +0200, Thierry Reding a écrit :
> >   3. Does the HW do support single interrupt per frame (RK3288 as an 
> > example does not, but RK3399 do)
> 
> Yeah, we definitely do get a single interrupt at the end of a frame, or
> when an error occurs. Looking a bit at the register documentation it
> looks like this can be more fine-grained. We can for example get an
> interrupt at the end of a slice or a row of macro blocks.

This last one is really fancy. I've been working on some HW where they
do synchronization between decoder and encoder so they process data
with one macro-block distance. I know chips&media have similar feature,
and now Tegra, would be nice to find some convergence on this in the
future.


signature.asc
Description: This is a digitally signed message part


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Nicolas Dufresne
n theory, the pipeline needs to be
> > reconfigured with each slice.
> > 
> > What we have been doing in Cedrus is to currently gather all the slices
> > and use the last slice's specific configuration for the pipeline, which
> > sort of works, but is very likely not a good idea.
> 
> To be honest, my testing has been very minimal, so it's quite possible
> that I've always only run into examples with either only a single slice
> or multiple slices with the same configuration. Or perhaps with
> differing configurations but non-significant (or non-noticable)
> differences.
> 
> > You mentionned that the Tegra VPU currentyl always operates in frame
> > mode (even when the stream actually has multiple slices, which I assume
> > are gathered at some point). I wonder how it goes about configuring
> > different slice parameters (which are specific to each slice, not
> > frame) for the different slices.
> 
> That's part of the beauty of the frame-level decoding mode (I think
> that's call SXE-P). The syntax engine has access to the complete
> bitstream and can parse all the information that it needs. There's some
> data that we pass into the decoder from the SPS and PPS, but other than
> that the VDE will do everything by itself.
> 
> > I believe we should at least always expose per-slice granularity in the
> > pixel format and requests. Maybe we could have a way to allow multiple
> > slices to be gathered in the source buffer and have a control slice
> > array for each request. In that case, we'd have a single request queued
> > for the series of slices, with a bit offset in each control to the
> > matching slice.
> > 
> > Then we could specify that such slices must be appended in a way that
> > suits most decoders that would have to operate per-frame (so we need to
> > figure this out) and worst case, we'll always have offsets in the
> > controls if we need to setup a bounce buffer in the driver because
> > things are not laid out the way we specified.
> > 
> > Then we introduce a specific cap to indicate which mode is supported
> > (per-slice and/or per-frame) and adapt our ffmpeg reference to be able
> > to operate in both modes.
> > 
> > That adds some complexity for userspace, but I don't think we can avoid
> > it at this point and it feels better than having two different pixel
> > formats (which would probably be even more complex to manage for
> > userspace).
> > 
> > What do you think?
> 
> I'm not sure I understand why this would be simpler than exposing two
> different pixel formats. It sounds like essentially the same thing, just
> with a different method.
> 
> One advantage I see with your approach is that it more formally defines
> how slices are passed. This might be a good thing to do anyway. I'm not
> sure if software stacks provide that information anyway. If they do this
> would be trivial to achieve. If they don't this could be an extra burden
> on userspace for decoder that don't need it.

Just to feed the discussion, in GStreamer it would be exposed like this
(except that this is full bitstream, not just slices):

/* FULL Frame */
video/x-h264,stream-format=byte-stream,alignment=au

/* One of more NAL per memory buffer */
video/x-h264,stream-format=byte-stream,alignment=nal

"stream-format=byte-stream" means with start-code, where you could AVC
or AVC3 bitstream too. We do that, so you have a common format, with
variant. I'm worried having too many formats will not scale in the long
term, that's all, I still think this solution works too. But note that
we already have _H264 and _H264_NOSC format. And then, how do you call
a stream that only has slice nals, but all all slice of a frame per
buffer ...

p.s. In Tegra OMX, there is a control to pick between AU/NAL, so I'm
pretty sure the HW support both ways.

> 
> Would it perhaps be possible to make this slice meta data optional? For
> example, could we just provide an H.264 slice pixel format and then let
> userspace fill in buffers in whatever way they want, provided that they
> follow some rules (must be annex B or something else, concatenated
> slices, ...) and then if there's an extra control specifying the offsets
> of individual slices drivers can use that, if not they just pass the
> bitstream buffer to the hardware if frame-level decoding is supported
> and let the hardware do its thing?
> 
> Hardware that has requirements different from that could require the
> meta data to be present and fail otherwise.
> 
> On the other hand, userspace would have to be prepared to deal with this
> type of hardware anyway, so it basically needs to

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Nicolas Dufresne
Le mercredi 22 mai 2019 à 11:29 +0200, Paul Kocialkowski a écrit :
> Le mercredi 22 mai 2019 à 10:32 +0200, Thierry Reding a écrit :
> > On Wed, May 22, 2019 at 09:29:24AM +0200, Boris Brezillon wrote:
> > > On Wed, 22 May 2019 15:39:37 +0900
> > > Tomasz Figa  wrote:
> > > 
> > > > > It would be premature to state that we are excluding. We are just
> > > > > trying to find one format to get things upstream, and make sure we 
> > > > > have
> > > > > a plan how to extend it. Trying to support everything on the first try
> > > > > is not going to work so well.
> > > > > 
> > > > > What is interesting to provide is how does you IP achieve multi-slice
> > > > > decoding per frame. That's what we are studying on the RK/Hantro chip.
> > > > > Typical questions are:
> > > > > 
> > > > >   1. Do all slices have to be contiguous in memory
> > > > >   2. If 1., do you place start-code, AVC header or pass a seperate 
> > > > > index to let the HW locate the start of each NAL ?
> > > > >   3. Does the HW do support single interrupt per frame (RK3288 as an 
> > > > > example does not, but RK3399 do)  
> > > > 
> > > > AFAICT, the bit about RK3288 isn't true. At least in our downstream
> > > > driver that was created mostly by RK themselves, we've been assuming
> > > > that the interrupt is for the complete frame, without any problems.
> > > 
> > > I confirm that's what happens when all slices forming a frame are packed
> > > in a single output buffer: you only get one interrupt at the end of the
> > > decoding process (in that case, when the frame is decoded). Of course,
> > > if you split things up and do per-slice decoding instead (one slice per
> > > buffer) you get an interrupt per slice, though I didn't manage to make
> > > that work.
> > > I get a DEC_BUFFER interrupt (AKA, "buffer is empty but frame is not
> > > fully decoded") on the first slice and an ASO (Arbitrary Slice Ordering)
> > > interrupt on the second slice, which makes me think some states are
> > > reset between the 2 operations leading the engine to think that the
> > > second slice is part of a new frame.
> > 
> > That sounds a lot like how this works on Tegra. My understanding is that
> > for slice decoding you'd also get an interrupt every time a full slice
> > has been decoded perhaps coupled with another "frame done" interrupt
> > when the full frame has been decoded after the last slice.
> > 
> > In frame-level decode mode you don't get interrupts in between and
> > instead only get the "frame done" interrupt. Unless something went wrong
> > during decoding, in which case you also get an interrupt but with error
> > flags and status registers that help determine what exactly happened.
> > 
> > > Anyway, it doesn't sound like a crazy idea to support both per-slice
> > > and per-frame decoding and maybe have a way to expose what a
> > > specific codec can do (through an extra cap mechanism).
> > 
> > Yeah, I think it makes sense to support both for devices that can do
> > both. From what Nicolas said it may make sense for an application to
> > want to do slice-level decoding if receiving a stream from the network
> > and frame-level decoding if playing back from a local file. If a driver
> > supports both, the application could detect that and choose the
> > appropriate format.
> > 
> > It sounds to me like using different input formats for that would be a
> > very natural way to describe it. Applications can already detect the set
> > of supported input formats and set the format when they allocate buffers
> > so that should work very nicely.
> 
> Pixel formats are indeed the natural way to go about this, but I have
> some reservations in this case. Slices are the natural unit of video
> streams, just like frames are to display hardware. Part of the pipeline
> configuration is slice-specific, so in theory, the pipeline needs to be
> reconfigured with each slice.
> 
> What we have been doing in Cedrus is to currently gather all the slices
> and use the last slice's specific configuration for the pipeline, which
> sort of works, but is very likely not a good idea.
> 
> You mentionned that the Tegra VPU currentyl always operates in frame
> mode (even when the stream actually has multiple slices, which I assume
> are gathered at some point). I wonder how it goes about configuring
> different slice parameters (which are specific to each slice, not
> frame) for the different slices. 

Per-frame CODEC won't ask for the l0/l1 list, which is slice specific.
This is the case for the RK3288, we don't pass that information.
Instead we build a list from the DPB entries, this is the list before
the applying the modifications found in the slice header. The HW will
do the rest.

> 
> I believe we should at least always expose per-slice granularity in the
> pixel format and requests. Maybe we could have a way to allow multiple
> slices to be gathered in the source buffer and have a control slice
> array for each request. In that case, we'd have a single requ

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Nicolas Dufresne
Le mercredi 22 mai 2019 à 10:20 +0200, Boris Brezillon a écrit :
> On Wed, 22 May 2019 09:29:24 +0200
> Boris Brezillon  wrote:
> 
> > On Wed, 22 May 2019 15:39:37 +0900
> > Tomasz Figa  wrote:
> > 
> > > > It would be premature to state that we are excluding. We are just
> > > > trying to find one format to get things upstream, and make sure we have
> > > > a plan how to extend it. Trying to support everything on the first try
> > > > is not going to work so well.
> > > > 
> > > > What is interesting to provide is how does you IP achieve multi-slice
> > > > decoding per frame. That's what we are studying on the RK/Hantro chip.
> > > > Typical questions are:
> > > > 
> > > >   1. Do all slices have to be contiguous in memory
> > > >   2. If 1., do you place start-code, AVC header or pass a seperate 
> > > > index to let the HW locate the start of each NAL ?
> > > >   3. Does the HW do support single interrupt per frame (RK3288 as an 
> > > > example does not, but RK3399 do)
> > > 
> > > AFAICT, the bit about RK3288 isn't true. At least in our downstream
> > > driver that was created mostly by RK themselves, we've been assuming
> > > that the interrupt is for the complete frame, without any problems.  
> > 
> > I confirm that's what happens when all slices forming a frame are packed
> > in a single output buffer: you only get one interrupt at the end of the
> > decoding process (in that case, when the frame is decoded). Of course,
> > if you split things up and do per-slice decoding instead (one slice per
> > buffer) you get an interrupt per slice, though I didn't manage to make
> > that work.
> > I get a DEC_BUFFER interrupt (AKA, "buffer is empty but frame is not
> > fully decoded") on the first slice and an ASO (Arbitrary Slice Ordering)
> > interrupt on the second slice, which makes me think some states are
> > reset between the 2 operations leading the engine to think that the
> > second slice is part of a new frame.
> > 
> > Anyway, it doesn't sound like a crazy idea to support both per-slice
> > and per-frame decoding and maybe have a way to expose what a
> > specific codec can do (through an extra cap mechanism).
> > The other option would be to support only per-slice decoding with a
> > mandatory START_FRAME/END_FRAME sequence to let drivers for HW that
> > only support per-frame decoding know when they should trigger the
> > decoding operation.
> 
> Just to clarify, we can use Hans' V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
> work to identify start/end frame boundaries, the only problem I see is
> that users are not required to clear the flag on the last slice of a
> frame, so there's no way for the driver to know when it should trigger
> the decode-frame operation. I guess we could trigger this decode
> operation when v4l2_m2m_release_capture_buf() returns true, but I
> wonder if it's not too late to do that.

If the flag is gone, you can schedule immediatly, otherwise you'll know
by the timestamp change on the following slice.

> 
> > The downside is that it implies having a bounce
> > buffer where the driver can pack slices to be decoded on the END_FRAME
> > event.
> > 


signature.asc
Description: This is a digitally signed message part


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Nicolas Dufresne
Le mercredi 22 mai 2019 à 15:01 +0900, Tomasz Figa a écrit :
> On Tue, May 21, 2019 at 8:45 PM Paul Kocialkowski
>  wrote:
> > Hi,
> > 
> > On Tue, 2019-05-21 at 19:27 +0900, Tomasz Figa wrote:
> > > On Thu, May 16, 2019 at 2:43 AM Paul Kocialkowski
> > >  wrote:
> > > > Hi,
> > > > 
> > > > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit :
> > > > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a écrit :
> > > > > > Hi,
> > > > > > 
> > > > > > With the Rockchip stateless VPU driver in the works, we now have a
> > > > > > better idea of what the situation is like on platforms other than
> > > > > > Allwinner. This email shares my conclusions about the situation and 
> > > > > > how
> > > > > > we should update the MPEG-2, H.264 and H.265 controls accordingly.
> > > > > > 
> > > > > > - Per-slice decoding
> > > > > > 
> > > > > > We've discussed this one already[0] and Hans has submitted a 
> > > > > > patch[1]
> > > > > > to implement the required core bits. When we agree it looks good, we
> > > > > > should lift the restriction that all slices must be concatenated and
> > > > > > have them submitted as individual requests.
> > > > > > 
> > > > > > One question is what to do about other controls. I feel like it 
> > > > > > would
> > > > > > make sense to always pass all the required controls for decoding the
> > > > > > slice, including the ones that don't change across slices. But there
> > > > > > may be no particular advantage to this and only downsides. Not 
> > > > > > doing it
> > > > > > and relying on the "control cache" can work, but we need to specify
> > > > > > that only a single stream can be decoded per opened instance of the
> > > > > > v4l2 device. This is the assumption we're going with for handling
> > > > > > multi-slice anyway, so it shouldn't be an issue.
> > > > > 
> > > > > My opinion on this is that the m2m instance is a state, and the driver
> > > > > should be responsible of doing time-division multiplexing across
> > > > > multiple m2m instance jobs. Doing the time-division multiplexing in
> > > > > userspace would require some sort of daemon to work properly across
> > > > > processes. I also think the kernel is better place for doing resource
> > > > > access scheduling in general.
> > > > 
> > > > I agree with that yes. We always have a single m2m context and specific
> > > > controls per opened device so keeping cached values works out well.
> > > > 
> > > > So maybe we shall explicitly require that the request with the first
> > > > slice for a frame also contains the per-frame controls.
> > > > 
> > > 
> > > Agreed.
> > > 
> > > One more argument not to allow such multiplexing is that despite the
> 
> ^^ Here I meant the "userspace multiplexing".

Thanks, I was confused for a moment (specially that browser is your use
case).

> 
> > > API being called "stateless", there is actually some state saved
> > > between frames, e.g. the Rockchip decoder writes some intermediate
> > > data to some local buffers which need to be given to the decoder to
> > > decode the next frame. Actually, on Rockchip there is even a
> > > requirement to keep the reference list entries in the same order
> > > between frames.
> > 
> > Well, what I'm suggesting is to have one stream per m2m context, but it
> > should certainly be possible to have multiple m2m contexts (multiple
> > userspace open calls) that decode different streams concurrently.
> > 
> > Is that really going to be a problem for Rockchip? If so, then the
> > driver should probably enforce allowing a single userspace open and m2m
> > context at a time.
> 
> No, that's not what I meant. Obviously the driver can switch between
> different sets of private buffers when scheduling different contexts,
> as long as the userspace doesn't attempt to do any multiplexing
> itself.
> 
> Best regards,
> Tomasz


signature.asc
Description: This is a digitally signed message part


Re: [PATCH v6 0/5] Add ZynqMP VCU/Allegro DVT H.264 encoder driver

2019-05-22 Thread Michael Tretter
On Wed, 22 May 2019 14:04:23 +0200, Hans Verkuil wrote:
> On 5/13/19 7:21 PM, Michael Tretter wrote:
> > This is v6 of the Allegro DVT H.264 encoder driver found in the EV
> > family of the Xilinx ZynqMP platform.
> > 
> > Only minor changes this time. I dropped the implementation of the
> > selection api, removed all references mentioning the decoder, and fixed
> > a few issues reported by sparse and smatch.
> > 
> > The v4l2-compliance result using the current vicodec branch is
> > 
> > v4l2-compliance SHA: c2ad13e4b7aef9ae160303189c67a91e1775f025, 64 bits
> > 
> > Compliance test for allegro device /dev/video4:
[...]
> > I observed that the "MMAP (select)" test occasionally fails, because the
> > test did not receive an V4L2_EVENT_EOS when dequeuing a buffer with
> > V4L2_BUF_FLAG_LAST being set. The driver always queues the event before
> > returning the last buffer and the "MMAP (epoll)" does not fail. Thus, I
> > decided to send the series anyway.  
> 
> Where exactly does v4l2-compliance fail? This is weird, and I believe
> this warrants a bit more debugging. I recommend adding a debug
> statement in allegro_channel_buf_done() to see when a buffer is marked
> LAST.

v4l2-compliance fails in line 1074

fail: v4l2-test-buffers.cpp(1074): !got_eos && !got_source_change

The corresponding code in v4l2-compliance is

if (buf.g_flags() & V4L2_BUF_FLAG_LAST) {
fail_on_test(buf.dqbuf(node) != EPIPE);
>   fail_on_test(!got_eos && !got_source_change);
if (!count)
break;
fail_on_test(node->streamoff(m2m_q.g_type()));
m2m_q.munmap_bufs(node);

When the test fails, the select/epoll_wait returns with readable data,
but without readable events on the last buffer. If the test is
successful, data and events are available. This looks like a race
between the event and the LAST buffer and if the LAST buffer comes
first, the test fails.

As said, the driver always queues the EOS event before calling
v4l2_m2m_buf_done() on the LAST buffer. Right now, I don't understand
how this can happen, but I will continue debugging.

> 
> These tests really should not fail, and it is a strong indication of a
> bug somewhere.
> 
> I don't want to merge a driver that has a FAIL in v4l2-compliance without
> at the very least understanding why that happens. Ignoring it defeats the
> purpose of v4l2-compliance.

Totally agreed.

Michael

> 
> Regards,
> 
>   Hans
> 


Re: [PATCH v6 0/5] Add ZynqMP VCU/Allegro DVT H.264 encoder driver

2019-05-22 Thread Hans Verkuil
On 5/13/19 7:21 PM, Michael Tretter wrote:
> This is v6 of the Allegro DVT H.264 encoder driver found in the EV
> family of the Xilinx ZynqMP platform.
> 
> Only minor changes this time. I dropped the implementation of the
> selection api, removed all references mentioning the decoder, and fixed
> a few issues reported by sparse and smatch.
> 
> The v4l2-compliance result using the current vicodec branch is
> 
> v4l2-compliance SHA: c2ad13e4b7aef9ae160303189c67a91e1775f025, 64 bits
> 
> Compliance test for allegro device /dev/video4:
> 
> Driver Info:
>   Driver name  : allegro
>   Card type: Allegro DVT Video Encoder
>   Bus info : platform:a0009000.video-codec
>   Driver version   : 5.1.0
>   Capabilities : 0x84208000
>   Video Memory-to-Memory
>   Streaming
>   Extended Pix Format
>   Device Capabilities
>   Device Caps  : 0x04208000
>   Video Memory-to-Memory
>   Streaming
>   Extended Pix Format
>   Detected Stateful Encoder
> 
> Required ioctls:
>   test VIDIOC_QUERYCAP: OK
> 
> Allow for multiple opens:
>   test second /dev/video4 open: OK
>   test VIDIOC_QUERYCAP: OK
>   test VIDIOC_G/S_PRIORITY: OK
>   test for unlimited opens: OK
> 
> Debug ioctls:
>   test VIDIOC_DBG_G/S_REGISTER: OK
>   test VIDIOC_LOG_STATUS: OK (Not Supported)
> 
> Input ioctls:
>   test VIDIOC_G/S_TUNER/ENUM_FREQ_BANDS: OK (Not Supported)
>   test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
>   test VIDIOC_S_HW_FREQ_SEEK: OK (Not Supported)
>   test VIDIOC_ENUMAUDIO: OK (Not Supported)
>   test VIDIOC_G/S/ENUMINPUT: OK (Not Supported)
>   test VIDIOC_G/S_AUDIO: OK (Not Supported)
>   Inputs: 0 Audio Inputs: 0 Tuners: 0
> 
> Output ioctls:
>   test VIDIOC_G/S_MODULATOR: OK (Not Supported)
>   test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
>   test VIDIOC_ENUMAUDOUT: OK (Not Supported)
>   test VIDIOC_G/S/ENUMOUTPUT: OK (Not Supported)
>   test VIDIOC_G/S_AUDOUT: OK (Not Supported)
>   Outputs: 0 Audio Outputs: 0 Modulators: 0
> 
> Input/Output configuration ioctls:
>   test VIDIOC_ENUM/G/S/QUERY_STD: OK (Not Supported)
>   test VIDIOC_ENUM/G/S/QUERY_DV_TIMINGS: OK (Not Supported)
>   test VIDIOC_DV_TIMINGS_CAP: OK (Not Supported)
>   test VIDIOC_G/S_EDID: OK (Not Supported)
> 
> Control ioctls:
>   test VIDIOC_QUERY_EXT_CTRL/QUERYMENU: OK
>   test VIDIOC_QUERYCTRL: OK
>   test VIDIOC_G/S_CTRL: OK
>   test VIDIOC_G/S/TRY_EXT_CTRLS: OK
>   test VIDIOC_(UN)SUBSCRIBE_EVENT/DQEVENT: OK
>   test VIDIOC_G/S_JPEGCOMP: OK (Not Supported)
>   Standard Controls: 10 Private Controls: 0
> 
> Format ioctls:
>   test VIDIOC_ENUM_FMT/FRAMESIZES/FRAMEINTERVALS: OK
>   test VIDIOC_G/S_PARM: OK (Not Supported)
>   test VIDIOC_G_FBUF: OK (Not Supported)
>   test VIDIOC_G_FMT: OK
>   test VIDIOC_TRY_FMT: OK
>   test VIDIOC_S_FMT: OK
>   test VIDIOC_G_SLICED_VBI_CAP: OK (Not Supported)
>   test Cropping: OK (Not Supported)
>   test Composing: OK (Not Supported)
>   test Scaling: OK
> 
> Codec ioctls:
>   test VIDIOC_(TRY_)ENCODER_CMD: OK
>   test VIDIOC_G_ENC_INDEX: OK (Not Supported)
>   test VIDIOC_(TRY_)DECODER_CMD: OK (Not Supported)
> 
> Buffer ioctls:
>   test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
>   test VIDIOC_EXPBUF: OK
>   test Requests: OK (Not Supported)
> 
> Test input 0:
> 
> Streaming ioctls:
>   test read/write: OK (Not Supported)
>   test blocking wait: OK
>   Video Capture: Captured 60 buffers
>   test MMAP (select): OK
>   Video Capture: Captured 60 buffers
>   test MMAP (epoll): OK
>   test USERPTR (select): OK (Not Supported)
>   test DMABUF: Cannot test, specify --expbuf-device
> 
> Total for allegro device /dev/video4: 49, Succeeded: 49, Failed: 0, Warnings: > 0
> 
> I observed that the "MMAP (select)" test occasionally fails, because the
> test did not receive an V4L2_EVENT_EOS when dequeuing a buffer with
> V4L2_BUF_FLAG_LAST being set. The driver always queues the event before
> returning the last buffer and the "MMAP (epoll)" does not fail. Thus, I
> decided to send the series anyway.

Where exactly does v4l2-compliance fail? This is weird, and I believe
this warrants a bit more debugging. I recommend adding a debug
statement in allegro_channel_buf_done() to see when a buffer is marked
LAST.

These tests really should not fail, and it is a strong indication of a
bug somewhere.

I don't want to merge a driver that has a FAIL in v4l2-compliance without
at the very least understanding why that happens. Ignoring it defeats the
purpose of v4l2-compliance.

Regards,

Hans


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Thierry Reding
On Wed, May 22, 2019 at 12:55:53PM +0200, Hans Verkuil wrote:
> On 5/22/19 12:42 PM, Thierry Reding wrote:
> > On Wed, May 22, 2019 at 10:26:28AM +0200, Paul Kocialkowski wrote:
> >> Hi,
> >>
> >> Le mercredi 22 mai 2019 à 15:48 +0900, Tomasz Figa a écrit :
> >>> On Sat, May 18, 2019 at 11:09 PM Nicolas Dufresne  
> >>> wrote:
> >>>> Le samedi 18 mai 2019 à 12:29 +0200, Paul Kocialkowski a écrit :
> >>>>> Hi,
> >>>>>
> >>>>> Le samedi 18 mai 2019 à 12:04 +0200, Jernej Škrabec a écrit :
> >>>>>> Dne sobota, 18. maj 2019 ob 11:50:37 CEST je Paul Kocialkowski 
> >>>>>> napisal(a):
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> On Fri, 2019-05-17 at 16:43 -0400, Nicolas Dufresne wrote:
> >>>>>>>> Le jeudi 16 mai 2019 à 20:45 +0200, Paul Kocialkowski a écrit :
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> Le jeudi 16 mai 2019 à 14:24 -0400, Nicolas Dufresne a écrit :
> >>>>>>>>>> Le mercredi 15 mai 2019 à 22:59 +0200, Paul Kocialkowski a écrit :
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> Le mercredi 15 mai 2019 à 14:54 -0400, Nicolas Dufresne a écrit :
> >>>>>>>>>>>> Le mercredi 15 mai 2019 à 19:42 +0200, Paul Kocialkowski a écrit 
> >>>>>>>>>>>> :
> >>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit
> >>>>>> :
> >>>>>>>>>>>>>> Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a
> >>>>>> écrit :
> >>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> With the Rockchip stateless VPU driver in the works, we now
> >>>>>>>>>>>>>>> have a
> >>>>>>>>>>>>>>> better idea of what the situation is like on platforms other
> >>>>>>>>>>>>>>> than
> >>>>>>>>>>>>>>> Allwinner. This email shares my conclusions about the
> >>>>>>>>>>>>>>> situation and how
> >>>>>>>>>>>>>>> we should update the MPEG-2, H.264 and H.265 controls
> >>>>>>>>>>>>>>> accordingly.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - Per-slice decoding
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> We've discussed this one already[0] and Hans has submitted a
> >>>>>>>>>>>>>>> patch[1]
> >>>>>>>>>>>>>>> to implement the required core bits. When we agree it looks
> >>>>>>>>>>>>>>> good, we
> >>>>>>>>>>>>>>> should lift the restriction that all slices must be
> >>>>>>>>>>>>>>> concatenated and
> >>>>>>>>>>>>>>> have them submitted as individual requests.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> One question is what to do about other controls. I feel like
> >>>>>>>>>>>>>>> it would
> >>>>>>>>>>>>>>> make sense to always pass all the required controls for
> >>>>>>>>>>>>>>> decoding the
> >>>>>>>>>>>>>>> slice, including the ones that don't change across slices.
> >>>>>>>>>>>>>>> But there
> >>>>>>>>>>>>>>> may be no particular advantage to this and only downsides.
> >>>>>>>>>>>>>>> Not doing it
> >>>>>>>>>>>>>>> 

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Thierry Reding
rhaps with
differing configurations but non-significant (or non-noticable)
differences.

> You mentionned that the Tegra VPU currentyl always operates in frame
> mode (even when the stream actually has multiple slices, which I assume
> are gathered at some point). I wonder how it goes about configuring
> different slice parameters (which are specific to each slice, not
> frame) for the different slices.

That's part of the beauty of the frame-level decoding mode (I think
that's call SXE-P). The syntax engine has access to the complete
bitstream and can parse all the information that it needs. There's some
data that we pass into the decoder from the SPS and PPS, but other than
that the VDE will do everything by itself.

> I believe we should at least always expose per-slice granularity in the
> pixel format and requests. Maybe we could have a way to allow multiple
> slices to be gathered in the source buffer and have a control slice
> array for each request. In that case, we'd have a single request queued
> for the series of slices, with a bit offset in each control to the
> matching slice.
> 
> Then we could specify that such slices must be appended in a way that
> suits most decoders that would have to operate per-frame (so we need to
> figure this out) and worst case, we'll always have offsets in the
> controls if we need to setup a bounce buffer in the driver because
> things are not laid out the way we specified.
> 
> Then we introduce a specific cap to indicate which mode is supported
> (per-slice and/or per-frame) and adapt our ffmpeg reference to be able
> to operate in both modes.
> 
> That adds some complexity for userspace, but I don't think we can avoid
> it at this point and it feels better than having two different pixel
> formats (which would probably be even more complex to manage for
> userspace).
> 
> What do you think?

I'm not sure I understand why this would be simpler than exposing two
different pixel formats. It sounds like essentially the same thing, just
with a different method.

One advantage I see with your approach is that it more formally defines
how slices are passed. This might be a good thing to do anyway. I'm not
sure if software stacks provide that information anyway. If they do this
would be trivial to achieve. If they don't this could be an extra burden
on userspace for decoder that don't need it.

Would it perhaps be possible to make this slice meta data optional? For
example, could we just provide an H.264 slice pixel format and then let
userspace fill in buffers in whatever way they want, provided that they
follow some rules (must be annex B or something else, concatenated
slices, ...) and then if there's an extra control specifying the offsets
of individual slices drivers can use that, if not they just pass the
bitstream buffer to the hardware if frame-level decoding is supported
and let the hardware do its thing?

Hardware that has requirements different from that could require the
meta data to be present and fail otherwise.

On the other hand, userspace would have to be prepared to deal with this
type of hardware anyway, so it basically needs to provide the meta data
in any case. Perhaps the meta data could be optional if a buffer
contains a single slice.

One other thing that occurred to me is that the meta data could perhaps
contain a more elaborate description of the data in the slice. But that
has the problem that it can't be detected upfront, so userspace can't
discover whether the decoder can handle that data until an error is
returned from the decoder upon receiving the meta data.

To answer your question: I don't feel strongly one way or the other. The
above is really just discussing the specifics of how the data is passed,
but we don't really know what exactly the data is that we need to pass.

> > > The other option would be to support only per-slice decoding with a
> > > mandatory START_FRAME/END_FRAME sequence to let drivers for HW that
> > > only support per-frame decoding know when they should trigger the
> > > decoding operation. The downside is that it implies having a bounce
> > > buffer where the driver can pack slices to be decoded on the END_FRAME
> > > event.
> > 
> > I vaguely remember that that's what the video codec abstraction does in
> > Mesa/Gallium. 
> 
> Well, if it's exposed through VDPAU or VAAPI, the interface already
> operates per-slice and it would certainly not be a big issue to change
> that.

The video pipe callbacks can implement a ->decode_bitstream() callback
that gets a number of buffer/size pairs along with a picture description
(which corresponds roughly to the SPS/PPS). The buffer/size pairs are
exactly what's passed in from VDPAU or VAAPI. It looks like VDPAU c

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Hans Verkuil
On 5/22/19 12:42 PM, Thierry Reding wrote:
> On Wed, May 22, 2019 at 10:26:28AM +0200, Paul Kocialkowski wrote:
>> Hi,
>>
>> Le mercredi 22 mai 2019 à 15:48 +0900, Tomasz Figa a écrit :
>>> On Sat, May 18, 2019 at 11:09 PM Nicolas Dufresne  
>>> wrote:
>>>> Le samedi 18 mai 2019 à 12:29 +0200, Paul Kocialkowski a écrit :
>>>>> Hi,
>>>>>
>>>>> Le samedi 18 mai 2019 à 12:04 +0200, Jernej Škrabec a écrit :
>>>>>> Dne sobota, 18. maj 2019 ob 11:50:37 CEST je Paul Kocialkowski 
>>>>>> napisal(a):
>>>>>>> Hi,
>>>>>>>
>>>>>>> On Fri, 2019-05-17 at 16:43 -0400, Nicolas Dufresne wrote:
>>>>>>>> Le jeudi 16 mai 2019 à 20:45 +0200, Paul Kocialkowski a écrit :
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Le jeudi 16 mai 2019 à 14:24 -0400, Nicolas Dufresne a écrit :
>>>>>>>>>> Le mercredi 15 mai 2019 à 22:59 +0200, Paul Kocialkowski a écrit :
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Le mercredi 15 mai 2019 à 14:54 -0400, Nicolas Dufresne a écrit :
>>>>>>>>>>>> Le mercredi 15 mai 2019 à 19:42 +0200, Paul Kocialkowski a écrit :
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit
>>>>>> :
>>>>>>>>>>>>>> Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a
>>>>>> écrit :
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> With the Rockchip stateless VPU driver in the works, we now
>>>>>>>>>>>>>>> have a
>>>>>>>>>>>>>>> better idea of what the situation is like on platforms other
>>>>>>>>>>>>>>> than
>>>>>>>>>>>>>>> Allwinner. This email shares my conclusions about the
>>>>>>>>>>>>>>> situation and how
>>>>>>>>>>>>>>> we should update the MPEG-2, H.264 and H.265 controls
>>>>>>>>>>>>>>> accordingly.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Per-slice decoding
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We've discussed this one already[0] and Hans has submitted a
>>>>>>>>>>>>>>> patch[1]
>>>>>>>>>>>>>>> to implement the required core bits. When we agree it looks
>>>>>>>>>>>>>>> good, we
>>>>>>>>>>>>>>> should lift the restriction that all slices must be
>>>>>>>>>>>>>>> concatenated and
>>>>>>>>>>>>>>> have them submitted as individual requests.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> One question is what to do about other controls. I feel like
>>>>>>>>>>>>>>> it would
>>>>>>>>>>>>>>> make sense to always pass all the required controls for
>>>>>>>>>>>>>>> decoding the
>>>>>>>>>>>>>>> slice, including the ones that don't change across slices.
>>>>>>>>>>>>>>> But there
>>>>>>>>>>>>>>> may be no particular advantage to this and only downsides.
>>>>>>>>>>>>>>> Not doing it
>>>>>>>>>>>>>>> and relying on the "control cache" can work, but we need to
>>>>>>>>>>>>>>> specify
>>>>>>>>>>>>>>> that only a single stream can be decoded per opened instance
>>>>>>>>>>>>>>> of the
>>>>>>>>>>>>>>> v4l2 device. This is the assump

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Thierry Reding
On Wed, May 22, 2019 at 10:26:28AM +0200, Paul Kocialkowski wrote:
> Hi,
> 
> Le mercredi 22 mai 2019 à 15:48 +0900, Tomasz Figa a écrit :
> > On Sat, May 18, 2019 at 11:09 PM Nicolas Dufresne  
> > wrote:
> > > Le samedi 18 mai 2019 à 12:29 +0200, Paul Kocialkowski a écrit :
> > > > Hi,
> > > > 
> > > > Le samedi 18 mai 2019 à 12:04 +0200, Jernej Škrabec a écrit :
> > > > > Dne sobota, 18. maj 2019 ob 11:50:37 CEST je Paul Kocialkowski 
> > > > > napisal(a):
> > > > > > Hi,
> > > > > > 
> > > > > > On Fri, 2019-05-17 at 16:43 -0400, Nicolas Dufresne wrote:
> > > > > > > Le jeudi 16 mai 2019 à 20:45 +0200, Paul Kocialkowski a écrit :
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > Le jeudi 16 mai 2019 à 14:24 -0400, Nicolas Dufresne a écrit :
> > > > > > > > > Le mercredi 15 mai 2019 à 22:59 +0200, Paul Kocialkowski a 
> > > > > > > > > écrit :
> > > > > > > > > > Hi,
> > > > > > > > > > 
> > > > > > > > > > Le mercredi 15 mai 2019 à 14:54 -0400, Nicolas Dufresne a 
> > > > > > > > > > écrit :
> > > > > > > > > > > Le mercredi 15 mai 2019 à 19:42 +0200, Paul Kocialkowski 
> > > > > > > > > > > a écrit :
> > > > > > > > > > > > Hi,
> > > > > > > > > > > > 
> > > > > > > > > > > > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne 
> > > > > > > > > > > > a écrit
> > > > > :
> > > > > > > > > > > > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul 
> > > > > > > > > > > > > Kocialkowski a
> > > > > écrit :
> > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > With the Rockchip stateless VPU driver in the 
> > > > > > > > > > > > > > works, we now
> > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > better idea of what the situation is like on 
> > > > > > > > > > > > > > platforms other
> > > > > > > > > > > > > > than
> > > > > > > > > > > > > > Allwinner. This email shares my conclusions about 
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > situation and how
> > > > > > > > > > > > > > we should update the MPEG-2, H.264 and H.265 
> > > > > > > > > > > > > > controls
> > > > > > > > > > > > > > accordingly.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > - Per-slice decoding
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > We've discussed this one already[0] and Hans has 
> > > > > > > > > > > > > > submitted a
> > > > > > > > > > > > > > patch[1]
> > > > > > > > > > > > > > to implement the required core bits. When we agree 
> > > > > > > > > > > > > > it looks
> > > > > > > > > > > > > > good, we
> > > > > > > > > > > > > > should lift the restriction that all slices must be
> > > > > > > > > > > > > > concatenated and
> > > > > > > > > > > > > > have them submitted as individual requests.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > One question is what to do about other controls. I 
> > > > > > > > > > > > > > feel like
> > > > > > > > > > > > > > it would
> > > > > > > > > > > > > > ma

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Thierry Reding
On Tue, May 21, 2019 at 12:23:46PM -0400, Nicolas Dufresne wrote:
> Le mardi 21 mai 2019 à 17:43 +0200, Thierry Reding a écrit :
> > On Wed, May 15, 2019 at 07:42:50PM +0200, Paul Kocialkowski wrote:
> > > Hi,
> > > 
> > > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit :
> > > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a écrit :
> > > > > Hi,
> > > > > 
> > > > > With the Rockchip stateless VPU driver in the works, we now have a
> > > > > better idea of what the situation is like on platforms other than
> > > > > Allwinner. This email shares my conclusions about the situation and 
> > > > > how
> > > > > we should update the MPEG-2, H.264 and H.265 controls accordingly.
> > > > > 
> > > > > - Per-slice decoding
> > > > > 
> > > > > We've discussed this one already[0] and Hans has submitted a patch[1]
> > > > > to implement the required core bits. When we agree it looks good, we
> > > > > should lift the restriction that all slices must be concatenated and
> > > > > have them submitted as individual requests.
> > > > > 
> > > > > One question is what to do about other controls. I feel like it would
> > > > > make sense to always pass all the required controls for decoding the
> > > > > slice, including the ones that don't change across slices. But there
> > > > > may be no particular advantage to this and only downsides. Not doing 
> > > > > it
> > > > > and relying on the "control cache" can work, but we need to specify
> > > > > that only a single stream can be decoded per opened instance of the
> > > > > v4l2 device. This is the assumption we're going with for handling
> > > > > multi-slice anyway, so it shouldn't be an issue.
> > > > 
> > > > My opinion on this is that the m2m instance is a state, and the driver
> > > > should be responsible of doing time-division multiplexing across
> > > > multiple m2m instance jobs. Doing the time-division multiplexing in
> > > > userspace would require some sort of daemon to work properly across
> > > > processes. I also think the kernel is better place for doing resource
> > > > access scheduling in general.
> > > 
> > > I agree with that yes. We always have a single m2m context and specific
> > > controls per opened device so keeping cached values works out well.
> > > 
> > > So maybe we shall explicitly require that the request with the first
> > > slice for a frame also contains the per-frame controls.
> > > 
> > > > > - Annex-B formats
> > > > > 
> > > > > I don't think we have really reached a conclusion on the pixel formats
> > > > > we want to expose. The main issue is how to deal with codecs that need
> > > > > the full slice NALU with start code, where the slice_header is
> > > > > duplicated in raw bitstream, when others are fine with just the 
> > > > > encoded
> > > > > slice data and the parsed slice header control.
> > > > > 
> > > > > My initial thinking was that we'd need 3 formats:
> > > > > - One that only takes only the slice compressed data (without raw 
> > > > > slice
> > > > > header and start code);
> > > > > - One that takes both the NALU data (including start code, raw header
> > > > > and compressed data) and slice header controls;
> > > > > - One that takes the NALU data but no slice header.
> > > > > 
> > > > > But I no longer think the latter really makes sense in the context of
> > > > > stateless video decoding.
> > > > > 
> > > > > A side-note: I think we should definitely have data offsets in every
> > > > > case, so that implementations can just push the whole NALU regardless
> > > > > of the format if they're lazy.
> > > > 
> > > > I realize that I didn't share our latest research on the subject. So a
> > > > slice in the original bitstream is formed of the following blocks
> > > > (simplified):
> > > > 
> > > >   [nal_header][nal_type][slice_header][slice]
> > > 
> > > Thanks for the details!
> > > 
> > > > nal_header:
> > > > This one is a h

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Paul Kocialkowski
Le mercredi 22 mai 2019 à 10:32 +0200, Thierry Reding a écrit :
> On Wed, May 22, 2019 at 09:29:24AM +0200, Boris Brezillon wrote:
> > On Wed, 22 May 2019 15:39:37 +0900
> > Tomasz Figa  wrote:
> > 
> > > > It would be premature to state that we are excluding. We are just
> > > > trying to find one format to get things upstream, and make sure we have
> > > > a plan how to extend it. Trying to support everything on the first try
> > > > is not going to work so well.
> > > > 
> > > > What is interesting to provide is how does you IP achieve multi-slice
> > > > decoding per frame. That's what we are studying on the RK/Hantro chip.
> > > > Typical questions are:
> > > > 
> > > >   1. Do all slices have to be contiguous in memory
> > > >   2. If 1., do you place start-code, AVC header or pass a seperate 
> > > > index to let the HW locate the start of each NAL ?
> > > >   3. Does the HW do support single interrupt per frame (RK3288 as an 
> > > > example does not, but RK3399 do)  
> > > 
> > > AFAICT, the bit about RK3288 isn't true. At least in our downstream
> > > driver that was created mostly by RK themselves, we've been assuming
> > > that the interrupt is for the complete frame, without any problems.
> > 
> > I confirm that's what happens when all slices forming a frame are packed
> > in a single output buffer: you only get one interrupt at the end of the
> > decoding process (in that case, when the frame is decoded). Of course,
> > if you split things up and do per-slice decoding instead (one slice per
> > buffer) you get an interrupt per slice, though I didn't manage to make
> > that work.
> > I get a DEC_BUFFER interrupt (AKA, "buffer is empty but frame is not
> > fully decoded") on the first slice and an ASO (Arbitrary Slice Ordering)
> > interrupt on the second slice, which makes me think some states are
> > reset between the 2 operations leading the engine to think that the
> > second slice is part of a new frame.
> 
> That sounds a lot like how this works on Tegra. My understanding is that
> for slice decoding you'd also get an interrupt every time a full slice
> has been decoded perhaps coupled with another "frame done" interrupt
> when the full frame has been decoded after the last slice.
> 
> In frame-level decode mode you don't get interrupts in between and
> instead only get the "frame done" interrupt. Unless something went wrong
> during decoding, in which case you also get an interrupt but with error
> flags and status registers that help determine what exactly happened.
> 
> > Anyway, it doesn't sound like a crazy idea to support both per-slice
> > and per-frame decoding and maybe have a way to expose what a
> > specific codec can do (through an extra cap mechanism).
> 
> Yeah, I think it makes sense to support both for devices that can do
> both. From what Nicolas said it may make sense for an application to
> want to do slice-level decoding if receiving a stream from the network
> and frame-level decoding if playing back from a local file. If a driver
> supports both, the application could detect that and choose the
> appropriate format.
> 
> It sounds to me like using different input formats for that would be a
> very natural way to describe it. Applications can already detect the set
> of supported input formats and set the format when they allocate buffers
> so that should work very nicely.

Pixel formats are indeed the natural way to go about this, but I have
some reservations in this case. Slices are the natural unit of video
streams, just like frames are to display hardware. Part of the pipeline
configuration is slice-specific, so in theory, the pipeline needs to be
reconfigured with each slice.

What we have been doing in Cedrus is to currently gather all the slices
and use the last slice's specific configuration for the pipeline, which
sort of works, but is very likely not a good idea.

You mentionned that the Tegra VPU currentyl always operates in frame
mode (even when the stream actually has multiple slices, which I assume
are gathered at some point). I wonder how it goes about configuring
different slice parameters (which are specific to each slice, not
frame) for the different slices. 

I believe we should at least always expose per-slice granularity in the
pixel format and requests. Maybe we could have a way to allow multiple
slices to be gathered in the source buffer and have a control slice
array for each request. In that case, we'd have a single request queued
for the series of slices, with a bit offset in each control to the
matching slice.

Then we could specify that such slices must be appended in a way that
suits most decoders that would have to operate per-frame (so we need to
figure this out) and worst case, we'll always have offsets in the
controls if we need to setup a bounce buffer in the driver because
things are not laid out the way we specified.

Then we introduce a specific cap to indicate which mode is supported
(per-slice and/or per-frame) and adapt our ffmp

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Thierry Reding
On Wed, May 22, 2019 at 09:29:24AM +0200, Boris Brezillon wrote:
> On Wed, 22 May 2019 15:39:37 +0900
> Tomasz Figa  wrote:
> 
> > > It would be premature to state that we are excluding. We are just
> > > trying to find one format to get things upstream, and make sure we have
> > > a plan how to extend it. Trying to support everything on the first try
> > > is not going to work so well.
> > >
> > > What is interesting to provide is how does you IP achieve multi-slice
> > > decoding per frame. That's what we are studying on the RK/Hantro chip.
> > > Typical questions are:
> > >
> > >   1. Do all slices have to be contiguous in memory
> > >   2. If 1., do you place start-code, AVC header or pass a seperate index 
> > > to let the HW locate the start of each NAL ?
> > >   3. Does the HW do support single interrupt per frame (RK3288 as an 
> > > example does not, but RK3399 do)  
> > 
> > AFAICT, the bit about RK3288 isn't true. At least in our downstream
> > driver that was created mostly by RK themselves, we've been assuming
> > that the interrupt is for the complete frame, without any problems.
> 
> I confirm that's what happens when all slices forming a frame are packed
> in a single output buffer: you only get one interrupt at the end of the
> decoding process (in that case, when the frame is decoded). Of course,
> if you split things up and do per-slice decoding instead (one slice per
> buffer) you get an interrupt per slice, though I didn't manage to make
> that work.
> I get a DEC_BUFFER interrupt (AKA, "buffer is empty but frame is not
> fully decoded") on the first slice and an ASO (Arbitrary Slice Ordering)
> interrupt on the second slice, which makes me think some states are
> reset between the 2 operations leading the engine to think that the
> second slice is part of a new frame.

That sounds a lot like how this works on Tegra. My understanding is that
for slice decoding you'd also get an interrupt every time a full slice
has been decoded perhaps coupled with another "frame done" interrupt
when the full frame has been decoded after the last slice.

In frame-level decode mode you don't get interrupts in between and
instead only get the "frame done" interrupt. Unless something went wrong
during decoding, in which case you also get an interrupt but with error
flags and status registers that help determine what exactly happened.

> Anyway, it doesn't sound like a crazy idea to support both per-slice
> and per-frame decoding and maybe have a way to expose what a
> specific codec can do (through an extra cap mechanism).

Yeah, I think it makes sense to support both for devices that can do
both. From what Nicolas said it may make sense for an application to
want to do slice-level decoding if receiving a stream from the network
and frame-level decoding if playing back from a local file. If a driver
supports both, the application could detect that and choose the
appropriate format.

It sounds to me like using different input formats for that would be a
very natural way to describe it. Applications can already detect the set
of supported input formats and set the format when they allocate buffers
so that should work very nicely.

> The other option would be to support only per-slice decoding with a
> mandatory START_FRAME/END_FRAME sequence to let drivers for HW that
> only support per-frame decoding know when they should trigger the
> decoding operation. The downside is that it implies having a bounce
> buffer where the driver can pack slices to be decoded on the END_FRAME
> event.

I vaguely remember that that's what the video codec abstraction does in
Mesa/Gallium. I'm not very familiar with V4L2, but this seems like it
could be problematic to integrate with the way that V4L2 works in
general. Perhaps sending a special buffer (0 length or whatever) to mark
the end of a frame would work. But this is probably something that
others have already thought about, since slice-level decoding is what
most people are using, hence there must already be a way for userspace
to somehow synchronize input vs. output buffers. Or does this currently
just work by queueing bitstream buffers as fast as possible and then
dequeueing frame buffers as they become available?

Thierry


signature.asc
Description: PGP signature


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Paul Kocialkowski
Hi,

Le mercredi 22 mai 2019 à 15:48 +0900, Tomasz Figa a écrit :
> On Sat, May 18, 2019 at 11:09 PM Nicolas Dufresne  
> wrote:
> > Le samedi 18 mai 2019 à 12:29 +0200, Paul Kocialkowski a écrit :
> > > Hi,
> > > 
> > > Le samedi 18 mai 2019 à 12:04 +0200, Jernej Škrabec a écrit :
> > > > Dne sobota, 18. maj 2019 ob 11:50:37 CEST je Paul Kocialkowski 
> > > > napisal(a):
> > > > > Hi,
> > > > > 
> > > > > On Fri, 2019-05-17 at 16:43 -0400, Nicolas Dufresne wrote:
> > > > > > Le jeudi 16 mai 2019 à 20:45 +0200, Paul Kocialkowski a écrit :
> > > > > > > Hi,
> > > > > > > 
> > > > > > > Le jeudi 16 mai 2019 à 14:24 -0400, Nicolas Dufresne a écrit :
> > > > > > > > Le mercredi 15 mai 2019 à 22:59 +0200, Paul Kocialkowski a 
> > > > > > > > écrit :
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > Le mercredi 15 mai 2019 à 14:54 -0400, Nicolas Dufresne a 
> > > > > > > > > écrit :
> > > > > > > > > > Le mercredi 15 mai 2019 à 19:42 +0200, Paul Kocialkowski a 
> > > > > > > > > > écrit :
> > > > > > > > > > > Hi,
> > > > > > > > > > > 
> > > > > > > > > > > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a 
> > > > > > > > > > > écrit
> > > > :
> > > > > > > > > > > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul 
> > > > > > > > > > > > Kocialkowski a
> > > > écrit :
> > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > 
> > > > > > > > > > > > > With the Rockchip stateless VPU driver in the works, 
> > > > > > > > > > > > > we now
> > > > > > > > > > > > > have a
> > > > > > > > > > > > > better idea of what the situation is like on 
> > > > > > > > > > > > > platforms other
> > > > > > > > > > > > > than
> > > > > > > > > > > > > Allwinner. This email shares my conclusions about the
> > > > > > > > > > > > > situation and how
> > > > > > > > > > > > > we should update the MPEG-2, H.264 and H.265 controls
> > > > > > > > > > > > > accordingly.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > - Per-slice decoding
> > > > > > > > > > > > > 
> > > > > > > > > > > > > We've discussed this one already[0] and Hans has 
> > > > > > > > > > > > > submitted a
> > > > > > > > > > > > > patch[1]
> > > > > > > > > > > > > to implement the required core bits. When we agree it 
> > > > > > > > > > > > > looks
> > > > > > > > > > > > > good, we
> > > > > > > > > > > > > should lift the restriction that all slices must be
> > > > > > > > > > > > > concatenated and
> > > > > > > > > > > > > have them submitted as individual requests.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > One question is what to do about other controls. I 
> > > > > > > > > > > > > feel like
> > > > > > > > > > > > > it would
> > > > > > > > > > > > > make sense to always pass all the required controls 
> > > > > > > > > > > > > for
> > > > > > > > > > > > > decoding the
> > > > > > > > > > > > > slice, including the ones that don't change across 
> > > > > > > > > > > > > slices.
> > > > > > > > > > > > > But there
> > > > > > > > > > > &g

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Boris Brezillon
On Wed, 22 May 2019 09:29:24 +0200
Boris Brezillon  wrote:

> On Wed, 22 May 2019 15:39:37 +0900
> Tomasz Figa  wrote:
> 
> > > It would be premature to state that we are excluding. We are just
> > > trying to find one format to get things upstream, and make sure we have
> > > a plan how to extend it. Trying to support everything on the first try
> > > is not going to work so well.
> > >
> > > What is interesting to provide is how does you IP achieve multi-slice
> > > decoding per frame. That's what we are studying on the RK/Hantro chip.
> > > Typical questions are:
> > >
> > >   1. Do all slices have to be contiguous in memory
> > >   2. If 1., do you place start-code, AVC header or pass a seperate index 
> > > to let the HW locate the start of each NAL ?
> > >   3. Does the HW do support single interrupt per frame (RK3288 as an 
> > > example does not, but RK3399 do)
> > 
> > AFAICT, the bit about RK3288 isn't true. At least in our downstream
> > driver that was created mostly by RK themselves, we've been assuming
> > that the interrupt is for the complete frame, without any problems.  
> 
> I confirm that's what happens when all slices forming a frame are packed
> in a single output buffer: you only get one interrupt at the end of the
> decoding process (in that case, when the frame is decoded). Of course,
> if you split things up and do per-slice decoding instead (one slice per
> buffer) you get an interrupt per slice, though I didn't manage to make
> that work.
> I get a DEC_BUFFER interrupt (AKA, "buffer is empty but frame is not
> fully decoded") on the first slice and an ASO (Arbitrary Slice Ordering)
> interrupt on the second slice, which makes me think some states are
> reset between the 2 operations leading the engine to think that the
> second slice is part of a new frame.
> 
> Anyway, it doesn't sound like a crazy idea to support both per-slice
> and per-frame decoding and maybe have a way to expose what a
> specific codec can do (through an extra cap mechanism).
> The other option would be to support only per-slice decoding with a
> mandatory START_FRAME/END_FRAME sequence to let drivers for HW that
> only support per-frame decoding know when they should trigger the
> decoding operation.

Just to clarify, we can use Hans' V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
work to identify start/end frame boundaries, the only problem I see is
that users are not required to clear the flag on the last slice of a
frame, so there's no way for the driver to know when it should trigger
the decode-frame operation. I guess we could trigger this decode
operation when v4l2_m2m_release_capture_buf() returns true, but I
wonder if it's not too late to do that.

> The downside is that it implies having a bounce
> buffer where the driver can pack slices to be decoded on the END_FRAME
> event.
> 



Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Thierry Reding
On Tue, May 21, 2019 at 12:07:47PM -0400, Nicolas Dufresne wrote:
> Le mardi 21 mai 2019 à 17:09 +0200, Thierry Reding a écrit :
> > On Tue, May 21, 2019 at 01:44:50PM +0200, Paul Kocialkowski wrote:
> > > Hi,
> > > 
> > > On Tue, 2019-05-21 at 19:27 +0900, Tomasz Figa wrote:
> > > > On Thu, May 16, 2019 at 2:43 AM Paul Kocialkowski
> > > >  wrote:
> > > > > Hi,
> > > > > 
> > > > > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit :
> > > > > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a écrit :
> > > > > > > Hi,
> > > > > > > 
> > > > > > > With the Rockchip stateless VPU driver in the works, we now have a
> > > > > > > better idea of what the situation is like on platforms other than
> > > > > > > Allwinner. This email shares my conclusions about the situation 
> > > > > > > and how
> > > > > > > we should update the MPEG-2, H.264 and H.265 controls accordingly.
> > > > > > > 
> > > > > > > - Per-slice decoding
> > > > > > > 
> > > > > > > We've discussed this one already[0] and Hans has submitted a 
> > > > > > > patch[1]
> > > > > > > to implement the required core bits. When we agree it looks good, 
> > > > > > > we
> > > > > > > should lift the restriction that all slices must be concatenated 
> > > > > > > and
> > > > > > > have them submitted as individual requests.
> > > > > > > 
> > > > > > > One question is what to do about other controls. I feel like it 
> > > > > > > would
> > > > > > > make sense to always pass all the required controls for decoding 
> > > > > > > the
> > > > > > > slice, including the ones that don't change across slices. But 
> > > > > > > there
> > > > > > > may be no particular advantage to this and only downsides. Not 
> > > > > > > doing it
> > > > > > > and relying on the "control cache" can work, but we need to 
> > > > > > > specify
> > > > > > > that only a single stream can be decoded per opened instance of 
> > > > > > > the
> > > > > > > v4l2 device. This is the assumption we're going with for handling
> > > > > > > multi-slice anyway, so it shouldn't be an issue.
> > > > > > 
> > > > > > My opinion on this is that the m2m instance is a state, and the 
> > > > > > driver
> > > > > > should be responsible of doing time-division multiplexing across
> > > > > > multiple m2m instance jobs. Doing the time-division multiplexing in
> > > > > > userspace would require some sort of daemon to work properly across
> > > > > > processes. I also think the kernel is better place for doing 
> > > > > > resource
> > > > > > access scheduling in general.
> > > > > 
> > > > > I agree with that yes. We always have a single m2m context and 
> > > > > specific
> > > > > controls per opened device so keeping cached values works out well.
> > > > > 
> > > > > So maybe we shall explicitly require that the request with the first
> > > > > slice for a frame also contains the per-frame controls.
> > > > > 
> > > > 
> > > > Agreed.
> > > > 
> > > > One more argument not to allow such multiplexing is that despite the
> > > > API being called "stateless", there is actually some state saved
> > > > between frames, e.g. the Rockchip decoder writes some intermediate
> > > > data to some local buffers which need to be given to the decoder to
> > > > decode the next frame. Actually, on Rockchip there is even a
> > > > requirement to keep the reference list entries in the same order
> > > > between frames.
> > > 
> > > Well, what I'm suggesting is to have one stream per m2m context, but it
> > > should certainly be possible to have multiple m2m contexts (multiple
> > > userspace open calls) that decode different streams concurrently.
> > > 
> > > Is that really going to be a problem for Rockchi

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-22 Thread Boris Brezillon
On Wed, 22 May 2019 15:39:37 +0900
Tomasz Figa  wrote:

> > It would be premature to state that we are excluding. We are just
> > trying to find one format to get things upstream, and make sure we have
> > a plan how to extend it. Trying to support everything on the first try
> > is not going to work so well.
> >
> > What is interesting to provide is how does you IP achieve multi-slice
> > decoding per frame. That's what we are studying on the RK/Hantro chip.
> > Typical questions are:
> >
> >   1. Do all slices have to be contiguous in memory
> >   2. If 1., do you place start-code, AVC header or pass a seperate index to 
> > let the HW locate the start of each NAL ?
> >   3. Does the HW do support single interrupt per frame (RK3288 as an 
> > example does not, but RK3399 do)  
> 
> AFAICT, the bit about RK3288 isn't true. At least in our downstream
> driver that was created mostly by RK themselves, we've been assuming
> that the interrupt is for the complete frame, without any problems.

I confirm that's what happens when all slices forming a frame are packed
in a single output buffer: you only get one interrupt at the end of the
decoding process (in that case, when the frame is decoded). Of course,
if you split things up and do per-slice decoding instead (one slice per
buffer) you get an interrupt per slice, though I didn't manage to make
that work.
I get a DEC_BUFFER interrupt (AKA, "buffer is empty but frame is not
fully decoded") on the first slice and an ASO (Arbitrary Slice Ordering)
interrupt on the second slice, which makes me think some states are
reset between the 2 operations leading the engine to think that the
second slice is part of a new frame.

Anyway, it doesn't sound like a crazy idea to support both per-slice
and per-frame decoding and maybe have a way to expose what a
specific codec can do (through an extra cap mechanism).
The other option would be to support only per-slice decoding with a
mandatory START_FRAME/END_FRAME sequence to let drivers for HW that
only support per-frame decoding know when they should trigger the
decoding operation. The downside is that it implies having a bounce
buffer where the driver can pack slices to be decoded on the END_FRAME
event.



Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-21 Thread Tomasz Figa
On Sat, May 18, 2019 at 11:09 PM Nicolas Dufresne  wrote:
>
> Le samedi 18 mai 2019 à 12:29 +0200, Paul Kocialkowski a écrit :
> > Hi,
> >
> > Le samedi 18 mai 2019 à 12:04 +0200, Jernej Škrabec a écrit :
> > > Dne sobota, 18. maj 2019 ob 11:50:37 CEST je Paul Kocialkowski napisal(a):
> > > > Hi,
> > > >
> > > > On Fri, 2019-05-17 at 16:43 -0400, Nicolas Dufresne wrote:
> > > > > Le jeudi 16 mai 2019 à 20:45 +0200, Paul Kocialkowski a écrit :
> > > > > > Hi,
> > > > > >
> > > > > > Le jeudi 16 mai 2019 à 14:24 -0400, Nicolas Dufresne a écrit :
> > > > > > > Le mercredi 15 mai 2019 à 22:59 +0200, Paul Kocialkowski a écrit :
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > Le mercredi 15 mai 2019 à 14:54 -0400, Nicolas Dufresne a écrit 
> > > > > > > > :
> > > > > > > > > Le mercredi 15 mai 2019 à 19:42 +0200, Paul Kocialkowski a 
> > > > > > > > > écrit :
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a 
> > > > > > > > > > écrit
> > > :
> > > > > > > > > > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a
> > > écrit :
> > > > > > > > > > > > Hi,
> > > > > > > > > > > >
> > > > > > > > > > > > With the Rockchip stateless VPU driver in the works, we 
> > > > > > > > > > > > now
> > > > > > > > > > > > have a
> > > > > > > > > > > > better idea of what the situation is like on platforms 
> > > > > > > > > > > > other
> > > > > > > > > > > > than
> > > > > > > > > > > > Allwinner. This email shares my conclusions about the
> > > > > > > > > > > > situation and how
> > > > > > > > > > > > we should update the MPEG-2, H.264 and H.265 controls
> > > > > > > > > > > > accordingly.
> > > > > > > > > > > >
> > > > > > > > > > > > - Per-slice decoding
> > > > > > > > > > > >
> > > > > > > > > > > > We've discussed this one already[0] and Hans has 
> > > > > > > > > > > > submitted a
> > > > > > > > > > > > patch[1]
> > > > > > > > > > > > to implement the required core bits. When we agree it 
> > > > > > > > > > > > looks
> > > > > > > > > > > > good, we
> > > > > > > > > > > > should lift the restriction that all slices must be
> > > > > > > > > > > > concatenated and
> > > > > > > > > > > > have them submitted as individual requests.
> > > > > > > > > > > >
> > > > > > > > > > > > One question is what to do about other controls. I feel 
> > > > > > > > > > > > like
> > > > > > > > > > > > it would
> > > > > > > > > > > > make sense to always pass all the required controls for
> > > > > > > > > > > > decoding the
> > > > > > > > > > > > slice, including the ones that don't change across 
> > > > > > > > > > > > slices.
> > > > > > > > > > > > But there
> > > > > > > > > > > > may be no particular advantage to this and only 
> > > > > > > > > > > > downsides.
> > > > > > > > > > > > Not doing it
> > > > > > > > > > > > and relying on the "control cache" can work, but we 
> > > > > > > > > > > > need to
> > > > > > > > > > > > specify
> > > > > > > > > > > > that only a single stream can be decoded per opened 
> >

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-21 Thread Tomasz Figa
On Wed, May 22, 2019 at 1:23 AM Nicolas Dufresne  wrote:
>
> Le mardi 21 mai 2019 à 17:43 +0200, Thierry Reding a écrit :
> > On Wed, May 15, 2019 at 07:42:50PM +0200, Paul Kocialkowski wrote:
> > > Hi,
> > >
> > > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit :
> > > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a écrit :
> > > > > Hi,
> > > > >
> > > > > With the Rockchip stateless VPU driver in the works, we now have a
> > > > > better idea of what the situation is like on platforms other than
> > > > > Allwinner. This email shares my conclusions about the situation and 
> > > > > how
> > > > > we should update the MPEG-2, H.264 and H.265 controls accordingly.
> > > > >
> > > > > - Per-slice decoding
> > > > >
> > > > > We've discussed this one already[0] and Hans has submitted a patch[1]
> > > > > to implement the required core bits. When we agree it looks good, we
> > > > > should lift the restriction that all slices must be concatenated and
> > > > > have them submitted as individual requests.
> > > > >
> > > > > One question is what to do about other controls. I feel like it would
> > > > > make sense to always pass all the required controls for decoding the
> > > > > slice, including the ones that don't change across slices. But there
> > > > > may be no particular advantage to this and only downsides. Not doing 
> > > > > it
> > > > > and relying on the "control cache" can work, but we need to specify
> > > > > that only a single stream can be decoded per opened instance of the
> > > > > v4l2 device. This is the assumption we're going with for handling
> > > > > multi-slice anyway, so it shouldn't be an issue.
> > > >
> > > > My opinion on this is that the m2m instance is a state, and the driver
> > > > should be responsible of doing time-division multiplexing across
> > > > multiple m2m instance jobs. Doing the time-division multiplexing in
> > > > userspace would require some sort of daemon to work properly across
> > > > processes. I also think the kernel is better place for doing resource
> > > > access scheduling in general.
> > >
> > > I agree with that yes. We always have a single m2m context and specific
> > > controls per opened device so keeping cached values works out well.
> > >
> > > So maybe we shall explicitly require that the request with the first
> > > slice for a frame also contains the per-frame controls.
> > >
> > > > > - Annex-B formats
> > > > >
> > > > > I don't think we have really reached a conclusion on the pixel formats
> > > > > we want to expose. The main issue is how to deal with codecs that need
> > > > > the full slice NALU with start code, where the slice_header is
> > > > > duplicated in raw bitstream, when others are fine with just the 
> > > > > encoded
> > > > > slice data and the parsed slice header control.
> > > > >
> > > > > My initial thinking was that we'd need 3 formats:
> > > > > - One that only takes only the slice compressed data (without raw 
> > > > > slice
> > > > > header and start code);
> > > > > - One that takes both the NALU data (including start code, raw header
> > > > > and compressed data) and slice header controls;
> > > > > - One that takes the NALU data but no slice header.
> > > > >
> > > > > But I no longer think the latter really makes sense in the context of
> > > > > stateless video decoding.
> > > > >
> > > > > A side-note: I think we should definitely have data offsets in every
> > > > > case, so that implementations can just push the whole NALU regardless
> > > > > of the format if they're lazy.
> > > >
> > > > I realize that I didn't share our latest research on the subject. So a
> > > > slice in the original bitstream is formed of the following blocks
> > > > (simplified):
> > > >
> > > >   [nal_header][nal_type][slice_header][slice]
> > >
> > > Thanks for the details!
> > >
> > > > nal_header:
> > > > This one is a header used to locate t

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-21 Thread Tomasz Figa
On Tue, May 21, 2019 at 8:45 PM Paul Kocialkowski
 wrote:
>
> Hi,
>
> On Tue, 2019-05-21 at 19:27 +0900, Tomasz Figa wrote:
> > On Thu, May 16, 2019 at 2:43 AM Paul Kocialkowski
> >  wrote:
> > > Hi,
> > >
> > > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit :
> > > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a écrit :
> > > > > Hi,
> > > > >
> > > > > With the Rockchip stateless VPU driver in the works, we now have a
> > > > > better idea of what the situation is like on platforms other than
> > > > > Allwinner. This email shares my conclusions about the situation and 
> > > > > how
> > > > > we should update the MPEG-2, H.264 and H.265 controls accordingly.
> > > > >
> > > > > - Per-slice decoding
> > > > >
> > > > > We've discussed this one already[0] and Hans has submitted a patch[1]
> > > > > to implement the required core bits. When we agree it looks good, we
> > > > > should lift the restriction that all slices must be concatenated and
> > > > > have them submitted as individual requests.
> > > > >
> > > > > One question is what to do about other controls. I feel like it would
> > > > > make sense to always pass all the required controls for decoding the
> > > > > slice, including the ones that don't change across slices. But there
> > > > > may be no particular advantage to this and only downsides. Not doing 
> > > > > it
> > > > > and relying on the "control cache" can work, but we need to specify
> > > > > that only a single stream can be decoded per opened instance of the
> > > > > v4l2 device. This is the assumption we're going with for handling
> > > > > multi-slice anyway, so it shouldn't be an issue.
> > > >
> > > > My opinion on this is that the m2m instance is a state, and the driver
> > > > should be responsible of doing time-division multiplexing across
> > > > multiple m2m instance jobs. Doing the time-division multiplexing in
> > > > userspace would require some sort of daemon to work properly across
> > > > processes. I also think the kernel is better place for doing resource
> > > > access scheduling in general.
> > >
> > > I agree with that yes. We always have a single m2m context and specific
> > > controls per opened device so keeping cached values works out well.
> > >
> > > So maybe we shall explicitly require that the request with the first
> > > slice for a frame also contains the per-frame controls.
> > >
> >
> > Agreed.
> >
> > One more argument not to allow such multiplexing is that despite the

^^ Here I meant the "userspace multiplexing".

> > API being called "stateless", there is actually some state saved
> > between frames, e.g. the Rockchip decoder writes some intermediate
> > data to some local buffers which need to be given to the decoder to
> > decode the next frame. Actually, on Rockchip there is even a
> > requirement to keep the reference list entries in the same order
> > between frames.
>
> Well, what I'm suggesting is to have one stream per m2m context, but it
> should certainly be possible to have multiple m2m contexts (multiple
> userspace open calls) that decode different streams concurrently.
>
> Is that really going to be a problem for Rockchip? If so, then the
> driver should probably enforce allowing a single userspace open and m2m
> context at a time.

No, that's not what I meant. Obviously the driver can switch between
different sets of private buffers when scheduling different contexts,
as long as the userspace doesn't attempt to do any multiplexing
itself.

Best regards,
Tomasz


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-21 Thread Nicolas Dufresne
Le mardi 21 mai 2019 à 17:43 +0200, Thierry Reding a écrit :
> On Wed, May 15, 2019 at 07:42:50PM +0200, Paul Kocialkowski wrote:
> > Hi,
> > 
> > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit :
> > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a écrit :
> > > > Hi,
> > > > 
> > > > With the Rockchip stateless VPU driver in the works, we now have a
> > > > better idea of what the situation is like on platforms other than
> > > > Allwinner. This email shares my conclusions about the situation and how
> > > > we should update the MPEG-2, H.264 and H.265 controls accordingly.
> > > > 
> > > > - Per-slice decoding
> > > > 
> > > > We've discussed this one already[0] and Hans has submitted a patch[1]
> > > > to implement the required core bits. When we agree it looks good, we
> > > > should lift the restriction that all slices must be concatenated and
> > > > have them submitted as individual requests.
> > > > 
> > > > One question is what to do about other controls. I feel like it would
> > > > make sense to always pass all the required controls for decoding the
> > > > slice, including the ones that don't change across slices. But there
> > > > may be no particular advantage to this and only downsides. Not doing it
> > > > and relying on the "control cache" can work, but we need to specify
> > > > that only a single stream can be decoded per opened instance of the
> > > > v4l2 device. This is the assumption we're going with for handling
> > > > multi-slice anyway, so it shouldn't be an issue.
> > > 
> > > My opinion on this is that the m2m instance is a state, and the driver
> > > should be responsible of doing time-division multiplexing across
> > > multiple m2m instance jobs. Doing the time-division multiplexing in
> > > userspace would require some sort of daemon to work properly across
> > > processes. I also think the kernel is better place for doing resource
> > > access scheduling in general.
> > 
> > I agree with that yes. We always have a single m2m context and specific
> > controls per opened device so keeping cached values works out well.
> > 
> > So maybe we shall explicitly require that the request with the first
> > slice for a frame also contains the per-frame controls.
> > 
> > > > - Annex-B formats
> > > > 
> > > > I don't think we have really reached a conclusion on the pixel formats
> > > > we want to expose. The main issue is how to deal with codecs that need
> > > > the full slice NALU with start code, where the slice_header is
> > > > duplicated in raw bitstream, when others are fine with just the encoded
> > > > slice data and the parsed slice header control.
> > > > 
> > > > My initial thinking was that we'd need 3 formats:
> > > > - One that only takes only the slice compressed data (without raw slice
> > > > header and start code);
> > > > - One that takes both the NALU data (including start code, raw header
> > > > and compressed data) and slice header controls;
> > > > - One that takes the NALU data but no slice header.
> > > > 
> > > > But I no longer think the latter really makes sense in the context of
> > > > stateless video decoding.
> > > > 
> > > > A side-note: I think we should definitely have data offsets in every
> > > > case, so that implementations can just push the whole NALU regardless
> > > > of the format if they're lazy.
> > > 
> > > I realize that I didn't share our latest research on the subject. So a
> > > slice in the original bitstream is formed of the following blocks
> > > (simplified):
> > > 
> > >   [nal_header][nal_type][slice_header][slice]
> > 
> > Thanks for the details!
> > 
> > > nal_header:
> > > This one is a header used to locate the start and the end of the of a
> > > NAL. There is two standard forms, the ANNEX B / start code, a sequence
> > > of 3 bytes 0x00 0x00 0x01, you'll often see 4 bytes, the first byte
> > > would be a leading 0 from the previous NAL padding, but this is also
> > > totally valid start code. The second form is the AVC form, notably used
> > > in ISOMP4 container. It simply is the size of the NAL. You must keep
> > > your buffer aligned to NALs in this case a

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-21 Thread Nicolas Dufresne
Le mardi 21 mai 2019 à 17:09 +0200, Thierry Reding a écrit :
> On Tue, May 21, 2019 at 01:44:50PM +0200, Paul Kocialkowski wrote:
> > Hi,
> > 
> > On Tue, 2019-05-21 at 19:27 +0900, Tomasz Figa wrote:
> > > On Thu, May 16, 2019 at 2:43 AM Paul Kocialkowski
> > >  wrote:
> > > > Hi,
> > > > 
> > > > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit :
> > > > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a écrit :
> > > > > > Hi,
> > > > > > 
> > > > > > With the Rockchip stateless VPU driver in the works, we now have a
> > > > > > better idea of what the situation is like on platforms other than
> > > > > > Allwinner. This email shares my conclusions about the situation and 
> > > > > > how
> > > > > > we should update the MPEG-2, H.264 and H.265 controls accordingly.
> > > > > > 
> > > > > > - Per-slice decoding
> > > > > > 
> > > > > > We've discussed this one already[0] and Hans has submitted a 
> > > > > > patch[1]
> > > > > > to implement the required core bits. When we agree it looks good, we
> > > > > > should lift the restriction that all slices must be concatenated and
> > > > > > have them submitted as individual requests.
> > > > > > 
> > > > > > One question is what to do about other controls. I feel like it 
> > > > > > would
> > > > > > make sense to always pass all the required controls for decoding the
> > > > > > slice, including the ones that don't change across slices. But there
> > > > > > may be no particular advantage to this and only downsides. Not 
> > > > > > doing it
> > > > > > and relying on the "control cache" can work, but we need to specify
> > > > > > that only a single stream can be decoded per opened instance of the
> > > > > > v4l2 device. This is the assumption we're going with for handling
> > > > > > multi-slice anyway, so it shouldn't be an issue.
> > > > > 
> > > > > My opinion on this is that the m2m instance is a state, and the driver
> > > > > should be responsible of doing time-division multiplexing across
> > > > > multiple m2m instance jobs. Doing the time-division multiplexing in
> > > > > userspace would require some sort of daemon to work properly across
> > > > > processes. I also think the kernel is better place for doing resource
> > > > > access scheduling in general.
> > > > 
> > > > I agree with that yes. We always have a single m2m context and specific
> > > > controls per opened device so keeping cached values works out well.
> > > > 
> > > > So maybe we shall explicitly require that the request with the first
> > > > slice for a frame also contains the per-frame controls.
> > > > 
> > > 
> > > Agreed.
> > > 
> > > One more argument not to allow such multiplexing is that despite the
> > > API being called "stateless", there is actually some state saved
> > > between frames, e.g. the Rockchip decoder writes some intermediate
> > > data to some local buffers which need to be given to the decoder to
> > > decode the next frame. Actually, on Rockchip there is even a
> > > requirement to keep the reference list entries in the same order
> > > between frames.
> > 
> > Well, what I'm suggesting is to have one stream per m2m context, but it
> > should certainly be possible to have multiple m2m contexts (multiple
> > userspace open calls) that decode different streams concurrently.
> > 
> > Is that really going to be a problem for Rockchip? If so, then the
> > driver should probably enforce allowing a single userspace open and m2m
> > context at a time.
> 
> If you have hardware storing data necessary to the decoding process in
> buffers local to the decoder you'd have to have some sort of context
> switch operation that backs up the data in those buffers before you
> switch to a different context and restore those buffers when you switch
> back. We have similar hardware on Tegra, though I'm not exactly familiar
> with the details of what is saved and how essential it is. My
> understanding is that those internal buffers can be copied to external
> RAM or vice versa, but I suspect that this isn't going to be very
> efficient. It may very well be that restricting to a single userspace
> open is the most sensible option.

That would be by far the worst for a browser use case where an adds
might have stolen that single instance you have available in HW. It's
normal that context switching will have some impact on performance, but
in general, most of the time, the other instances will be idles by
userspace. If there is not context switches, there should be no (or
very little overhead). Of course, it's should not be a heard
requirement to get a driver in the kernel, I'm not saying that.

p.s. In the IMX8M/Hantro G1 they specifically says that the single core
decoder can handle up to 8 1080p60 streams at the same time. But there
is some buffers being write-back by the IP for every slice (at the end
of the decoded reference frames).

> 
> Thierry


signature.asc
Description: This is a digitally signed message part


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-21 Thread Thierry Reding
On Wed, May 15, 2019 at 07:42:50PM +0200, Paul Kocialkowski wrote:
> Hi,
> 
> Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit :
> > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a écrit :
> > > Hi,
> > > 
> > > With the Rockchip stateless VPU driver in the works, we now have a
> > > better idea of what the situation is like on platforms other than
> > > Allwinner. This email shares my conclusions about the situation and how
> > > we should update the MPEG-2, H.264 and H.265 controls accordingly.
> > > 
> > > - Per-slice decoding
> > > 
> > > We've discussed this one already[0] and Hans has submitted a patch[1]
> > > to implement the required core bits. When we agree it looks good, we
> > > should lift the restriction that all slices must be concatenated and
> > > have them submitted as individual requests.
> > > 
> > > One question is what to do about other controls. I feel like it would
> > > make sense to always pass all the required controls for decoding the
> > > slice, including the ones that don't change across slices. But there
> > > may be no particular advantage to this and only downsides. Not doing it
> > > and relying on the "control cache" can work, but we need to specify
> > > that only a single stream can be decoded per opened instance of the
> > > v4l2 device. This is the assumption we're going with for handling
> > > multi-slice anyway, so it shouldn't be an issue.
> > 
> > My opinion on this is that the m2m instance is a state, and the driver
> > should be responsible of doing time-division multiplexing across
> > multiple m2m instance jobs. Doing the time-division multiplexing in
> > userspace would require some sort of daemon to work properly across
> > processes. I also think the kernel is better place for doing resource
> > access scheduling in general.
> 
> I agree with that yes. We always have a single m2m context and specific
> controls per opened device so keeping cached values works out well.
> 
> So maybe we shall explicitly require that the request with the first
> slice for a frame also contains the per-frame controls.
> 
> > > - Annex-B formats
> > > 
> > > I don't think we have really reached a conclusion on the pixel formats
> > > we want to expose. The main issue is how to deal with codecs that need
> > > the full slice NALU with start code, where the slice_header is
> > > duplicated in raw bitstream, when others are fine with just the encoded
> > > slice data and the parsed slice header control.
> > > 
> > > My initial thinking was that we'd need 3 formats:
> > > - One that only takes only the slice compressed data (without raw slice
> > > header and start code);
> > > - One that takes both the NALU data (including start code, raw header
> > > and compressed data) and slice header controls;
> > > - One that takes the NALU data but no slice header.
> > > 
> > > But I no longer think the latter really makes sense in the context of
> > > stateless video decoding.
> > > 
> > > A side-note: I think we should definitely have data offsets in every
> > > case, so that implementations can just push the whole NALU regardless
> > > of the format if they're lazy.
> > 
> > I realize that I didn't share our latest research on the subject. So a
> > slice in the original bitstream is formed of the following blocks
> > (simplified):
> > 
> >   [nal_header][nal_type][slice_header][slice]
> 
> Thanks for the details!
> 
> > nal_header:
> > This one is a header used to locate the start and the end of the of a
> > NAL. There is two standard forms, the ANNEX B / start code, a sequence
> > of 3 bytes 0x00 0x00 0x01, you'll often see 4 bytes, the first byte
> > would be a leading 0 from the previous NAL padding, but this is also
> > totally valid start code. The second form is the AVC form, notably used
> > in ISOMP4 container. It simply is the size of the NAL. You must keep
> > your buffer aligned to NALs in this case as you cannot scan from random
> > location.
> > 
> > nal_type:
> > It's a bit more then just the type, but it contains at least the
> > information of the nal type. This has different size on H.264 and HEVC
> > but I know it's size is in bytes.
> > 
> > slice_header:
> > This contains per slice parameters, like the modification lists to
> > apply on the references. This one has a siz

Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-21 Thread Thierry Reding
On Tue, May 21, 2019 at 01:44:50PM +0200, Paul Kocialkowski wrote:
> Hi,
> 
> On Tue, 2019-05-21 at 19:27 +0900, Tomasz Figa wrote:
> > On Thu, May 16, 2019 at 2:43 AM Paul Kocialkowski
> >  wrote:
> > > Hi,
> > > 
> > > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit :
> > > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a écrit :
> > > > > Hi,
> > > > > 
> > > > > With the Rockchip stateless VPU driver in the works, we now have a
> > > > > better idea of what the situation is like on platforms other than
> > > > > Allwinner. This email shares my conclusions about the situation and 
> > > > > how
> > > > > we should update the MPEG-2, H.264 and H.265 controls accordingly.
> > > > > 
> > > > > - Per-slice decoding
> > > > > 
> > > > > We've discussed this one already[0] and Hans has submitted a patch[1]
> > > > > to implement the required core bits. When we agree it looks good, we
> > > > > should lift the restriction that all slices must be concatenated and
> > > > > have them submitted as individual requests.
> > > > > 
> > > > > One question is what to do about other controls. I feel like it would
> > > > > make sense to always pass all the required controls for decoding the
> > > > > slice, including the ones that don't change across slices. But there
> > > > > may be no particular advantage to this and only downsides. Not doing 
> > > > > it
> > > > > and relying on the "control cache" can work, but we need to specify
> > > > > that only a single stream can be decoded per opened instance of the
> > > > > v4l2 device. This is the assumption we're going with for handling
> > > > > multi-slice anyway, so it shouldn't be an issue.
> > > > 
> > > > My opinion on this is that the m2m instance is a state, and the driver
> > > > should be responsible of doing time-division multiplexing across
> > > > multiple m2m instance jobs. Doing the time-division multiplexing in
> > > > userspace would require some sort of daemon to work properly across
> > > > processes. I also think the kernel is better place for doing resource
> > > > access scheduling in general.
> > > 
> > > I agree with that yes. We always have a single m2m context and specific
> > > controls per opened device so keeping cached values works out well.
> > > 
> > > So maybe we shall explicitly require that the request with the first
> > > slice for a frame also contains the per-frame controls.
> > > 
> > 
> > Agreed.
> > 
> > One more argument not to allow such multiplexing is that despite the
> > API being called "stateless", there is actually some state saved
> > between frames, e.g. the Rockchip decoder writes some intermediate
> > data to some local buffers which need to be given to the decoder to
> > decode the next frame. Actually, on Rockchip there is even a
> > requirement to keep the reference list entries in the same order
> > between frames.
> 
> Well, what I'm suggesting is to have one stream per m2m context, but it
> should certainly be possible to have multiple m2m contexts (multiple
> userspace open calls) that decode different streams concurrently.
> 
> Is that really going to be a problem for Rockchip? If so, then the
> driver should probably enforce allowing a single userspace open and m2m
> context at a time.

If you have hardware storing data necessary to the decoding process in
buffers local to the decoder you'd have to have some sort of context
switch operation that backs up the data in those buffers before you
switch to a different context and restore those buffers when you switch
back. We have similar hardware on Tegra, though I'm not exactly familiar
with the details of what is saved and how essential it is. My
understanding is that those internal buffers can be copied to external
RAM or vice versa, but I suspect that this isn't going to be very
efficient. It may very well be that restricting to a single userspace
open is the most sensible option.

Thierry


signature.asc
Description: PGP signature


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-21 Thread Paul Kocialkowski
Hi,

On Tue, 2019-05-21 at 19:27 +0900, Tomasz Figa wrote:
> On Thu, May 16, 2019 at 2:43 AM Paul Kocialkowski
>  wrote:
> > Hi,
> > 
> > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit :
> > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a écrit :
> > > > Hi,
> > > > 
> > > > With the Rockchip stateless VPU driver in the works, we now have a
> > > > better idea of what the situation is like on platforms other than
> > > > Allwinner. This email shares my conclusions about the situation and how
> > > > we should update the MPEG-2, H.264 and H.265 controls accordingly.
> > > > 
> > > > - Per-slice decoding
> > > > 
> > > > We've discussed this one already[0] and Hans has submitted a patch[1]
> > > > to implement the required core bits. When we agree it looks good, we
> > > > should lift the restriction that all slices must be concatenated and
> > > > have them submitted as individual requests.
> > > > 
> > > > One question is what to do about other controls. I feel like it would
> > > > make sense to always pass all the required controls for decoding the
> > > > slice, including the ones that don't change across slices. But there
> > > > may be no particular advantage to this and only downsides. Not doing it
> > > > and relying on the "control cache" can work, but we need to specify
> > > > that only a single stream can be decoded per opened instance of the
> > > > v4l2 device. This is the assumption we're going with for handling
> > > > multi-slice anyway, so it shouldn't be an issue.
> > > 
> > > My opinion on this is that the m2m instance is a state, and the driver
> > > should be responsible of doing time-division multiplexing across
> > > multiple m2m instance jobs. Doing the time-division multiplexing in
> > > userspace would require some sort of daemon to work properly across
> > > processes. I also think the kernel is better place for doing resource
> > > access scheduling in general.
> > 
> > I agree with that yes. We always have a single m2m context and specific
> > controls per opened device so keeping cached values works out well.
> > 
> > So maybe we shall explicitly require that the request with the first
> > slice for a frame also contains the per-frame controls.
> > 
> 
> Agreed.
> 
> One more argument not to allow such multiplexing is that despite the
> API being called "stateless", there is actually some state saved
> between frames, e.g. the Rockchip decoder writes some intermediate
> data to some local buffers which need to be given to the decoder to
> decode the next frame. Actually, on Rockchip there is even a
> requirement to keep the reference list entries in the same order
> between frames.

Well, what I'm suggesting is to have one stream per m2m context, but it
should certainly be possible to have multiple m2m contexts (multiple
userspace open calls) that decode different streams concurrently.

Is that really going to be a problem for Rockchip? If so, then the
driver should probably enforce allowing a single userspace open and m2m
context at a time.

Cheers,

Paul

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com



Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-21 Thread Tomasz Figa
On Thu, May 16, 2019 at 2:43 AM Paul Kocialkowski
 wrote:
>
> Hi,
>
> Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a écrit :
> > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a écrit :
> > > Hi,
> > >
> > > With the Rockchip stateless VPU driver in the works, we now have a
> > > better idea of what the situation is like on platforms other than
> > > Allwinner. This email shares my conclusions about the situation and how
> > > we should update the MPEG-2, H.264 and H.265 controls accordingly.
> > >
> > > - Per-slice decoding
> > >
> > > We've discussed this one already[0] and Hans has submitted a patch[1]
> > > to implement the required core bits. When we agree it looks good, we
> > > should lift the restriction that all slices must be concatenated and
> > > have them submitted as individual requests.
> > >
> > > One question is what to do about other controls. I feel like it would
> > > make sense to always pass all the required controls for decoding the
> > > slice, including the ones that don't change across slices. But there
> > > may be no particular advantage to this and only downsides. Not doing it
> > > and relying on the "control cache" can work, but we need to specify
> > > that only a single stream can be decoded per opened instance of the
> > > v4l2 device. This is the assumption we're going with for handling
> > > multi-slice anyway, so it shouldn't be an issue.
> >
> > My opinion on this is that the m2m instance is a state, and the driver
> > should be responsible of doing time-division multiplexing across
> > multiple m2m instance jobs. Doing the time-division multiplexing in
> > userspace would require some sort of daemon to work properly across
> > processes. I also think the kernel is better place for doing resource
> > access scheduling in general.
>
> I agree with that yes. We always have a single m2m context and specific
> controls per opened device so keeping cached values works out well.
>
> So maybe we shall explicitly require that the request with the first
> slice for a frame also contains the per-frame controls.
>

Agreed.

One more argument not to allow such multiplexing is that despite the
API being called "stateless", there is actually some state saved
between frames, e.g. the Rockchip decoder writes some intermediate
data to some local buffers which need to be given to the decoder to
decode the next frame. Actually, on Rockchip there is even a
requirement to keep the reference list entries in the same order
between frames.

Best regards,
Tomasz


Re: Proposed updates and guidelines for MPEG-2, H.264 and H.265 stateless support

2019-05-18 Thread Nicolas Dufresne
Le samedi 18 mai 2019 à 12:29 +0200, Paul Kocialkowski a écrit :
> Hi,
> 
> Le samedi 18 mai 2019 à 12:04 +0200, Jernej Škrabec a écrit :
> > Dne sobota, 18. maj 2019 ob 11:50:37 CEST je Paul Kocialkowski napisal(a):
> > > Hi,
> > > 
> > > On Fri, 2019-05-17 at 16:43 -0400, Nicolas Dufresne wrote:
> > > > Le jeudi 16 mai 2019 à 20:45 +0200, Paul Kocialkowski a écrit :
> > > > > Hi,
> > > > > 
> > > > > Le jeudi 16 mai 2019 à 14:24 -0400, Nicolas Dufresne a écrit :
> > > > > > Le mercredi 15 mai 2019 à 22:59 +0200, Paul Kocialkowski a écrit :
> > > > > > > Hi,
> > > > > > > 
> > > > > > > Le mercredi 15 mai 2019 à 14:54 -0400, Nicolas Dufresne a écrit :
> > > > > > > > Le mercredi 15 mai 2019 à 19:42 +0200, Paul Kocialkowski a 
> > > > > > > > écrit :
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > Le mercredi 15 mai 2019 à 10:42 -0400, Nicolas Dufresne a 
> > > > > > > > > écrit 
> > :
> > > > > > > > > > Le mercredi 15 mai 2019 à 12:09 +0200, Paul Kocialkowski a 
> > écrit :
> > > > > > > > > > > Hi,
> > > > > > > > > > > 
> > > > > > > > > > > With the Rockchip stateless VPU driver in the works, we 
> > > > > > > > > > > now
> > > > > > > > > > > have a
> > > > > > > > > > > better idea of what the situation is like on platforms 
> > > > > > > > > > > other
> > > > > > > > > > > than
> > > > > > > > > > > Allwinner. This email shares my conclusions about the
> > > > > > > > > > > situation and how
> > > > > > > > > > > we should update the MPEG-2, H.264 and H.265 controls
> > > > > > > > > > > accordingly.
> > > > > > > > > > > 
> > > > > > > > > > > - Per-slice decoding
> > > > > > > > > > > 
> > > > > > > > > > > We've discussed this one already[0] and Hans has 
> > > > > > > > > > > submitted a
> > > > > > > > > > > patch[1]
> > > > > > > > > > > to implement the required core bits. When we agree it 
> > > > > > > > > > > looks
> > > > > > > > > > > good, we
> > > > > > > > > > > should lift the restriction that all slices must be
> > > > > > > > > > > concatenated and
> > > > > > > > > > > have them submitted as individual requests.
> > > > > > > > > > > 
> > > > > > > > > > > One question is what to do about other controls. I feel 
> > > > > > > > > > > like
> > > > > > > > > > > it would
> > > > > > > > > > > make sense to always pass all the required controls for
> > > > > > > > > > > decoding the
> > > > > > > > > > > slice, including the ones that don't change across slices.
> > > > > > > > > > > But there
> > > > > > > > > > > may be no particular advantage to this and only downsides.
> > > > > > > > > > > Not doing it
> > > > > > > > > > > and relying on the "control cache" can work, but we need 
> > > > > > > > > > > to
> > > > > > > > > > > specify
> > > > > > > > > > > that only a single stream can be decoded per opened 
> > > > > > > > > > > instance
> > > > > > > > > > > of the
> > > > > > > > > > > v4l2 device. This is the assumption we're going with for
> > > > > > > > > > > handling
> > > > > > > > > > > multi-slice anyway, so it shouldn't be an issue.
> > > > > > > > > > 
> > > > > > > > > > My opinion on this is that the 

  1   2   3   4   5   6   7   8   9   10   >