Re: [FFmpeg-devel] [WIP] False positives on Coverity

2024-05-14 Thread Tomas Härdin
Formal methods would be better than the heuristics coverity uses. At
the moment such methods are still too expensive for general use except
for the most safety critical applications (aerospace etc). But perhaps
in time the tooling and SMT solvers will improve sufficiently to make
it commonplace.

For FFmpeg the Eva and WP plugins for Frama-C would be of relevance.
I've been toying with the idea on and off

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2] lavc/speedhqdec: Add AV_CODEC_CAP_SLICE_THREADS

2024-05-14 Thread Tomas Härdin
I forgot to mention it is possible to go even further with ||izing the
decoder by separating VLC decode from IDCT. This would affect serial
performance however, since all coefficients would likely not fit in the
innermost cache. For 4k yuv444p this is 51 MiB. Even 1080p yuv420p is
still 6 MiB. For comparison, at the moment only a single DCT block is
kept, which is just 128 bytes

Another possibility could be to have two threads per slice,
interleaving VLC decode and IDCT, using double buffering on more
modestly sized buffers. Two 16-line buffers per pair of threads or 768
KiB for 4k, 3 MiB across all four thread pairs. But, this would require
mutexes between pairs of threads, which execute2() isn't designed for

A final possibility is to do just enough upfront work to find the bit
boundary between 16-line blocks within each slice. That is, doing VLC
decode but not writing coefficients anywhere. Then each group of 16
lines could be made into its own job since we know the exact start and
end bit for all of them. This would degrade serial performance and
performance with 4/8 threads but probably a win for 4k and up with
large numbers of threads

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 1/1] libavdevice/decklink: extend available actions on signal loss

2024-05-14 Thread Michael Riedl
>
>> Deprecate the option 'draw_bars' in favor of the new option 
>> 'signal_loss_action',
>> which controls the behavior when the input signal is not available
>> (including the behavior previously available through draw_bars).
>> The default behavior remains unchanged to be backwards compatible.
>> The new option is more flexible for extending now and in the future.
>>
>> The new value 'repeat' repeats the last video frame.
>> This is useful for very short dropouts and was not available before.
>
> As far as I see, you are overriding frameBytes for a repeated frame, that 
> seems wrong. pkt.data (frameBytes) must be associated with the videoFrame 
> which is passed to av_buffer_create() later on.
>
> Every AVFrame returned by the decklink device has an AVBuffer set up which
> keeps a reference to the original DeckLink frame. This allows the use of the 
> DeckLink frame's raw buffer directly. But you cannot use the raw buffer of 
> another DeckLink frame for which the AVBuffer of the AVFrame does not keep a 
> reference.

Thank you for your feedback!

I took another look at the code and revisited the DeckLink documentation to 
ensure my understanding was correct. It seems that frameBytes is a pointer to 
the buffer of an IDeckLinkVideoFrame, and it remains valid as long as the 
videoFrame is not released. To handle this, I add a reference to the DeckLink 
videoFrame to keep it valid and then release it (decreasing the reference 
counter) when it's no longer needed. Updating frameBytes multiple times should 
be okay since it just points to the raw frame buffer.

I ran some tests using Valgrind with and without the repeat option. I found a 
memory leak introduced by the patch because the destructor of 
decklink_input_callback is never called, which leaves one last video frame 
unreleased.

I'll work on fixing this memory leak and send an update soon. If you have any 
more comments or concerns, please let me know.

Best regards,
Michael Riedl

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] lavc/speedhqenc: Require width to be a multiple of 16

2024-05-14 Thread Tomas Härdin
Stop gap solution. I don't know enough about mpegvideo_enc to provide a
proper implementation, nor do I have access to NDI hardware to feel
comfortable with it. This patch at least prevents us from outputting
files that we know are broken

/Tomas
From 9dd76f9ec153c3d10374a2a4a74348dc39458c07 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= 
Date: Tue, 14 May 2024 13:03:22 +0200
Subject: [PATCH] lavc/speedhqenc: Require width to be a multiple of 16

---
 libavcodec/speedhqenc.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libavcodec/speedhqenc.c b/libavcodec/speedhqenc.c
index 5b4ff4c139..39ed244bca 100644
--- a/libavcodec/speedhqenc.c
+++ b/libavcodec/speedhqenc.c
@@ -104,6 +104,12 @@ av_cold int ff_speedhq_encode_init(MpegEncContext *s)
 return AVERROR(EINVAL);
 }
 
+// border is not implemented correctly at the moment, see ticket #10078
+if (s->width % 16) {
+av_log(s, AV_LOG_ERROR, "width must be a multiple of 16\n");
+return AVERROR_PATCHWELCOME;
+}
+
 s->min_qcoeff = -2048;
 s->max_qcoeff = 2047;
 
-- 
2.39.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 0/1] Updated decklink patch

2024-05-14 Thread Michael Riedl
Hi,

This patch adds the possibility to specify an action to be taken when a signal 
loss is detected on a decklink device.
Version 3 of this patch fixes a memory leak present in v2. Thanks to Marton 
Balint for reviewing the patch.

Kind regards,
Michael


Michael Riedl (1):
  libavdevice/decklink: extend available actions on signal loss

 doc/indevs.texi | 16 
 libavdevice/decklink_common.h   |  1 +
 libavdevice/decklink_common_c.h |  7 +++
 libavdevice/decklink_dec.cpp| 24 +++-
 libavdevice/decklink_dec_c.c|  6 +-
 5 files changed, 52 insertions(+), 2 deletions(-)

--
2.43.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 1/1] libavdevice/decklink: extend available actions on signal loss

2024-05-14 Thread Michael Riedl
Deprecate the option 'draw_bars' in favor of the new option 
'signal_loss_action',
which controls the behavior when the input signal is not available
(including the behavior previously available through draw_bars).
The default behavior remains unchanged to be backwards compatible.
The new option is more flexible for extending now and in the future.

The new value 'repeat' repeats the last video frame.
This is useful for very short dropouts and was not available before.

Signed-off-by: Michael Riedl 
---
 doc/indevs.texi | 16 
 libavdevice/decklink_common.h   |  1 +
 libavdevice/decklink_common_c.h |  7 +++
 libavdevice/decklink_dec.cpp| 24 +++-
 libavdevice/decklink_dec_c.c|  6 +-
 5 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/doc/indevs.texi b/doc/indevs.texi
index 734fc657523..cdf44a66382 100644
--- a/doc/indevs.texi
+++ b/doc/indevs.texi
@@ -396,6 +396,22 @@ Defaults to @samp{audio}.
 @item draw_bars
 If set to @samp{true}, color bars are drawn in the event of a signal loss.
 Defaults to @samp{true}.
+This option is deprecated, please use the @code{signal_loss_action} option.
+
+@item signal_loss_action
+Sets the action to take in the event of a signal loss. Accepts one of the
+following values:
+
+@table @option
+@item 1, none
+Do nothing on signal loss. This usually results in black frames.
+@item 2, bars
+Draw color bars on signal loss. Only supported for 8-bit input signals.
+@item 3, repeat
+Repeat the last video frame on signal loss.
+@end table
+
+Defaults to @samp{bars}.
 
 @item queue_size
 Sets maximum input buffer size in bytes. If the buffering reaches this value,
diff --git a/libavdevice/decklink_common.h b/libavdevice/decklink_common.h
index c54a635876c..6b32dc2d09c 100644
--- a/libavdevice/decklink_common.h
+++ b/libavdevice/decklink_common.h
@@ -147,6 +147,7 @@ struct decklink_ctx {
 DecklinkPtsSource video_pts_source;
 int draw_bars;
 BMDPixelFormat raw_format;
+DecklinkSignalLossAction signal_loss_action;
 
 int frames_preroll;
 int frames_buffer;
diff --git a/libavdevice/decklink_common_c.h b/libavdevice/decklink_common_c.h
index 9c55d891494..53d2c583e7e 100644
--- a/libavdevice/decklink_common_c.h
+++ b/libavdevice/decklink_common_c.h
@@ -37,6 +37,12 @@ typedef enum DecklinkPtsSource {
 PTS_SRC_NB
 } DecklinkPtsSource;
 
+typedef enum DecklinkSignalLossAction {
+SIGNAL_LOSS_NONE= 1,
+SIGNAL_LOSS_REPEAT  = 2,
+SIGNAL_LOSS_BARS= 3
+} DecklinkSignalLossAction;
+
 struct decklink_cctx {
 const AVClass *cclass;
 
@@ -68,6 +74,7 @@ struct decklink_cctx {
 int64_t timestamp_align;
 int timing_offset;
 int wait_for_tc;
+DecklinkSignalLossAction signal_loss_action;
 };
 
 #endif /* AVDEVICE_DECKLINK_COMMON_C_H */
diff --git a/libavdevice/decklink_dec.cpp b/libavdevice/decklink_dec.cpp
index 671573853ba..e10fd5d6569 100644
--- a/libavdevice/decklink_dec.cpp
+++ b/libavdevice/decklink_dec.cpp
@@ -593,6 +593,7 @@ private:
 int no_video;
 int64_t initial_video_pts;
 int64_t initial_audio_pts;
+IDeckLinkVideoInputFrame* last_video_frame;
 };
 
 decklink_input_callback::decklink_input_callback(AVFormatContext *_avctx) : 
_refs(1)
@@ -602,10 +603,13 @@ 
decklink_input_callback::decklink_input_callback(AVFormatContext *_avctx) : _ref
 ctx = (struct decklink_ctx *)cctx->ctx;
 no_video = 0;
 initial_audio_pts = initial_video_pts = AV_NOPTS_VALUE;
+last_video_frame = nullptr;
 }
 
 decklink_input_callback::~decklink_input_callback()
 {
+if (last_video_frame)
+last_video_frame->Release();
 }
 
 ULONG decklink_input_callback::AddRef(void)
@@ -773,7 +777,7 @@ HRESULT decklink_input_callback::VideoInputFrameArrived(
   ctx->video_st->time_base.den);
 
 if (videoFrame->GetFlags() & bmdFrameHasNoInputSource) {
-if (ctx->draw_bars && videoFrame->GetPixelFormat() == 
bmdFormat8BitYUV) {
+if (ctx->signal_loss_action == SIGNAL_LOSS_BARS && 
videoFrame->GetPixelFormat() == bmdFormat8BitYUV) {
 unsigned bars[8] = {
 0xEA80EA80, 0xD292D210, 0xA910A9A5, 0x90229035,
 0x6ADD6ACA, 0x51EF515A, 0x286D28EF, 0x10801080 };
@@ -785,6 +789,8 @@ HRESULT decklink_input_callback::VideoInputFrameArrived(
 for (int x = 0; x < width; x += 2)
 *p++ = bars[(x * 8) / width];
 }
+} else if (ctx->signal_loss_action == SIGNAL_LOSS_REPEAT) {
+last_video_frame->GetBytes(&frameBytes);
 }
 
 if (!no_video) {
@@ -793,6 +799,12 @@ HRESULT decklink_input_callback::VideoInputFrameArrived(
 }
 no_video = 1;
 } else {
+if (ctx->signal_loss_action == SIGNAL_LOSS_REPEAT) {
+if (last_video_frame)
+last_video_frame->Release();
+

Re: [FFmpeg-devel] [PATCH v3 4/4] tests/checkasm/vvc_alf: add check_alf_classify

2024-05-14 Thread Nuo Mi
On Mon, May 13, 2024 at 8:32 PM  wrote:

> From: Wu Jianhua 
>
> Perforamnce Test (fps):
> clip  before  after delta
> Tango2_3840x2160_60_10_420_27_LD.266  56  115   105.36%
> RitualDance_1920x1080_60_10_420_32_LD.266 272 481   76.83%
> RitualDance_1920x1080_60_10_420_37_RA.266 303 426   40.59%
>
Applied.
Thank you.

>
> Signed-off-by: Wu Jianhua 
> ---
>  tests/checkasm/vvc_alf.c | 47 
>  1 file changed, 47 insertions(+)
>
> diff --git a/tests/checkasm/vvc_alf.c b/tests/checkasm/vvc_alf.c
> index 10469e1528..9526260598 100644
> --- a/tests/checkasm/vvc_alf.c
> +++ b/tests/checkasm/vvc_alf.c
> @@ -121,6 +121,47 @@ static void check_alf_filter(VVCDSPContext *c, const
> int bit_depth)
>  }
>  }
>
> +static void check_alf_classify(VVCDSPContext *c, const int bit_depth)
> +{
> +LOCAL_ALIGNED_32(int, class_idx0, [SRC_BUF_SIZE]);
> +LOCAL_ALIGNED_32(int, transpose_idx0, [SRC_BUF_SIZE]);
> +LOCAL_ALIGNED_32(int, class_idx1, [SRC_BUF_SIZE]);
> +LOCAL_ALIGNED_32(int, transpose_idx1, [SRC_BUF_SIZE]);
> +LOCAL_ALIGNED_32(uint8_t, src0, [SRC_BUF_SIZE]);
> +LOCAL_ALIGNED_32(uint8_t, src1, [SRC_BUF_SIZE]);
> +LOCAL_ALIGNED_32(int32_t, alf_gradient_tmp, [ALF_GRADIENT_SIZE *
> ALF_GRADIENT_SIZE * ALF_NUM_DIR]);
> +
> +ptrdiff_t stride = SRC_PIXEL_STRIDE * SIZEOF_PIXEL;
> +int offset = (3 * SRC_PIXEL_STRIDE + 3) * SIZEOF_PIXEL;
> +
> +declare_func_emms(AV_CPU_FLAG_AVX2, void, int *class_idx, int
> *transpose_idx,
> +const uint8_t *src, ptrdiff_t src_stride, int width, int height,
> int vb_pos, int *gradient_tmp);
> +
> +randomize_buffers(src0, src1, SRC_BUF_SIZE);
> +
> +for (int h = 4; h <= MAX_CTU_SIZE; h += 4) {
> +for (int w = 4; w <= MAX_CTU_SIZE; w += 4) {
> +const int id_size = w * h / ALF_BLOCK_SIZE / ALF_BLOCK_SIZE *
> sizeof(int);
> +const int vb_pos  = MAX_CTU_SIZE - ALF_BLOCK_SIZE;
> +if (check_func(c->alf.classify, "vvc_alf_classify_%dx%d_%d",
> w, h, bit_depth)) {
> +memset(class_idx0, 0, id_size);
> +memset(class_idx1, 0, id_size);
> +memset(transpose_idx0, 0, id_size);
> +memset(transpose_idx1, 0, id_size);
> +call_ref(class_idx0, transpose_idx0, src0 + offset,
> stride, w, h, vb_pos, alf_gradient_tmp);
> +
> +call_new(class_idx1, transpose_idx1, src1 + offset,
> stride, w, h, vb_pos, alf_gradient_tmp);
> +
> +if (memcmp(class_idx0, class_idx1, id_size))
> +fail();
> +if (memcmp(transpose_idx0, transpose_idx1, id_size))
> +fail();
> +bench_new(class_idx1, transpose_idx1, src1 + offset,
> stride, w, h, vb_pos, alf_gradient_tmp);
> +}
> +}
> +}
> +}
> +
>  void checkasm_check_vvc_alf(void)
>  {
>  int bit_depth;
> @@ -130,4 +171,10 @@ void checkasm_check_vvc_alf(void)
>  check_alf_filter(&h, bit_depth);
>  }
>  report("alf_filter");
> +
> +for (bit_depth = 8; bit_depth <= 12; bit_depth += 2) {
> +ff_vvc_dsp_init(&h, bit_depth);
> +check_alf_classify(&h, bit_depth);
> +}
> +report("alf_classify");
>  }
> --
> 2.44.0.windows.1
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [WIP] False positives on Coverity

2024-05-14 Thread Rémi Denis-Courmont


Le 14 mai 2024 10:37:20 GMT+03:00, "Tomas Härdin"  a écrit :
>Formal methods would be better than the heuristics coverity uses.

That sounds like wishful thinking, or at least a distant pipe dream. Lets stick 
to what is possible and realistic today, please.

And I don't think that it would be reasonable to require that every FFmpeg 
developer be able to update the hypothetical formal proofs whenever they make a 
code change.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] avformat/pcmdec: add pts and dts calculation for pcmdec

2024-05-14 Thread Shiqi Zhu
Signed-off-by: Shiqi Zhu 
---
 libavformat/pcmdec.c | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/libavformat/pcmdec.c b/libavformat/pcmdec.c
index 2f6508b75a..d879aefaad 100644
--- a/libavformat/pcmdec.c
+++ b/libavformat/pcmdec.c
@@ -36,6 +36,7 @@ typedef struct PCMAudioDemuxerContext {
 AVClass *class;
 int sample_rate;
 AVChannelLayout ch_layout;
+int64_t nb_samples;
 } PCMAudioDemuxerContext;
 
 static int pcm_read_header(AVFormatContext *s)
@@ -46,6 +47,7 @@ static int pcm_read_header(AVFormatContext *s)
 uint8_t *mime_type = NULL;
 int ret;
 
+s1->nb_samples = 0;
 st = avformat_new_stream(s, NULL);
 if (!st)
 return AVERROR(ENOMEM);
@@ -104,6 +106,37 @@ static int pcm_read_header(AVFormatContext *s)
 return 0;
 }
 
+static int pcm_dec_read_packet(AVFormatContext *s, AVPacket *pkt)
+{
+PCMAudioDemuxerContext *s1 = s->priv_data;
+AVCodecParameters *par = s->streams[0]->codecpar;
+int ret;
+
+ret = ff_pcm_read_packet(s, pkt);
+if (ret < 0)
+return ret;
+
+pkt->time_base = s->streams[0]->time_base;
+pkt->dts = pkt->pts = s1->nb_samples;
+s1->nb_samples += pkt->size / par->block_align;
+
+return ret;
+}
+
+static int pcm_dec_read_seek(AVFormatContext *s,
+ int stream_index, int64_t timestamp, int flags)
+{
+PCMAudioDemuxerContext *s1 = s->priv_data;
+int ret;
+
+ret = ff_pcm_read_seek(s, stream_index, timestamp, flags);
+if (ret < 0)
+return ret;
+
+s1->nb_samples = ffstream(s->streams[0])->cur_dts;
+return ret;
+}
+
 static const AVOption pcm_options[] = {
 { "sample_rate", "", offsetof(PCMAudioDemuxerContext, sample_rate), 
AV_OPT_TYPE_INT, {.i64 = 44100}, 0, INT_MAX, AV_OPT_FLAG_DECODING_PARAM },
 { "ch_layout",   "", offsetof(PCMAudioDemuxerContext, ch_layout),   
AV_OPT_TYPE_CHLAYOUT, {.str = "mono"}, 0, 0, AV_OPT_FLAG_DECODING_PARAM },
@@ -126,8 +159,8 @@ const FFInputFormat ff_pcm_ ## name_ ## _demuxer = {
\
 .p.priv_class   = &pcm_demuxer_class,   \
 .priv_data_size = sizeof(PCMAudioDemuxerContext),   \
 .read_header= pcm_read_header,  \
-.read_packet= ff_pcm_read_packet,   \
-.read_seek  = ff_pcm_read_seek, \
+.read_packet= pcm_dec_read_packet,  \
+.read_seek  = pcm_dec_read_seek,\
 .raw_codec_id   = codec,\
 __VA_ARGS__ \
 };
-- 
2.34.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avformat/pcmdec: add pts and dts calculation for pcmdec

2024-05-14 Thread Andreas Rheinhardt
Shiqi Zhu:
> Signed-off-by: Shiqi Zhu 
> ---
>  libavformat/pcmdec.c | 37 +++--
>  1 file changed, 35 insertions(+), 2 deletions(-)
> 
> diff --git a/libavformat/pcmdec.c b/libavformat/pcmdec.c
> index 2f6508b75a..d879aefaad 100644
> --- a/libavformat/pcmdec.c
> +++ b/libavformat/pcmdec.c
> @@ -36,6 +36,7 @@ typedef struct PCMAudioDemuxerContext {
>  AVClass *class;
>  int sample_rate;
>  AVChannelLayout ch_layout;
> +int64_t nb_samples;
>  } PCMAudioDemuxerContext;
>  
>  static int pcm_read_header(AVFormatContext *s)
> @@ -46,6 +47,7 @@ static int pcm_read_header(AVFormatContext *s)
>  uint8_t *mime_type = NULL;
>  int ret;
>  
> +s1->nb_samples = 0;
>  st = avformat_new_stream(s, NULL);
>  if (!st)
>  return AVERROR(ENOMEM);
> @@ -104,6 +106,37 @@ static int pcm_read_header(AVFormatContext *s)
>  return 0;
>  }
>  
> +static int pcm_dec_read_packet(AVFormatContext *s, AVPacket *pkt)
> +{
> +PCMAudioDemuxerContext *s1 = s->priv_data;
> +AVCodecParameters *par = s->streams[0]->codecpar;
> +int ret;
> +
> +ret = ff_pcm_read_packet(s, pkt);
> +if (ret < 0)
> +return ret;
> +
> +pkt->time_base = s->streams[0]->time_base;
> +pkt->dts = pkt->pts = s1->nb_samples;
> +s1->nb_samples += pkt->size / par->block_align;
> +
> +return ret;
> +}
> +
> +static int pcm_dec_read_seek(AVFormatContext *s,
> + int stream_index, int64_t timestamp, int flags)
> +{
> +PCMAudioDemuxerContext *s1 = s->priv_data;
> +int ret;
> +
> +ret = ff_pcm_read_seek(s, stream_index, timestamp, flags);
> +if (ret < 0)
> +return ret;
> +
> +s1->nb_samples = ffstream(s->streams[0])->cur_dts;
> +return ret;
> +}
> +
>  static const AVOption pcm_options[] = {
>  { "sample_rate", "", offsetof(PCMAudioDemuxerContext, sample_rate), 
> AV_OPT_TYPE_INT, {.i64 = 44100}, 0, INT_MAX, AV_OPT_FLAG_DECODING_PARAM },
>  { "ch_layout",   "", offsetof(PCMAudioDemuxerContext, ch_layout),   
> AV_OPT_TYPE_CHLAYOUT, {.str = "mono"}, 0, 0, AV_OPT_FLAG_DECODING_PARAM },
> @@ -126,8 +159,8 @@ const FFInputFormat ff_pcm_ ## name_ ## _demuxer = {  
>   \
>  .p.priv_class   = &pcm_demuxer_class,   \
>  .priv_data_size = sizeof(PCMAudioDemuxerContext),   \
>  .read_header= pcm_read_header,  \
> -.read_packet= ff_pcm_read_packet,   \
> -.read_seek  = ff_pcm_read_seek, \
> +.read_packet= pcm_dec_read_packet,  \
> +.read_seek  = pcm_dec_read_seek,\
>  .raw_codec_id   = codec,\
>  __VA_ARGS__ \
>  };

A quick test shows that PTS and DTS are already set generically for pcm
formats (unless the AVFMT_FLAG_NOFILLIN flag is set). If it is not in
your usecase, then you should provide details about this (preferably by
opening a ticket on trac).

- Andreas

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 0/2] Add support for H266/VVC encoding

2024-05-14 Thread Christian Bartnik
This patchset is based on the latest patchset from Thomas Siedel
(thomas...@spin-digital.com).
Since almost all changes from the patchset but libvvenc and libvvdec has been
merged this patch only implements the libvvenc and libvvdec wrapper
implementation.
As ffmpeg already has it´s own vvc decoder, feel free to cherry pick libvvenc
only.
The libvvdec patch has been cleaned up by removing the extradata parsing files
and using existing code from cbs_h266.

The libvvenc patch only has been cleaned up with following changes:
- add defaults struct vvenc_defaults
- fix: init qp value (typo)
- cleanup init function (move code into sub functions)
- cleanup verbosity
- add check for payload allocation
- vvenc-params return error for invalid options or values
- add support for capped CRF mode (QP + subj.Optimization + max. bitrate) if 
vvenc version >= 1.11.0
- add vvenc documentation in doc/encoders.texi

Christian Bartnik (2):
  avcodec: add external enc libvvenc for H266/VVC
  avcodec: add external dec libvvdec for H266/VVC

 configure  |   9 +
 doc/encoders.texi  |  65 +
 libavcodec/Makefile|   2 +
 libavcodec/allcodecs.c |   2 +
 libavcodec/libvvdec.c  | 617 +
 libavcodec/libvvenc.c  | 566 +
 6 files changed, 1261 insertions(+)
 create mode 100644 libavcodec/libvvdec.c
 create mode 100644 libavcodec/libvvenc.c

--
2.34.1
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 1/2] avcodec: add external enc libvvenc for H266/VVC

2024-05-14 Thread Christian Bartnik
From: Thomas Siedel 

Add external encoder VVenC for H266/VVC encoding.
Register new encoder libvvenc.
Add libvvenc to wrap the vvenc interface.
libvvenc implements encoder option: preset,qp,period,subjopt,
vvenc-params,levelidc,tier.
Enable encoder by adding --enable-libvvenc in configure step.

Co-authored-by: Christian Bartnik chris1031...@gmail.com
Signed-off-by: Christian Bartnik 
---
 configure  |   4 +
 doc/encoders.texi  |  65 +
 libavcodec/Makefile|   1 +
 libavcodec/allcodecs.c |   1 +
 libavcodec/libvvenc.c  | 566 +
 5 files changed, 637 insertions(+)
 create mode 100644 libavcodec/libvvenc.c

diff --git a/configure b/configure
index a909b0689c..5d9a14821b 100755
--- a/configure
+++ b/configure
@@ -293,6 +293,7 @@ External library support:
   --enable-libvorbis   enable Vorbis en/decoding via libvorbis,
native implementation exists [no]
   --enable-libvpx  enable VP8 and VP9 de/encoding via libvpx [no]
+  --enable-libvvencenable H.266/VVC encoding via vvenc [no]
   --enable-libwebp enable WebP encoding via libwebp [no]
   --enable-libx264 enable H.264 encoding via x264 [no]
   --enable-libx265 enable HEVC encoding via x265 [no]
@@ -1966,6 +1967,7 @@ EXTERNAL_LIBRARY_LIST="
 libvmaf
 libvorbis
 libvpx
+libvvenc
 libwebp
 libxevd
 libxeve
@@ -3558,6 +3560,7 @@ libvpx_vp8_decoder_deps="libvpx"
 libvpx_vp8_encoder_deps="libvpx"
 libvpx_vp9_decoder_deps="libvpx"
 libvpx_vp9_encoder_deps="libvpx"
+libvvenc_encoder_deps="libvvenc"
 libwebp_encoder_deps="libwebp"
 libwebp_anim_encoder_deps="libwebp"
 libx262_encoder_deps="libx262"
@@ -7025,6 +7028,7 @@ enabled libvpx&& {
 die "libvpx enabled but no supported decoders found"
 fi
 }
+enabled libvvenc  && require_pkg_config libvvenc "libvvenc >= 1.6.1" 
"vvenc/vvenc.h" vvenc_get_version

 enabled libwebp   && {
 enabled libwebp_encoder  && require_pkg_config libwebp "libwebp >= 
0.2.0" webp/encode.h WebPGetEncoderVersion
diff --git a/doc/encoders.texi b/doc/encoders.texi
index c82f316f94..92aab17c49 100644
--- a/doc/encoders.texi
+++ b/doc/encoders.texi
@@ -2378,6 +2378,71 @@ Indicates frame duration
 For more information about libvpx see:
 @url{http://www.webmproject.org/}
 
+@section libvvenc
+
+VVenC H.266/VVC encoder wrapper.
+
+This encoder requires the presence of the libvvenc headers and library
+during configuration. You need to explicitly configure the build with
+@option{--enable-libvvenc}.
+
+The VVenC project website is at
+@url{https://github.com/fraunhoferhhi/vvenc}.
+
+@subsection Supported Pixel Formats
+
+VVenC supports only 10-bit color spaces as input. But the internal (encoded)
+bit depth can be set to 8-bit or 10-bit at runtime.
+
+@subsection Options
+
+@table @option
+@item b
+Sets target video bitrate.
+
+@item g
+Set the GOP size. Currently support for g=1 (Intra only) or default.
+
+@item preset
+Set the VVenC preset.
+
+@item levelidc
+Set level idc.
+
+@item tier
+Set vvc tier.
+
+@item qp
+Set constant quantization parameter.
+
+@item subopt @var{boolean}
+Set subjective (perceptually motivated) optimization. Default is 1 (on).
+
+@item bitdepth8 @var{boolean}
+Set 8bit coding mode instead of using 10bit. Default is 0 (off).
+
+@item period
+set (intra) refresh period in seconds.
+
+@item vvenc-params
+Set vvenc options using a list of @var{key}=@var{value} couples separated
+by ":". See @command{vvencapp --fullhelp} or @command{vvencFFapp --fullhelp} 
for a list of options.
+
+For example, the options might be provided as:
+
+@example
+intraperiod=64:decodingrefreshtype=idr:poc0idr=1:internalbitdepth=8
+@end example
+
+For example the encoding options for 2-pass encoding might be provided with 
@option{-vvenc-params}:
+
+@example
+ffmpeg -i input -c:v libvvenc -b 1M -vvenc-params 
passes=2:pass=1:rcstatsfile=stats.json output.mp4
+ffmpeg -i input -c:v libvvenc -b 1M -vvenc-params 
passes=2:pass=2:rcstatsfile=stats.json output.mp4
+@end example
+
+@end table
+
 @section libwebp
 
 libwebp WebP Image encoder wrapper
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index 2443d2c6fd..5d7349090e 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -1153,6 +1153,7 @@ OBJS-$(CONFIG_LIBVPX_VP8_DECODER) += libvpxdec.o
 OBJS-$(CONFIG_LIBVPX_VP8_ENCODER) += libvpxenc.o
 OBJS-$(CONFIG_LIBVPX_VP9_DECODER) += libvpxdec.o
 OBJS-$(CONFIG_LIBVPX_VP9_ENCODER) += libvpxenc.o
+OBJS-$(CONFIG_LIBVVENC_ENCODER)   += libvvenc.o
 OBJS-$(CONFIG_LIBWEBP_ENCODER)+= libwebpenc_common.o libwebpenc.o
 OBJS-$(CONFIG_LIBWEBP_ANIM_ENCODER)   += libwebpenc_common.o 
libwebpenc_animencoder.o
 OBJS-$(CONFIG_LIBX262_ENCODER)+= libx264.o
diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index b102a8069e..59d36dbd56 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcod

[FFmpeg-devel] [PATCH v3 2/2] avcodec: add external dec libvvdec for H266/VVC

2024-05-14 Thread Christian Bartnik
From: Thomas Siedel 

Add external decoder VVdeC for H266/VVC decoding.
Register new decoder libvvdec.
Add libvvdec to wrap the vvdec interface.
Enable decoder by adding --enable-libvvdec in configure step.

Co-authored-by: Christian Bartnik chris1031...@gmail.com
Signed-off-by: Christian Bartnik 
---
 configure  |   5 +
 libavcodec/Makefile|   1 +
 libavcodec/allcodecs.c |   1 +
 libavcodec/libvvdec.c  | 617 +
 4 files changed, 624 insertions(+)
 create mode 100644 libavcodec/libvvdec.c

diff --git a/configure b/configure
index 5d9a14821b..a5df482215 100755
--- a/configure
+++ b/configure
@@ -294,6 +294,7 @@ External library support:
native implementation exists [no]
   --enable-libvpx  enable VP8 and VP9 de/encoding via libvpx [no]
   --enable-libvvencenable H.266/VVC encoding via vvenc [no]
+  --enable-libvvdecenable H.266/VVC decoding via vvdec [no]
   --enable-libwebp enable WebP encoding via libwebp [no]
   --enable-libx264 enable H.264 encoding via x264 [no]
   --enable-libx265 enable HEVC encoding via x265 [no]
@@ -1968,6 +1969,7 @@ EXTERNAL_LIBRARY_LIST="
 libvorbis
 libvpx
 libvvenc
+libvvdec
 libwebp
 libxevd
 libxeve
@@ -3561,6 +3563,8 @@ libvpx_vp8_encoder_deps="libvpx"
 libvpx_vp9_decoder_deps="libvpx"
 libvpx_vp9_encoder_deps="libvpx"
 libvvenc_encoder_deps="libvvenc"
+libvvdec_decoder_deps="libvvdec"
+libvvdec_decoder_select="vvc_mp4toannexb_bsf"
 libwebp_encoder_deps="libwebp"
 libwebp_anim_encoder_deps="libwebp"
 libx262_encoder_deps="libx262"
@@ -7029,6 +7033,7 @@ enabled libvpx&& {
 fi
 }
 enabled libvvenc  && require_pkg_config libvvenc "libvvenc >= 1.6.1" 
"vvenc/vvenc.h" vvenc_get_version
+enabled libvvdec  && require_pkg_config libvvdec "libvvdec >= 1.6.0" 
"vvdec/vvdec.h" vvdec_get_version
 
 enabled libwebp   && {
 enabled libwebp_encoder  && require_pkg_config libwebp "libwebp >= 
0.2.0" webp/encode.h WebPGetEncoderVersion
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index 5d7349090e..318b22a1fa 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -1154,6 +1154,7 @@ OBJS-$(CONFIG_LIBVPX_VP8_ENCODER) += libvpxenc.o
 OBJS-$(CONFIG_LIBVPX_VP9_DECODER) += libvpxdec.o
 OBJS-$(CONFIG_LIBVPX_VP9_ENCODER) += libvpxenc.o
 OBJS-$(CONFIG_LIBVVENC_ENCODER)   += libvvenc.o
+OBJS-$(CONFIG_LIBVVDEC_DECODER)   += libvvdec.o
 OBJS-$(CONFIG_LIBWEBP_ENCODER)+= libwebpenc_common.o libwebpenc.o
 OBJS-$(CONFIG_LIBWEBP_ANIM_ENCODER)   += libwebpenc_common.o 
libwebpenc_animencoder.o
 OBJS-$(CONFIG_LIBX262_ENCODER)+= libx264.o
diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index 59d36dbd56..4120681d17 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcodec/allcodecs.c
@@ -801,6 +801,7 @@ extern const FFCodec ff_libvpx_vp8_decoder;
 extern FFCodec ff_libvpx_vp9_encoder;
 extern const FFCodec ff_libvpx_vp9_decoder;
 extern const FFCodec ff_libvvenc_encoder;
+extern const FFCodec ff_libvvdec_decoder;
 /* preferred over libwebp */
 extern const FFCodec ff_libwebp_anim_encoder;
 extern const FFCodec ff_libwebp_encoder;
diff --git a/libavcodec/libvvdec.c b/libavcodec/libvvdec.c
new file mode 100644
index 00..7f94a81b37
--- /dev/null
+++ b/libavcodec/libvvdec.c
@@ -0,0 +1,617 @@
+/*
+ * H.266 decoding using the VVdeC library
+ *
+ * Copyright (C) 2022, Thomas Siedel
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "config_components.h"
+
+#include 
+
+#include "libavutil/common.h"
+#include "libavutil/avutil.h"
+#include "libavutil/pixdesc.h"
+#include "libavutil/opt.h"
+#include "libavutil/imgutils.h"
+#include "libavutil/frame.h"
+#include "libavutil/mastering_display_metadata.h"
+#include "libavutil/log.h"
+
+#include "avcodec.h"
+#include "codec_internal.h"
+#include "decode.h"
+#include "internal.h"
+#include "profiles.h"
+
+#include "cbs_h266.h"
+
+typedef struct VVdeCContext {
+AVClass  *av_class;
+vvdecDecoder *vvdecDec;
+vvdecParams  vvdecParams;
+bool bFlush;
+AVBufferPool *pools[3]; /** Pools for each data plane. */
+int   

Re: [FFmpeg-devel] [PATCH 2/2] checkasm/h264dsp: support checking more idct depths

2024-05-14 Thread J. Dekker

Devin Heitmueller  writes:

> On Wed, Apr 24, 2024 at 10:10 AM J. Dekker  wrote:
>>
>> Signed-off-by: J. Dekker 
>> ---
>>  tests/checkasm/h264dsp.c | 6 --
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c
>> index 0f484e3f43..5cb646ae49 100644
>> --- a/tests/checkasm/h264dsp.c
>> +++ b/tests/checkasm/h264dsp.c
>> @@ -173,6 +173,7 @@ static void dct8x8(int16_t *coef, int bit_depth)
>>
>>  static void check_idct(void)
>>  {
>> +static const int depths[5] = { 8, 9, 10, 12, 14 };
>>  LOCAL_ALIGNED_16(uint8_t, src,  [8 * 8 * 2]);
>>  LOCAL_ALIGNED_16(uint8_t, dst,  [8 * 8 * 2]);
>>  LOCAL_ALIGNED_16(uint8_t, dst0, [8 * 8 * 2]);
>> @@ -181,10 +182,11 @@ static void check_idct(void)
>>  LOCAL_ALIGNED_16(int16_t, subcoef0, [8 * 8 * 2]);
>>  LOCAL_ALIGNED_16(int16_t, subcoef1, [8 * 8 * 2]);
>>  H264DSPContext h;
>> -int bit_depth, sz, align, dc;
>> +int bit_depth, sz, align, dc, i;
>>  declare_func_emms(AV_CPU_FLAG_MMX, void, uint8_t *dst, int16_t *block, 
>> int stride);
>>
>> -for (bit_depth = 8; bit_depth <= 10; bit_depth++) {
>> +for (i = 0; i < 5; i++) {
>> +bit_depth = depths[i];
>
> Perhaps this should use FF_ARRAY_ELEMS(depths) rather than a hard-coded "5"?

Thanks for the suggestion, pushed with this change.

-- 
jd
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3 2/2] avcodec: add external dec libvvdec for H266/VVC

2024-05-14 Thread Lynne via ffmpeg-devel

On 14/05/2024 17:09, Christian Bartnik wrote:

From: Thomas Siedel 

Add external decoder VVdeC for H266/VVC decoding.
Register new decoder libvvdec.
Add libvvdec to wrap the vvdec interface.
Enable decoder by adding --enable-libvvdec in configure step.

Co-authored-by: Christian Bartnik chris1031...@gmail.com
Signed-off-by: Christian Bartnik 
---
  configure  |   5 +
  libavcodec/Makefile|   1 +
  libavcodec/allcodecs.c |   1 +
  libavcodec/libvvdec.c  | 617 +
  4 files changed, 624 insertions(+)
  create mode 100644 libavcodec/libvvdec.c


I would prefer to have this one skipped, as initially suggested.


OpenPGP_0xA2FEA5F03F034464.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3 1/2] avcodec: add external enc libvvenc for H266/VVC

2024-05-14 Thread Andreas Rheinhardt
Christian Bartnik:
> From: Thomas Siedel 
> 
> Add external encoder VVenC for H266/VVC encoding.
> Register new encoder libvvenc.
> Add libvvenc to wrap the vvenc interface.
> libvvenc implements encoder option: preset,qp,period,subjopt,
> vvenc-params,levelidc,tier.
> Enable encoder by adding --enable-libvvenc in configure step.
> 
> Co-authored-by: Christian Bartnik chris1031...@gmail.com
> Signed-off-by: Christian Bartnik 
> ---
>  configure  |   4 +
>  doc/encoders.texi  |  65 +
>  libavcodec/Makefile|   1 +
>  libavcodec/allcodecs.c |   1 +
>  libavcodec/libvvenc.c  | 566 +
>  5 files changed, 637 insertions(+)
>  create mode 100644 libavcodec/libvvenc.c
> 
> diff --git a/configure b/configure
> index a909b0689c..5d9a14821b 100755
> --- a/configure
> +++ b/configure
> @@ -293,6 +293,7 @@ External library support:
>--enable-libvorbis   enable Vorbis en/decoding via libvorbis,
> native implementation exists [no]
>--enable-libvpx  enable VP8 and VP9 de/encoding via libvpx [no]
> +  --enable-libvvencenable H.266/VVC encoding via vvenc [no]
>--enable-libwebp enable WebP encoding via libwebp [no]
>--enable-libx264 enable H.264 encoding via x264 [no]
>--enable-libx265 enable HEVC encoding via x265 [no]
> @@ -1966,6 +1967,7 @@ EXTERNAL_LIBRARY_LIST="
>  libvmaf
>  libvorbis
>  libvpx
> +libvvenc
>  libwebp
>  libxevd
>  libxeve
> @@ -3558,6 +3560,7 @@ libvpx_vp8_decoder_deps="libvpx"
>  libvpx_vp8_encoder_deps="libvpx"
>  libvpx_vp9_decoder_deps="libvpx"
>  libvpx_vp9_encoder_deps="libvpx"
> +libvvenc_encoder_deps="libvvenc"
>  libwebp_encoder_deps="libwebp"
>  libwebp_anim_encoder_deps="libwebp"
>  libx262_encoder_deps="libx262"
> @@ -7025,6 +7028,7 @@ enabled libvpx&& {
>  die "libvpx enabled but no supported decoders found"
>  fi
>  }
> +enabled libvvenc  && require_pkg_config libvvenc "libvvenc >= 1.6.1" 
> "vvenc/vvenc.h" vvenc_get_version
> 
>  enabled libwebp   && {
>  enabled libwebp_encoder  && require_pkg_config libwebp "libwebp >= 
> 0.2.0" webp/encode.h WebPGetEncoderVersion
> diff --git a/doc/encoders.texi b/doc/encoders.texi
> index c82f316f94..92aab17c49 100644
> --- a/doc/encoders.texi
> +++ b/doc/encoders.texi
> @@ -2378,6 +2378,71 @@ Indicates frame duration
>  For more information about libvpx see:
>  @url{http://www.webmproject.org/}
>  
> +@section libvvenc
> +
> +VVenC H.266/VVC encoder wrapper.
> +
> +This encoder requires the presence of the libvvenc headers and library
> +during configuration. You need to explicitly configure the build with
> +@option{--enable-libvvenc}.
> +
> +The VVenC project website is at
> +@url{https://github.com/fraunhoferhhi/vvenc}.
> +
> +@subsection Supported Pixel Formats
> +
> +VVenC supports only 10-bit color spaces as input. But the internal (encoded)
> +bit depth can be set to 8-bit or 10-bit at runtime.
> +
> +@subsection Options
> +
> +@table @option
> +@item b
> +Sets target video bitrate.
> +
> +@item g
> +Set the GOP size. Currently support for g=1 (Intra only) or default.
> +
> +@item preset
> +Set the VVenC preset.
> +
> +@item levelidc
> +Set level idc.
> +
> +@item tier
> +Set vvc tier.
> +
> +@item qp
> +Set constant quantization parameter.
> +
> +@item subopt @var{boolean}
> +Set subjective (perceptually motivated) optimization. Default is 1 (on).
> +
> +@item bitdepth8 @var{boolean}
> +Set 8bit coding mode instead of using 10bit. Default is 0 (off).
> +
> +@item period
> +set (intra) refresh period in seconds.
> +
> +@item vvenc-params
> +Set vvenc options using a list of @var{key}=@var{value} couples separated
> +by ":". See @command{vvencapp --fullhelp} or @command{vvencFFapp --fullhelp} 
> for a list of options.
> +
> +For example, the options might be provided as:
> +
> +@example
> +intraperiod=64:decodingrefreshtype=idr:poc0idr=1:internalbitdepth=8
> +@end example
> +
> +For example the encoding options for 2-pass encoding might be provided with 
> @option{-vvenc-params}:
> +
> +@example
> +ffmpeg -i input -c:v libvvenc -b 1M -vvenc-params 
> passes=2:pass=1:rcstatsfile=stats.json output.mp4
> +ffmpeg -i input -c:v libvvenc -b 1M -vvenc-params 
> passes=2:pass=2:rcstatsfile=stats.json output.mp4
> +@end example
> +
> +@end table
> +
>  @section libwebp
>  
>  libwebp WebP Image encoder wrapper
> diff --git a/libavcodec/Makefile b/libavcodec/Makefile
> index 2443d2c6fd..5d7349090e 100644
> --- a/libavcodec/Makefile
> +++ b/libavcodec/Makefile
> @@ -1153,6 +1153,7 @@ OBJS-$(CONFIG_LIBVPX_VP8_DECODER) += libvpxdec.o
>  OBJS-$(CONFIG_LIBVPX_VP8_ENCODER) += libvpxenc.o
>  OBJS-$(CONFIG_LIBVPX_VP9_DECODER) += libvpxdec.o
>  OBJS-$(CONFIG_LIBVPX_VP9_ENCODER) += libvpxenc.o
> +OBJS-$(CONFIG_LIBVVENC_ENCODER)   += libvvenc.o
>  OBJS-$(CONFIG_LIBWEBP_ENCODER)+= libweb

Re: [FFmpeg-devel] [PATCH] lavu/riscv: fallback to raw hwprobe() system call

2024-05-14 Thread Rémi Denis-Courmont
Le lauantaina 11. toukokuuta 2024, 17.41.52 EEST Rémi Denis-Courmont a écrit :
> Not all C run-times support this, and even then, it will be a while
> before distributions provide recent enough versions thereof.

Merged. The FATE box should now have working hardware detection.

-- 
レミ・デニ-クールモン
http://www.remlab.net/



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3 1/9] lavc/vp9dsp: R-V ipred vert

2024-05-14 Thread Rémi Denis-Courmont
Le tiistaina 14. toukokuuta 2024, 7.45.29 EEST flow gg a écrit :
> I am locally using:
> if (bpp == 8 && (flags & AV_CPU_FLAG_RVI) && (flags &
> AV_CPU_FLAG_RVB_ADDR)) {

There is no point testing the I flag if you test any other flag. The I flag is 
always set (since we don't, and probably never will, support RV32E) and only 
exists for the benefit of checkasm.

> this performs better on k230/banana_f3 than C.

It also performs better than C on SiFive U74, even though that design has 
veery slow unaligned access (emulated in SBI). Of course, it could just be 
that checkasm only tests aligned accesses and unaligned accesses are legal, 
hence my earlier question.

-- 
レミ・デニ-クールモン
http://www.remlab.net/



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3 2/9] lavc/vp9dsp: R-V mc copy

2024-05-14 Thread Rémi Denis-Courmont
Le tiistaina 14. toukokuuta 2024, 7.44.55 EEST flow gg a écrit :
> I am locally using:
> if (bpp == 8 && (flags & AV_CPU_FLAG_RVI)) {
> this performs better on k230/banana_f3 than C.
> For email, refer to [FFmpeg-devel] [PATCH 2/2] lavc/vp8dsp: restrict RVI
> optimisations and change it to
> if (bpp == 8 && (flags & AV_CPU_FLAG_RV_MISALIGNED)) {
> So no output, but I think the same modification should be made here?

I just can't get any benchmarks out of checkasm. Even if I comment out the 
MISALIGNED flag check, this is not reporting anything. I tested with only patch 
1/9 and 2/9, not the following. I don't know why.

-- 
雷米‧德尼-库尔蒙
http://www.remlab.net/



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] web: add a news entry about STF sponsorship

2024-05-14 Thread Thilo Borgmann via ffmpeg-devel
---
This text including the link is also meant to be published via our socal media.

 src/index | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/index b/src/index
index d035ffa..83cc9bf 100644
--- a/src/index
+++ b/src/index
@@ -35,6 +35,14 @@
 News
   
 
+  May 13th, 2024, Sovereign Tech Fund
+  
+  The FFMPEG community is excited to announce that it has received
+  funding from the https://www.sovereigntechfund.de/";>Sovereign Tech 
Fund
+  for the maintenance of the FFMPEG project. This funding will help ensure 
that FFMPEG
+  continues bringing video efficiently and securely to billions worldwide 
everyday.
+  
+
   April 5th, 2024, FFmpeg 7.0 "Dijkstra"
   
   A new major release, FFmpeg 7.0 
"Dijkstra",
-- 
2.39.3 (Apple Git-146)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] web: add a news entry about STF sponsorship

2024-05-14 Thread J. Dekker


Thilo Borgmann via ffmpeg-devel  writes:

> ---
> This text including the link is also meant to be published via our socal 
> media.
>
>  src/index | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/src/index b/src/index
> index d035ffa..83cc9bf 100644
> --- a/src/index
> +++ b/src/index
> @@ -35,6 +35,14 @@
>  News
>
>  
> +  May 13th, 2024, Sovereign Tech Fund
> +  
> +  The FFMPEG community is excited to announce that it has received
> +  funding from the https://www.sovereigntechfund.de/";>Sovereign 
> Tech Fund
> +  for the maintenance of the FFMPEG project. This funding will help ensure 
> that FFMPEG
> +  continues bringing video efficiently and securely to billions worldwide 
> everyday.
> +  
> +
>April 5th, 2024, FFmpeg 7.0 "Dijkstra"
>
>A new major release, FFmpeg 7.0 
> "Dijkstra",

Spelling of FFmpeg should be fixed, and needs a unique header id. Also
FFmpeg does more than 'bring video'.

-- 
jd
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] web: add a news entry about STF sponsorship

2024-05-14 Thread Thilo Borgmann via ffmpeg-devel

Am 14.05.24 um 19:14 schrieb J. Dekker:


Thilo Borgmann via ffmpeg-devel  writes:


---
This text including the link is also meant to be published via our socal media.

  src/index | 8 
  1 file changed, 8 insertions(+)

diff --git a/src/index b/src/index
index d035ffa..83cc9bf 100644
--- a/src/index
+++ b/src/index
@@ -35,6 +35,14 @@
  News

  
+  May 13th, 2024, Sovereign Tech Fund

+  
+  The FFMPEG community is excited to announce that it has received
+  funding from the https://www.sovereigntechfund.de/";>Sovereign Tech 
Fund
+  for the maintenance of the FFMPEG project. This funding will help ensure 
that FFMPEG
+  continues bringing video efficiently and securely to billions worldwide 
everyday.
+  
+
April 5th, 2024, FFmpeg 7.0 "Dijkstra"

A new major release, FFmpeg 7.0 
"Dijkstra",





Spelling of FFmpeg should be fixed, and needs a unique header id. Also
FFmpeg does more than 'bring video'.


Oh, missed the ID and spelling.
Propose something to extend 'bring video'.
 Thanks,
Thilo

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3 2/9] lavc/vp9dsp: R-V mc copy

2024-05-14 Thread flow gg
Using this will give output `if (bpp == 8 && (flags & AV_CPU_FLAG_RVI)) {`
Did you comment out the MISALIGNED flag check but not add RVI, resulting in
no output?

Rémi Denis-Courmont  于2024年5月15日周三 01:02写道:

> Le tiistaina 14. toukokuuta 2024, 7.44.55 EEST flow gg a écrit :
> > I am locally using:
> > if (bpp == 8 && (flags & AV_CPU_FLAG_RVI)) {
> > this performs better on k230/banana_f3 than C.
> > For email, refer to [FFmpeg-devel] [PATCH 2/2] lavc/vp8dsp: restrict RVI
> > optimisations and change it to
> > if (bpp == 8 && (flags & AV_CPU_FLAG_RV_MISALIGNED)) {
> > So no output, but I think the same modification should be made here?
>
> I just can't get any benchmarks out of checkasm. Even if I comment out the
> MISALIGNED flag check, this is not reporting anything. I tested with only
> patch
> 1/9 and 2/9, not the following. I don't know why.
>
> --
> 雷米‧德尼-库尔蒙
> http://www.remlab.net/
>
>
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3 1/9] lavc/vp9dsp: R-V ipred vert

2024-05-14 Thread flow gg
Okay, learned it

Rémi Denis-Courmont  于2024年5月15日周三 01:00写道:

> Le tiistaina 14. toukokuuta 2024, 7.45.29 EEST flow gg a écrit :
> > I am locally using:
> > if (bpp == 8 && (flags & AV_CPU_FLAG_RVI) && (flags &
> > AV_CPU_FLAG_RVB_ADDR)) {
>
> There is no point testing the I flag if you test any other flag. The I
> flag is
> always set (since we don't, and probably never will, support RV32E) and
> only
> exists for the benefit of checkasm.
>
> > this performs better on k230/banana_f3 than C.
>
> It also performs better than C on SiFive U74, even though that design has
> veery slow unaligned access (emulated in SBI). Of course, it could
> just be
> that checkasm only tests aligned accesses and unaligned accesses are
> legal,
> hence my earlier question.
>
> --
> レミ・デニ-クールモン
> http://www.remlab.net/
>
>
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] aacdec: move from scalefactor ranged arrays to flat arrays

2024-05-14 Thread Lynne via ffmpeg-devel
AAC uses an unconventional system to send scalefactors
(the volume+quantization value for each band).
Each window is split into either 1 or 8 blocks (long vs short),
and transformed separately from one another, with the coefficients
for each being also completely independent. The scalefactors
slightly increase from 64 (long) to 128 (short) to accomodate
better per-block-per-band volume for each window.

To reduce overhead, the codec signals scalefactor sizes in an obtuse way,
where each group's scalefactor types are sent via a variable length decoding,
with a range.
But our decoder was written in a way where those ranges were carried through
the entire decoder, and to actually read them you had to use the range.

Instead of having a dedicated array with a range for each scalefactor,
just let the decoder directly index each scalefactor.

This also switches the form of quantized scalefactors to the format
the spec uses, where for intensity stereo and regular, scalefactors
are stored in a scalefactor - 100 form, rather than as-is.

USAC gets rid of the complex scalefactor handling. This commit permits
for code sharing between both.
---
Opus avoids this by simply interleaving all energy values and their
coefficients for each individual window, which also saves a lot of bits
due to the energy values being similar window to window.

 libavcodec/aac/aacdec.c  | 100 ---
 libavcodec/aac/aacdec.h  |   6 +-
 libavcodec/aac/aacdec_dsp_template.c |  95 ++---
 3 files changed, 84 insertions(+), 117 deletions(-)

diff --git a/libavcodec/aac/aacdec.c b/libavcodec/aac/aacdec.c
index 7457fe6c97..35722f9b9b 100644
--- a/libavcodec/aac/aacdec.c
+++ b/libavcodec/aac/aacdec.c
@@ -1412,13 +1412,13 @@ fail:
  *
  * @return  Returns error status. 0 - OK, !0 - error
  */
-static int decode_band_types(AACDecContext *ac, enum BandType band_type[120],
- int band_type_run_end[120], GetBitContext *gb,
- IndividualChannelStream *ics)
+static int decode_band_types(AACDecContext *ac, SingleChannelElement *sce,
+ GetBitContext *gb)
 {
-int g, idx = 0;
+IndividualChannelStream *ics = &sce->ics;
 const int bits = (ics->window_sequence[0] == EIGHT_SHORT_SEQUENCE) ? 3 : 5;
-for (g = 0; g < ics->num_window_groups; g++) {
+
+for (int g = 0; g < ics->num_window_groups; g++) {
 int k = 0;
 while (k < ics->max_sfb) {
 uint8_t sect_end = k;
@@ -1442,10 +1442,8 @@ static int decode_band_types(AACDecContext *ac, enum 
BandType band_type[120],
 return AVERROR_INVALIDDATA;
 }
 } while (sect_len_incr == (1 << bits) - 1);
-for (; k < sect_end; k++) {
-band_type[idx]   = sect_band_type;
-band_type_run_end[idx++] = sect_end;
-}
+for (; k < sect_end; k++)
+sce->band_type[g*ics->max_sfb + k] = sect_band_type;
 }
 }
 return 0;
@@ -1461,69 +1459,59 @@ static int decode_band_types(AACDecContext *ac, enum 
BandType band_type[120],
  *
  * @return  Returns error status. 0 - OK, !0 - error
  */
-static int decode_scalefactors(AACDecContext *ac, int sfo[120],
-   GetBitContext *gb,
-   unsigned int global_gain,
-   IndividualChannelStream *ics,
-   enum BandType band_type[120],
-   int band_type_run_end[120])
+static int decode_scalefactors(AACDecContext *ac, SingleChannelElement *sce,
+   GetBitContext *gb, unsigned int global_gain)
 {
-int g, i, idx = 0;
+IndividualChannelStream *ics = &sce->ics;
 int offset[3] = { global_gain, global_gain - NOISE_OFFSET, 0 };
 int clipped_offset;
 int noise_flag = 1;
-for (g = 0; g < ics->num_window_groups; g++) {
-for (i = 0; i < ics->max_sfb;) {
-int run_end = band_type_run_end[idx];
-switch (band_type[idx]) {
+
+for (int g = 0; g < ics->num_window_groups; g++) {
+for (int sfb = 0; sfb < ics->max_sfb; sfb++) {
+switch (sce->band_type[g*ics->max_sfb + sfb]) {
 case ZERO_BT:
-for (; i < run_end; i++, idx++)
-sfo[idx] = 0;
+sce->sfo[g*ics->max_sfb + sfb] = 0;
 break;
 case INTENSITY_BT: /* fallthrough */
 case INTENSITY_BT2:
-for (; i < run_end; i++, idx++) {
-offset[2] += get_vlc2(gb, ff_vlc_scalefactors, 7, 3) - 
SCALE_DIFF_ZERO;
-clipped_offset = av_clip(offset[2], -155, 100);
-if (offset[2] != clipped_offset) {
-avpriv_request_sample(ac->avctx,
-  "If you heard an audible 
artifact, there may

Re: [FFmpeg-devel] [PATCH v3 4/9] lavc/vp9dsp: R-V V ipred tm

2024-05-14 Thread Rémi Denis-Courmont
Le maanantaina 13. toukokuuta 2024, 19.59.21 EEST u...@foxmail.com a écrit :
> From: sunyuechi 
> 
> C908:
> vp9_tm_4x4_8bpp_c: 116.5
> vp9_tm_4x4_8bpp_rvv_i32: 43.5
> vp9_tm_8x8_8bpp_c: 416.2
> vp9_tm_8x8_8bpp_rvv_i32: 86.0
> vp9_tm_16x16_8bpp_c: 1665.5
> vp9_tm_16x16_8bpp_rvv_i32: 187.2
> vp9_tm_32x32_8bpp_c: 6974.2
> vp9_tm_32x32_8bpp_rvv_i32: 625.7
> ---
>  libavcodec/riscv/vp9_intra_rvv.S | 141 +++
>  libavcodec/riscv/vp9dsp.h|   8 ++
>  libavcodec/riscv/vp9dsp_init.c   |   4 +
>  3 files changed, 153 insertions(+)
> 
> diff --git a/libavcodec/riscv/vp9_intra_rvv.S
> b/libavcodec/riscv/vp9_intra_rvv.S index ca156d65cd..7e1046bc13 100644
> --- a/libavcodec/riscv/vp9_intra_rvv.S
> +++ b/libavcodec/riscv/vp9_intra_rvv.S
> @@ -173,3 +173,144 @@ func ff_h_8x8_rvv, zve32x
> 
>  ret
>  endfunc
> +
> +.macro tm_sum dst, top, offset
> +lbu  t3, \offset(a2)
> +sub  t3, t3, a4
> +vadd.vx  \dst, \top, t3

The macro saves some copycat code, but it seems to prevent good scheduling. 
Consuming t3 right after loading it is not ideal.

> +.endm
> +
> +func ff_tm_32x32_rvv, zve32x
> +lbu  a4, -1(a3)
> +li   t5, 32
> +
> +.macro tm_sum32 n1,n2,n3,n4,n5,n6,n7,n8
> +vsetvli  zero, t5, e16, m4, ta, ma

AFAICT, you do not need to reset the vector configuration every time.

> +vle8.v   v8, (a3)
> +vzext.vf2v28, v8
> +
> +tm_sum   v0, v28, \n1
> +tm_sum   v4, v28, \n2
> +tm_sum   v8, v28, \n3
> +tm_sum   v12, v28, \n4
> +tm_sum   v16, v28, \n5
> +tm_sum   v20, v28, \n6
> +tm_sum   v24, v28, \n7
> +tm_sum   v28, v28, \n8
> +
> +.irp n 0, 4, 8, 12, 16, 20, 24, 28
> +vmax.vx  v\n, v\n, zero
> +.endr
> +
> +vsetvli  zero, zero, e8, m2, ta, ma
> +.irp n 0, 4, 8, 12, 16, 20, 24, 28
> +vnclipu.wi   v\n, v\n, 0
> +vse8.v   v\n, (a0)
> +add  a0, a0, a1
> +.endr
> +.endm
> +
> +tm_sum32 31, 30, 29, 28, 27, 26, 25, 24
> +tm_sum32 23, 22, 21, 20, 19, 18, 17, 16
> +tm_sum32 15, 14, 13, 12, 11, 10, 9, 8
> +tm_sum32 7, 6, 5, 4, 3, 2, 1, 0
> +
> +ret
> +endfunc
> +
> +func ff_tm_16x16_rvv, zve32x
> +vsetivli  zero, 16, e16, m2, ta, ma
> +vle8.vv8, (a3)
> +vzext.vf2 v30, v8
> +lbu   a4, -1(a3)
> +
> +tm_sum   v0, v30, 15
> +tm_sum   v2, v30, 14
> +tm_sum   v4, v30, 13
> +tm_sum   v6, v30, 12
> +tm_sum   v8, v30, 11
> +tm_sum   v10, v30, 10
> +tm_sum   v12, v30, 9
> +tm_sum   v14, v30, 8
> +tm_sum   v16, v30, 7
> +tm_sum   v18, v30, 6
> +tm_sum   v20, v30, 5
> +tm_sum   v22, v30, 4
> +tm_sum   v24, v30, 3
> +tm_sum   v26, v30, 2
> +tm_sum   v28, v30, 1
> +tm_sum   v30, v30, 0
> +
> +.irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30
> +vmax.vx  v\n, v\n, zero
> +.endr
> +
> +vsetvli  zero, zero, e8, m1, ta, ma
> +.irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28
> +vnclipu.wi   v\n, v\n, 0
> +vse8.v   v\n, (a0)
> +add  a0, a0, a1
> +.endr
> +vnclipu.wi   v30, v30, 0
> +vse8.v   v30, (a0)
> +
> +ret
> +endfunc
> +
> +func ff_tm_8x8_rvv, zve32x
> +vsetivli zero, 8, e16, m1, ta, ma
> +vle8.v   v8, (a3)
> +vzext.vf2v28, v8
> +lbu  a4, -1(a3)
> +
> +tm_sum   v16, v28, 7
> +tm_sum   v17, v28, 6
> +tm_sum   v18, v28, 5
> +tm_sum   v19, v28, 4
> +tm_sum   v20, v28, 3
> +tm_sum   v21, v28, 2
> +tm_sum   v22, v28, 1
> +tm_sum   v23, v28, 0
> +
> +.irp n 16, 17, 18, 19, 20, 21, 22, 23
> +vmax.vx  v\n, v\n, zero
> +.endr
> +
> +vsetvli  zero, zero, e8, mf2, ta, ma
> +.irp n 16, 17, 18, 19, 20, 21, 22
> +vnclipu.wi   v\n, v\n, 0
> +vse8.v   v\n, (a0)
> +add  a0, a0, a1
> +.endr
> +vnclipu.wi   v24, v23, 0
> +vse8.v   v24, (a0)
> +
> +ret
> +endfunc
> +
> +func ff_tm_4x4_rvv, zve32x
> +vsetivli zero, 4, e16, mf2, ta, ma
> +vle8.v   v8, (a3)
> +vzext.vf2v28, v8
> +lbu  a4, -1(a3)
> +
> +tm_sum   v16, v28, 3
> +tm_sum   v17, v28, 2
> +tm_sum   v18, v28, 1
> +tm_sum   v19, v28, 0
> +
> +.irp n 16, 17, 18, 19
> +vmax.vx  v\n, v\n, zero
> +.endr
> +
> +vsetvli  zero, z

Re: [FFmpeg-devel] [PATCH v3 4/9] lavc/vp9dsp: R-V V ipred tm

2024-05-14 Thread flow gg
Why is it unnecessary to reset the vector configuration every time? I think
it is necessary to reset e16/e8 each time.

Rémi Denis-Courmont  于2024年5月15日周三 01:46写道:

> Le maanantaina 13. toukokuuta 2024, 19.59.21 EEST u...@foxmail.com a
> écrit :
> > From: sunyuechi 
> >
> > C908:
> > vp9_tm_4x4_8bpp_c: 116.5
> > vp9_tm_4x4_8bpp_rvv_i32: 43.5
> > vp9_tm_8x8_8bpp_c: 416.2
> > vp9_tm_8x8_8bpp_rvv_i32: 86.0
> > vp9_tm_16x16_8bpp_c: 1665.5
> > vp9_tm_16x16_8bpp_rvv_i32: 187.2
> > vp9_tm_32x32_8bpp_c: 6974.2
> > vp9_tm_32x32_8bpp_rvv_i32: 625.7
> > ---
> >  libavcodec/riscv/vp9_intra_rvv.S | 141 +++
> >  libavcodec/riscv/vp9dsp.h|   8 ++
> >  libavcodec/riscv/vp9dsp_init.c   |   4 +
> >  3 files changed, 153 insertions(+)
> >
> > diff --git a/libavcodec/riscv/vp9_intra_rvv.S
> > b/libavcodec/riscv/vp9_intra_rvv.S index ca156d65cd..7e1046bc13 100644
> > --- a/libavcodec/riscv/vp9_intra_rvv.S
> > +++ b/libavcodec/riscv/vp9_intra_rvv.S
> > @@ -173,3 +173,144 @@ func ff_h_8x8_rvv, zve32x
> >
> >  ret
> >  endfunc
> > +
> > +.macro tm_sum dst, top, offset
> > +lbu  t3, \offset(a2)
> > +sub  t3, t3, a4
> > +vadd.vx  \dst, \top, t3
>
> The macro saves some copycat code, but it seems to prevent good
> scheduling.
> Consuming t3 right after loading it is not ideal.
>
> > +.endm
> > +
> > +func ff_tm_32x32_rvv, zve32x
> > +lbu  a4, -1(a3)
> > +li   t5, 32
> > +
> > +.macro tm_sum32 n1,n2,n3,n4,n5,n6,n7,n8
> > +vsetvli  zero, t5, e16, m4, ta, ma
>
> AFAICT, you do not need to reset the vector configuration every time.
>
> > +vle8.v   v8, (a3)
> > +vzext.vf2v28, v8
> > +
> > +tm_sum   v0, v28, \n1
> > +tm_sum   v4, v28, \n2
> > +tm_sum   v8, v28, \n3
> > +tm_sum   v12, v28, \n4
> > +tm_sum   v16, v28, \n5
> > +tm_sum   v20, v28, \n6
> > +tm_sum   v24, v28, \n7
> > +tm_sum   v28, v28, \n8
> > +
> > +.irp n 0, 4, 8, 12, 16, 20, 24, 28
> > +vmax.vx  v\n, v\n, zero
> > +.endr
> > +
> > +vsetvli  zero, zero, e8, m2, ta, ma
> > +.irp n 0, 4, 8, 12, 16, 20, 24, 28
> > +vnclipu.wi   v\n, v\n, 0
> > +vse8.v   v\n, (a0)
> > +add  a0, a0, a1
> > +.endr
> > +.endm
> > +
> > +tm_sum32 31, 30, 29, 28, 27, 26, 25, 24
> > +tm_sum32 23, 22, 21, 20, 19, 18, 17, 16
> > +tm_sum32 15, 14, 13, 12, 11, 10, 9, 8
> > +tm_sum32 7, 6, 5, 4, 3, 2, 1, 0
> > +
> > +ret
> > +endfunc
> > +
> > +func ff_tm_16x16_rvv, zve32x
> > +vsetivli  zero, 16, e16, m2, ta, ma
> > +vle8.vv8, (a3)
> > +vzext.vf2 v30, v8
> > +lbu   a4, -1(a3)
> > +
> > +tm_sum   v0, v30, 15
> > +tm_sum   v2, v30, 14
> > +tm_sum   v4, v30, 13
> > +tm_sum   v6, v30, 12
> > +tm_sum   v8, v30, 11
> > +tm_sum   v10, v30, 10
> > +tm_sum   v12, v30, 9
> > +tm_sum   v14, v30, 8
> > +tm_sum   v16, v30, 7
> > +tm_sum   v18, v30, 6
> > +tm_sum   v20, v30, 5
> > +tm_sum   v22, v30, 4
> > +tm_sum   v24, v30, 3
> > +tm_sum   v26, v30, 2
> > +tm_sum   v28, v30, 1
> > +tm_sum   v30, v30, 0
> > +
> > +.irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30
> > +vmax.vx  v\n, v\n, zero
> > +.endr
> > +
> > +vsetvli  zero, zero, e8, m1, ta, ma
> > +.irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28
> > +vnclipu.wi   v\n, v\n, 0
> > +vse8.v   v\n, (a0)
> > +add  a0, a0, a1
> > +.endr
> > +vnclipu.wi   v30, v30, 0
> > +vse8.v   v30, (a0)
> > +
> > +ret
> > +endfunc
> > +
> > +func ff_tm_8x8_rvv, zve32x
> > +vsetivli zero, 8, e16, m1, ta, ma
> > +vle8.v   v8, (a3)
> > +vzext.vf2v28, v8
> > +lbu  a4, -1(a3)
> > +
> > +tm_sum   v16, v28, 7
> > +tm_sum   v17, v28, 6
> > +tm_sum   v18, v28, 5
> > +tm_sum   v19, v28, 4
> > +tm_sum   v20, v28, 3
> > +tm_sum   v21, v28, 2
> > +tm_sum   v22, v28, 1
> > +tm_sum   v23, v28, 0
> > +
> > +.irp n 16, 17, 18, 19, 20, 21, 22, 23
> > +vmax.vx  v\n, v\n, zero
> > +.endr
> > +
> > +vsetvli  zero, zero, e8, mf2, ta, ma
> > +.irp n 16, 17, 18, 19, 20, 21, 22
> > +vnclipu.wi   v\n, v\n, 0
> > +vse8.v   v\n, (a0)
> > +add  a0, a0, a1
> > +.endr
> > +vnclipu.wi   v24, v23, 0
> > +vse8.v   v24, (a0)
> > +
> > +ret
> > +endfunc
> > +

Re: [FFmpeg-devel] [PATCH v3 4/9] lavc/vp9dsp: R-V V ipred tm

2024-05-14 Thread Rémi Denis-Courmont
Le tiistaina 14. toukokuuta 2024, 20.57.17 EEST flow gg a écrit :
> Why is it unnecessary to reset the vector configuration every time? I think
> it is necessary to reset e16/e8 each time.

I misread the placement of .endm

OTOH, it seems that you could just write the tm_sum32 with a single parameter, 
as the other ones are just relative by constant +/-1.

-- 
雷米‧德尼-库尔蒙
http://www.remlab.net/



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 4/9] lavc/vp9dsp: R-V V ipred tm

2024-05-14 Thread uk7b
From: sunyuechi 

C908:
vp9_tm_4x4_8bpp_c: 116.5
vp9_tm_4x4_8bpp_rvv_i32: 43.5
vp9_tm_8x8_8bpp_c: 416.2
vp9_tm_8x8_8bpp_rvv_i32: 86.0
vp9_tm_16x16_8bpp_c: 1665.5
vp9_tm_16x16_8bpp_rvv_i32: 187.2
vp9_tm_32x32_8bpp_c: 6974.2
vp9_tm_32x32_8bpp_rvv_i32: 625.7
---
 libavcodec/riscv/vp9_intra_rvv.S | 123 +++
 libavcodec/riscv/vp9dsp.h|   8 ++
 libavcodec/riscv/vp9dsp_init.c   |   4 +
 3 files changed, 135 insertions(+)

diff --git a/libavcodec/riscv/vp9_intra_rvv.S b/libavcodec/riscv/vp9_intra_rvv.S
index ca156d65cd..5fb546c12d 100644
--- a/libavcodec/riscv/vp9_intra_rvv.S
+++ b/libavcodec/riscv/vp9_intra_rvv.S
@@ -173,3 +173,126 @@ func ff_h_8x8_rvv, zve32x
 
 ret
 endfunc
+
+.macro tm_sum4 dst1, dst2, dst3, dst4, top, n1
+lbu  t1, \n1(a2)
+lbu  t2, (\n1-1)(a2)
+lbu  t3, (\n1-2)(a2)
+lbu  t4, (\n1-3)(a2)
+sub  t1, t1, a4
+sub  t2, t2, a4
+sub  t3, t3, a4
+sub  t4, t4, a4
+vadd.vx  \dst1, \top, t1
+vadd.vx  \dst2, \top, t2
+vadd.vx  \dst3, \top, t3
+vadd.vx  \dst4, \top, t4
+.endm
+
+func ff_tm_32x32_rvv, zve32x
+lbu  a4, -1(a3)
+li   t5, 32
+
+.macro tm_sum32 offset
+vsetvli  zero, t5, e16, m4, ta, ma
+vle8.v   v8, (a3)
+vzext.vf2v28, v8
+
+tm_sum4  v0, v4, v8, v12, v28, \offset
+tm_sum4  v16, v20, v24, v28, v28, (\offset-4)
+
+.irp n 0, 4, 8, 12, 16, 20, 24, 28
+vmax.vx  v\n, v\n, zero
+.endr
+
+vsetvli  zero, zero, e8, m2, ta, ma
+.irp n 0, 4, 8, 12, 16, 20, 24, 28
+vnclipu.wi   v\n, v\n, 0
+vse8.v   v\n, (a0)
+add  a0, a0, a1
+.endr
+.endm
+
+tm_sum32 31
+tm_sum32 23
+tm_sum32 15
+tm_sum32 7
+
+ret
+endfunc
+
+func ff_tm_16x16_rvv, zve32x
+vsetivli  zero, 16, e16, m2, ta, ma
+vle8.vv8, (a3)
+vzext.vf2 v30, v8
+lbu   a4, -1(a3)
+
+tm_sum4   v0, v2, v4, v6, v30, 15
+tm_sum4   v8, v10, v12, v14, v30, 11
+tm_sum4   v16, v18, v20, v22, v30, 7
+tm_sum4   v24, v26, v28, v30, v30, 3
+
+.irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30
+vmax.vx  v\n, v\n, zero
+.endr
+
+vsetvli  zero, zero, e8, m1, ta, ma
+.irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28
+vnclipu.wi   v\n, v\n, 0
+vse8.v   v\n, (a0)
+add  a0, a0, a1
+.endr
+vnclipu.wi   v30, v30, 0
+vse8.v   v30, (a0)
+
+ret
+endfunc
+
+func ff_tm_8x8_rvv, zve32x
+vsetivli zero, 8, e16, m1, ta, ma
+vle8.v   v8, (a3)
+vzext.vf2v28, v8
+lbu  a4, -1(a3)
+
+tm_sum4  v16, v17, v18, v19, v28, 7
+tm_sum4  v20, v21, v22, v23, v28, 3
+
+.irp n 16, 17, 18, 19, 20, 21, 22, 23
+vmax.vx  v\n, v\n, zero
+.endr
+
+vsetvli  zero, zero, e8, mf2, ta, ma
+.irp n 16, 17, 18, 19, 20, 21, 22
+vnclipu.wi   v\n, v\n, 0
+vse8.v   v\n, (a0)
+add  a0, a0, a1
+.endr
+vnclipu.wi   v24, v23, 0
+vse8.v   v24, (a0)
+
+ret
+endfunc
+
+func ff_tm_4x4_rvv, zve32x
+vsetivli zero, 4, e16, mf2, ta, ma
+vle8.v   v8, (a3)
+vzext.vf2v28, v8
+lbu  a4, -1(a3)
+
+tm_sum4  v16, v17, v18, v19, v28, 3
+
+.irp n 16, 17, 18, 19
+vmax.vx  v\n, v\n, zero
+.endr
+
+vsetvli  zero, zero, e8, mf4, ta, ma
+.irp n 16, 17, 18
+vnclipu.wi   v\n, v\n, 0
+vse8.v   v\n, (a0)
+add  a0, a0, a1
+.endr
+vnclipu.wi   v24, v19, 0
+vse8.v   v24, (a0)
+
+ret
+endfunc
diff --git a/libavcodec/riscv/vp9dsp.h b/libavcodec/riscv/vp9dsp.h
index 0ad961c7e0..79330b4968 100644
--- a/libavcodec/riscv/vp9dsp.h
+++ b/libavcodec/riscv/vp9dsp.h
@@ -72,6 +72,14 @@ void ff_h_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const 
uint8_t *l,
 const uint8_t *a);
 void ff_h_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
   const uint8_t *a);
+void ff_tm_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+ const uint8_t *a);
+void ff_tm_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+ const uint8_t *a);
+void ff_tm_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+   const uint8_t *a);
+void ff_tm_4x4_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+   const uint8_t *a);
 
 #define VP9_8TAP_RISCV_RVV_FUNC(SIZE, type, type_

Re: [FFmpeg-devel] [PATCH v3 4/9] lavc/vp9dsp: R-V V ipred tm

2024-05-14 Thread flow gg
> The macro saves some copycat code, but it seems to prevent good
scheduling.
> Consuming t3 right after loading it is not ideal.

> OTOH, it seems that you could just write the tm_sum32 with a single
parameter,
> as the other ones are just relative by constant +/-1.

Okay, updated it in the reply

Rémi Denis-Courmont  于2024年5月15日周三 02:08写道:

> Le tiistaina 14. toukokuuta 2024, 20.57.17 EEST flow gg a écrit :
> > Why is it unnecessary to reset the vector configuration every time? I
> think
> > it is necessary to reset e16/e8 each time.
>
> I misread the placement of .endm
>
> OTOH, it seems that you could just write the tm_sum32 with a single
> parameter,
> as the other ones are just relative by constant +/-1.
>
> --
> 雷米‧德尼-库尔蒙
> http://www.remlab.net/
>
>
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 2/2] lavc/flacdsp: optimise RVV vector type for lpc16

2024-05-14 Thread Rémi Denis-Courmont
This calculates the optimal vector type value at run-time based on the
hardware vector length and the FLAC LPC prediction order. In this
particular case, the additional computation is easily amortised over
the loop iterations:

T-Head C908:   CV before   V after
flac_lpc_16_13: 14180.2  11229.0 7338.5
flac_lpc_16_16: 16833.2  11091.0 7248.5
flac_lpc_16_29: 28817.2  11455.710506.5
flac_lpc_16_32: 31059.7  10368.511305.2

With 128-bit vectors, improvements are expected for the first two
test cases only. For the other two, there is overhead but below noise.
Improvements should be better observable with prediction order of 8
and less, or on hardware with larger vector sizes.

The same optimisation strategy should be applicable to LPC32
(and work-in-progress LPC33), but is left as a future exercise.
---
 libavcodec/riscv/flacdsp_init.c |  2 +-
 libavcodec/riscv/flacdsp_rvv.S  | 10 --
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/libavcodec/riscv/flacdsp_init.c b/libavcodec/riscv/flacdsp_init.c
index 77ffd09244..097f938f04 100644
--- a/libavcodec/riscv/flacdsp_init.c
+++ b/libavcodec/riscv/flacdsp_init.c
@@ -71,7 +71,7 @@ av_cold void ff_flacdsp_init_riscv(FLACDSPContext *c, enum 
AVSampleFormat fmt,
 if ((flags & AV_CPU_FLAG_RVV_I32) && (flags & AV_CPU_FLAG_RVB_ADDR)) {
 int vlenb = ff_get_rv_vlenb();
 
-if (vlenb >= 16)
+if ((flags & AV_CPU_FLAG_RVB_BASIC) && vlenb >= 16)
 c->lpc16 = ff_flac_lpc16_rvv;
 
 c->wasted32 = ff_flac_wasted32_rvv;
diff --git a/libavcodec/riscv/flacdsp_rvv.S b/libavcodec/riscv/flacdsp_rvv.S
index 8b9c626198..42cece9786 100644
--- a/libavcodec/riscv/flacdsp_rvv.S
+++ b/libavcodec/riscv/flacdsp_rvv.S
@@ -20,8 +20,14 @@
 
 #include "libavutil/riscv/asm.S"
 
-func ff_flac_lpc16_rvv, zve32x
-vsetvli zero, a2, e32, m8, ta, ma
+func ff_flac_lpc16_rvv, zve32x, zbb
+csrrt0, vlenb
+addit2, a2, -1
+clz t0, t0
+clz t2, t2
+addit0, t0, VTYPE_E32 | VTYPE_M8 | VTYPE_TA | VTYPE_MA
+sub t0, t0, t2 // t0 += log2(next_power_of_two(len) / vlenb) - 1
+vsetvl  zero, a2, t0
 vle32.v v8, (a1)
 sub a4, a4, a2
 vle32.v v16, (a0)
-- 
2.43.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 1/2] lavu/riscv: assembler macros for VTYPE fields

2024-05-14 Thread Rémi Denis-Courmont
---
 libavutil/riscv/asm.S | 48 +--
 1 file changed, 33 insertions(+), 15 deletions(-)

diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S
index 14be5055f5..ecf3081e61 100644
--- a/libavutil/riscv/asm.S
+++ b/libavutil/riscv/asm.S
@@ -96,20 +96,38 @@
 .endm
 #endif
 
+#define VTYPE_E8   000
+#define VTYPE_E16  010
+#define VTYPE_E32  020
+#define VTYPE_E64  030
+
+#define VTYPE_MF8   05
+#define VTYPE_MF4   06
+#define VTYPE_MF2   07
+#define VTYPE_M100
+#define VTYPE_M201
+#define VTYPE_M402
+#define VTYPE_M803
+
+#define VTYPE_TU  
+#define VTYPE_TA  0100
+#define VTYPE_MU  
+#define VTYPE_MA  0200
+
 /* Convenience macro to load a Vector type (vtype) as immediate */
 .macro  lvtypei rd, e, m=m1, tp=tu, mp=mu
 
 .ifc \e,e8
-.equ ei, 0
+.equ ei, VTYPE_E8
 .else
 .ifc \e,e16
-.equ ei, 8
+.equ ei, VTYPE_E16
 .else
 .ifc \e,e32
-.equ ei, 16
+.equ ei, VTYPE_E32
 .else
 .ifc \e,e64
-.equ ei, 24
+.equ ei, VTYPE_E64
 .else
 .error "Unknown element type"
 .endif
@@ -118,25 +136,25 @@
 .endif
 
 .ifc \m,m1
-.equ mi, 0
+.equ mi, VTYPE_M1
 .else
 .ifc \m,m2
-.equ mi, 1
+.equ mi, VTYPE_M2
 .else
 .ifc \m,m4
-.equ mi, 2
+.equ mi, VTYPE_M4
 .else
 .ifc \m,m8
-.equ mi, 3
+.equ mi, VTYPE_M8
 .else
 .ifc \m,mf8
-.equ mi, 5
+.equ mi, VTYPE_MF8
 .else
 .ifc \m,mf4
-.equ mi, 6
+.equ mi, VTYPE_MF4
 .else
 .ifc \m,mf2
-.equ mi, 7
+.equ mi, VTYPE_MF2
 .else
 .error "Unknown multiplier"
 .equ mi, 3
@@ -149,20 +167,20 @@
 .endif
 
 .ifc \tp,tu
-.equ tpi, 0
+.equ tpi, VTYPE_TU
 .else
 .ifc \tp,ta
-.equ tpi, 64
+.equ tpi, VTYPE_TA
 .else
 .error "Unknown tail policy"
 .endif
 .endif
 
 .ifc \mp,mu
-.equ mpi, 0
+.equ mpi, VTYPE_MU
 .else
 .ifc \mp,ma
-.equ mpi, 128
+.equ mpi, VTYPE_MA
 .else
 .error "Unknown mask policy"
 .endif
-- 
2.43.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 1/1] libavdevice/decklink: extend available actions on signal loss

2024-05-14 Thread Marton Balint




On Tue, 14 May 2024, Michael Riedl wrote:




Deprecate the option 'draw_bars' in favor of the new option 
'signal_loss_action',
which controls the behavior when the input signal is not available
(including the behavior previously available through draw_bars).
The default behavior remains unchanged to be backwards compatible.
The new option is more flexible for extending now and in the future.

The new value 'repeat' repeats the last video frame.
This is useful for very short dropouts and was not available before.


As far as I see, you are overriding frameBytes for a repeated frame, that seems 
wrong. pkt.data (frameBytes) must be associated with the videoFrame which is 
passed to av_buffer_create() later on.

Every AVFrame returned by the decklink device has an AVBuffer set up which
keeps a reference to the original DeckLink frame. This allows the use of the 
DeckLink frame's raw buffer directly. But you cannot use the raw buffer of 
another DeckLink frame for which the AVBuffer of the AVFrame does not keep a 
reference.


Thank you for your feedback!

I took another look at the code and revisited the DeckLink documentation 
to ensure my understanding was correct. It seems that frameBytes is a 
pointer to the buffer of an IDeckLinkVideoFrame, and it remains valid as 
long as the videoFrame is not released.


That is just it. You are releasing the repeated frame as soon as a valid 
frame comes in. The AVPacket data you previously returned will still point 
to the now released frameBytes. As I wrote above, the decklink frame 
corresponding to the returned frameBytes must be released in the 
destructor of the AVPacket buffer.


Regards,
Marton
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/2] lavc/flacdsp: optimise RVV vector type for lpc16

2024-05-14 Thread Rémi Denis-Courmont
Le tiistaina 14. toukokuuta 2024, 22.35.57 EEST Rémi Denis-Courmont a écrit :
> This calculates the optimal vector type value at run-time based on the
> hardware vector length and the FLAC LPC prediction order. In this
> particular case, the additional computation is easily amortised over
> the loop iterations:
> 
> T-Head C908:   CV before   V after
> flac_lpc_16_13: 14180.2  11229.0 7338.5
> flac_lpc_16_16: 16833.2  11091.0 7248.5
> flac_lpc_16_29: 28817.2  11455.710506.5
> flac_lpc_16_32: 31059.7  10368.511305.2
> 
> With 128-bit vectors, improvements are expected for the first two
> test cases only. For the other two, there is overhead but below noise.
> Improvements should be better observable with prediction order of 8
> and less, or on hardware with larger vector sizes.
> 
> The same optimisation strategy should be applicable to LPC32
> (and work-in-progress LPC33), but is left as a future exercise.
> ---
>  libavcodec/riscv/flacdsp_init.c |  2 +-
>  libavcodec/riscv/flacdsp_rvv.S  | 10 --
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/libavcodec/riscv/flacdsp_init.c
> b/libavcodec/riscv/flacdsp_init.c index 77ffd09244..097f938f04 100644
> --- a/libavcodec/riscv/flacdsp_init.c
> +++ b/libavcodec/riscv/flacdsp_init.c
> @@ -71,7 +71,7 @@ av_cold void ff_flacdsp_init_riscv(FLACDSPContext *c, enum
> AVSampleFormat fmt, if ((flags & AV_CPU_FLAG_RVV_I32) && (flags &
> AV_CPU_FLAG_RVB_ADDR)) { int vlenb = ff_get_rv_vlenb();
> 
> -if (vlenb >= 16)
> +if ((flags & AV_CPU_FLAG_RVB_BASIC) && vlenb >= 16)
>  c->lpc16 = ff_flac_lpc16_rvv;
> 
>  c->wasted32 = ff_flac_wasted32_rvv;
> diff --git a/libavcodec/riscv/flacdsp_rvv.S b/libavcodec/riscv/flacdsp_rvv.S
> index 8b9c626198..42cece9786 100644
> --- a/libavcodec/riscv/flacdsp_rvv.S
> +++ b/libavcodec/riscv/flacdsp_rvv.S
> @@ -20,8 +20,14 @@
> 
>  #include "libavutil/riscv/asm.S"
> 
> -func ff_flac_lpc16_rvv, zve32x
> -vsetvli zero, a2, e32, m8, ta, ma
> +func ff_flac_lpc16_rvv, zve32x, zbb
> +csrrt0, vlenb
> +addit2, a2, -1
> +clz t0, t0
> +clz t2, t2
> +addit0, t0, VTYPE_E32 | VTYPE_M8 | VTYPE_TA | VTYPE_MA
> +sub t0, t0, t2 // t0 += log2(next_power_of_two(len) / vlenb) -
> 1

Ok so checkasm can't sense it since we don't test that,
but I guess that this might crash due to illegal vector configuration if
- pred_order <= 2 with 128-bit vectors,
- pred_order <= 4 with 256-bit vectors,
- and so on.

This needs a little bit more work.

> +vsetvl  zero, a2, t0
>  vle32.v v8, (a1)
>  sub a4, a4, a2
>  vle32.v v16, (a0)


-- 
Rémi Denis-Courmont
http://www.remlab.net/



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 09/10] tests/checkasm/vf_blend: Add missing function parameter

2024-05-14 Thread Marton Balint




On Mon, 13 May 2024, Andreas Rheinhardt wrote:


Forgotten in 5b8faaad6c71bbb90951ca1642391e11cf6f5f91.

Signed-off-by: Andreas Rheinhardt 
---
tests/checkasm/vf_blend.c | 8 
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tests/checkasm/vf_blend.c b/tests/checkasm/vf_blend.c
index b5a96ee4bc..5ebfc11fed 100644
--- a/tests/checkasm/vf_blend.c
+++ b/tests/checkasm/vf_blend.c
@@ -68,7 +68,7 @@
 const uint8_t *bottom, ptrdiff_t bottom_linesize,  
   \
 uint8_t *dst, ptrdiff_t dst_linesize,  
   \
 ptrdiff_t width, ptrdiff_t height, 
   \
- struct FilterParams *param, double *values);  
\
+ struct FilterParams *param, double *values, int starty);  
\


You are going to need to adjust the parameters after the vf_blend patch I 
just pushed.


Thanks,
Marton



w = WIDTH / depth;  
   \

   \
for (i = 0; i < BUF_UNITS - 1; i++) {   
   \
@@ -76,14 +76,14 @@
int dst_offset = i * SIZE_PER_UNIT; /* dst must be aligned */   
   \
randomize_buffers();
   \
call_ref(top1 + src_offset, w, bot1 + src_offset, w,
   \
- dst1 + dst_offset, w, w, HEIGHT, ¶m, NULL);   
\
+ dst1 + dst_offset, w, w, HEIGHT, ¶m, NULL, 0);
\
call_new(top2 + src_offset, w, bot2 + src_offset, w,
   \
- dst2 + dst_offset, w, w, HEIGHT, ¶m, NULL);   
\
+ dst2 + dst_offset, w, w, HEIGHT, ¶m, NULL, 0);
\
if (memcmp(top1, top2, BUF_SIZE) || memcmp(bot1, bot2, BUF_SIZE) || 
memcmp(dst1, dst2, BUF_SIZE)) \
fail(); 
   \
}   
   \
bench_new(top2, w / 4, bot2, w / 4, dst2, w / 4,
   \
-  w / 4, HEIGHT / 4, ¶m, NULL);
\
+  w / 4, HEIGHT / 4, ¶m, NULL, 0); 
\
} while (0)

void checkasm_check_blend(void)
--
2.40.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v8 01/15] avcodec/vaapi_encode: introduce a base layer for vaapi encode

2024-05-14 Thread Mark Thompson
On 18/04/2024 09:58, tong1.wu-at-intel@ffmpeg.org wrote:
> From: Tong Wu 
> 
> Since VAAPI and future D3D12VA implementation may share some common 
> parameters,
> a base layer encode context is introduced as vaapi context's base.
> 
> Signed-off-by: Tong Wu 
> ---
>  libavcodec/hw_base_encode.h | 52 +
>  libavcodec/vaapi_encode.h   | 36 -
>  2 files changed, 57 insertions(+), 31 deletions(-)
>  create mode 100644 libavcodec/hw_base_encode.h
> 
> diff --git a/libavcodec/hw_base_encode.h b/libavcodec/hw_base_encode.h
> new file mode 100644
> index 00..3d1974bba3
> --- /dev/null
> +++ b/libavcodec/hw_base_encode.h
> @@ -0,0 +1,52 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
> + */
> +
> +#ifndef AVCODEC_HW_BASE_ENCODE_H
> +#define AVCODEC_HW_BASE_ENCODE_H
> +
> +#define MAX_DPB_SIZE 16
> +#define MAX_PICTURE_REFERENCES 2
> +#define MAX_REORDER_DELAY 16
> +#define MAX_ASYNC_DEPTH 64
> +#define MAX_REFERENCE_LIST_NUM 2

Is there a reason to change these from enum to defines?  I'm not seeing 
anywhere they should be visible to the preprocessor, and this means they are 
normally invisible to a debugger.

> +
> +enum {
> +PICTURE_TYPE_IDR = 0,
> +PICTURE_TYPE_I   = 1,
> +PICTURE_TYPE_P   = 2,
> +PICTURE_TYPE_B   = 3,
> +};
> +
> +enum {
> +// Codec supports controlling the subdivision of pictures into slices.
> +FLAG_SLICE_CONTROL = 1 << 0,
> +// Codec only supports constant quality (no rate control).
> +FLAG_CONSTANT_QUALITY_ONLY = 1 << 1,
> +// Codec is intra-only.
> +FLAG_INTRA_ONLY= 1 << 2,
> +// Codec supports B-pictures.
> +FLAG_B_PICTURES= 1 << 3,
> +// Codec supports referencing B-pictures.
> +FLAG_B_PICTURE_REFERENCES  = 1 << 4,
> +// Codec supports non-IDR key pictures (that is, key pictures do
> +// not necessarily empty the DPB).
> +FLAG_NON_IDR_KEY_PICTURES  = 1 << 5,
> +};
> +
> +#endif /* AVCODEC_HW_BASE_ENCODE_H */
> ...

Would it make more sense to put the HWBaseEncodeContext in this patch as well?  
(With just the AVClass member.)

Thanks,

- Mark
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] avformat/mov: avoid seeking back to 0 on HEVC open GOP files

2024-05-14 Thread llyyr.public
ab77b878f1 attempted to fix the issue of broken packets being sent to
the decoder by implementing logic that kept attempting to PTS-step
backwards until it reached a valid point, however applying this
heuristic meant that in files that had no valid points (such as HEVC
videos shot on iPhones), we'd seek back to sample 0 on every seek
attempt. This meant that files that were previously seekable, albeit
with some skipped frames, were not seekable at all now.

Relax this heuristic a bit by giving up on seeking to a valid point if
we've tried a different sample and we still don't have a valid point to
seek to. This may some frames to be skipped on seeking but it's better
than not being able to seek at all in such files.

Fixes: ab77b878f1 ("avformat/mov: fix seeking with HEVC open GOP files")
Fixes: #10585
---
 libavformat/mov.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/libavformat/mov.c b/libavformat/mov.c
index b3fa748f27e8..6174a04c3169 100644
--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -10133,7 +10133,7 @@ static int mov_seek_stream(AVFormatContext *s, AVStream 
*st, int64_t timestamp,
 {
 MOVStreamContext *sc = st->priv_data;
 FFStream *const sti = ffstream(st);
-int sample, time_sample, ret;
+int sample, time_sample, ret, next_ts, requested_sample;
 unsigned int i;
 
 // Here we consider timestamp to be PTS, hence try to offset it so that we
@@ -10154,7 +10154,17 @@ static int mov_seek_stream(AVFormatContext *s, 
AVStream *st, int64_t timestamp,
 
 if (!sample || can_seek_to_key_sample(st, sample, timestamp))
 break;
-timestamp -= FFMAX(sc->min_sample_duration, 1);
+
+next_ts = timestamp - FFMAX(sc->min_sample_duration, 1);
+requested_sample = av_index_search_timestamp(st, next_ts, flags);
+
+// If we've reached a different sample trying to find a good pts to
+// seek to, give up searching because we'll end up seeking back to
+// sample 0 on every seek.
+if (!can_seek_to_key_sample(st, requested_sample, next_ts) && sample 
!= requested_sample)
+break;
+
+timestamp = next_ts;
 }
 
 mov_current_sample_set(sc, sample);

base-commit: b0093ab8a3d34bf2fefd6665464cc343a9ef0d53
-- 
2.45.0
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] avformat/mov: avoid seeking back to 0 on HEVC open GOP files

2024-05-14 Thread llyyr.public
ab77b878f1 attempted to fix the issue of broken packets being sent to
the decoder by implementing logic that kept attempting to PTS-step
backwards until it reached a valid point, however applying this
heuristic meant that in files that had no valid points (such as HEVC
videos shot on iPhones), we'd seek back to sample 0 on every seek
attempt. This meant that files that were previously seekable, albeit
with some skipped frames, were not seekable at all now.

Relax this heuristic a bit by giving up on seeking to a valid point if
we've tried a different sample and we still don't have a valid point to
seek to. This may some frames to be skipped on seeking but it's better
than not being able to seek at all in such files.

Fixes: ab77b878f1 ("avformat/mov: fix seeking with HEVC open GOP files")
Fixes: #10585
---
 libavformat/mov.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/libavformat/mov.c b/libavformat/mov.c
index b3fa748f27e8..6174a04c3169 100644
--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -10133,7 +10133,7 @@ static int mov_seek_stream(AVFormatContext *s, AVStream 
*st, int64_t timestamp,
 {
 MOVStreamContext *sc = st->priv_data;
 FFStream *const sti = ffstream(st);
-int sample, time_sample, ret;
+int sample, time_sample, ret, next_ts, requested_sample;
 unsigned int i;
 
 // Here we consider timestamp to be PTS, hence try to offset it so that we
@@ -10154,7 +10154,17 @@ static int mov_seek_stream(AVFormatContext *s, 
AVStream *st, int64_t timestamp,
 
 if (!sample || can_seek_to_key_sample(st, sample, timestamp))
 break;
-timestamp -= FFMAX(sc->min_sample_duration, 1);
+
+next_ts = timestamp - FFMAX(sc->min_sample_duration, 1);
+requested_sample = av_index_search_timestamp(st, next_ts, flags);
+
+// If we've reached a different sample trying to find a good pts to
+// seek to, give up searching because we'll end up seeking back to
+// sample 0 on every seek.
+if (!can_seek_to_key_sample(st, requested_sample, next_ts) && sample 
!= requested_sample)
+break;
+
+timestamp = next_ts;
 }
 
 mov_current_sample_set(sc, sample);

base-commit: b0093ab8a3d34bf2fefd6665464cc343a9ef0d53
-- 
2.45.0
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 1/3] avformat/mp3dec: only call ffio_ensure_seekback once

2024-05-14 Thread Marton Balint
Otherwise the subsequent ffio_ensure_seekback calls destroy the buffer of the
earlier. The worst case ~66kB seekback is so small it is easier to request it
entirely.

Fixes ticket #10837, a regression since
0d17f5228f4d3854066ec1001f69c7d1714b0df9.

Signed-off-by: Marton Balint 
---
 libavformat/mp3dec.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/libavformat/mp3dec.c b/libavformat/mp3dec.c
index ec6cf567bc..78d6c8c71c 100644
--- a/libavformat/mp3dec.c
+++ b/libavformat/mp3dec.c
@@ -32,6 +32,7 @@
 #include "replaygain.h"
 
 #include "libavcodec/codec_id.h"
+#include "libavcodec/mpegaudio.h"
 #include "libavcodec/mpegaudiodecheader.h"
 
 #define XING_FLAG_FRAMES 0x01
@@ -400,15 +401,16 @@ static int mp3_read_header(AVFormatContext *s)
 if (ret < 0)
 return ret;
 
+ret = ffio_ensure_seekback(s->pb, 64 * 1024 + MPA_MAX_CODED_FRAME_SIZE + 
4);
+if (ret < 0)
+return ret;
+
 off = avio_tell(s->pb);
 for (i = 0; i < 64 * 1024; i++) {
 uint32_t header, header2;
 int frame_size;
-if (!(i&1023))
-ffio_ensure_seekback(s->pb, i + 1024 + 4);
 frame_size = check(s->pb, off + i, &header);
 if (frame_size > 0) {
-ffio_ensure_seekback(s->pb, i + 1024 + frame_size + 4);
 ret = check(s->pb, off + i + frame_size, &header2);
 if (ret >= 0 &&
 (header & MP3_MASK) == (header2 & MP3_MASK))
-- 
2.35.3

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 2/3] avformat/mp3dec: simplify inner frame size check in mp3_read_header

2024-05-14 Thread Marton Balint
We are protecting the checked buffer with ffio_ensure_seekback(), so if the
inner check fails with a seek error, that likely means the end of file was
reached when checking for the next frame. This could also be the result of a
wrongly guessed (larger than normal) frame size, so let's continue the loop
instead of breaking out early. It will end sooner or later anyway.

Signed-off-by: Marton Balint 
---
 libavformat/mp3dec.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/libavformat/mp3dec.c b/libavformat/mp3dec.c
index 78d6c8c71c..4abc73966f 100644
--- a/libavformat/mp3dec.c
+++ b/libavformat/mp3dec.c
@@ -412,14 +412,8 @@ static int mp3_read_header(AVFormatContext *s)
 frame_size = check(s->pb, off + i, &header);
 if (frame_size > 0) {
 ret = check(s->pb, off + i + frame_size, &header2);
-if (ret >= 0 &&
-(header & MP3_MASK) == (header2 & MP3_MASK))
-{
+if (ret >= 0 && (header & MP3_MASK) == (header2 & MP3_MASK))
 break;
-} else if (ret == CHECK_SEEK_FAILED) {
-av_log(s, AV_LOG_ERROR, "Invalid frame size (%d): Could not 
seek to %"PRId64".\n", frame_size, off + i + frame_size);
-return AVERROR(EINVAL);
-}
 } else if (frame_size == CHECK_SEEK_FAILED) {
 av_log(s, AV_LOG_ERROR, "Failed to read frame size: Could not seek 
to %"PRId64".\n", (int64_t) (i + 1024 + frame_size + 4));
 return AVERROR(EINVAL);
-- 
2.35.3

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 3/3] avformat/mp3dec: change bogus error message if read_header encounters EOF

2024-05-14 Thread Marton Balint
Because of ffio_ensure_seekback() a seek error normally should only happen if
the end of file is reached during checking for the junk run-in. Also use proper
error code.

Signed-off-by: Marton Balint 
---
 libavformat/mp3dec.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavformat/mp3dec.c b/libavformat/mp3dec.c
index 4abc73966f..f421e03926 100644
--- a/libavformat/mp3dec.c
+++ b/libavformat/mp3dec.c
@@ -415,8 +415,8 @@ static int mp3_read_header(AVFormatContext *s)
 if (ret >= 0 && (header & MP3_MASK) == (header2 & MP3_MASK))
 break;
 } else if (frame_size == CHECK_SEEK_FAILED) {
-av_log(s, AV_LOG_ERROR, "Failed to read frame size: Could not seek 
to %"PRId64".\n", (int64_t) (i + 1024 + frame_size + 4));
-return AVERROR(EINVAL);
+av_log(s, AV_LOG_ERROR, "Failed to find two consecutive MPEG audio 
frames.\n");
+return AVERROR_INVALIDDATA;
 }
 }
 if (i == 64 * 1024) {
-- 
2.35.3

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] avformat/mov: avoid seeking back to 0 on HEVC open GOP files

2024-05-14 Thread llyyr . public
From: llyyr 

ab77b878f1 attempted to fix the issue of broken packets being sent to
the decoder by implementing logic that kept attempting to PTS-step
backwards until it reached a valid point, however applying this
heuristic meant that in files that had no valid points (such as HEVC
videos shot on iPhones), we'd seek back to sample 0 on every seek
attempt. This meant that files that were previously seekable, albeit
with some skipped frames, were not seekable at all now.

Relax this heuristic a bit by giving up on seeking to a valid point if
we've tried a different sample and we still don't have a valid point to
seek to. This may some frames to be skipped on seeking but it's better
than not being able to seek at all in such files.

Fixes: ab77b878f1 ("avformat/mov: fix seeking with HEVC open GOP files")
Fixes: #10585
---
 libavformat/mov.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/libavformat/mov.c b/libavformat/mov.c
index b3fa748f27e8..6174a04c3169 100644
--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -10133,7 +10133,7 @@ static int mov_seek_stream(AVFormatContext *s, AVStream 
*st, int64_t timestamp,
 {
 MOVStreamContext *sc = st->priv_data;
 FFStream *const sti = ffstream(st);
-int sample, time_sample, ret;
+int sample, time_sample, ret, next_ts, requested_sample;
 unsigned int i;
 
 // Here we consider timestamp to be PTS, hence try to offset it so that we
@@ -10154,7 +10154,17 @@ static int mov_seek_stream(AVFormatContext *s, 
AVStream *st, int64_t timestamp,
 
 if (!sample || can_seek_to_key_sample(st, sample, timestamp))
 break;
-timestamp -= FFMAX(sc->min_sample_duration, 1);
+
+next_ts = timestamp - FFMAX(sc->min_sample_duration, 1);
+requested_sample = av_index_search_timestamp(st, next_ts, flags);
+
+// If we've reached a different sample trying to find a good pts to
+// seek to, give up searching because we'll end up seeking back to
+// sample 0 on every seek.
+if (!can_seek_to_key_sample(st, requested_sample, next_ts) && sample 
!= requested_sample)
+break;
+
+timestamp = next_ts;
 }
 
 mov_current_sample_set(sc, sample);
-- 
2.45.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] lavu/riscv: fix parsing the unaligned read capability

2024-05-14 Thread Rémi Denis-Courmont
Pointed-out-by: Stefan O'Rear 
---
 libavutil/riscv/cpu.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c
index 7b8aa7ac21..04ac404bbf 100644
--- a/libavutil/riscv/cpu.c
+++ b/libavutil/riscv/cpu.c
@@ -77,8 +77,12 @@ int ff_get_cpu_flags_riscv(void)
 if (pairs[1].value & RISCV_HWPROBE_EXT_ZVBB)
 ret |= AV_CPU_FLAG_RV_ZVBB;
 #endif
-if (pairs[2].value & RISCV_HWPROBE_MISALIGNED_FAST)
-ret |= AV_CPU_FLAG_RV_MISALIGNED;
+switch (pairs[2].value & RISCV_HWPROBE_MISALIGNED_MASK) {
+case RISCV_HWPROBE_MISALIGNED_FAST:
+ret |= AV_CPU_FLAG_RV_MISALIGNED;
+break;
+default:
+}
 }
 #elif HAVE_GETAUXVAL
 {
-- 
2.43.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCHv2 2/2] lavc/flacdsp: R-V V flac_wasted33

2024-05-14 Thread Rémi Denis-Courmont
Le sunnuntaina 12. toukokuuta 2024, 22.54.21 EEST Rémi Denis-Courmont a écrit 
:
> T-Head C908:
> flac_wasted_33_c:   786.2
> flac_wasted_33_rvv_i64: 486.5

Fails with a minority of checkasm seeds.

-- 
Rémi Denis-Courmont
http://www.remlab.net/



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-14 Thread Stone Chen
Implements AVX2 DMVR (decoder-side motion vector refinement) SAD functions. 
DMVR SAD is only calculated if w >= 8, h >= 8, and w * h > 128. To reduce 
complexity, SAD is only calculated on even rows. This is calculated for all 
video bitdepths, but the values passed to the function are always 16bit (even 
if the original video bitdepth is 8). The AVX2 implementation uses min/max/sub.

Benchmarks ( AMD 7940HS )
Before:
BQTerrace_1920x1080_60_10_420_22_RA.vvc | 80.7 |
Chimera_8bit_1080P_1000_frames.vvc | 158.0 |
NovosobornayaSquare_1920x1080.bin | 159.7 |
RitualDance_1920x1080_60_10_420_37_RA.266 | 146.3 |

After:
BQTerrace_1920x1080_60_10_420_22_RA.vvc | 82.7 |
Chimera_8bit_1080P_1000_frames.vvc | 167.0 |
NovosobornayaSquare_1920x1080.bin | 166.3 |
RitualDance_1920x1080_60_10_420_37_RA.266 | 154.0 |
---
 libavcodec/x86/vvc/Makefile  |   3 +-
 libavcodec/x86/vvc/vvc_sad.asm   | 157 +++
 libavcodec/x86/vvc/vvcdsp_init.c |   6 ++
 3 files changed, 165 insertions(+), 1 deletion(-)
 create mode 100644 libavcodec/x86/vvc/vvc_sad.asm

diff --git a/libavcodec/x86/vvc/Makefile b/libavcodec/x86/vvc/Makefile
index d6a66f860a..7b2438ce17 100644
--- a/libavcodec/x86/vvc/Makefile
+++ b/libavcodec/x86/vvc/Makefile
@@ -5,4 +5,5 @@ OBJS-$(CONFIG_VVC_DECODER) += x86/vvc/vvcdsp_init.o 
\
   x86/h26x/h2656dsp.o
 X86ASM-OBJS-$(CONFIG_VVC_DECODER)  += x86/vvc/vvc_alf.o  \
   x86/vvc/vvc_mc.o   \
-  x86/h26x/h2656_inter.o
+  x86/vvc/vvc_sad.o  \
+  x86/h26x/h2656_inter.o 
diff --git a/libavcodec/x86/vvc/vvc_sad.asm b/libavcodec/x86/vvc/vvc_sad.asm
new file mode 100644
index 00..530142ad35
--- /dev/null
+++ b/libavcodec/x86/vvc/vvc_sad.asm
@@ -0,0 +1,157 @@
+; /*
+; * Provide SIMD DMVR SAD functions for VVC decoding
+; *
+; * Copyright (c) 2024 Stone Chen
+; *
+; * This file is part of FFmpeg.
+; *
+; * FFmpeg is free software; you can redistribute it and/or
+; * modify it under the terms of the GNU Lesser General Public
+; * License as published by the Free Software Foundation; either
+; * version 2.1 of the License, or (at your option) any later version.
+; *
+; * FFmpeg is distributed in the hope that it will be useful,
+; * but WITHOUT ANY WARRANTY; without even the implied warranty of
+; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+; * Lesser General Public License for more details.
+; *
+; * You should have received a copy of the GNU Lesser General Public
+; * License along with FFmpeg; if not, write to the Free Software
+; * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
USA
+; */
+
+%include "libavutil/x86/x86util.asm"
+
+%define MAX_PB_SIZE 128
+%define ROWS 2  ; DMVR SAD is only calculated on even rows to reduce 
complexity
+
+SECTION .text
+
+%macro MIN_MAX_SAD 3 ; 
+vpminuw   %1, %2, %3
+vpmaxuw   %3, %2, %3
+vpsubusw  %3, %3, %1
+%endmacro
+
+%macro HORIZ_ADD 3  ; xm0, xm1, m1
+vextracti128  %1, %3, q0001  ;32  1  0
+vpaddd%1, %2 ; xm0 (7 + 3) (6 + 2) (5 + 1)   (4 + 0)
+vpshufd   %2, %1, q0032  ; xm1-  - (7 + 3)   (6 + 2)
+vpaddd%1, %1, %2 ; xm0_  _ (5 1 7 3) (4 0 6 2)
+vpshufd   %2, %1, q0001  ; xm1_  _ (5 1 7 3) (5 1 7 3)
+vpaddd%1, %1, %2 ;   (01234567)
+%endmacro
+
+%macro INIT_OFFSET 6 ; src1, src2, dxq, dyq, off1, off2
+sub %3, 2
+sub %4, 2
+
+mov %5, 2
+mov %6, 2
+
+add %5, %4   
+sub %6, %4
+
+imul%5, 128
+imul%6, 128
+
+add %5, 2
+add %6, 2
+
+add %5, %3
+sub %6, %3
+
+lea %1, [%1 + %5 * 2]
+lea %2, [%2 + %6 * 2]
+%endmacro
+
+%if ARCH_X86_64
+%if HAVE_AVX2_EXTERNAL
+
+INIT_YMM avx2
+
+cglobal vvc_sad, 6, 11, 14, src1, src2, dx, dy, block_w, block_h, off1, off2, 
row_idx, dx2, dy2
+movsxd   dx2q, dxd
+movsxd   dy2q, dyd
+INIT_OFFSET src1q, src2q, dx2q, dy2q, off1q, off2q
+pxor   m3, m3
+pxor   m8, m8
+
+cmp  block_wd, 16
+jgevvc_sad_16_128
+
+vvc_sad_8:
+.loop_height:
+movu  xm0, [src1q]
+movu  xm1, [src2q]
+MIN_MAX_SAD   xm2, xm0, xm1
+vpmovzxwd  m1, xm1
+vpaddd m3, m1
+
+movu  xm5, [src1q + MAX_PB_SIZE * ROWS * 2]
+movu  xm6, [src2q + MAX_PB_SIZE * ROWS * 2]
+MIN_MAX_SAD   xm7, xm5, xm6
+vpmovzxwd  

[FFmpeg-devel] [PATCH v3 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-14 Thread Stone Chen
Adds checkasm for DMVR SAD AVX2 implementation.

Benchmarks ( AMD 7940HS )
vvc_sad_8x8_c: 63.0
vvc_sad_8x8_avx2: 3.0
vvc_sad_16x16_c: 263.0
vvc_sad_16x16_avx2: 23.0
vvc_sad_32x32_c: 1003.0
vvc_sad_32x32_avx2: 83.0
vvc_sad_64x64_c: 3923.0
vvc_sad_64x64_avx2: 373.0
vvc_sad_128x128_c: 17533.0
vvc_sad_128x128_avx2: 1683.0
---
 tests/checkasm/vvc_mc.c | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/tests/checkasm/vvc_mc.c b/tests/checkasm/vvc_mc.c
index 97f57cb401..e251400bfc 100644
--- a/tests/checkasm/vvc_mc.c
+++ b/tests/checkasm/vvc_mc.c
@@ -322,8 +322,46 @@ static void check_avg(void)
 report("avg");
 }
 
+static void check_vvc_sad(void)
+{
+const int bit_depth = 10;
+VVCDSPContext c;
+LOCAL_ALIGNED_32(uint16_t, src0, [MAX_CTU_SIZE * MAX_CTU_SIZE * 4]);
+LOCAL_ALIGNED_32(uint16_t, src1, [MAX_CTU_SIZE * MAX_CTU_SIZE * 4]);
+declare_func(int, const int16_t *src0, const int16_t *src1, int dx, int 
dy, int block_w, int block_h);
+
+ff_vvc_dsp_init(&c, bit_depth);
+memset(src0, 0, MAX_CTU_SIZE * MAX_CTU_SIZE * 4);
+memset(src1, 0, MAX_CTU_SIZE * MAX_CTU_SIZE * 4);
+
+randomize_pixels(src0, src1, MAX_CTU_SIZE * MAX_CTU_SIZE * 2);
+ for (int h = 8; h <= MAX_CTU_SIZE; h *= 2) {
+for (int w = 8; w <= MAX_CTU_SIZE; w *= 2) {
+for(int offy = 0; offy <= 4; offy++) {
+for(int offx = 0; offx <= 4; offx++) {
+if(check_func(c.inter.sad, "vvc_sad_%dx%d", w, h)) {
+int result0;
+int result1;
+
+result0 =  call_ref(src0 + PIXEL_STRIDE * 2 + 2, src1 
+ PIXEL_STRIDE * 2 + 2, offx, offy, w, h);
+result1 =  call_new(src0 + PIXEL_STRIDE * 2 + 2, src1 
+ PIXEL_STRIDE * 2 + 2, offx, offy, w, h);
+
+if (result1 != result0)
+fail();
+if(w == h && offx == 0 && offy == 0)
+bench_new(src0 + PIXEL_STRIDE * 2 + 2, src1 + 
PIXEL_STRIDE * 2 + 2, offx, offy, w, h);
+}
+}
+}
+}
+ }
+
+report("check_vvc_sad");
+}
+
 void checkasm_check_vvc_mc(void)
 {
+check_vvc_sad();
 check_put_vvc_luma();
 check_put_vvc_luma_uni();
 check_put_vvc_chroma();
-- 
2.45.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v8 05/15] avcodec/vaapi_encode: move the dpb logic from VAAPI to base layer

2024-05-14 Thread Mark Thompson
On 18/04/2024 09:58, tong1.wu-at-intel@ffmpeg.org wrote:
> From: Tong Wu 
> 
> Move receive_packet function to base. This requires adding *alloc,
> *issue, *output, *free as hardware callbacks. HWBaseEncodePicture is
> introduced as the base layer structure. The related parameters in
> VAAPIEncodeContext are also extracted to HWBaseEncodeContext. Then DPB
> management logic can be fully extracted to base layer as-is.
> 
> Signed-off-by: Tong Wu 
> ---
>  libavcodec/Makefile |   2 +-
>  libavcodec/hw_base_encode.c | 600 
>  libavcodec/hw_base_encode.h | 123 +
>  libavcodec/vaapi_encode.c   | 793 +---
>  libavcodec/vaapi_encode.h   | 102 +---
>  libavcodec/vaapi_encode_av1.c   |  51 +-
>  libavcodec/vaapi_encode_h264.c  | 176 +++
>  libavcodec/vaapi_encode_h265.c  | 121 ++---
>  libavcodec/vaapi_encode_mjpeg.c |   7 +-
>  libavcodec/vaapi_encode_mpeg2.c |  47 +-
>  libavcodec/vaapi_encode_vp8.c   |  18 +-
>  libavcodec/vaapi_encode_vp9.c   |  54 +--
>  12 files changed, 1097 insertions(+), 997 deletions(-)
>  create mode 100644 libavcodec/hw_base_encode.c
> 
> diff --git a/libavcodec/Makefile b/libavcodec/Makefile
> index 7f6de4470e..a2174dcb2f 100644
> --- a/libavcodec/Makefile
> +++ b/libavcodec/Makefile
> @@ -162,7 +162,7 @@ OBJS-$(CONFIG_STARTCODE)   += startcode.o
>  OBJS-$(CONFIG_TEXTUREDSP)  += texturedsp.o
>  OBJS-$(CONFIG_TEXTUREDSPENC)   += texturedspenc.o
>  OBJS-$(CONFIG_TPELDSP) += tpeldsp.o
> -OBJS-$(CONFIG_VAAPI_ENCODE)+= vaapi_encode.o
> +OBJS-$(CONFIG_VAAPI_ENCODE)+= vaapi_encode.o hw_base_encode.o
>  OBJS-$(CONFIG_AV1_AMF_ENCODER) += amfenc_av1.o
>  OBJS-$(CONFIG_VC1DSP)  += vc1dsp.o
>  OBJS-$(CONFIG_VIDEODSP)+= videodsp.o
> diff --git a/libavcodec/hw_base_encode.c b/libavcodec/hw_base_encode.c
> new file mode 100644
> index 00..1d9a255f69
> --- /dev/null
> +++ b/libavcodec/hw_base_encode.c
> @@ -0,0 +1,600 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
> + */
> +
> +#include "libavutil/avassert.h"
> +#include "libavutil/common.h"
> +#include "libavutil/internal.h"
> +#include "libavutil/log.h"
> +#include "libavutil/mem.h"
> +#include "libavutil/pixdesc.h"
> +
> +#include "encode.h"
> +#include "avcodec.h"
> +#include "hw_base_encode.h"
> +
> ...

Everything above here looks like a copy of the VAAPI code with the names 
changed, good if so.  (If not then please highlight any differences.)

> +
> +static int hw_base_encode_send_frame(AVCodecContext *avctx, AVFrame *frame)
> +{
> +HWBaseEncodeContext *ctx = avctx->priv_data;
> +HWBaseEncodePicture *pic;
> +int err;
> +
> +if (frame) {
> +av_log(avctx, AV_LOG_DEBUG, "Input frame: %ux%u (%"PRId64").\n",
> +   frame->width, frame->height, frame->pts);
> +
> +err = hw_base_encode_check_frame(avctx, frame);
> +if (err < 0)
> +return err;
> +
> +pic = ctx->op->alloc(avctx, frame);
> +if (!pic)
> +return AVERROR(ENOMEM);

Can you push the allocation of this picture out into the base layer?  
vaapi_encode_alloc() and d3d12va_encode_alloc() are identical except for the 
types and the input_surface setting.

> +
> +pic->input_image = av_frame_alloc();
> +if (!pic->input_image) {
> +err = AVERROR(ENOMEM);
> +goto fail;
> +}
> +
> +pic->recon_image = av_frame_alloc();
> +if (!pic->recon_image) {
> +err = AVERROR(ENOMEM);
> +goto fail;
> +}
> +
> +if (ctx->input_order == 0 || frame->pict_type == AV_PICTURE_TYPE_I)
> +pic->force_idr = 1;
> +
> +pic->pts = frame->pts;
> +pic->duration = frame->duration;
> +
> +if (avctx->flags & AV_CODEC_FLAG_COPY_OPAQUE) {
> +err = av_buffer_replace(&pic->opaque_ref, frame->opaque_ref);
> +if (err < 0)
> +goto fail;
> +
> +pic->opaque = frame->opaque;
> +}
> +
> +av_frame_move_ref(pic->input_image, frame);
> +
> +if (ctx->input_order == 0)
> +ctx->f

Re: [FFmpeg-devel] [PATCH v8 06/15] avcodec/vaapi_encode: extract the init function to base layer

2024-05-14 Thread Mark Thompson
On 18/04/2024 09:59, tong1.wu-at-intel@ffmpeg.org wrote:
> From: Tong Wu 
> 
> Related parameters are also moved to base layer.
> 
> Signed-off-by: Tong Wu 
> ---
>  libavcodec/hw_base_encode.c | 33 
>  libavcodec/hw_base_encode.h | 11 ++
>  libavcodec/vaapi_encode.c   | 68 ++---
>  libavcodec/vaapi_encode.h   |  6 ---
>  libavcodec/vaapi_encode_av1.c   |  2 +-
>  libavcodec/vaapi_encode_h264.c  |  2 +-
>  libavcodec/vaapi_encode_h265.c  |  2 +-
>  libavcodec/vaapi_encode_mjpeg.c |  6 ++-
>  8 files changed, 72 insertions(+), 58 deletions(-)
> 
> diff --git a/libavcodec/hw_base_encode.c b/libavcodec/hw_base_encode.c
> index 1d9a255f69..14f3ecfc94 100644
> --- a/libavcodec/hw_base_encode.c
> +++ b/libavcodec/hw_base_encode.c
> @@ -598,3 +598,36 @@ end:
>  
>  return 0;
>  }
> +
> +int ff_hw_base_encode_init(AVCodecContext *avctx)
> +{
> +HWBaseEncodeContext *ctx = avctx->priv_data;
> +
> +ctx->frame = av_frame_alloc();
> +if (!ctx->frame)
> +return AVERROR(ENOMEM);
> +
> +if (!avctx->hw_frames_ctx) {
> +av_log(avctx, AV_LOG_ERROR, "A hardware frames reference is "
> +   "required to associate the encoding device.\n");
> +return AVERROR(EINVAL);
> +}
> +
> +ctx->input_frames_ref = av_buffer_ref(avctx->hw_frames_ctx);
> +if (!ctx->input_frames_ref)
> +return AVERROR(ENOMEM);
> +
> +ctx->input_frames = (AVHWFramesContext *)ctx->input_frames_ref->data;
> +
> +ctx->device_ref = av_buffer_ref(ctx->input_frames->device_ref);
> +if (!ctx->device_ref)
> +return AVERROR(ENOMEM);
> +
> +ctx->device = (AVHWDeviceContext *)ctx->device_ref->data;
> +
> +ctx->tail_pkt = av_packet_alloc();
> +if (!ctx->tail_pkt)
> +return AVERROR(ENOMEM);
> +
> +return 0;
> +}
> diff --git a/libavcodec/hw_base_encode.h b/libavcodec/hw_base_encode.h
> index b5b676b9a8..f7e385e840 100644
> --- a/libavcodec/hw_base_encode.h
> +++ b/libavcodec/hw_base_encode.h
> @@ -19,6 +19,7 @@
>  #ifndef AVCODEC_HW_BASE_ENCODE_H
>  #define AVCODEC_HW_BASE_ENCODE_H
>  
> +#include "libavutil/hwcontext.h"
>  #include "libavutil/fifo.h"
>  
>  #define MAX_DPB_SIZE 16
> @@ -117,6 +118,14 @@ typedef struct HWBaseEncodeContext {
>  // Hardware-specific hooks.
>  const struct HWEncodePictureOperation *op;
>  
> +// The hardware device context.
> +AVBufferRef*device_ref;
> +AVHWDeviceContext *device;
> +
> +// The hardware frame context containing the input frames.
> +AVBufferRef*input_frames_ref;
> +AVHWFramesContext *input_frames;
> +
>  // Current encoding window, in display (input) order.
>  HWBaseEncodePicture *pic_start, *pic_end;
>  // The next picture to use as the previous reference picture in
> @@ -183,6 +192,8 @@ typedef struct HWBaseEncodeContext {
>  
>  int ff_hw_base_encode_receive_packet(AVCodecContext *avctx, AVPacket *pkt);
>  
> +int ff_hw_base_encode_init(AVCodecContext *avctx);
> +
>  #define HW_BASE_ENCODE_COMMON_OPTIONS \
>  { "async_depth", "Maximum processing parallelism. " \
>"Increase this to improve single channel performance.", \

Maybe this patch should be merged with 9/15 to keep the init/close symmetry?  
It's not clear that the intermediate makes sense, and it has some churn.

> ...

Thanks,

- Mark
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCHv3 1/2] lavc/flacdsp: R-V V flac_wasted32

2024-05-14 Thread Rémi Denis-Courmont
T-Head C908:
flac_wasted_32_c:   949.0
flac_wasted_32_rvv_i32: 278.7
---
 libavcodec/riscv/flacdsp_init.c |  7 ++-
 libavcodec/riscv/flacdsp_rvv.S  | 15 +++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/libavcodec/riscv/flacdsp_init.c b/libavcodec/riscv/flacdsp_init.c
index 66eb062620..454787470b 100644
--- a/libavcodec/riscv/flacdsp_init.c
+++ b/libavcodec/riscv/flacdsp_init.c
@@ -31,6 +31,7 @@ void ff_flac_lpc32_rvv(int32_t *decoded, const int coeffs[32],
int pred_order, int qlevel, int len);
 void ff_flac_lpc32_rvv_simple(int32_t *decoded, const int coeffs[32],
   int pred_order, int qlevel, int len);
+void ff_flac_wasted32_rvv(int32_t *, int shift, int len);
 void ff_flac_decorrelate_indep2_16_rvv(uint8_t **out, int32_t **in,
int channels, int len, int shift);
 void ff_flac_decorrelate_indep4_16_rvv(uint8_t **out, int32_t **in,
@@ -79,7 +80,11 @@ av_cold void ff_flacdsp_init_riscv(FLACDSPContext *c, enum 
AVSampleFormat fmt,
 else
 c->lpc32 = ff_flac_lpc32_rvv;
 }
+# endif
+
+c->wasted32 = ff_flac_wasted32_rvv;
 
+# if (__riscv_xlen >= 64)
 switch (fmt) {
 case AV_SAMPLE_FMT_S16:
 switch (channels) {
@@ -119,8 +124,8 @@ av_cold void ff_flacdsp_init_riscv(FLACDSPContext *c, enum 
AVSampleFormat fmt,
 c->decorrelate[2] = ff_flac_decorrelate_rs_32_rvv;
 c->decorrelate[3] = ff_flac_decorrelate_ms_32_rvv;
 break;
-# endif
 }
+# endif
 }
 #endif
 }
diff --git a/libavcodec/riscv/flacdsp_rvv.S b/libavcodec/riscv/flacdsp_rvv.S
index 25803f00f8..d7009cdec2 100644
--- a/libavcodec/riscv/flacdsp_rvv.S
+++ b/libavcodec/riscv/flacdsp_rvv.S
@@ -100,7 +100,22 @@ func ff_flac_lpc32_rvv_simple, zve64x
 
 ret
 endfunc
+#endif
+
+func ff_flac_wasted32_rvv, zve32x
+1:
+vsetvli t0, a2, e32, m8, ta, ma
+vle32.v v8, (a0)
+sub a2, a2, t0
+vsll.vx v8, v8, a1
+vse32.v v8, (a0)
+sh2add  a0, t0, a0
+bneza2, 1b
 
+ret
+endfunc
+
+#if (__riscv_xlen == 64)
 func ff_flac_decorrelate_indep2_16_rvv, zve32x
 ld  a0,  (a0)
 ld  a2, 8(a1)
-- 
2.43.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCHv3 2/2] lavc/flacdsp: R-V V flac_wasted33

2024-05-14 Thread Rémi Denis-Courmont
T-Head C908:
flac_wasted_33_c:   786.2
flac_wasted_33_rvv_i64: 486.5
---
 libavcodec/riscv/flacdsp_init.c |  4 
 libavcodec/riscv/flacdsp_rvv.S  | 32 
 2 files changed, 36 insertions(+)

diff --git a/libavcodec/riscv/flacdsp_init.c b/libavcodec/riscv/flacdsp_init.c
index 454787470b..4f1652dbe7 100644
--- a/libavcodec/riscv/flacdsp_init.c
+++ b/libavcodec/riscv/flacdsp_init.c
@@ -32,6 +32,7 @@ void ff_flac_lpc32_rvv(int32_t *decoded, const int coeffs[32],
 void ff_flac_lpc32_rvv_simple(int32_t *decoded, const int coeffs[32],
   int pred_order, int qlevel, int len);
 void ff_flac_wasted32_rvv(int32_t *, int shift, int len);
+void ff_flac_wasted33_rvv(int64_t *, const int32_t *, int shift, int len);
 void ff_flac_decorrelate_indep2_16_rvv(uint8_t **out, int32_t **in,
int channels, int len, int shift);
 void ff_flac_decorrelate_indep4_16_rvv(uint8_t **out, int32_t **in,
@@ -84,6 +85,9 @@ av_cold void ff_flacdsp_init_riscv(FLACDSPContext *c, enum 
AVSampleFormat fmt,
 
 c->wasted32 = ff_flac_wasted32_rvv;
 
+if (flags & AV_CPU_FLAG_RVV_I64)
+c->wasted33 = ff_flac_wasted33_rvv;
+
 # if (__riscv_xlen >= 64)
 switch (fmt) {
 case AV_SAMPLE_FMT_S16:
diff --git a/libavcodec/riscv/flacdsp_rvv.S b/libavcodec/riscv/flacdsp_rvv.S
index d7009cdec2..6287faa260 100644
--- a/libavcodec/riscv/flacdsp_rvv.S
+++ b/libavcodec/riscv/flacdsp_rvv.S
@@ -115,6 +115,38 @@ func ff_flac_wasted32_rvv, zve32x
 ret
 endfunc
 
+func ff_flac_wasted33_rvv, zve64x
+srli t0, a2, 5
+li   t1, 1
+bnez t0, 2f
+sll  a2, t1, a2
+1:
+vsetvli  t0, a3, e32, m4, ta, ma
+vle32.v  v8, (a1)
+sub  a3, a3, t0
+vwmulsu.vx   v16, v8, a2
+sh2add   a1, t0, a1
+vse64.v  v16, (a0)
+sh3add   a0, t0, a0
+bnez a3, 1b
+
+ret
+
+2:  // Pessimistic case: wasted >= 32
+vsetvli  t0, a3, e32, m4, ta, ma
+vle32.v  v8, (a1)
+sub  a3, a3, t0
+vwcvtu.x.x.v v16, v8
+sh2add   a1, t0, a1
+vsetvli  zero, zero, e64, m8, ta, ma
+vsll.vx  v16, v16, a2
+vse64.v  v16, (a0)
+sh3add   a0, t0, a0
+bnez a3, 2b
+
+ret
+endfunc
+
 #if (__riscv_xlen == 64)
 func ff_flac_decorrelate_indep2_16_rvv, zve32x
 ld  a0,  (a0)
-- 
2.43.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v8 11/15] avcodec/vaapi_encode: extract a get_recon_format function to base layer

2024-05-14 Thread Mark Thompson
On 18/04/2024 09:59, tong1.wu-at-intel@ffmpeg.org wrote:
> From: Tong Wu 
> 
> Surface size and block size parameters are also moved to base layer.
> 
> Signed-off-by: Tong Wu 
> ---
>  libavcodec/hw_base_encode.c | 58 +++
>  libavcodec/hw_base_encode.h | 12 +
>  libavcodec/vaapi_encode.c   | 81 -
>  libavcodec/vaapi_encode.h   | 10 
>  libavcodec/vaapi_encode_av1.c   | 10 ++--
>  libavcodec/vaapi_encode_h264.c  | 11 +++--
>  libavcodec/vaapi_encode_h265.c  | 25 +-
>  libavcodec/vaapi_encode_mjpeg.c |  5 +-
>  libavcodec/vaapi_encode_vp9.c   |  6 +--
>  9 files changed, 118 insertions(+), 100 deletions(-)
> 
> diff --git a/libavcodec/hw_base_encode.c b/libavcodec/hw_base_encode.c
> index a4223d90f0..af85bb99aa 100644
> --- a/libavcodec/hw_base_encode.c
> +++ b/libavcodec/hw_base_encode.c
> @@ -693,6 +693,64 @@ int ff_hw_base_init_gop_structure(AVCodecContext *avctx, 
> uint32_t ref_l0, uint32
>  return 0;
>  }
>  
> +int ff_hw_base_get_recon_format(AVCodecContext *avctx, const void *hwconfig, 
> enum AVPixelFormat *fmt)
> +{
> +HWBaseEncodeContext *ctx = avctx->priv_data;
> +AVHWFramesConstraints *constraints = NULL;
> +enum AVPixelFormat recon_format;
> +int err, i;
> +
> +constraints = av_hwdevice_get_hwframe_constraints(ctx->device_ref,
> +  hwconfig);

Does this mechanism actually make sense for D3D12?

VAAPI is the currently the only implementation of this function with non-null 
hwconfig, and this is really relying on it to get useful information (otherwise 
the formats are just everything the device can allocate as a surface and the 
sizes are 0/INT_MAX).

If D3D12 has something which would fit into the hwconfig method then this could 
work very nicely as well, but if it doesn't then presumably it does have some 
other calls to check things like the maximum frame size supported by the 
encoder and we should be using those rather than making this code generic?

(Also consider Vulkan if possible; if two thirds of the cases want it then 
maybe we should do this even if it doesn't fit in one of them.)

> +if (!constraints) {
> +err = AVERROR(ENOMEM);
> +goto fail;
> +}
> +
> +// Probably we can use the input surface format as the surface format
> +// of the reconstructed frames.  If not, we just pick the first (only?)
> +// format in the valid list and hope that it all works.
> +recon_format = AV_PIX_FMT_NONE;
> +if (constraints->valid_sw_formats) {
> +for (i = 0; constraints->valid_sw_formats[i] != AV_PIX_FMT_NONE; 
> i++) {
> +if (ctx->input_frames->sw_format ==
> +constraints->valid_sw_formats[i]) {
> +recon_format = ctx->input_frames->sw_format;
> +break;
> +}
> +}
> +if (recon_format == AV_PIX_FMT_NONE) {
> +// No match.  Just use the first in the supported list and
> +// hope for the best.
> +recon_format = constraints->valid_sw_formats[0];
> +}
> +} else {
> +// No idea what to use; copy input format.
> +recon_format = ctx->input_frames->sw_format;
> +}
> +av_log(avctx, AV_LOG_DEBUG, "Using %s as format of "
> +   "reconstructed frames.\n", av_get_pix_fmt_name(recon_format));
> +
> +if (ctx->surface_width  < constraints->min_width  ||
> +ctx->surface_height < constraints->min_height ||
> +ctx->surface_width  > constraints->max_width ||
> +ctx->surface_height > constraints->max_height) {
> +av_log(avctx, AV_LOG_ERROR, "Hardware does not support encoding at "
> +   "size %dx%d (constraints: width %d-%d height %d-%d).\n",
> +   ctx->surface_width, ctx->surface_height,
> +   constraints->min_width,  constraints->max_width,
> +   constraints->min_height, constraints->max_height);
> +err = AVERROR(EINVAL);
> +goto fail;
> +}
> +
> +*fmt = recon_format;
> +err = 0;
> +fail:
> +av_hwframe_constraints_free(&constraints);
> +return err;
> +}
> +
>  int ff_hw_base_encode_init(AVCodecContext *avctx)
>  {
>  HWBaseEncodeContext *ctx = avctx->priv_data;
> diff --git a/libavcodec/hw_base_encode.h b/libavcodec/hw_base_encode.h
> index d717f955d8..7686cf9501 100644
> --- a/libavcodec/hw_base_encode.h
> +++ b/libavcodec/hw_base_encode.h
> @@ -126,6 +126,16 @@ typedef struct HWBaseEncodeContext {
>  // Desired B frame reference depth.
>  int desired_b_depth;
>  
> +// The required size of surfaces.  This is probably the input
> +// size (AVCodecContext.width|height) aligned up to whatever
> +// block size is required by the codec.
> +int surface_width;
> +int surface_height;
> +
> +// The block size for slice calculations.
> +int slic

Re: [FFmpeg-devel] [PATCH v8 12/15] avcodec/vaapi_encode: extract a free funtion to base layer

2024-05-14 Thread Mark Thompson
On 18/04/2024 09:59, tong1.wu-at-intel@ffmpeg.org wrote:
> From: Tong Wu 
> 
> Signed-off-by: Tong Wu 
> ---
>  libavcodec/hw_base_encode.c | 11 +++
>  libavcodec/hw_base_encode.h |  2 ++
>  libavcodec/vaapi_encode.c   |  6 +-
>  3 files changed, 14 insertions(+), 5 deletions(-)

"... free funtion to ..."

While I do approve of fun, maybe this should be a function.

> diff --git a/libavcodec/hw_base_encode.c b/libavcodec/hw_base_encode.c
> index af85bb99aa..812668f3f2 100644
> --- a/libavcodec/hw_base_encode.c
> +++ b/libavcodec/hw_base_encode.c
> @@ -751,6 +751,17 @@ fail:
>  return err;
>  }
>  
> +int ff_hw_base_encode_free(AVCodecContext *avctx, HWBaseEncodePicture *pic)
> +{
> +av_frame_free(&pic->input_image);
> +av_frame_free(&pic->recon_image);
> +
> +av_buffer_unref(&pic->opaque_ref);
> +av_freep(&pic->priv_data);
> +
> +return 0;
> +}
> +
>  int ff_hw_base_encode_init(AVCodecContext *avctx)
>  {
>  HWBaseEncodeContext *ctx = avctx->priv_data;
> diff --git a/libavcodec/hw_base_encode.h b/libavcodec/hw_base_encode.h
> index 7686cf9501..d566980efc 100644
> --- a/libavcodec/hw_base_encode.h
> +++ b/libavcodec/hw_base_encode.h
> @@ -222,6 +222,8 @@ int ff_hw_base_init_gop_structure(AVCodecContext *avctx, 
> uint32_t ref_l0, uint32
>  
>  int ff_hw_base_get_recon_format(AVCodecContext *avctx, const void *hwconfig, 
> enum AVPixelFormat *fmt);
>  
> +int ff_hw_base_encode_free(AVCodecContext *avctx, HWBaseEncodePicture *pic);
> +
>  int ff_hw_base_encode_init(AVCodecContext *avctx);
>  
>  int ff_hw_base_encode_close(AVCodecContext *avctx);
> diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c
> index ee4cf42baf..08792c07c5 100644
> --- a/libavcodec/vaapi_encode.c
> +++ b/libavcodec/vaapi_encode.c
> @@ -878,17 +878,13 @@ static int vaapi_encode_free(AVCodecContext *avctx,
>  av_freep(&pic->slices[i].codec_slice_params);
>  }
>  
> -av_frame_free(&base_pic->input_image);
> -av_frame_free(&base_pic->recon_image);
> -
> -av_buffer_unref(&base_pic->opaque_ref);
> +ff_hw_base_encode_free(avctx, base_pic);
>  
>  av_freep(&pic->param_buffers);
>  av_freep(&pic->slices);
>  // Output buffer should already be destroyed.
>  av_assert0(pic->output_buffer == VA_INVALID_ID);
>  
> -av_freep(&base_pic->priv_data);
>  av_freep(&pic->codec_picture_params);
>  av_freep(&pic->roi);
>  

Thanks,

- Mark
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 1/2] avutil/channel_layout: add a helper function to get the ambisonic order of a layout

2024-05-14 Thread James Almer
Signed-off-by: James Almer 
---
 libavutil/channel_layout.c | 17 -
 libavutil/channel_layout.h | 10 ++
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/libavutil/channel_layout.c b/libavutil/channel_layout.c
index fd6718e0e7..e213f68666 100644
--- a/libavutil/channel_layout.c
+++ b/libavutil/channel_layout.c
@@ -473,15 +473,14 @@ static int has_channel_names(const AVChannelLayout 
*channel_layout)
 return 0;
 }
 
-/**
- * If the layout is n-th order standard-order ambisonic, with optional
- * extra non-diegetic channels at the end, return the order.
- * Return a negative error code otherwise.
- */
-static int ambisonic_order(const AVChannelLayout *channel_layout)
+int av_channel_layout_get_ambisonic_order(const AVChannelLayout 
*channel_layout)
 {
 int i, highest_ambi, order;
 
+if (channel_layout->order != AV_CHANNEL_ORDER_AMBISONIC &&
+channel_layout->order != AV_CHANNEL_ORDER_CUSTOM)
+return AVERROR(EINVAL);
+
 highest_ambi = -1;
 if (channel_layout->order == AV_CHANNEL_ORDER_AMBISONIC)
 highest_ambi = channel_layout->nb_channels - 
av_popcount64(channel_layout->u.mask) - 1;
@@ -536,7 +535,7 @@ static enum AVChannelOrder canonical_order(AVChannelLayout 
*channel_layout)
 if (masked_description(channel_layout, 0) > 0)
 return AV_CHANNEL_ORDER_NATIVE;
 
-order = ambisonic_order(channel_layout);
+order = av_channel_layout_get_ambisonic_order(channel_layout);
 if (order >= 0 && masked_description(channel_layout, (order + 1) * (order 
+ 1)) >= 0)
 return AV_CHANNEL_ORDER_AMBISONIC;
 
@@ -551,7 +550,7 @@ static enum AVChannelOrder canonical_order(AVChannelLayout 
*channel_layout)
 static int try_describe_ambisonic(AVBPrint *bp, const AVChannelLayout 
*channel_layout)
 {
 int nb_ambi_channels;
-int order = ambisonic_order(channel_layout);
+int order = av_channel_layout_get_ambisonic_order(channel_layout);
 if (order < 0)
 return order;
 
@@ -945,7 +944,7 @@ int av_channel_layout_retype(AVChannelLayout 
*channel_layout, enum AVChannelOrde
 if (channel_layout->order == AV_CHANNEL_ORDER_CUSTOM) {
 int64_t mask;
 int nb_channels = channel_layout->nb_channels;
-int order = ambisonic_order(channel_layout);
+int order = av_channel_layout_get_ambisonic_order(channel_layout);
 if (order < 0)
 return AVERROR(ENOSYS);
 mask = masked_description(channel_layout, (order + 1) * (order + 
1));
diff --git a/libavutil/channel_layout.h b/libavutil/channel_layout.h
index 8a078d1601..c2ab236488 100644
--- a/libavutil/channel_layout.h
+++ b/libavutil/channel_layout.h
@@ -679,6 +679,16 @@ int av_channel_layout_check(const AVChannelLayout 
*channel_layout);
  */
 int av_channel_layout_compare(const AVChannelLayout *chl, const 
AVChannelLayout *chl1);
 
+/**
+ * Return the order if the layout is n-th order standard-order ambisonic.
+ * The presence of optional extra non-diegetic channels at the end is not taken
+ * into account.
+ *
+ * @param channel_layout input channel layout
+ * @return the order of the layout, a negative error code otherwise.
+ */
+int av_channel_layout_get_ambisonic_order(const AVChannelLayout 
*channel_layout);
+
 /**
  * The conversion must be lossless.
  */
-- 
2.45.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 2/2] avformat/movenc: add support for writing SA3D boxes

2024-05-14 Thread James Almer
Signed-off-by: James Almer 
---
 libavformat/movenc.c | 61 
 1 file changed, 61 insertions(+)

diff --git a/libavformat/movenc.c b/libavformat/movenc.c
index f907f67752..2aec9a8d17 100644
--- a/libavformat/movenc.c
+++ b/libavformat/movenc.c
@@ -916,6 +916,62 @@ static int mov_write_dmlp_tag(AVFormatContext *s, 
AVIOContext *pb, MOVTrack *tra
 return update_size(pb, pos);
 }
 
+static int mov_write_SA3D_tag(AVFormatContext *s, AVIOContext *pb, MOVTrack 
*track)
+{
+AVChannelLayout ch_layout = { 0 };
+const AVDictionaryEntry *str = av_dict_get(track->st->metadata, "SA3D", 
NULL, 0);
+int64_t pos;
+int ambisonic_order, ambi_channels, non_diegetic_channels;
+int i, ret;
+
+if (!str)
+return 0;
+
+ret = av_channel_layout_from_string(&ch_layout, str->value);
+if (ret < 0) {
+if (ret == AVERROR(EINVAL)) {
+invalid:
+av_log(s, AV_LOG_ERROR, "Invalid SA3D layout: \"%s\"\n", 
str->value);
+ret = 0;
+}
+av_channel_layout_uninit(&ch_layout);
+return ret;
+   }
+
+if (track->st->codecpar->ch_layout.nb_channels != ch_layout.nb_channels)
+goto invalid;
+
+ambisonic_order = av_channel_layout_get_ambisonic_order(&ch_layout);
+if (ambisonic_order < 0)
+goto invalid;
+
+ambi_channels = (ambisonic_order + 1LL) * (ambisonic_order + 1LL);
+non_diegetic_channels = ch_layout.nb_channels - ambi_channels;
+if (non_diegetic_channels && non_diegetic_channels != 2)
+goto invalid;
+
+av_log(s, AV_LOG_VERBOSE, "Inserting SA3D box with layout: \"%s\"\n", 
str->value);
+
+pos = avio_tell(pb);
+
+avio_wb32(pb, 0); // Size
+ffio_wfourcc(pb, "SA3D");
+avio_w8(pb, 0); // version
+avio_w8(pb, (!!non_diegetic_channels) << 7); // head_locked_stereo and 
ambisonic_type
+avio_wb32(pb, ambisonic_order); // ambisonic_order
+avio_w8(pb, 0); // ambisonic_channel_ordering
+avio_w8(pb, 0); // ambisonic_normalization
+avio_wb32(pb, ch_layout.nb_channels); // num_channels
+for (i = 0; i < ambi_channels; i++)
+avio_wb32(pb, av_channel_layout_channel_from_index(&ch_layout, i) - 
AV_CHAN_AMBISONIC_BASE);
+for (; i < ch_layout.nb_channels; i++)
+avio_wb32(pb, i);
+
+av_channel_layout_uninit(&ch_layout);
+
+return update_size(pb, pos);
+}
+
 static int mov_write_chan_tag(AVFormatContext *s, AVIOContext *pb, MOVTrack 
*track)
 {
 uint32_t layout_tag, bitmap, *channel_desc;
@@ -1419,6 +1475,11 @@ static int mov_write_audio_tag(AVFormatContext *s, 
AVIOContext *pb, MOVMuxContex
 if (ret < 0)
 return ret;
 
+if (track->mode == MODE_MP4 && track->par->codec_type == AVMEDIA_TYPE_AUDIO
+&& ((ret = mov_write_SA3D_tag(s, pb, track)) < 0)) {
+return ret;
+}
+
 if (track->mode == MODE_MOV && track->par->codec_type == AVMEDIA_TYPE_AUDIO
 && ((ret = mov_write_chan_tag(s, pb, track)) < 0)) {
 return ret;
-- 
2.45.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avformat/pcmdec: add pts and dts calculation for pcmdec

2024-05-14 Thread Hiccup Zhu
The purpose of this patch is to calculate pts and dts when using pcmdemux.
Is there anything wrong with doing this, or do you have any suggestions for
improvement?

Andreas Rheinhardt  于2024年5月14日周二 19:41写道:

> Shiqi Zhu:
> > Signed-off-by: Shiqi Zhu 
> > ---
> >  libavformat/pcmdec.c | 37 +++--
> >  1 file changed, 35 insertions(+), 2 deletions(-)
> >
> > diff --git a/libavformat/pcmdec.c b/libavformat/pcmdec.c
> > index 2f6508b75a..d879aefaad 100644
> > --- a/libavformat/pcmdec.c
> > +++ b/libavformat/pcmdec.c
> > @@ -36,6 +36,7 @@ typedef struct PCMAudioDemuxerContext {
> >  AVClass *class;
> >  int sample_rate;
> >  AVChannelLayout ch_layout;
> > +int64_t nb_samples;
> >  } PCMAudioDemuxerContext;
> >
> >  static int pcm_read_header(AVFormatContext *s)
> > @@ -46,6 +47,7 @@ static int pcm_read_header(AVFormatContext *s)
> >  uint8_t *mime_type = NULL;
> >  int ret;
> >
> > +s1->nb_samples = 0;
> >  st = avformat_new_stream(s, NULL);
> >  if (!st)
> >  return AVERROR(ENOMEM);
> > @@ -104,6 +106,37 @@ static int pcm_read_header(AVFormatContext *s)
> >  return 0;
> >  }
> >
> > +static int pcm_dec_read_packet(AVFormatContext *s, AVPacket *pkt)
> > +{
> > +PCMAudioDemuxerContext *s1 = s->priv_data;
> > +AVCodecParameters *par = s->streams[0]->codecpar;
> > +int ret;
> > +
> > +ret = ff_pcm_read_packet(s, pkt);
> > +if (ret < 0)
> > +return ret;
> > +
> > +pkt->time_base = s->streams[0]->time_base;
> > +pkt->dts = pkt->pts = s1->nb_samples;
> > +s1->nb_samples += pkt->size / par->block_align;
> > +
> > +return ret;
> > +}
> > +
> > +static int pcm_dec_read_seek(AVFormatContext *s,
> > + int stream_index, int64_t timestamp, int
> flags)
> > +{
> > +PCMAudioDemuxerContext *s1 = s->priv_data;
> > +int ret;
> > +
> > +ret = ff_pcm_read_seek(s, stream_index, timestamp, flags);
> > +if (ret < 0)
> > +return ret;
> > +
> > +s1->nb_samples = ffstream(s->streams[0])->cur_dts;
> > +return ret;
> > +}
> > +
> >  static const AVOption pcm_options[] = {
> >  { "sample_rate", "", offsetof(PCMAudioDemuxerContext, sample_rate),
> AV_OPT_TYPE_INT, {.i64 = 44100}, 0, INT_MAX, AV_OPT_FLAG_DECODING_PARAM },
> >  { "ch_layout",   "", offsetof(PCMAudioDemuxerContext, ch_layout),
>  AV_OPT_TYPE_CHLAYOUT, {.str = "mono"}, 0, 0, AV_OPT_FLAG_DECODING_PARAM },
> > @@ -126,8 +159,8 @@ const FFInputFormat ff_pcm_ ## name_ ## _demuxer =
> {\
> >  .p.priv_class   = &pcm_demuxer_class,   \
> >  .priv_data_size = sizeof(PCMAudioDemuxerContext),   \
> >  .read_header= pcm_read_header,  \
> > -.read_packet= ff_pcm_read_packet,   \
> > -.read_seek  = ff_pcm_read_seek, \
> > +.read_packet= pcm_dec_read_packet,  \
> > +.read_seek  = pcm_dec_read_seek,\
> >  .raw_codec_id   = codec,\
> >  __VA_ARGS__ \
> >  };
>
> A quick test shows that PTS and DTS are already set generically for pcm
> formats (unless the AVFMT_FLAG_NOFILLIN flag is set). If it is not in
> your usecase, then you should provide details about this (preferably by
> opening a ticket on trac).
>
> - Andreas
>
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>


-- 
THE HEART HAS HOW FAR YOU CAN GO FAR.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avformat/pcmdec: add pts and dts calculation for pcmdec

2024-05-14 Thread Andreas Rheinhardt
Hiccup Zhu:
> The purpose of this patch is to calculate pts and dts when using pcmdemux.
> Is there anything wrong with doing this, or do you have any suggestions for
> improvement?
> 

1. Don't top-post on this list.
2. PTS and DTS are already produced with this demuxer. As has been said:
If it isn't for you, open a ticket about it.

> Andreas Rheinhardt  于2024年5月14日周二 19:41写道:
> 
>> Shiqi Zhu:
>>> Signed-off-by: Shiqi Zhu 
>>> ---
>>>  libavformat/pcmdec.c | 37 +++--
>>>  1 file changed, 35 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/libavformat/pcmdec.c b/libavformat/pcmdec.c
>>> index 2f6508b75a..d879aefaad 100644
>>> --- a/libavformat/pcmdec.c
>>> +++ b/libavformat/pcmdec.c
>>> @@ -36,6 +36,7 @@ typedef struct PCMAudioDemuxerContext {
>>>  AVClass *class;
>>>  int sample_rate;
>>>  AVChannelLayout ch_layout;
>>> +int64_t nb_samples;
>>>  } PCMAudioDemuxerContext;
>>>
>>>  static int pcm_read_header(AVFormatContext *s)
>>> @@ -46,6 +47,7 @@ static int pcm_read_header(AVFormatContext *s)
>>>  uint8_t *mime_type = NULL;
>>>  int ret;
>>>
>>> +s1->nb_samples = 0;
>>>  st = avformat_new_stream(s, NULL);
>>>  if (!st)
>>>  return AVERROR(ENOMEM);
>>> @@ -104,6 +106,37 @@ static int pcm_read_header(AVFormatContext *s)
>>>  return 0;
>>>  }
>>>
>>> +static int pcm_dec_read_packet(AVFormatContext *s, AVPacket *pkt)
>>> +{
>>> +PCMAudioDemuxerContext *s1 = s->priv_data;
>>> +AVCodecParameters *par = s->streams[0]->codecpar;
>>> +int ret;
>>> +
>>> +ret = ff_pcm_read_packet(s, pkt);
>>> +if (ret < 0)
>>> +return ret;
>>> +
>>> +pkt->time_base = s->streams[0]->time_base;
>>> +pkt->dts = pkt->pts = s1->nb_samples;
>>> +s1->nb_samples += pkt->size / par->block_align;
>>> +
>>> +return ret;
>>> +}
>>> +
>>> +static int pcm_dec_read_seek(AVFormatContext *s,
>>> + int stream_index, int64_t timestamp, int
>> flags)
>>> +{
>>> +PCMAudioDemuxerContext *s1 = s->priv_data;
>>> +int ret;
>>> +
>>> +ret = ff_pcm_read_seek(s, stream_index, timestamp, flags);
>>> +if (ret < 0)
>>> +return ret;
>>> +
>>> +s1->nb_samples = ffstream(s->streams[0])->cur_dts;
>>> +return ret;
>>> +}
>>> +
>>>  static const AVOption pcm_options[] = {
>>>  { "sample_rate", "", offsetof(PCMAudioDemuxerContext, sample_rate),
>> AV_OPT_TYPE_INT, {.i64 = 44100}, 0, INT_MAX, AV_OPT_FLAG_DECODING_PARAM },
>>>  { "ch_layout",   "", offsetof(PCMAudioDemuxerContext, ch_layout),
>>  AV_OPT_TYPE_CHLAYOUT, {.str = "mono"}, 0, 0, AV_OPT_FLAG_DECODING_PARAM },
>>> @@ -126,8 +159,8 @@ const FFInputFormat ff_pcm_ ## name_ ## _demuxer =
>> {\
>>>  .p.priv_class   = &pcm_demuxer_class,   \
>>>  .priv_data_size = sizeof(PCMAudioDemuxerContext),   \
>>>  .read_header= pcm_read_header,  \
>>> -.read_packet= ff_pcm_read_packet,   \
>>> -.read_seek  = ff_pcm_read_seek, \
>>> +.read_packet= pcm_dec_read_packet,  \
>>> +.read_seek  = pcm_dec_read_seek,\
>>>  .raw_codec_id   = codec,\
>>>  __VA_ARGS__ \
>>>  };
>>
>> A quick test shows that PTS and DTS are already set generically for pcm
>> formats (unless the AVFMT_FLAG_NOFILLIN flag is set). If it is not in
>> your usecase, then you should provide details about this (preferably by
>> opening a ticket on trac).
>>
>> - Andreas
>>
>> ___
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>>
> 
> 

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avformat/mov: avoid seeking back to 0 on HEVC open GOP files

2024-05-14 Thread Philip Langdale via ffmpeg-devel
On Wed, 15 May 2024 01:36:43 +0530
llyyr.pub...@gmail.com wrote:

> From: llyyr 
> 
> ab77b878f1 attempted to fix the issue of broken packets being sent to
> the decoder by implementing logic that kept attempting to PTS-step
> backwards until it reached a valid point, however applying this
> heuristic meant that in files that had no valid points (such as HEVC
> videos shot on iPhones), we'd seek back to sample 0 on every seek
> attempt. This meant that files that were previously seekable, albeit
> with some skipped frames, were not seekable at all now.
> 
> Relax this heuristic a bit by giving up on seeking to a valid point if
> we've tried a different sample and we still don't have a valid point
> to seek to. This may some frames to be skipped on seeking but it's
> better than not being able to seek at all in such files.
> 
> Fixes: ab77b878f1 ("avformat/mov: fix seeking with HEVC open GOP
> files") Fixes: #10585
> ---
>  libavformat/mov.c | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/libavformat/mov.c b/libavformat/mov.c
> index b3fa748f27e8..6174a04c3169 100644
> --- a/libavformat/mov.c
> +++ b/libavformat/mov.c
> @@ -10133,7 +10133,7 @@ static int mov_seek_stream(AVFormatContext
> *s, AVStream *st, int64_t timestamp, {
>  MOVStreamContext *sc = st->priv_data;
>  FFStream *const sti = ffstream(st);
> -int sample, time_sample, ret;
> +int sample, time_sample, ret, next_ts, requested_sample;
>  unsigned int i;
>  
>  // Here we consider timestamp to be PTS, hence try to offset it
> so that we @@ -10154,7 +10154,17 @@ static int
> mov_seek_stream(AVFormatContext *s, AVStream *st, int64_t timestamp, 
>  if (!sample || can_seek_to_key_sample(st, sample, timestamp))
>  break;
> -timestamp -= FFMAX(sc->min_sample_duration, 1);
> +
> +next_ts = timestamp - FFMAX(sc->min_sample_duration, 1);
> +requested_sample = av_index_search_timestamp(st, next_ts,
> flags); +
> +// If we've reached a different sample trying to find a good
> pts to
> +// seek to, give up searching because we'll end up seeking
> back to
> +// sample 0 on every seek.
> +if (!can_seek_to_key_sample(st, requested_sample, next_ts)
> && sample != requested_sample)
> +break;
> +
> +timestamp = next_ts;
>  }
>  
>  mov_current_sample_set(sc, sample);

LGTM.

I know it's been a _long time_ since you first sent this; I'll push next
week if there aren't any other comments.

--phil
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] checkasm/h264dsp: use int64_t scale values

2024-05-14 Thread James Almer
Fixes "signed integer overflow: [varies] * 104858 cannot be represented in type 
'int'" errors
under ubsan.

Signed-off-by: James Almer 
---
 tests/checkasm/h264dsp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c
index cb180cc44f..0cc1f32740 100644
--- a/tests/checkasm/h264dsp.c
+++ b/tests/checkasm/h264dsp.c
@@ -83,7 +83,7 @@ static void dct4x4_##size(dctcoef *coef)  
   \
 }\
 for (y = 0; y < 4; y++) {\
 for (x = 0; x < 4; x++) {\
-static const int scale[] = { 13107 * 10, 8066 * 13, 5243 * 16 }; \
+const int64_t scale[] = { 13107 * 10, 8066 * 13, 5243 * 16 };\
 const int idx = (y & 1) + (x & 1);   \
 coef[y*4 + x] = (coef[y*4 + x] * scale[idx] + (1 << 14)) >> 15;  \
 }\
-- 
2.45.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 4/9] lavc/vp9dsp: R-V V ipred tm

2024-05-14 Thread uk7b
From: sunyuechi 

C908:
vp9_tm_4x4_8bpp_c: 116.5
vp9_tm_4x4_8bpp_rvv_i32: 43.5
vp9_tm_8x8_8bpp_c: 416.2
vp9_tm_8x8_8bpp_rvv_i32: 86.0
vp9_tm_16x16_8bpp_c: 1665.5
vp9_tm_16x16_8bpp_rvv_i32: 187.2
vp9_tm_32x32_8bpp_c: 6974.2
vp9_tm_32x32_8bpp_rvv_i32: 625.7
---
 libavcodec/riscv/vp9_intra_rvv.S | 118 +++
 libavcodec/riscv/vp9dsp.h|   8 +++
 libavcodec/riscv/vp9dsp_init.c   |   4 ++
 3 files changed, 130 insertions(+)

diff --git a/libavcodec/riscv/vp9_intra_rvv.S b/libavcodec/riscv/vp9_intra_rvv.S
index ca156d65cd..280c497687 100644
--- a/libavcodec/riscv/vp9_intra_rvv.S
+++ b/libavcodec/riscv/vp9_intra_rvv.S
@@ -173,3 +173,121 @@ func ff_h_8x8_rvv, zve32x
 
 ret
 endfunc
+
+.macro tm_sum4 dst1, dst2, dst3, dst4, top, n1
+lbu  t1, \n1(a2)
+lbu  t2, (\n1-1)(a2)
+lbu  t3, (\n1-2)(a2)
+lbu  t4, (\n1-3)(a2)
+sub  t1, t1, a4
+sub  t2, t2, a4
+sub  t3, t3, a4
+sub  t4, t4, a4
+vadd.vx  \dst1, \top, t1
+vadd.vx  \dst2, \top, t2
+vadd.vx  \dst3, \top, t3
+vadd.vx  \dst4, \top, t4
+.endm
+
+func ff_tm_32x32_rvv, zve32x
+lbu  a4, -1(a3)
+li   t5, 32
+
+.irp offset 31, 23, 15, 7
+vsetvli  zero, t5, e16, m4, ta, ma
+vle8.v   v8, (a3)
+vzext.vf2v28, v8
+
+tm_sum4  v0, v4, v8, v12, v28, \offset
+tm_sum4  v16, v20, v24, v28, v28, (\offset-4)
+
+.irp n 0, 4, 8, 12, 16, 20, 24, 28
+vmax.vx  v\n, v\n, zero
+.endr
+
+vsetvli  zero, zero, e8, m2, ta, ma
+.irp n 0, 4, 8, 12, 16, 20, 24, 28
+vnclipu.wi   v\n, v\n, 0
+vse8.v   v\n, (a0)
+add  a0, a0, a1
+.endr
+.endr
+
+ret
+endfunc
+
+func ff_tm_16x16_rvv, zve32x
+vsetivli  zero, 16, e16, m2, ta, ma
+vle8.vv8, (a3)
+vzext.vf2 v30, v8
+lbu   a4, -1(a3)
+
+tm_sum4   v0, v2, v4, v6, v30, 15
+tm_sum4   v8, v10, v12, v14, v30, 11
+tm_sum4   v16, v18, v20, v22, v30, 7
+tm_sum4   v24, v26, v28, v30, v30, 3
+
+.irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30
+vmax.vx  v\n, v\n, zero
+.endr
+
+vsetvli  zero, zero, e8, m1, ta, ma
+.irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28
+vnclipu.wi   v\n, v\n, 0
+vse8.v   v\n, (a0)
+add  a0, a0, a1
+.endr
+vnclipu.wi   v30, v30, 0
+vse8.v   v30, (a0)
+
+ret
+endfunc
+
+func ff_tm_8x8_rvv, zve32x
+vsetivli zero, 8, e16, m1, ta, ma
+vle8.v   v8, (a3)
+vzext.vf2v28, v8
+lbu  a4, -1(a3)
+
+tm_sum4  v16, v17, v18, v19, v28, 7
+tm_sum4  v20, v21, v22, v23, v28, 3
+
+.irp n 16, 17, 18, 19, 20, 21, 22, 23
+vmax.vx  v\n, v\n, zero
+.endr
+
+vsetvli  zero, zero, e8, mf2, ta, ma
+.irp n 16, 17, 18, 19, 20, 21, 22
+vnclipu.wi   v\n, v\n, 0
+vse8.v   v\n, (a0)
+add  a0, a0, a1
+.endr
+vnclipu.wi   v24, v23, 0
+vse8.v   v24, (a0)
+
+ret
+endfunc
+
+func ff_tm_4x4_rvv, zve32x
+vsetivli zero, 4, e16, mf2, ta, ma
+vle8.v   v8, (a3)
+vzext.vf2v28, v8
+lbu  a4, -1(a3)
+
+tm_sum4  v16, v17, v18, v19, v28, 3
+
+.irp n 16, 17, 18, 19
+vmax.vx  v\n, v\n, zero
+.endr
+
+vsetvli  zero, zero, e8, mf4, ta, ma
+.irp n 16, 17, 18
+vnclipu.wi   v\n, v\n, 0
+vse8.v   v\n, (a0)
+add  a0, a0, a1
+.endr
+vnclipu.wi   v24, v19, 0
+vse8.v   v24, (a0)
+
+ret
+endfunc
diff --git a/libavcodec/riscv/vp9dsp.h b/libavcodec/riscv/vp9dsp.h
index 0ad961c7e0..79330b4968 100644
--- a/libavcodec/riscv/vp9dsp.h
+++ b/libavcodec/riscv/vp9dsp.h
@@ -72,6 +72,14 @@ void ff_h_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const 
uint8_t *l,
 const uint8_t *a);
 void ff_h_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
   const uint8_t *a);
+void ff_tm_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+ const uint8_t *a);
+void ff_tm_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+ const uint8_t *a);
+void ff_tm_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+   const uint8_t *a);
+void ff_tm_4x4_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+   const uint8_t *a);
 
 #define VP9_8TAP_RISCV_RVV_FUNC(SIZE, type, type_idx) \
 void ff_put_8tap_##type##_##SIZE##h_rvv(uint8_t *dst, ptrdiff_t 

Re: [FFmpeg-devel] [PATCH 4/9] lavc/vp9dsp: R-V V ipred tm

2024-05-14 Thread flow gg
updated for clean code

 于2024年5月15日周三 11:56写道:

> From: sunyuechi 
>
> C908:
> vp9_tm_4x4_8bpp_c: 116.5
> vp9_tm_4x4_8bpp_rvv_i32: 43.5
> vp9_tm_8x8_8bpp_c: 416.2
> vp9_tm_8x8_8bpp_rvv_i32: 86.0
> vp9_tm_16x16_8bpp_c: 1665.5
> vp9_tm_16x16_8bpp_rvv_i32: 187.2
> vp9_tm_32x32_8bpp_c: 6974.2
> vp9_tm_32x32_8bpp_rvv_i32: 625.7
> ---
>  libavcodec/riscv/vp9_intra_rvv.S | 118 +++
>  libavcodec/riscv/vp9dsp.h|   8 +++
>  libavcodec/riscv/vp9dsp_init.c   |   4 ++
>  3 files changed, 130 insertions(+)
>
> diff --git a/libavcodec/riscv/vp9_intra_rvv.S
> b/libavcodec/riscv/vp9_intra_rvv.S
> index ca156d65cd..280c497687 100644
> --- a/libavcodec/riscv/vp9_intra_rvv.S
> +++ b/libavcodec/riscv/vp9_intra_rvv.S
> @@ -173,3 +173,121 @@ func ff_h_8x8_rvv, zve32x
>
>  ret
>  endfunc
> +
> +.macro tm_sum4 dst1, dst2, dst3, dst4, top, n1
> +lbu  t1, \n1(a2)
> +lbu  t2, (\n1-1)(a2)
> +lbu  t3, (\n1-2)(a2)
> +lbu  t4, (\n1-3)(a2)
> +sub  t1, t1, a4
> +sub  t2, t2, a4
> +sub  t3, t3, a4
> +sub  t4, t4, a4
> +vadd.vx  \dst1, \top, t1
> +vadd.vx  \dst2, \top, t2
> +vadd.vx  \dst3, \top, t3
> +vadd.vx  \dst4, \top, t4
> +.endm
> +
> +func ff_tm_32x32_rvv, zve32x
> +lbu  a4, -1(a3)
> +li   t5, 32
> +
> +.irp offset 31, 23, 15, 7
> +vsetvli  zero, t5, e16, m4, ta, ma
> +vle8.v   v8, (a3)
> +vzext.vf2v28, v8
> +
> +tm_sum4  v0, v4, v8, v12, v28, \offset
> +tm_sum4  v16, v20, v24, v28, v28, (\offset-4)
> +
> +.irp n 0, 4, 8, 12, 16, 20, 24, 28
> +vmax.vx  v\n, v\n, zero
> +.endr
> +
> +vsetvli  zero, zero, e8, m2, ta, ma
> +.irp n 0, 4, 8, 12, 16, 20, 24, 28
> +vnclipu.wi   v\n, v\n, 0
> +vse8.v   v\n, (a0)
> +add  a0, a0, a1
> +.endr
> +.endr
> +
> +ret
> +endfunc
> +
> +func ff_tm_16x16_rvv, zve32x
> +vsetivli  zero, 16, e16, m2, ta, ma
> +vle8.vv8, (a3)
> +vzext.vf2 v30, v8
> +lbu   a4, -1(a3)
> +
> +tm_sum4   v0, v2, v4, v6, v30, 15
> +tm_sum4   v8, v10, v12, v14, v30, 11
> +tm_sum4   v16, v18, v20, v22, v30, 7
> +tm_sum4   v24, v26, v28, v30, v30, 3
> +
> +.irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30
> +vmax.vx  v\n, v\n, zero
> +.endr
> +
> +vsetvli  zero, zero, e8, m1, ta, ma
> +.irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28
> +vnclipu.wi   v\n, v\n, 0
> +vse8.v   v\n, (a0)
> +add  a0, a0, a1
> +.endr
> +vnclipu.wi   v30, v30, 0
> +vse8.v   v30, (a0)
> +
> +ret
> +endfunc
> +
> +func ff_tm_8x8_rvv, zve32x
> +vsetivli zero, 8, e16, m1, ta, ma
> +vle8.v   v8, (a3)
> +vzext.vf2v28, v8
> +lbu  a4, -1(a3)
> +
> +tm_sum4  v16, v17, v18, v19, v28, 7
> +tm_sum4  v20, v21, v22, v23, v28, 3
> +
> +.irp n 16, 17, 18, 19, 20, 21, 22, 23
> +vmax.vx  v\n, v\n, zero
> +.endr
> +
> +vsetvli  zero, zero, e8, mf2, ta, ma
> +.irp n 16, 17, 18, 19, 20, 21, 22
> +vnclipu.wi   v\n, v\n, 0
> +vse8.v   v\n, (a0)
> +add  a0, a0, a1
> +.endr
> +vnclipu.wi   v24, v23, 0
> +vse8.v   v24, (a0)
> +
> +ret
> +endfunc
> +
> +func ff_tm_4x4_rvv, zve32x
> +vsetivli zero, 4, e16, mf2, ta, ma
> +vle8.v   v8, (a3)
> +vzext.vf2v28, v8
> +lbu  a4, -1(a3)
> +
> +tm_sum4  v16, v17, v18, v19, v28, 3
> +
> +.irp n 16, 17, 18, 19
> +vmax.vx  v\n, v\n, zero
> +.endr
> +
> +vsetvli  zero, zero, e8, mf4, ta, ma
> +.irp n 16, 17, 18
> +vnclipu.wi   v\n, v\n, 0
> +vse8.v   v\n, (a0)
> +add  a0, a0, a1
> +.endr
> +vnclipu.wi   v24, v19, 0
> +vse8.v   v24, (a0)
> +
> +ret
> +endfunc
> diff --git a/libavcodec/riscv/vp9dsp.h b/libavcodec/riscv/vp9dsp.h
> index 0ad961c7e0..79330b4968 100644
> --- a/libavcodec/riscv/vp9dsp.h
> +++ b/libavcodec/riscv/vp9dsp.h
> @@ -72,6 +72,14 @@ void ff_h_16x16_rvv(uint8_t *dst, ptrdiff_t stride,
> const uint8_t *l,
>  const uint8_t *a);
>  void ff_h_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
>const uint8_t *a);
> +void ff_tm_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
> + const uint8_t *a);
> +void ff_tm_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
> + const uint8_t *a);
> +void