date:20220818

[FFmpeg-cvslog] avcodec/mpegvideo_dec: Don't sync AVCodecContext fields manually

2022-08-18 Thread Andreas Rheinhardt

ffmpeg | branch: master | Andreas Rheinhardt  | 
Mon Aug 15 17:58:23 2022 +0200| [afd9da24d9da6b4a194c03779e8863b8a66ed745] | 
committer: Andreas Rheinhardt

avcodec/mpegvideo_dec: Don't sync AVCodecContext fields manually

They are already synced generically in update_context_from_thread()
in pthread_frame.c.

Signed-off-by: Andreas Rheinhardt 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=afd9da24d9da6b4a194c03779e8863b8a66ed745
---

 libavcodec/mpegvideo_dec.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/libavcodec/mpegvideo_dec.c b/libavcodec/mpegvideo_dec.c
index 08385764b4..406c3feacf 100644
--- a/libavcodec/mpegvideo_dec.c
+++ b/libavcodec/mpegvideo_dec.c
@@ -90,11 +90,6 @@ int ff_mpeg_update_thread_context(AVCodecContext *dst,
 return ret;
 }
 
-s->avctx->coded_height  = s1->avctx->coded_height;
-s->avctx->coded_width   = s1->avctx->coded_width;
-s->avctx->width = s1->avctx->width;
-s->avctx->height= s1->avctx->height;
-
 s->quarter_sample   = s1->quarter_sample;
 
 s->coded_picture_number = s1->coded_picture_number;

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] avcodec/mpegvideo_dec: Remove commented-out cruft

2022-08-18 Thread Andreas Rheinhardt

ffmpeg | branch: master | Andreas Rheinhardt  | 
Mon Aug 15 12:04:31 2022 +0200| [22e157c1c6d540040dc356f5e218f021060ccf46] | 
committer: Andreas Rheinhardt

avcodec/mpegvideo_dec: Remove commented-out cruft

The fields in question were removed in
759001c534287a96dc96d1e274665feb7059145d.

Signed-off-by: Andreas Rheinhardt 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=22e157c1c6d540040dc356f5e218f021060ccf46
---

 libavcodec/mpegvideo_dec.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/libavcodec/mpegvideo_dec.c b/libavcodec/mpegvideo_dec.c
index 7566fe69f9..08385764b4 100644
--- a/libavcodec/mpegvideo_dec.c
+++ b/libavcodec/mpegvideo_dec.c
@@ -73,8 +73,6 @@ int ff_mpeg_update_thread_context(AVCodecContext *dst,
 s->bitstream_buffer_size = s->allocated_bitstream_buffer_size = 0;
 
 if (s1->context_initialized) {
-// s->picture_range_start  += MAX_PICTURE_COUNT;
-// s->picture_range_end+= MAX_PICTURE_COUNT;
 ff_mpv_idct_init(s);
 if ((err = ff_mpv_common_init(s)) < 0) {
 memset(s, 0, sizeof(*s));

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] doc: fix binary values of SI prefixes

2022-08-18 Thread Chema Gonzalez

ffmpeg | branch: master | Chema Gonzalez  | Wed Aug 17 
10:05:39 2022 -0700| [59225b459fcf6c8b20ef7585cd87e73cb6a4113d] | committer: 
Andreas Rheinhardt

doc: fix binary values of SI prefixes

Signed-off-by: Andreas Rheinhardt 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=59225b459fcf6c8b20ef7585cd87e73cb6a4113d
---

 doc/utils.texi | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/doc/utils.texi b/doc/utils.texi
index 232a0608b3..627b55d154 100644
--- a/doc/utils.texi
+++ b/doc/utils.texi
@@ -1073,13 +1073,13 @@ indication of the corresponding powers of 10 and of 2.
 @item T
 10^12 / 2^40
 @item P
-10^15 / 2^40
+10^15 / 2^50
 @item E
-10^18 / 2^50
+10^18 / 2^60
 @item Z
-10^21 / 2^60
+10^21 / 2^70
 @item Y
-10^24 / 2^70
+10^24 / 2^80
 @end table
 
 @c man end EXPRESSION EVALUATION

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] avcodec/ffv1enc: Remove redundant wrapper

2022-08-18 Thread Andreas Rheinhardt

ffmpeg | branch: master | Andreas Rheinhardt  | 
Sun Aug 14 15:28:56 2022 +0200| [3553b70d6d1282c1118e12f78b90c402e0d5f25c] | 
committer: Andreas Rheinhardt

avcodec/ffv1enc: Remove redundant wrapper

Signed-off-by: Andreas Rheinhardt 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=3553b70d6d1282c1118e12f78b90c402e0d5f25c
---

 libavcodec/ffv1enc.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/libavcodec/ffv1enc.c b/libavcodec/ffv1enc.c
index 6f8b8275b5..90593fbaf1 100644
--- a/libavcodec/ffv1enc.c
+++ b/libavcodec/ffv1enc.c
@@ -1240,12 +1240,6 @@ static int encode_frame(AVCodecContext *avctx, AVPacket 
*pkt,
 return 0;
 }
 
-static av_cold int encode_close(AVCodecContext *avctx)
-{
-ff_ffv1_close(avctx);
-return 0;
-}
-
 #define OFFSET(x) offsetof(FFV1Context, x)
 #define VE AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_ENCODING_PARAM
 static const AVOption options[] = {
@@ -1281,7 +1275,7 @@ const FFCodec ff_ffv1_encoder = {
 .priv_data_size = sizeof(FFV1Context),
 .init   = encode_init,
 FF_CODEC_ENCODE_CB(encode_frame),
-.close  = encode_close,
+.close  = ff_ffv1_close,
 .p.capabilities = AV_CODEC_CAP_SLICE_THREADS | AV_CODEC_CAP_DELAY,
 .p.pix_fmts = (const enum AVPixelFormat[]) {
 AV_PIX_FMT_YUV420P,   AV_PIX_FMT_YUVA420P,  AV_PIX_FMT_YUVA422P,  
AV_PIX_FMT_YUV444P,

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] avcodec/ffv1enc: Don't create and keep unnecessary reference

2022-08-18 Thread Andreas Rheinhardt

ffmpeg | branch: master | Andreas Rheinhardt  | 
Sun Aug 14 13:37:56 2022 +0200| [7e9a79044105a649c6091049909ad242e6b35d2e] | 
committer: Andreas Rheinhardt

avcodec/ffv1enc: Don't create and keep unnecessary reference

Signed-off-by: Andreas Rheinhardt 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=7e9a79044105a649c6091049909ad242e6b35d2e
---

 libavcodec/ffv1.h|  1 +
 libavcodec/ffv1enc.c | 15 ++-
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/libavcodec/ffv1.h b/libavcodec/ffv1.h
index ac80fa85ce..3532815501 100644
--- a/libavcodec/ffv1.h
+++ b/libavcodec/ffv1.h
@@ -91,6 +91,7 @@ typedef struct FFV1Context {
 struct FFV1Context *fsrc;
 
 AVFrame *cur;
+const AVFrame *cur_enc_frame;
 int plane_count;
 int ac;  ///< 1=range coder <-> 0=golomb rice
 int ac_byte_count;   ///< number of bytes used for AC 
coding
diff --git a/libavcodec/ffv1enc.c b/libavcodec/ffv1enc.c
index ec06636db5..6f8b8275b5 100644
--- a/libavcodec/ffv1enc.c
+++ b/libavcodec/ffv1enc.c
@@ -916,12 +916,12 @@ static void encode_slice_header(FFV1Context *f, 
FFV1Context *fs)
 put_symbol(c, state, f->plane[j].quant_table_index, 0);
 av_assert0(f->plane[j].quant_table_index == f->context_model);
 }
-if (!f->picture.f->interlaced_frame)
+if (!f->cur_enc_frame->interlaced_frame)
 put_symbol(c, state, 3, 0);
 else
-put_symbol(c, state, 1 + !f->picture.f->top_field_first, 0);
-put_symbol(c, state, f->picture.f->sample_aspect_ratio.num, 0);
-put_symbol(c, state, f->picture.f->sample_aspect_ratio.den, 0);
+put_symbol(c, state, 1 + !f->cur_enc_frame->top_field_first, 0);
+put_symbol(c, state, f->cur_enc_frame->sample_aspect_ratio.num, 0);
+put_symbol(c, state, f->cur_enc_frame->sample_aspect_ratio.den, 0);
 if (f->version > 3) {
 put_rac(c, state, fs->slice_coding_mode == 1);
 if (fs->slice_coding_mode == 1)
@@ -1024,7 +1024,7 @@ static int encode_slice(AVCodecContext *c, void *arg)
 int height   = fs->slice_height;
 int x= fs->slice_x;
 int y= fs->slice_y;
-const AVFrame *const p = f->picture.f;
+const AVFrame *const p = f->cur_enc_frame;
 const int ps = av_pix_fmt_desc_get(c->pix_fmt)->comp[0].step;
 int ret;
 RangeCoder c_bak = fs->c;
@@ -1098,7 +1098,6 @@ static int encode_frame(AVCodecContext *avctx, AVPacket 
*pkt,
 {
 FFV1Context *f  = avctx->priv_data;
 RangeCoder *const c = >slice_context[0]->c;
-AVFrame *const p= f->picture.f;
 uint8_t keystate= 128;
 uint8_t *buf_p;
 int i, ret;
@@ -1165,9 +1164,7 @@ static int encode_frame(AVCodecContext *avctx, AVPacket 
*pkt,
 ff_init_range_encoder(c, pkt->data, pkt->size);
 ff_build_rac_states(c, 0.05 * (1LL << 32), 256 - 8);
 
-av_frame_unref(p);
-if ((ret = av_frame_ref(p, pict)) < 0)
-return ret;
+f->cur_enc_frame = pict;
 
 if (avctx->gop_size == 0 || f->picture_number % avctx->gop_size == 0) {
 put_rac(c, , 1);

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] avcodec/get_buffer: Don't get AVPixFmtDescriptor unnecessarily

2022-08-18 Thread Andreas Rheinhardt

ffmpeg | branch: master | Andreas Rheinhardt  | 
Mon Aug 15 10:18:22 2022 +0200| [f76cef5c518a9874ec4e3b4b36c5b909c3452919] | 
committer: Andreas Rheinhardt

avcodec/get_buffer: Don't get AVPixFmtDescriptor unnecessarily

It is unused since 3575a495f6dcc395656343380e13c57d48b9f976
(and the error message is dangerous: av_get_pix_fmt_name(format)
returns NULL iff av_pix_fmt_desc_get(format) returns NULL
and using a NULL string for %s would be UB).

Signed-off-by: Andreas Rheinhardt 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=f76cef5c518a9874ec4e3b4b36c5b909c3452919
---

 libavcodec/get_buffer.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/libavcodec/get_buffer.c b/libavcodec/get_buffer.c
index 3e45a0479f..a04fd878de 100644
--- a/libavcodec/get_buffer.c
+++ b/libavcodec/get_buffer.c
@@ -246,7 +246,6 @@ fail:
 static int video_get_buffer(AVCodecContext *s, AVFrame *pic)
 {
 FramePool *pool = (FramePool*)s->internal->pool->data;
-const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pic->format);
 int i;
 
 if (pic->data[0] || pic->data[1] || pic->data[2] || pic->data[3]) {
@@ -254,13 +253,6 @@ static int video_get_buffer(AVCodecContext *s, AVFrame 
*pic)
 return -1;
 }
 
-if (!desc) {
-av_log(s, AV_LOG_ERROR,
-"Unable to get pixel format descriptor for format %s\n",
-av_get_pix_fmt_name(pic->format));
-return AVERROR(EINVAL);
-}
-
 memset(pic->data, 0, sizeof(pic->data));
 pic->extended_data = pic->data;
 

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] avcodec/mpegpicture: Reset fields explicitly instead of memsetting them

2022-08-18 Thread Andreas Rheinhardt

ffmpeg | branch: master | Andreas Rheinhardt  | 
Sun Aug 14 00:06:16 2022 +0200| [e50684318390b5cffc68da131f7630f11814b808] | 
committer: Andreas Rheinhardt

avcodec/mpegpicture: Reset fields explicitly instead of memsetting them

Improves the grepability of the code.
(Furthermore, I hope that no compiler will really call memset
for 28 bytes.)

Signed-off-by: Andreas Rheinhardt 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=e50684318390b5cffc68da131f7630f11814b808
---

 libavcodec/mpegpicture.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/libavcodec/mpegpicture.c b/libavcodec/mpegpicture.c
index 711ce35f9d..977bc65191 100644
--- a/libavcodec/mpegpicture.c
+++ b/libavcodec/mpegpicture.c
@@ -311,8 +311,6 @@ fail:
  */
 void ff_mpeg_unref_picture(AVCodecContext *avctx, Picture *pic)
 {
-int off = offsetof(Picture, hwaccel_priv_buf) + 
sizeof(pic->hwaccel_priv_buf);
-
 pic->tf.f = pic->f;
 /* WM Image / Screen codecs allocate internal buffers with different
  * dimensions / colorspaces; ignore user-defined callbacks for these. */
@@ -328,7 +326,12 @@ void ff_mpeg_unref_picture(AVCodecContext *avctx, Picture 
*pic)
 if (pic->needs_realloc)
 free_picture_tables(pic);
 
-memset((uint8_t*)pic + off, 0, sizeof(*pic) - off);
+pic->hwaccel_picture_private = NULL;
+pic->field_picture = 0;
+pic->b_frame_score = 0;
+pic->needs_realloc = 0;
+pic->reference = 0;
+pic->shared= 0;
 }
 
 int ff_update_picture_tables(Picture *dst, const Picture *src)

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] avcodec/h263dec: Don't set frame parameters redundantly

2022-08-18 Thread Andreas Rheinhardt

ffmpeg | branch: master | Andreas Rheinhardt  | 
Sat Aug 13 20:59:07 2022 +0200| [f0ea5094afa3b056cf9c2f71cacadb4fbb7a6a95] | 
committer: Andreas Rheinhardt

avcodec/h263dec: Don't set frame parameters redundantly

This frame will be reset later in ff_mpv_frame_start()
anyway.

Signed-off-by: Andreas Rheinhardt 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=f0ea5094afa3b056cf9c2f71cacadb4fbb7a6a95
---

 libavcodec/h263dec.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/libavcodec/h263dec.c b/libavcodec/h263dec.c
index a65f16caea..a14d7811f5 100644
--- a/libavcodec/h263dec.c
+++ b/libavcodec/h263dec.c
@@ -583,10 +583,6 @@ retry:
 s->codec_id == AV_CODEC_ID_H263I)
 s->gob_index = H263_GOB_HEIGHT(s->height);
 
-// for skipping the frame
-s->current_picture.f->pict_type = s->pict_type;
-s->current_picture.f->key_frame = s->pict_type == AV_PICTURE_TYPE_I;
-
 /* skip B-frames if we don't have reference frames */
 if (!s->last_picture_ptr &&
 (s->pict_type == AV_PICTURE_TYPE_B || s->droppable))

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] avcodec/h263dec: Remove redundant code to set cur_pic_ptr

2022-08-18 Thread Andreas Rheinhardt

ffmpeg | branch: master | Andreas Rheinhardt  | 
Sat Aug 13 20:37:11 2022 +0200| [74d623914f02aa79447df43a742efd0929dded04] | 
committer: Andreas Rheinhardt

avcodec/h263dec: Remove redundant code to set cur_pic_ptr

It is done later in ff_mpv_frame_start() (and nobody uses
current_picture_ptr between setting it in ff_mpv_frame_start()).

(The reason the vsynth*-h263-obmc ref files change is because
the call to ff_find_unused_picture() now happens after the older
pictures have been unreferenced in ff_mpv_frame_start(),
so that their slots in the picture array can be immediately
reused; the obmc code is somehow buggy and changes its output
depending on the earlier contents of the motion_val buffer.)

Signed-off-by: Andreas Rheinhardt 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=74d623914f02aa79447df43a742efd0929dded04
---

 libavcodec/h263dec.c   | 7 ---
 tests/ref/vsynth/vsynth1-h263-obmc | 4 ++--
 tests/ref/vsynth/vsynth2-h263-obmc | 4 ++--
 tests/ref/vsynth/vsynth_lena-h263-obmc | 4 ++--
 4 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/libavcodec/h263dec.c b/libavcodec/h263dec.c
index 8db0eccd89..a65f16caea 100644
--- a/libavcodec/h263dec.c
+++ b/libavcodec/h263dec.c
@@ -543,13 +543,6 @@ retry:
 return ret;
 }
 
-if (!s->current_picture_ptr || s->current_picture_ptr->f->data[0]) {
-int i = ff_find_unused_picture(s->avctx, s->picture, 0);
-if (i < 0)
-return i;
-s->current_picture_ptr = >picture[i];
-}
-
 avctx->has_b_frames = !s->low_delay;
 
 if (CONFIG_MPEG4_DECODER && avctx->codec_id == AV_CODEC_ID_MPEG4) {
diff --git a/tests/ref/vsynth/vsynth1-h263-obmc 
b/tests/ref/vsynth/vsynth1-h263-obmc
index b7a267a8cb..aed283ed53 100644
--- a/tests/ref/vsynth/vsynth1-h263-obmc
+++ b/tests/ref/vsynth/vsynth1-h263-obmc
@@ -1,4 +1,4 @@
 7dec64380f375e5118b66f3b1e24 *tests/data/fate/vsynth1-h263-obmc.avi
 657320 tests/data/fate/vsynth1-h263-obmc.avi
-844f7ee27fa122e199fe20987b41a15c 
*tests/data/fate/vsynth1-h263-obmc.out.rawvideo
-stddev:8.16 PSNR: 29.89 MAXDIFF:  113 bytes:  7603200/  7603200
+2a69f6b37378aa34418dfd04ec98c1c8 
*tests/data/fate/vsynth1-h263-obmc.out.rawvideo
+stddev:8.38 PSNR: 29.66 MAXDIFF:  116 bytes:  7603200/  7603200
diff --git a/tests/ref/vsynth/vsynth2-h263-obmc 
b/tests/ref/vsynth/vsynth2-h263-obmc
index 2cef7f551b..c0dcc3239e 100644
--- a/tests/ref/vsynth/vsynth2-h263-obmc
+++ b/tests/ref/vsynth/vsynth2-h263-obmc
@@ -1,4 +1,4 @@
 2d8a58b295e03f94e6a41468b2d3909e *tests/data/fate/vsynth2-h263-obmc.avi
 208522 tests/data/fate/vsynth2-h263-obmc.avi
-4a939ef99fc759293f2e609bfcacd2a4 
*tests/data/fate/vsynth2-h263-obmc.out.rawvideo
-stddev:6.10 PSNR: 32.41 MAXDIFF:   90 bytes:  7603200/  7603200
+3500b4227c1e6309ca5213414599266f 
*tests/data/fate/vsynth2-h263-obmc.out.rawvideo
+stddev:6.19 PSNR: 32.29 MAXDIFF:  111 bytes:  7603200/  7603200
diff --git a/tests/ref/vsynth/vsynth_lena-h263-obmc 
b/tests/ref/vsynth/vsynth_lena-h263-obmc
index 5b963107f6..78d7cc7277 100644
--- a/tests/ref/vsynth/vsynth_lena-h263-obmc
+++ b/tests/ref/vsynth/vsynth_lena-h263-obmc
@@ -1,4 +1,4 @@
 3c6946f808412ac320be9e0c36051ea2 *tests/data/fate/vsynth_lena-h263-obmc.avi
 154730 tests/data/fate/vsynth_lena-h263-obmc.avi
-588d992d9d8096da8bdc5027268da914 
*tests/data/fate/vsynth_lena-h263-obmc.out.rawvideo
-stddev:5.39 PSNR: 33.49 MAXDIFF:   82 bytes:  7603200/  7603200
+737af7fb166e2260ba049ae6bc30673d 
*tests/data/fate/vsynth_lena-h263-obmc.out.rawvideo
+stddev:5.42 PSNR: 33.44 MAXDIFF:   77 bytes:  7603200/  7603200

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] checkasm/sw_scale: hscale does not requires cpuflag test.

2022-08-18 Thread Alan Kelly

ffmpeg | branch: master | Alan Kelly  | Fri 
Jul 15 17:01:31 2022 +0200| [da0a37bab7434ef485146ce8575c7948db1fe3e2] | 
committer: Anton Khirnov

checkasm/sw_scale: hscale does not requires cpuflag test.

This is done in ff_shuffle_filter_coefficients.

Signed-off-by: Anton Khirnov 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=da0a37bab7434ef485146ce8575c7948db1fe3e2
---

 tests/checkasm/sw_scale.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tests/checkasm/sw_scale.c b/tests/checkasm/sw_scale.c
index 9c07dd0421..86d266fb3e 100644
--- a/tests/checkasm/sw_scale.c
+++ b/tests/checkasm/sw_scale.c
@@ -278,8 +278,6 @@ static void check_hscale(void)
   const uint8_t *src, const int16_t *filter,
   const int32_t *filterPos, int filterSize);
 
-int cpu_flags = av_get_cpu_flags();
-
 ctx = sws_alloc_context();
 if (sws_init_context(ctx, NULL, NULL) < 0)
 fail();
@@ -328,8 +326,7 @@ static void check_hscale(void)
 ctx->dstW = ctx->chrDstW = input_sizes[dstWi];
 ff_sws_init_scale(ctx);
 memcpy(filterAvx2, filter, sizeof(uint16_t) * (SRC_PIXELS * 
MAX_FILTER_WIDTH + MAX_FILTER_WIDTH));
-if ((cpu_flags & AV_CPU_FLAG_AVX2) && !(cpu_flags & 
AV_CPU_FLAG_SLOW_GATHER))
-ff_shuffle_filter_coefficients(ctx, filterPosAvx, width, 
filterAvx2, ctx->dstW);
+ff_shuffle_filter_coefficients(ctx, filterPosAvx, width, 
filterAvx2, ctx->dstW);
 
 if (check_func(ctx->hcScale, "hscale_%d_to_%d__fs_%d_dstW_%d", 
ctx->srcBpc, ctx->dstBpc + 1, width, ctx->dstW)) {
 memset(dst0, 0, SRC_PIXELS * sizeof(dst0[0]));

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] libswscale: Enable hscale_avx2 for all input sizes.

2022-08-18 Thread Alan Kelly

ffmpeg | branch: master | Alan Kelly  | Fri 
Jul 15 16:59:43 2022 +0200| [a38293e4448c9389e604af9858984361a5677a20] | 
committer: Anton Khirnov

libswscale: Enable hscale_avx2 for all input sizes.

ff_shuffle_filter_coefficients shuffles the tail as required.

Signed-off-by: Anton Khirnov 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=a38293e4448c9389e604af9858984361a5677a20
---

 libswscale/utils.c| 19 ---
 libswscale/x86/swscale.c  |  6 ++
 tests/checkasm/sw_scale.c |  2 +-
 3 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/libswscale/utils.c b/libswscale/utils.c
index 34503e57f4..baa1791ebe 100644
--- a/libswscale/utils.c
+++ b/libswscale/utils.c
@@ -268,8 +268,7 @@ int ff_shuffle_filter_coefficients(SwsContext *c, int 
*filterPos,
 #if ARCH_X86_64
 int i, j, k;
 int cpu_flags = av_get_cpu_flags();
-// avx2 hscale filter processes 16 pixel blocks.
-if (!filter || dstW % 16 != 0)
+if (!filter)
 return 0;
 if (EXTERNAL_AVX2_FAST(cpu_flags) && !(cpu_flags & 
AV_CPU_FLAG_SLOW_GATHER)) {
 if ((c->srcBpc == 8) && (c->dstBpc <= 14)) {
@@ -281,9 +280,11 @@ int ff_shuffle_filter_coefficients(SwsContext *c, int 
*filterPos,
}
// Do not swap filterPos for pixels which won't be processed by
// the main loop.
-   for (i = 0; i + 8 <= dstW; i += 8) {
+   for (i = 0; i + 16 <= dstW; i += 16) {
FFSWAP(int, filterPos[i + 2], filterPos[i + 4]);
FFSWAP(int, filterPos[i + 3], filterPos[i + 5]);
+   FFSWAP(int, filterPos[i + 10], filterPos[i + 12]);
+   FFSWAP(int, filterPos[i + 11], filterPos[i + 13]);
}
if (filterSize > 4) {
// 16 pixels are processed at a time.
@@ -297,6 +298,18 @@ int ff_shuffle_filter_coefficients(SwsContext *c, int 
*filterPos,
}
}
}
+   // 4 pixels are processed at a time in the tail.
+   for (; i < dstW; i += 4) {
+   // 4 filter coeffs are processed at a time.
+   int rem = dstW - i >= 4 ? 4 : dstW - i;
+   for (k = 0; k + 4 <= filterSize; k += 4) {
+   for (j = 0; j < rem; ++j) {
+   int from = (i + j) * filterSize + k;
+   int to = i * filterSize + j * 4 + k * 4;
+   memcpy([to], [from], 4 * 
sizeof(int16_t));
+   }
+   }
+   }
}
av_free(filterCopy);
 }
diff --git a/libswscale/x86/swscale.c b/libswscale/x86/swscale.c
index 89ef9f5d2b..ec1ca0e01c 100644
--- a/libswscale/x86/swscale.c
+++ b/libswscale/x86/swscale.c
@@ -625,10 +625,8 @@ switch(c->dstBpc){ \
 
 if (EXTERNAL_AVX2_FAST(cpu_flags) && !(cpu_flags & 
AV_CPU_FLAG_SLOW_GATHER)) {
 if ((c->srcBpc == 8) && (c->dstBpc <= 14)) {
-if (c->chrDstW % 16 == 0)
-ASSIGN_AVX2_SCALE_FUNC(c->hcScale, c->hChrFilterSize);
-if (c->dstW % 16 == 0)
-ASSIGN_AVX2_SCALE_FUNC(c->hyScale, c->hLumFilterSize);
+ASSIGN_AVX2_SCALE_FUNC(c->hcScale, c->hChrFilterSize);
+ASSIGN_AVX2_SCALE_FUNC(c->hyScale, c->hLumFilterSize);
 }
 }
 
diff --git a/tests/checkasm/sw_scale.c b/tests/checkasm/sw_scale.c
index cbe4460a99..9c07dd0421 100644
--- a/tests/checkasm/sw_scale.c
+++ b/tests/checkasm/sw_scale.c
@@ -329,7 +329,7 @@ static void check_hscale(void)
 ff_sws_init_scale(ctx);
 memcpy(filterAvx2, filter, sizeof(uint16_t) * (SRC_PIXELS * 
MAX_FILTER_WIDTH + MAX_FILTER_WIDTH));
 if ((cpu_flags & AV_CPU_FLAG_AVX2) && !(cpu_flags & 
AV_CPU_FLAG_SLOW_GATHER))
-ff_shuffle_filter_coefficients(ctx, filterPosAvx, width, 
filterAvx2, SRC_PIXELS);
+ff_shuffle_filter_coefficients(ctx, filterPosAvx, width, 
filterAvx2, ctx->dstW);
 
 if (check_func(ctx->hcScale, "hscale_%d_to_%d__fs_%d_dstW_%d", 
ctx->srcBpc, ctx->dstBpc + 1, width, ctx->dstW)) {
 memset(dst0, 0, SRC_PIXELS * sizeof(dst0[0]));

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] sws: allow avx2 hscale to process inputs of any size.

2022-08-18 Thread Alan Kelly

ffmpeg | branch: master | Alan Kelly  | Tue 
Apr 26 10:00:02 2022 +0200| [a6724285fd45111436dd5242eab2c489182aa5c2] | 
committer: Anton Khirnov

sws: allow avx2 hscale to process inputs of any size.

The main loop processes blocks of 16 pixels. The tail processes blocks
of size 4.

Signed-off-by: Anton Khirnov 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=a6724285fd45111436dd5242eab2c489182aa5c2
---

 libswscale/x86/scale_avx2.asm | 44 ++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/libswscale/x86/scale_avx2.asm b/libswscale/x86/scale_avx2.asm
index 20acdbd633..37095e596a 100644
--- a/libswscale/x86/scale_avx2.asm
+++ b/libswscale/x86/scale_avx2.asm
@@ -53,6 +53,9 @@ cglobal hscale8to15_%1, 7, 9, 16, pos0, dst, w, srcmem, 
filter, fltpos, fltsize,
 mova m14, [four]
 shr fltsized, 2
 %endif
+cmp wq, 0x10
+jl .tail_loop
+sub wq, 0x10
 .loop:
 movu m1, [fltposq]
 movu m2, [fltposq+32]
@@ -101,7 +104,46 @@ cglobal hscale8to15_%1, 7, 9, 16, pos0, dst, w, srcmem, 
filter, fltpos, fltsize,
 add fltposq, 0x40
 add countq, 0x10
 cmp countq, wq
-jl .loop
+jle .loop
+
+add wq, 0x10
+cmp countq, wq
+jge .end
+
+.tail_loop:
+movu xm1, [fltposq]
+%ifidn %1, X4
+pxor xm9, xm9
+pxor xm10, xm10
+xor innerq, innerq
+.tail_innerloop:
+%endif
+vpcmpeqd  xm13, xm13
+vpgatherdd xm3,[srcmemq + xm1], xm13
+vpunpcklbw xm5, xm3, xm0
+vpunpckhbw xm6, xm3, xm0
+vpmaddwd xm5, xm5, [filterq]
+vpmaddwd xm6, xm6, [filterq + 0x10]
+add filterq, 0x20
+%ifidn %1, X4
+paddd xm9, xm5
+paddd xm10, xm6
+paddd xm1, xm14
+add innerq, 1
+cmp innerq, fltsizeq
+jl .tail_innerloop
+vphaddd xm5, xm9, xm10
+%else
+vphaddd xm5, xm5, xm6
+%endif
+vpsrad  xm5, 7
+vpackssdw xm5, xm5, xm5
+vmovq [dstq + countq * 2], xm5
+add fltposq, 0x10
+add countq, 0x4
+cmp countq, wq
+jl .tail_loop
+.end:
 REP_RET
 %endmacro
 

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] sws: Replace call to yuv2yuvX_mmx by yuv2yuvX_mmxext

2022-08-18 Thread Alan Kelly

ffmpeg | branch: master | Alan Kelly  | Wed 
Aug 17 11:20:39 2022 +0200| [51a34e8525fea2bbc29b42831d7a17f34e8518d3] | 
committer: Andreas Rheinhardt

sws: Replace call to yuv2yuvX_mmx by yuv2yuvX_mmxext

Signed-off-by: Andreas Rheinhardt 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=51a34e8525fea2bbc29b42831d7a17f34e8518d3
---

 libswscale/x86/swscale.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/libswscale/x86/swscale.c b/libswscale/x86/swscale.c
index 32d441245d..89ef9f5d2b 100644
--- a/libswscale/x86/swscale.c
+++ b/libswscale/x86/swscale.c
@@ -205,20 +205,17 @@ static void yuv2yuvX_ ##opt(const int16_t *filter, int 
filterSize, \
 int remainder = (dstW % step); \
 int pixelsProcessed = dstW - remainder; \
 if(((uintptr_t)dest) & 15){ \
-yuv2yuvX_mmx(filter, filterSize, src, dest, dstW, dither, offset); \
+yuv2yuvX_mmxext(filter, filterSize, src, dest, dstW, dither, offset); \
 return; \
 } \
 if(pixelsProcessed > 0) \
 ff_yuv2yuvX_ ##opt(filter, filterSize - 1, 0, dest - offset, 
pixelsProcessed + offset, dither, offset); \
 if(remainder > 0){ \
-  ff_yuv2yuvX_mmx(filter, filterSize - 1, pixelsProcessed, dest - offset, 
pixelsProcessed + remainder + offset, dither, offset); \
+  ff_yuv2yuvX_mmxext(filter, filterSize - 1, pixelsProcessed, dest - 
offset, pixelsProcessed + remainder + offset, dither, offset); \
 } \
 return; \
 }
 
-#if HAVE_MMX_EXTERNAL
-YUV2YUVX_FUNC_MMX(mmx, 16)
-#endif
 #if HAVE_MMXEXT_EXTERNAL
 YUV2YUVX_FUNC_MMX(mmxext, 16)
 #endif

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] lavc/aarch64: hevc_add_res add 12bit variants

2022-08-18 Thread J . Dekker

ffmpeg | branch: master | J. Dekker  | Tue Aug 16 07:01:53 
2022 +0200| [ce2f47318bdd1586f538059ed36fbf61e825023d] | committer: J. Dekker

lavc/aarch64: hevc_add_res add 12bit variants

hevc_add_res_4x4_12_c: 46.0
hevc_add_res_4x4_12_neon: 18.7
hevc_add_res_8x8_12_c: 194.7
hevc_add_res_8x8_12_neon: 25.2
hevc_add_res_16x16_12_c: 716.0
hevc_add_res_16x16_12_neon: 69.7
hevc_add_res_32x32_12_c: 3820.7
hevc_add_res_32x32_12_neon: 261.0

Signed-off-by: J. Dekker 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=ce2f47318bdd1586f538059ed36fbf61e825023d
---

 libavcodec/aarch64/hevcdsp_idct_neon.S| 158 +-
 libavcodec/aarch64/hevcdsp_init_aarch64.c |  14 +++
 2 files changed, 102 insertions(+), 70 deletions(-)

diff --git a/libavcodec/aarch64/hevcdsp_idct_neon.S 
b/libavcodec/aarch64/hevcdsp_idct_neon.S
index 484eea8437..124c50998a 100644
--- a/libavcodec/aarch64/hevcdsp_idct_neon.S
+++ b/libavcodec/aarch64/hevcdsp_idct_neon.S
@@ -5,7 +5,7 @@
  *
  * Ported from arm/hevcdsp_idct_neon.S by
  * Copyright (c) 2020 Reimar Döffinger
- * Copyright (c) 2020 Josh Dekker
+ * Copyright (c) 2020 J. Dekker
  *
  * This file is part of FFmpeg.
  *
@@ -37,11 +37,11 @@ const trans, align=4
 .short  31, 22, 13, 4
 endconst
 
-.macro clip10 in1, in2, c1, c2
-smax\in1, \in1, \c1
-smax\in2, \in2, \c1
-smin\in1, \in1, \c2
-smin\in2, \in2, \c2
+.macro clip2 in1, in2, min, max
+smax\in1, \in1, \min
+smax\in2, \in2, \min
+smin\in1, \in1, \max
+smin\in2, \in2, \max
 .endm
 
 function ff_hevc_add_residual_4x4_8_neon, export=1
@@ -64,25 +64,6 @@ function ff_hevc_add_residual_4x4_8_neon, export=1
 ret
 endfunc
 
-function ff_hevc_add_residual_4x4_10_neon, export=1
-mov x12,  x0
-ld1 {v0.8h-v1.8h}, [x1]
-ld1 {v2.d}[0], [x12], x2
-ld1 {v2.d}[1], [x12], x2
-ld1 {v3.d}[0], [x12], x2
-sqadd   v0.8h, v0.8h, v2.8h
-ld1 {v3.d}[1], [x12], x2
-moviv4.8h, #0
-sqadd   v1.8h, v1.8h, v3.8h
-mvniv5.8h, #0xFC, lsl #8 // movi #0x3FF
-clip10  v0.8h, v1.8h, v4.8h, v5.8h
-st1 {v0.d}[0], [x0],  x2
-st1 {v0.d}[1], [x0],  x2
-st1 {v1.d}[0], [x0],  x2
-st1 {v1.d}[1], [x0],  x2
-ret
-endfunc
-
 function ff_hevc_add_residual_8x8_8_neon, export=1
 add x12, x0, x2
 add x2, x2, x2
@@ -103,25 +84,6 @@ function ff_hevc_add_residual_8x8_8_neon, export=1
 ret
 endfunc
 
-function ff_hevc_add_residual_8x8_10_neon, export=1
-add x12, x0, x2
-add x2,  x2, x2
-mov x3,  #8
-moviv4.8h, #0
-mvniv5.8h, #0xFC, lsl #8 // movi #0x3FF
-1:  subsx3,  x3, #2
-ld1 {v0.8h-v1.8h}, [x1], #32
-ld1 {v2.8h}, [x0]
-sqadd   v0.8h, v0.8h, v2.8h
-ld1 {v3.8h}, [x12]
-sqadd   v1.8h, v1.8h, v3.8h
-clip10  v0.8h, v1.8h, v4.8h, v5.8h
-st1 {v0.8h}, [x0],  x2
-st1 {v1.8h}, [x12], x2
-bne 1b
-ret
-endfunc
-
 function ff_hevc_add_residual_16x16_8_neon, export=1
 mov x3,  #16
 add x12, x0, x2
@@ -148,28 +110,6 @@ function ff_hevc_add_residual_16x16_8_neon, export=1
 ret
 endfunc
 
-function ff_hevc_add_residual_16x16_10_neon, export=1
-mov x3,  #16
-moviv20.8h, #0
-mvniv21.8h, #0xFC, lsl #8 // movi #0x3FF
-add x12,  x0, x2
-add x2,  x2, x2
-1:  subsx3,  x3, #2
-ld1 {v16.8h-v17.8h}, [x0]
-ld1 {v0.8h-v3.8h},   [x1], #64
-sqadd   v0.8h, v0.8h, v16.8h
-ld1 {v18.8h-v19.8h}, [x12]
-sqadd   v1.8h, v1.8h, v17.8h
-sqadd   v2.8h, v2.8h, v18.8h
-sqadd   v3.8h, v3.8h, v19.8h
-clip10  v0.8h, v1.8h, v20.8h, v21.8h
-clip10  v2.8h, v3.8h, v20.8h, v21.8h
-st1 {v0.8h-v1.8h}, [x0],  x2
-st1 {v2.8h-v3.8h}, [x12], x2
-bne 1b
-ret
-endfunc
-
 function ff_hevc_add_residual_32x32_8_neon, export=1
 add x12,  x0, x2
 add x2,  x2, x2
@@ -209,10 +149,88 @@ function ff_hevc_add_residual_32x32_8_neon, export=1
 ret
 endfunc
 
-function ff_hevc_add_residual_32x32_10_neon, export=1
+.macro add_res bitdepth
+function ff_hevc_add_residual_4x4_\bitdepth\()_neon,

[FFmpeg-cvslog] aarch64: me_cmp: Remove a leftover unnecessary instruction

2022-08-18 Thread Martin Storsjö

ffmpeg | branch: master | Martin Storsjö  | Thu Aug 18 
12:14:15 2022 +0300| [48be6616d0536c5b0ff3ee58caee4c024ca64116] | committer: 
Martin Storsjö

aarch64: me_cmp: Remove a leftover unnecessary instruction

This was missed in a2e45ad407c526cd5ce2f3a361fb98084228cd6e.

Signed-off-by: Martin Storsjö 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=48be6616d0536c5b0ff3ee58caee4c024ca64116
---

 libavcodec/aarch64/me_cmp_neon.S | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/libavcodec/aarch64/me_cmp_neon.S b/libavcodec/aarch64/me_cmp_neon.S
index b89c25438e..4198985c6c 100644
--- a/libavcodec/aarch64/me_cmp_neon.S
+++ b/libavcodec/aarch64/me_cmp_neon.S
@@ -328,7 +328,6 @@ function ff_pix_abs16_y2_neon, export=1
 // initialize buffers
 moviv29.8h, #0  // clear the 
accumulator
 moviv28.8h, #0  // clear the 
accumulator
-movid18, #0
 add x5, x2, x3  // pix2 + stride
 cmp w4, #4
 b.lt2f
@@ -386,9 +385,8 @@ function ff_pix_abs16_y2_neon, export=1
 3:
 add v29.8h, v29.8h, v28.8h  // Add vectors together
 uaddlv  s16, v29.8h // Add up vector values
-add d18, d18, d16
 
-fmovw0, s18
+fmovw0, s16
 
 ret
 endfunc

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] lavc/aarch64: Add neon implementation for pix_abs8

2022-08-18 Thread Hubert Mazur

ffmpeg | branch: master | Hubert Mazur  | Tue Aug 16 
14:20:16 2022 +0200| [70efa4d01188b61efc0b82e7241a59a32c7e2e22] | committer: 
Martin Storsjö

lavc/aarch64: Add neon implementation for pix_abs8

Provide optimized implementation of pix_abs8 function for arm64.

Performance comparison tests are shown below.
- pix_abs_1_0_c: 101.2
- pix_abs_1_0_neon: 22.5
- sad_1_c: 101.2
- sad_1_neon: 22.5

Benchmarks and tests are run with checkasm tool on AWS Graviton 3.

Signed-off-by: Martin Storsjö 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=70efa4d01188b61efc0b82e7241a59a32c7e2e22
---

 libavcodec/aarch64/me_cmp_init_aarch64.c |  4 +++
 libavcodec/aarch64/me_cmp_neon.S | 47 
 2 files changed, 51 insertions(+)

diff --git a/libavcodec/aarch64/me_cmp_init_aarch64.c 
b/libavcodec/aarch64/me_cmp_init_aarch64.c
index 7c03ce8c50..fb7c3f5059 100644
--- a/libavcodec/aarch64/me_cmp_init_aarch64.c
+++ b/libavcodec/aarch64/me_cmp_init_aarch64.c
@@ -31,6 +31,8 @@ int ff_pix_abs16_x2_neon(MpegEncContext *v, const uint8_t 
*pix1, const uint8_t *
  ptrdiff_t stride, int h);
 int ff_pix_abs16_y2_neon(MpegEncContext *v, const uint8_t *pix1, const uint8_t 
*pix2,
  ptrdiff_t stride, int h);
+int ff_pix_abs8_neon(MpegEncContext *s, const uint8_t *blk1, const uint8_t 
*blk2,
+ ptrdiff_t stride, int h);
 
 int sse16_neon(MpegEncContext *v, const uint8_t *pix1, const uint8_t *pix2,
ptrdiff_t stride, int h);
@@ -48,8 +50,10 @@ av_cold void ff_me_cmp_init_aarch64(MECmpContext *c, 
AVCodecContext *avctx)
 c->pix_abs[0][1] = ff_pix_abs16_x2_neon;
 c->pix_abs[0][2] = ff_pix_abs16_y2_neon;
 c->pix_abs[0][3] = ff_pix_abs16_xy2_neon;
+c->pix_abs[1][0] = ff_pix_abs8_neon;
 
 c->sad[0] = ff_pix_abs16_neon;
+c->sad[1] = ff_pix_abs8_neon;
 c->sse[0] = sse16_neon;
 c->sse[1] = sse8_neon;
 c->sse[2] = sse4_neon;
diff --git a/libavcodec/aarch64/me_cmp_neon.S b/libavcodec/aarch64/me_cmp_neon.S
index c0647c49e9..b89c25438e 100644
--- a/libavcodec/aarch64/me_cmp_neon.S
+++ b/libavcodec/aarch64/me_cmp_neon.S
@@ -72,6 +72,53 @@ function ff_pix_abs16_neon, export=1
 ret
 endfunc
 
+function ff_pix_abs8_neon, export=1
+// x0   unused
+// x1   uint8_t *pix1
+// x2   uint8_t *pix2
+// x3   ptrdiff_t stride
+// w4   int h
+
+moviv30.8h, #0
+cmp w4, #4
+b.lt2f
+
+// make 4 iterations at once
+1:
+ld1 {v0.8b}, [x1], x3   // Load pix1 for first 
iteration
+ld1 {v1.8b}, [x2], x3   // Load pix2 for first 
iteration
+ld1 {v2.8b}, [x1], x3   // Load pix1 for 
second iteration
+uabal   v30.8h, v0.8b, v1.8b// Absolute 
difference, first iteration
+ld1 {v3.8b}, [x2], x3   // Load pix2 for 
second iteration
+ld1 {v4.8b}, [x1], x3   // Load pix1 for third 
iteration
+uabal   v30.8h, v2.8b, v3.8b// Absolute 
difference, second iteration
+ld1 {v5.8b}, [x2], x3   // Load pix2 for third 
iteration
+sub w4, w4, #4  // h -= 4
+ld1 {v6.8b}, [x1], x3   // Load pix1 for 
foruth iteration
+ld1 {v7.8b}, [x2], x3   // Load pix2 for 
fourth iteration
+uabal   v30.8h, v4.8b, v5.8b// Absolute 
difference, third iteration
+cmp w4, #4
+uabal   v30.8h, v6.8b, v7.8b// Absolute 
difference, foruth iteration
+b.ge1b
+
+cbz w4, 3f
+
+// iterate by one
+2:
+ld1 {v0.8b}, [x1], x3   // Load pix1
+ld1 {v1.8b}, [x2], x3   // Load pix2
+
+subsw4, w4, #1
+uabal   v30.8h, v0.8b, v1.8b
+b.ne2b
+
+3:
+uaddlv  s20, v30.8h // Add up vector
+fmovw0, s20
+
+ret
+endfunc
+
 function ff_pix_abs16_xy2_neon, export=1
 // x0   unused
 // x1   uint8_t *pix1

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] lavc/aarch64: Add neon implementation for sse8

2022-08-18 Thread Hubert Mazur

ffmpeg | branch: master | Hubert Mazur  | Tue Aug 16 
14:20:15 2022 +0200| [74312e80d74eebf095d0092a6bb2f1f207626174] | committer: 
Martin Storsjö

lavc/aarch64: Add neon implementation for sse8

Provide optimized implementation of sse8 function for arm64.

Performance comparison tests are shown below.
- sse_1_c: 130.7
- sse_1_neon: 29.7

Benchmarks and tests run with checkasm tool on AWS Graviton 3.

Signed-off-by: Hubert Mazur 
Signed-off-by: Martin Storsjö 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=74312e80d74eebf095d0092a6bb2f1f207626174
---

 libavcodec/aarch64/me_cmp_init_aarch64.c |  3 ++
 libavcodec/aarch64/me_cmp_neon.S | 64 
 2 files changed, 67 insertions(+)

diff --git a/libavcodec/aarch64/me_cmp_init_aarch64.c 
b/libavcodec/aarch64/me_cmp_init_aarch64.c
index 30737e2436..7c03ce8c50 100644
--- a/libavcodec/aarch64/me_cmp_init_aarch64.c
+++ b/libavcodec/aarch64/me_cmp_init_aarch64.c
@@ -34,6 +34,8 @@ int ff_pix_abs16_y2_neon(MpegEncContext *v, const uint8_t 
*pix1, const uint8_t *
 
 int sse16_neon(MpegEncContext *v, const uint8_t *pix1, const uint8_t *pix2,
ptrdiff_t stride, int h);
+int sse8_neon(MpegEncContext *v, const uint8_t *pix1, const uint8_t *pix2,
+  ptrdiff_t stride, int h);
 int sse4_neon(MpegEncContext *v, const uint8_t *pix1, const uint8_t *pix2,
   ptrdiff_t stride, int h);
 
@@ -49,6 +51,7 @@ av_cold void ff_me_cmp_init_aarch64(MECmpContext *c, 
AVCodecContext *avctx)
 
 c->sad[0] = ff_pix_abs16_neon;
 c->sse[0] = sse16_neon;
+c->sse[1] = sse8_neon;
 c->sse[2] = sse4_neon;
 }
 }
diff --git a/libavcodec/aarch64/me_cmp_neon.S b/libavcodec/aarch64/me_cmp_neon.S
index 26490e189f..c0647c49e9 100644
--- a/libavcodec/aarch64/me_cmp_neon.S
+++ b/libavcodec/aarch64/me_cmp_neon.S
@@ -420,6 +420,70 @@ function sse16_neon, export=1
 ret
 endfunc
 
+function sse8_neon, export=1
+// x0 - unused
+// x1 - pix1
+// x2 - pix2
+// x3 - stride
+// w4 - h
+
+moviv21.4s, #0
+moviv20.4s, #0
+cmp w4, #4
+b.le2f
+
+// make 4 iterations at once
+1:
+
+// res = abs(pix1[0] - pix2[0])
+// res * res
+
+ld1 {v0.8b}, [x1], x3   // Load pix1 for first 
iteration
+ld1 {v1.8b}, [x2], x3   // Load pix2 for 
second iteration
+ld1 {v2.8b}, [x1], x3   // Load pix1 for 
second iteration
+ld1 {v3.8b}, [x2], x3   // Load pix2 for 
second iteration
+uabdl   v30.8h, v0.8b, v1.8b// Absolute 
difference, first iteration
+ld1 {v4.8b}, [x1], x3   // Load pix1 for third 
iteration
+ld1 {v5.8b}, [x2], x3   // Load pix2 for third 
iteration
+uabdl   v29.8h, v2.8b, v3.8b// Absolute 
difference, second iteration
+umlal   v21.4s, v30.4h, v30.4h  // Multiply lower 
half, first iteration
+ld1 {v6.8b}, [x1], x3   // Load pix1 for 
fourth iteration
+ld1 {v7.8b}, [x2], x3   // Load pix2 for 
fourth iteration
+uabdl   v28.8h, v4.8b, v5.8b// Absolute 
difference, third iteration
+umlal   v21.4s, v29.4h, v29.4h  // Multiply lower 
half, second iteration
+umlal2  v20.4s, v30.8h, v30.8h  // Multiply upper 
half, first iteration
+uabdl   v27.8h, v6.8b, v7.8b// Absolute 
difference, fourth iteration
+umlal   v21.4s, v28.4h, v28.4h  // Multiply lower 
half, third iteration
+umlal2  v20.4s, v29.8h, v29.8h  // Multiply upper 
half, second iteration
+sub w4, w4, #4  // h -= 4
+umlal2  v20.4s, v28.8h, v28.8h  // Multiply upper 
half, third iteration
+umlal   v21.4s, v27.4h, v27.4h  // Multiply lower 
half, fourth iteration
+cmp w4, #4
+umlal2  v20.4s, v27.8h, v27.8h  // Multiply upper 
half, fourth iteration
+b.ge1b
+
+cbz w4, 3f
+
+// iterate by one
+2:
+ld1 {v0.8b}, [x1], x3   // Load pix1
+ld1 {v1.8b}, [x2], x3   // Load pix2
+subsw4, w4, #1
+uabdl   v30.8h, v0.8b, v1.8b
+umlal   v21.4s, v30.4h, v30.4h
+umlal2  v20.4s, v30.8h, v30.8h
+
+b.ne2b
+
+3:
+add v21.4s, v21.4s, v20.4s  // Add accumulator 
vectors together
+uaddlv  d17, v21.4s // Add up vector
+
+fmovw0, s17
+ret
+
+endfunc
+

[FFmpeg-cvslog] lavc/aarch64: Add neon implementation for pix_abs16_y2

2022-08-18 Thread Hubert Mazur

ffmpeg | branch: master | Hubert Mazur  | Tue Aug 16 
14:20:14 2022 +0200| [a2e45ad407c526cd5ce2f3a361fb98084228cd6e] | committer: 
Martin Storsjö

lavc/aarch64: Add neon implementation for pix_abs16_y2

Provide optimized implementation of pix_abs16_y2 function for arm64.

Performance comparison tests are shown below.
pix_abs_0_2_c: 317.2
pix_abs_0_2_neon: 37.5

Benchmarks and tests run with checkasm tool on AWS Graviton 3.

Signed-off-by: Hubert Mazur 
Signed-off-by: Martin Storsjö 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=a2e45ad407c526cd5ce2f3a361fb98084228cd6e
---

 libavcodec/aarch64/me_cmp_init_aarch64.c |  3 ++
 libavcodec/aarch64/me_cmp_neon.S | 75 
 2 files changed, 78 insertions(+)

diff --git a/libavcodec/aarch64/me_cmp_init_aarch64.c 
b/libavcodec/aarch64/me_cmp_init_aarch64.c
index 57722b6a9a..30737e2436 100644
--- a/libavcodec/aarch64/me_cmp_init_aarch64.c
+++ b/libavcodec/aarch64/me_cmp_init_aarch64.c
@@ -29,6 +29,8 @@ int ff_pix_abs16_xy2_neon(MpegEncContext *s, const uint8_t 
*blk1, const uint8_t
   ptrdiff_t stride, int h);
 int ff_pix_abs16_x2_neon(MpegEncContext *v, const uint8_t *pix1, const uint8_t 
*pix2,
  ptrdiff_t stride, int h);
+int ff_pix_abs16_y2_neon(MpegEncContext *v, const uint8_t *pix1, const uint8_t 
*pix2,
+ ptrdiff_t stride, int h);
 
 int sse16_neon(MpegEncContext *v, const uint8_t *pix1, const uint8_t *pix2,
ptrdiff_t stride, int h);
@@ -42,6 +44,7 @@ av_cold void ff_me_cmp_init_aarch64(MECmpContext *c, 
AVCodecContext *avctx)
 if (have_neon(cpu_flags)) {
 c->pix_abs[0][0] = ff_pix_abs16_neon;
 c->pix_abs[0][1] = ff_pix_abs16_x2_neon;
+c->pix_abs[0][2] = ff_pix_abs16_y2_neon;
 c->pix_abs[0][3] = ff_pix_abs16_xy2_neon;
 
 c->sad[0] = ff_pix_abs16_neon;
diff --git a/libavcodec/aarch64/me_cmp_neon.S b/libavcodec/aarch64/me_cmp_neon.S
index f3201739b8..26490e189f 100644
--- a/libavcodec/aarch64/me_cmp_neon.S
+++ b/libavcodec/aarch64/me_cmp_neon.S
@@ -271,6 +271,81 @@ function ff_pix_abs16_x2_neon, export=1
 ret
 endfunc
 
+function ff_pix_abs16_y2_neon, export=1
+// x0   unused
+// x1   uint8_t *pix1
+// x2   uint8_t *pix2
+// x3   ptrdiff_t stride
+// w4   int h
+
+// initialize buffers
+moviv29.8h, #0  // clear the 
accumulator
+moviv28.8h, #0  // clear the 
accumulator
+movid18, #0
+add x5, x2, x3  // pix2 + stride
+cmp w4, #4
+b.lt2f
+
+// make 4 iterations at once
+1:
+
+// abs(pix1[0], avg2(pix2[0], pix2[0 + stride]))
+// avg2(a, b) = (((a) + (b) + 1) >> 1)
+// abs(x) = (x < 0 ? (-x) : (x))
+
+ld1 {v1.16b}, [x2], x3  // Load pix2 for first 
iteration
+ld1 {v2.16b}, [x5], x3  // Load pix3 for first 
iteration
+ld1 {v0.16b}, [x1], x3  // Load pix1 for first 
iteration
+urhadd  v30.16b, v1.16b, v2.16b // Rounding halving 
add, first iteration
+ld1 {v4.16b}, [x2], x3  // Load pix2 for 
second iteration
+ld1 {v5.16b}, [x5], x3  // Load pix3 for 
second iteartion
+uabal   v29.8h, v0.8b, v30.8b   // Absolute difference 
of lower half, first iteration
+uabal2  v28.8h, v0.16b, v30.16b // Absolute difference 
of upper half, first iteration
+ld1 {v3.16b}, [x1], x3  // Load pix1 for 
second iteration
+urhadd  v27.16b, v4.16b, v5.16b // Rounding halving 
add, second iteration
+ld1 {v7.16b}, [x2], x3  // Load pix2 for third 
iteration
+ld1 {v20.16b}, [x5], x3 // Load pix3 for third 
iteration
+uabal   v29.8h, v3.8b, v27.8b   // Absolute difference 
of lower half for second iteration
+uabal2  v28.8h, v3.16b, v27.16b // Absolute difference 
of upper half for second iteration
+ld1 {v6.16b}, [x1], x3  // Load pix1 for third 
iteration
+urhadd  v26.16b, v7.16b, v20.16b// Rounding halving 
add, third iteration
+ld1 {v22.16b}, [x2], x3 // Load pix2 for 
fourth iteration
+ld1 {v23.16b}, [x5], x3 // Load pix3 for 
fourth iteration
+uabal   v29.8h, v6.8b, v26.8b   // Absolute difference 
of lower half for third iteration
+uabal2  v28.8h, v6.16b, v26.16b // Absolute difference 
of upper half for third iteration
+ld1 {v21.16b},

[FFmpeg-cvslog] lavc/aarch64: Add neon implementation for sse16

2022-08-18 Thread Hubert Mazur

ffmpeg | branch: master | Hubert Mazur  | Tue Aug 16 
14:20:12 2022 +0200| [ad251fd26243d93093206a511cb547f46b967e4c] | committer: 
Martin Storsjö

lavc/aarch64: Add neon implementation for sse16

Provide neon implementation for sse16 function.

Performance comparison tests are shown below.
- sse_0_c: 268.2
- sse_0_neon: 43.5

Benchmarks and tests run with checkasm tool on AWS Graviton 3.

Signed-off-by: Hubert Mazur 
Signed-off-by: Martin Storsjö 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=ad251fd26243d93093206a511cb547f46b967e4c
---

 libavcodec/aarch64/me_cmp_init_aarch64.c |  4 ++
 libavcodec/aarch64/me_cmp_neon.S | 74 
 2 files changed, 78 insertions(+)

diff --git a/libavcodec/aarch64/me_cmp_init_aarch64.c 
b/libavcodec/aarch64/me_cmp_init_aarch64.c
index dfb9583320..ab2a1909ba 100644
--- a/libavcodec/aarch64/me_cmp_init_aarch64.c
+++ b/libavcodec/aarch64/me_cmp_init_aarch64.c
@@ -30,6 +30,9 @@ int ff_pix_abs16_xy2_neon(MpegEncContext *s, const uint8_t 
*blk1, const uint8_t
 int ff_pix_abs16_x2_neon(MpegEncContext *v, const uint8_t *pix1, const uint8_t 
*pix2,
  ptrdiff_t stride, int h);
 
+int sse16_neon(MpegEncContext *v, const uint8_t *pix1, const uint8_t *pix2,
+   ptrdiff_t stride, int h);
+
 av_cold void ff_me_cmp_init_aarch64(MECmpContext *c, AVCodecContext *avctx)
 {
 int cpu_flags = av_get_cpu_flags();
@@ -40,5 +43,6 @@ av_cold void ff_me_cmp_init_aarch64(MECmpContext *c, 
AVCodecContext *avctx)
 c->pix_abs[0][3] = ff_pix_abs16_xy2_neon;
 
 c->sad[0] = ff_pix_abs16_neon;
+c->sse[0] = sse16_neon;
 }
 }
diff --git a/libavcodec/aarch64/me_cmp_neon.S b/libavcodec/aarch64/me_cmp_neon.S
index cda7ce0408..b98b2b7e03 100644
--- a/libavcodec/aarch64/me_cmp_neon.S
+++ b/libavcodec/aarch64/me_cmp_neon.S
@@ -270,3 +270,77 @@ function ff_pix_abs16_x2_neon, export=1
 
 ret
 endfunc
+
+function sse16_neon, export=1
+// x0 - unused
+// x1 - pix1
+// x2 - pix2
+// x3 - stride
+// w4 - h
+
+cmp w4, #4
+moviv17.4s, #0
+b.lt2f
+
+// Make 4 iterations at once
+1:
+
+// res = abs(pix1[0] - pix2[0])
+// res * res
+
+ld1 {v0.16b}, [x1], x3  // Load pix1 vector 
for first iteration
+ld1 {v1.16b}, [x2], x3  // Load pix2 vector 
for first iteration
+ld1 {v2.16b}, [x1], x3  // Load pix1 vector 
for second iteration
+uabdv30.16b, v0.16b, v1.16b // Absolute 
difference, first iteration
+ld1 {v3.16b}, [x2], x3  // Load pix2 vector 
for second iteration
+umull   v29.8h, v30.8b, v30.8b  // Multiply lower half 
of vectors, first iteration
+umull2  v28.8h, v30.16b, v30.16b// Multiply upper half 
of vectors, first iteration
+uabdv27.16b, v2.16b, v3.16b // Absolute 
difference, second iteration
+uadalp  v17.4s, v29.8h  // Pairwise add, first 
iteration
+ld1 {v4.16b}, [x1], x3  // Load pix1 for third 
iteration
+umull   v26.8h, v27.8b, v27.8b  // Mulitply lower 
half, second iteration
+umull2  v25.8h, v27.16b, v27.16b// Multiply upper 
half, second iteration
+ld1 {v5.16b}, [x2], x3  // Load pix2 for third 
iteration
+uadalp  v17.4s, v26.8h  // Pairwise add and 
accumulate, second iteration
+uabdv24.16b, v4.16b, v5.16b // Absolute 
difference, third iteration
+ld1 {v6.16b}, [x1], x3  // Load pix1 for 
fourth iteration
+uadalp  v17.4s, v25.8h  // Pairwise add and 
accumulate, second iteration
+umull   v23.8h, v24.8b, v24.8b  // Multiply lower 
half, third iteration
+umull2  v22.8h, v24.16b, v24.16b// Multiply upper 
half, third iteration
+uadalp  v17.4s, v23.8h  // Pairwise add and 
accumulate, third iteration
+ld1 {v7.16b}, [x2], x3  // Load pix2 for fouth 
iteration
+uadalp  v17.4s, v22.8h  // Pairwise add and 
accumulate, third iteration
+uabdv21.16b, v6.16b, v7.16b // Absolute 
difference, fourth iteration
+uadalp  v17.4s, v28.8h  // Pairwise add and 
accumulate, first iteration
+umull   v20.8h, v21.8b, v21.8b  // Multiply lower 
half, fourth iteration
+sub w4, w4, #4  // h -= 4
+umull2  v19.8h, v21.16b, v21.16b// Multiply upper 
half, fourth iteration
+uadalp  v17.4s, v20.8h

[FFmpeg-cvslog] lavc/aarch64: Add neon implementation for sse4

2022-08-18 Thread Hubert Mazur

ffmpeg | branch: master | Hubert Mazur  | Tue Aug 16 
14:20:13 2022 +0200| [d7abb7d143fd1fbacb0084a8936bc4029afe5111] | committer: 
Martin Storsjö

lavc/aarch64: Add neon implementation for sse4

Provide neon implementation for sse4 function.

Performance comparison tests are shown below.
- sse_2_c: 80.7
- sse_2_neon: 31.0

Benchmarks and tests are run with checkasm tool on AWS Graviton 3.

Signed-off-by: Hubert Mazur 
Signed-off-by: Martin Storsjö 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=d7abb7d143fd1fbacb0084a8936bc4029afe5111
---

 libavcodec/aarch64/me_cmp_init_aarch64.c |  3 ++
 libavcodec/aarch64/me_cmp_neon.S | 56 
 2 files changed, 59 insertions(+)

diff --git a/libavcodec/aarch64/me_cmp_init_aarch64.c 
b/libavcodec/aarch64/me_cmp_init_aarch64.c
index ab2a1909ba..57722b6a9a 100644
--- a/libavcodec/aarch64/me_cmp_init_aarch64.c
+++ b/libavcodec/aarch64/me_cmp_init_aarch64.c
@@ -32,6 +32,8 @@ int ff_pix_abs16_x2_neon(MpegEncContext *v, const uint8_t 
*pix1, const uint8_t *
 
 int sse16_neon(MpegEncContext *v, const uint8_t *pix1, const uint8_t *pix2,
ptrdiff_t stride, int h);
+int sse4_neon(MpegEncContext *v, const uint8_t *pix1, const uint8_t *pix2,
+  ptrdiff_t stride, int h);
 
 av_cold void ff_me_cmp_init_aarch64(MECmpContext *c, AVCodecContext *avctx)
 {
@@ -44,5 +46,6 @@ av_cold void ff_me_cmp_init_aarch64(MECmpContext *c, 
AVCodecContext *avctx)
 
 c->sad[0] = ff_pix_abs16_neon;
 c->sse[0] = sse16_neon;
+c->sse[2] = sse4_neon;
 }
 }
diff --git a/libavcodec/aarch64/me_cmp_neon.S b/libavcodec/aarch64/me_cmp_neon.S
index b98b2b7e03..f3201739b8 100644
--- a/libavcodec/aarch64/me_cmp_neon.S
+++ b/libavcodec/aarch64/me_cmp_neon.S
@@ -344,3 +344,59 @@ function sse16_neon, export=1
 
 ret
 endfunc
+
+function sse4_neon, export=1
+// x0 - unused
+// x1 - pix1
+// x2 - pix2
+// x3 - stride
+// w4 - h
+
+moviv16.4s, #0  // clear the result 
accumulator
+cmp w4, #4
+b.le2f
+
+// make 4 iterations at once
+1:
+
+// res = abs(pix1[0] - pix2[0])
+// res * res
+
+ld1 {v0.s}[0], [x1], x3 // Load pix1, first 
iteration
+ld1 {v1.s}[0], [x2], x3 // Load pix2, first 
iteration
+ld1 {v2.s}[0], [x1], x3 // Load pix1, second 
iteration
+ld1 {v3.s}[0], [x2], x3 // Load pix2, second 
iteration
+uabdl   v30.8h, v0.8b, v1.8b// Absolute 
difference, first iteration
+ld1 {v4.s}[0], [x1], x3 // Load pix1, third 
iteration
+ld1 {v5.s}[0], [x2], x3 // Load pix2, third 
iteration
+uabdl   v29.8h, v2.8b, v3.8b// Absolute 
difference, second iteration
+umlal   v16.4s, v30.4h, v30.4h  // Multiply vectors, 
first iteration
+ld1 {v6.s}[0], [x1], x3 // Load pix1, fourth 
iteration
+ld1 {v7.s}[0], [x2], x3 // Load pix2, fourth 
iteration
+uabdl   v28.8h, v4.8b, v5.8b// Absolute 
difference, third iteration
+umlal   v16.4s, v29.4h, v29.4h  // Multiply and 
accumulate, second iteration
+sub w4, w4, #4
+uabdl   v27.8h, v6.8b, v7.8b// Absolue difference, 
fourth iteration
+umlal   v16.4s, v28.4h, v28.4h  // Multiply and 
accumulate, third iteration
+cmp w4, #4
+umlal   v16.4s, v27.4h, v27.4h  // Multiply and 
accumulate, fourth iteration
+b.ge1b
+
+cbz w4, 3f
+
+// iterate by one
+2:
+ld1 {v0.s}[0], [x1], x3   // Load pix1
+ld1 {v1.s}[0], [x2], x3   // Load pix2
+uabdl   v30.8h, v0.8b, v1.8b
+subsw4, w4, #1
+umlal   v16.4s, v30.4h, v30.4h
+
+b.ne2b
+
+3:
+uaddlv  d17, v16.4s // Add vector
+fmovw0, s17
+
+ret
+endfunc

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] aarch64: me_cmp: Fix the indentation of function declarations

2022-08-18 Thread Martin Storsjö

ffmpeg | branch: master | Martin Storsjö  | Thu Aug 18 
12:00:20 2022 +0300| [60109d5b3d7bc88703fd4edfa282f25d0653016b] | committer: 
Martin Storsjö

aarch64: me_cmp: Fix the indentation of function declarations

Signed-off-by: Martin Storsjö 

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=60109d5b3d7bc88703fd4edfa282f25d0653016b
---

 libavcodec/aarch64/me_cmp_init_aarch64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavcodec/aarch64/me_cmp_init_aarch64.c 
b/libavcodec/aarch64/me_cmp_init_aarch64.c
index 79c739914f..dfb9583320 100644
--- a/libavcodec/aarch64/me_cmp_init_aarch64.c
+++ b/libavcodec/aarch64/me_cmp_init_aarch64.c
@@ -26,9 +26,9 @@
 int ff_pix_abs16_neon(MpegEncContext *s, const uint8_t *blk1, const uint8_t 
*blk2,
   ptrdiff_t stride, int h);
 int ff_pix_abs16_xy2_neon(MpegEncContext *s, const uint8_t *blk1, const 
uint8_t *blk2,
-  ptrdiff_t stride, int h);
+  ptrdiff_t stride, int h);
 int ff_pix_abs16_x2_neon(MpegEncContext *v, const uint8_t *pix1, const uint8_t 
*pix2,
-  ptrdiff_t stride, int h);
+ ptrdiff_t stride, int h);
 
 av_cold void ff_me_cmp_init_aarch64(MECmpContext *c, AVCodecContext *avctx)
 {

___
ffmpeg-cvslog mailing list
ffmpeg-cvslog@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-cvslog

To unsubscribe, visit link above, or email
ffmpeg-cvslog-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-cvslog] avcodec/mpegvideo_dec: Don't sync AVCodecContext fields manually

[FFmpeg-cvslog] avcodec/mpegvideo_dec: Remove commented-out cruft

[FFmpeg-cvslog] doc: fix binary values of SI prefixes

[FFmpeg-cvslog] avcodec/ffv1enc: Remove redundant wrapper

[FFmpeg-cvslog] avcodec/ffv1enc: Don't create and keep unnecessary reference

[FFmpeg-cvslog] avcodec/get_buffer: Don't get AVPixFmtDescriptor unnecessarily

[FFmpeg-cvslog] avcodec/mpegpicture: Reset fields explicitly instead of memsetting them

[FFmpeg-cvslog] avcodec/h263dec: Don't set frame parameters redundantly

[FFmpeg-cvslog] avcodec/h263dec: Remove redundant code to set cur_pic_ptr

[FFmpeg-cvslog] checkasm/sw_scale: hscale does not requires cpuflag test.

[FFmpeg-cvslog] libswscale: Enable hscale_avx2 for all input sizes.

[FFmpeg-cvslog] sws: allow avx2 hscale to process inputs of any size.

[FFmpeg-cvslog] sws: Replace call to yuv2yuvX_mmx by yuv2yuvX_mmxext

[FFmpeg-cvslog] lavc/aarch64: hevc_add_res add 12bit variants

[FFmpeg-cvslog] aarch64: me_cmp: Remove a leftover unnecessary instruction

[FFmpeg-cvslog] lavc/aarch64: Add neon implementation for pix_abs8

[FFmpeg-cvslog] lavc/aarch64: Add neon implementation for sse8

[FFmpeg-cvslog] lavc/aarch64: Add neon implementation for pix_abs16_y2

[FFmpeg-cvslog] lavc/aarch64: Add neon implementation for sse16

[FFmpeg-cvslog] lavc/aarch64: Add neon implementation for sse4

[FFmpeg-cvslog] aarch64: me_cmp: Fix the indentation of function declarations

21 matches

Site Navigation

Mail list logo

Footer information