Re: [FFmpeg-devel] [PATCH v2 2/3] avfilter/x86/vf_exposure: add ff_exposure_avx2

2021-11-20 Thread Wu Jianhua
James Almer:
>On 11/20/2021 5:42 PM, Wu Jianhua wrote:
>> James Almer:
>> On 11/4/2021 1:18 AM, Wu Jianhua wrote:
 Performance data(Less is better):
   exposure_sse:   500491
>>
>>> You reported a better result in the first patch.
>>
>> For they are tested on different baseline, I think it might be better to 
>> only compare these two values.
>>
   exposure_avx2:  449122
>>
>>> This looks like a really low speed up for a function that processes
>>>   twice the amount of floats per loop.
>>

 Signed-off-by: Wu Jianhua 
 ---
libavfilter/x86/vf_exposure.asm| 15 +++
libavfilter/x86/vf_exposure_init.c |  6 ++
2 files changed, 21 insertions(+)

 diff --git a/libavfilter/x86/vf_exposure.asm 
 b/libavfilter/x86/vf_exposure.asm
 index 3351c6fb3b..f271167805 100644
 --- a/libavfilter/x86/vf_exposure.asm
 +++ b/libavfilter/x86/vf_exposure.asm
 @@ -36,11 +36,21 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale
VBROADCASTSS m1, xmm1
%endif

 +%if cpuflag(fma3) || cpuflag(fma4)
>>
>>> Remove the fma4 check if you're not using it.
>>
>> No problem. Avx2 flag is only initialized with fma3, so the fma4 is 
>> redundant indeed.
>>
 +mulps   m0, m0, m1 ; black * scale
 +%endif
 +
.loop:
 +%if cpuflag(fma3) || cpuflag(fma4)
 +movam2, m0
 +vfmsub231ps m2, m1, [ptrq]
 +movu[ptrq], m2
> >
>>> Have you tried to not use FMA for this and just kept the sub + mul even
>>> for AVX2 and see how it performs?
>>
>> Yeah. Definitely. I have had sufficient tests before. The first version is 
>> kept sub + mul
>> for AVX2. After that, I keep trying to find a way out to speed up it 
>> further. Using FMA
>> here would be faster than sub + mul indeed, precisely, improving by 4%-10% 
>> approximately.
>> Not that much better, but still an optimal way I found at the present.

> I tried the checkasm test you wrote and when i made the AVX2 version use
> sub + mul instead of vfmsub231ps i noticed that i could change the
> epsilon value to FLT_EPSILON instead of 0.01f and the test would still
> succeed, meaning the output of the version using vfmsub231ps deviates a
> bit from the normal sub + mul one.

> The speed up is pretty small, so it may be worth just using the sub +
> mul version instead.

Yeah. Small, but it’s not called just one time. Many a little makes a mickle, 
isn’t it?
I might be more prefer to keep this.


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] avcodec/h263: Fix global-buffer-overflow with noout flag2 set

2021-11-20 Thread Andreas Rheinhardt
h263_get_motion_length() forgot to take an absolute value;
as a consequence, a negative index was used to access an array.
This leads to potential crashes, but mostly it just accesses what
is to the left of ff_mvtab (unless one uses ASAN), thereby defeating
the purpose of the AV_CODEC_FLAG2_NO_OUTPUT because the sizes of
the returned packets differ from the sizes the encoder would actually
have produced.

Signed-off-by: Andreas Rheinhardt 
---
Do we need this AV_CODEC_FLAG2_NO_OUTPUT codepath in h263.h and
mpeg4videoenc.c at all? It seems to have never worked and the speed
difference to encoding with output is negligible. (And I have not even
investigated whether the checks for whether said flag is set impact
the performance of ordinary encoding.)

 libavcodec/h263.h | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/libavcodec/h263.h b/libavcodec/h263.h
index 70fd1ffdc0..d6bef8318d 100644
--- a/libavcodec/h263.h
+++ b/libavcodec/h263.h
@@ -100,15 +100,16 @@ void ff_h263_encode_motion(PutBitContext *pb, int val, 
int f_code);
 
 
 static inline int h263_get_motion_length(int val, int f_code){
-int l, bit_size, code;
+int bit_size, code, sign;
 
 if (val == 0) {
 return 1; /* ff_mvtab[0][1] */
 } else {
 bit_size = f_code - 1;
 /* modulo encoding */
-l= INT_BIT - 6 - bit_size;
-val = (val<>l;
+val  = sign_extend(val, 6 + bit_size);
+sign = val >> 31;
+val  = (val ^ sign) - sign; /* val = FFABS(val) */
 val--;
 code = (val >> bit_size) + 1;
 
-- 
2.30.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2] av(codec|device): Don't cast pointers to int

2021-11-20 Thread Michael Niedermayer
On Sat, Nov 20, 2021 at 05:32:56PM +0100, Andreas Rheinhardt wrote:
> C99/C11 6.3.2.3 5: "Any pointer type may be converted to an integer
> type. [...] If the result cannot be represented in the integer type,
> the behavior is undefined." So stop casting pointers to int; use
> uintptr_t instead.
> 
> Signed-off-by: Andreas Rheinhardt 

LGTM

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Rewriting code that is poorly written but fully understood is good.
Rewriting code that one doesnt understand is a sign that one is less smart
than the original author, trying to rewrite it will not make it better.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 2/3] avfilter/x86/vf_exposure: add ff_exposure_avx2

2021-11-20 Thread James Almer

On 11/20/2021 5:42 PM, Wu Jianhua wrote:

James Almer:
On 11/4/2021 1:18 AM, Wu Jianhua wrote:

Performance data(Less is better):
  exposure_sse:   500491



You reported a better result in the first patch.


For they are tested on different baseline, I think it might be better to only 
compare these two values.


  exposure_avx2:  449122



This looks like a really low speed up for a function that processes
  twice the amount of floats per loop.




Signed-off-by: Wu Jianhua 
---
   libavfilter/x86/vf_exposure.asm| 15 +++
   libavfilter/x86/vf_exposure_init.c |  6 ++
   2 files changed, 21 insertions(+)

diff --git a/libavfilter/x86/vf_exposure.asm b/libavfilter/x86/vf_exposure.asm
index 3351c6fb3b..f271167805 100644
--- a/libavfilter/x86/vf_exposure.asm
+++ b/libavfilter/x86/vf_exposure.asm
@@ -36,11 +36,21 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale
   VBROADCASTSS m1, xmm1
   %endif

+%if cpuflag(fma3) || cpuflag(fma4)



Remove the fma4 check if you're not using it.


No problem. Avx2 flag is only initialized with fma3, so the fma4 is redundant 
indeed.


+mulps   m0, m0, m1 ; black * scale
+%endif
+
   .loop:
+%if cpuflag(fma3) || cpuflag(fma4)
+movam2, m0
+vfmsub231ps m2, m1, [ptrq]
+movu[ptrq], m2



Have you tried to not use FMA for this and just kept the sub + mul even
for AVX2 and see how it performs?


Yeah. Definitely. I have had sufficient tests before. The first version is kept 
sub + mul
for AVX2. After that, I keep trying to find a way out to speed up it further. 
Using FMA
here would be faster than sub + mul indeed, precisely, improving by 4%-10% 
approximately.
Not that much better, but still an optimal way I found at the present.


I tried the checkasm test you wrote and when i made the AVX2 version use 
sub + mul instead of vfmsub231ps i noticed that i could change the 
epsilon value to FLT_EPSILON instead of 0.01f and the test would still 
succeed, meaning the output of the version using vfmsub231ps deviates a 
bit from the normal sub + mul one.


The speed up is pretty small, so it may be worth just using the sub + 
mul version instead.





  +%else
   movum2, [ptrq]
   subps   m2, m2, m0
   mulps   m2, m2, m1
   movu[ptrq], m2
  +%endif
   add   ptrq, mmsize
   sublengthq, mmsize/4

  @@ -52,4 +62,9 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale
   %if ARCH_X86_64
   INIT_XMM sse
   EXPOSURE
  +
  +%if HAVE_AVX2_EXTERNAL
  +INIT_YMM avx2
  +EXPOSURE
  +%endif
   %endif
  diff --git a/libavfilter/x86/vf_exposure_init.c 
b/libavfilter/x86/vf_exposure_init.c
  index de1b360f6c..80dae6164e 100644
  --- a/libavfilter/x86/vf_exposure_init.c
  +++ b/libavfilter/x86/vf_exposure_init.c
  @@ -24,6 +24,7 @@
   #include "libavfilter/exposure.h"

   void ff_exposure_sse(float *ptr, int length, float black, float scale);
  +void ff_exposure_avx2(float *ptr, int length, float black, float scale);

   av_cold void ff_exposure_init_x86(ExposureContext *s)
  {
  @@ -32,5 +33,10 @@ av_cold void ff_exposure_init_x86(ExposureContext *s)
   #if ARCH_X86_64
   if (EXTERNAL_SSE(cpu_flags))
   s->exposure_func = ff_exposure_sse;
  +
  +#if HAVE_AVX2_EXTERNAL



No need for this preprocessor check.


Got it. I’ll update it.

Thanks for your review.
Jianhua

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 2/3] avfilter/x86/vf_exposure: add ff_exposure_avx2

2021-11-20 Thread Wu Jianhua
James Almer:
On 11/4/2021 1:18 AM, Wu Jianhua wrote:
>> Performance data(Less is better):
>>  exposure_sse:   500491

>You reported a better result in the first patch.

For they are tested on different baseline, I think it might be better to only 
compare these two values.

>>  exposure_avx2:  449122

> This looks like a really low speed up for a function that processes
>  twice the amount of floats per loop.

> >
> > Signed-off-by: Wu Jianhua 
> > ---
> >   libavfilter/x86/vf_exposure.asm| 15 +++
> >   libavfilter/x86/vf_exposure_init.c |  6 ++
> >   2 files changed, 21 insertions(+)
> >
> > diff --git a/libavfilter/x86/vf_exposure.asm 
> > b/libavfilter/x86/vf_exposure.asm
> > index 3351c6fb3b..f271167805 100644
> > --- a/libavfilter/x86/vf_exposure.asm
> > +++ b/libavfilter/x86/vf_exposure.asm
> > @@ -36,11 +36,21 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale
> >   VBROADCASTSS m1, xmm1
> >   %endif
> >
> > +%if cpuflag(fma3) || cpuflag(fma4)

> Remove the fma4 check if you're not using it.

No problem. Avx2 flag is only initialized with fma3, so the fma4 is redundant 
indeed.

> > +mulps   m0, m0, m1 ; black * scale
> > +%endif
> > +
> >   .loop:
> > +%if cpuflag(fma3) || cpuflag(fma4)
> > +movam2, m0
> > +vfmsub231ps m2, m1, [ptrq]
> > +movu[ptrq], m2

> Have you tried to not use FMA for this and just kept the sub + mul even
> for AVX2 and see how it performs?

Yeah. Definitely. I have had sufficient tests before. The first version is kept 
sub + mul
for AVX2. After that, I keep trying to find a way out to speed up it further. 
Using FMA
here would be faster than sub + mul indeed, precisely, improving by 4%-10% 
approximately.
Not that much better, but still an optimal way I found at the present.

>>  +%else
>>   movum2, [ptrq]
>>   subps   m2, m2, m0
>>   mulps   m2, m2, m1
>>   movu[ptrq], m2
>>  +%endif
>>   add   ptrq, mmsize
>>   sublengthq, mmsize/4
>>
>>  @@ -52,4 +62,9 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale
>>   %if ARCH_X86_64
>>   INIT_XMM sse
>>   EXPOSURE
>>  +
>>  +%if HAVE_AVX2_EXTERNAL
>>  +INIT_YMM avx2
>>  +EXPOSURE
>>  +%endif
>>   %endif
>>  diff --git a/libavfilter/x86/vf_exposure_init.c 
>> b/libavfilter/x86/vf_exposure_init.c
>>  index de1b360f6c..80dae6164e 100644
>>  --- a/libavfilter/x86/vf_exposure_init.c
>>  +++ b/libavfilter/x86/vf_exposure_init.c
>>  @@ -24,6 +24,7 @@
>>   #include "libavfilter/exposure.h"
>>
>>   void ff_exposure_sse(float *ptr, int length, float black, float scale);
>>  +void ff_exposure_avx2(float *ptr, int length, float black, float scale);
>>
>>   av_cold void ff_exposure_init_x86(ExposureContext *s)
> >  {
>>  @@ -32,5 +33,10 @@ av_cold void ff_exposure_init_x86(ExposureContext *s)
>>   #if ARCH_X86_64
>>   if (EXTERNAL_SSE(cpu_flags))
>>   s->exposure_func = ff_exposure_sse;
>>  +
>>  +#if HAVE_AVX2_EXTERNAL

> No need for this preprocessor check.

Got it. I’ll update it.

Thanks for your review.
Jianhua

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 5/5] avcodec/h263: Inline constant

2021-11-20 Thread Michael Niedermayer
On Fri, Nov 19, 2021 at 08:16:52PM +0100, Andreas Rheinhardt wrote:
> Signed-off-by: Andreas Rheinhardt 
> ---
>  libavcodec/h263.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libavcodec/h263.h b/libavcodec/h263.h
> index 491f2e0aac..70fd1ffdc0 100644
> --- a/libavcodec/h263.h
> +++ b/libavcodec/h263.h
> @@ -103,7 +103,7 @@ static inline int h263_get_motion_length(int val, int 
> f_code){
>  int l, bit_size, code;
>  
>  if (val == 0) {
> -return ff_mvtab[0][1];
> +return 1; /* ff_mvtab[0][1] */
>  } else {
>  bit_size = f_code - 1;
>  /* modulo encoding */

ok but i think this is not used in any speed relevant pathes

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Concerning the gods, I have no means of knowing whether they exist or not
or of what sort they may be, because of the obscurity of the subject, and
the brevity of human life -- Protagoras


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 3/3] tests/checkasm: add check for vf_exposure

2021-11-20 Thread James Almer

On 11/4/2021 1:18 AM, Wu Jianhua wrote:

Signed-off-by: Wu Jianhua 
---
  tests/checkasm/Makefile  |  1 +
  tests/checkasm/checkasm.c|  3 ++
  tests/checkasm/checkasm.h|  1 +
  tests/checkasm/vf_exposure.c | 62 
  tests/fate/checkasm.mak  |  1 +
  5 files changed, 68 insertions(+)
  create mode 100644 tests/checkasm/vf_exposure.c

diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
index 4ef5fa87da..7b86ffca6b 100644
--- a/tests/checkasm/Makefile
+++ b/tests/checkasm/Makefile
@@ -37,6 +37,7 @@ AVFILTEROBJS-$(CONFIG_AFIR_FILTER) += af_afir.o
  AVFILTEROBJS-$(CONFIG_BLEND_FILTER) += vf_blend.o
  AVFILTEROBJS-$(CONFIG_COLORSPACE_FILTER) += vf_colorspace.o
  AVFILTEROBJS-$(CONFIG_EQ_FILTER) += vf_eq.o
+AVFILTEROBJS-$(CONFIG_EXPOSURE_FILTER)   += vf_exposure.o
  AVFILTEROBJS-$(CONFIG_GBLUR_FILTER)  += vf_gblur.o
  AVFILTEROBJS-$(CONFIG_HFLIP_FILTER)  += vf_hflip.o
  AVFILTEROBJS-$(CONFIG_THRESHOLD_FILTER)  += vf_threshold.o
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index b1353f7cbe..50961d9961 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -169,6 +169,9 @@ static const struct {
  #if CONFIG_EQ_FILTER
  { "vf_eq", checkasm_check_vf_eq },
  #endif
+#if CONFIG_EXPOSURE_FILTER
+{ "vf_exposure", checkasm_check_vf_exposure },
+#endif
  #if CONFIG_GBLUR_FILTER
  { "vf_gblur", checkasm_check_vf_gblur },
  #endif
diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
index 68b0697d3e..b402894ad3 100644
--- a/tests/checkasm/checkasm.h
+++ b/tests/checkasm/checkasm.h
@@ -78,6 +78,7 @@ void checkasm_check_utvideodsp(void);
  void checkasm_check_v210dec(void);
  void checkasm_check_v210enc(void);
  void checkasm_check_vf_eq(void);
+void checkasm_check_vf_exposure(void);
  void checkasm_check_vf_gblur(void);
  void checkasm_check_vf_hflip(void);
  void checkasm_check_vf_threshold(void);
diff --git a/tests/checkasm/vf_exposure.c b/tests/checkasm/vf_exposure.c
new file mode 100644
index 00..14f0efed5f
--- /dev/null
+++ b/tests/checkasm/vf_exposure.c
@@ -0,0 +1,62 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include 
+#include 
+#include "checkasm.h"
+#include "libavfilter/exposure.h"
+
+#define PIXELS 256
+#define BUF_SIZE (PIXELS * 4)
+
+#define randomize_buffers(buf, size) \
+do { \
+int j;   \
+float *tmp_buf = (float *)buf;   \
+for (j = 0; j < size; j++)   \
+tmp_buf[j] = (float)(rnd() & 0xFF); \
+} while (0)
+
+void checkasm_check_vf_exposure(void)
+{
+float *dst_ref = av_malloc(BUF_SIZE);
+float *dst_new = av_malloc(BUF_SIZE);


Use stack for these. They are 1k bytes each.


+ExposureContext s;
+
+s.exposure = 0.5f;
+s.black = 0.1f;
+s.scale = 1.f / (exp2f(-s.exposure) - s.black);
+
+randomize_buffers(dst_ref, PIXELS);
+memcpy(dst_new, dst_ref, BUF_SIZE);
+
+ff_exposure_init(&s);
+
+if (check_func(s.exposure_func, "exposure")) {
+declare_func(void, float *dst, int length, float black, float scale);
+call_ref(dst_ref, PIXELS, s.black, s.scale);
+call_new(dst_new, PIXELS, s.black, s.scale);
+if (!float_near_abs_eps_array(dst_ref, dst_new, 0.01f, PIXELS))
+fail();
+bench_new(dst_new, PIXELS, s.black, s.scale);
+}
+report("exposure");
+
+av_freep(&dst_ref);
+av_freep(&dst_new);
+}
diff --git a/tests/fate/checkasm.mak b/tests/fate/checkasm.mak
index 6e7edbe655..4d4cd6cc88 100644
--- a/tests/fate/checkasm.mak
+++ b/tests/fate/checkasm.mak
@@ -34,6 +34,7 @@ FATE_CHECKASM = fate-checkasm-aacpsdsp
  \
  fate-checkasm-vf_blend  \
  fate-checkasm-vf_colorspace \
  fate-checkasm-vf_eq \
+fate-checkasm-vf_exposure   \
  fate-checkasm-vf_gblur  \
  fate-checkasm-vf_hflip

Re: [FFmpeg-devel] [PATCH v2 2/3] avfilter/x86/vf_exposure: add ff_exposure_avx2

2021-11-20 Thread James Almer

On 11/4/2021 1:18 AM, Wu Jianhua wrote:

Performance data(Less is better):
 exposure_sse:   500491


You reported a better result in the first patch.


 exposure_avx2:  449122


This looks like a really low speed up for a function that processes 
twice the amount of floats per loop.




Signed-off-by: Wu Jianhua 
---
  libavfilter/x86/vf_exposure.asm| 15 +++
  libavfilter/x86/vf_exposure_init.c |  6 ++
  2 files changed, 21 insertions(+)

diff --git a/libavfilter/x86/vf_exposure.asm b/libavfilter/x86/vf_exposure.asm
index 3351c6fb3b..f271167805 100644
--- a/libavfilter/x86/vf_exposure.asm
+++ b/libavfilter/x86/vf_exposure.asm
@@ -36,11 +36,21 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale
  VBROADCASTSS m1, xmm1
  %endif
  
+%if cpuflag(fma3) || cpuflag(fma4)


Remove the fma4 check if you're not using it.


+mulps   m0, m0, m1 ; black * scale
+%endif
+
  .loop:
+%if cpuflag(fma3) || cpuflag(fma4)
+movam2, m0
+vfmsub231ps m2, m1, [ptrq]
+movu[ptrq], m2


Have you tried to not use FMA for this and just keep the sub + mul even 
for AVX2 and see how it performs?



+%else
  movum2, [ptrq]
  subps   m2, m2, m0
  mulps   m2, m2, m1
  movu[ptrq], m2
+%endif
  add   ptrq, mmsize
  sublengthq, mmsize/4
  
@@ -52,4 +62,9 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale

  %if ARCH_X86_64
  INIT_XMM sse
  EXPOSURE
+
+%if HAVE_AVX2_EXTERNAL
+INIT_YMM avx2
+EXPOSURE
+%endif
  %endif
diff --git a/libavfilter/x86/vf_exposure_init.c 
b/libavfilter/x86/vf_exposure_init.c
index de1b360f6c..80dae6164e 100644
--- a/libavfilter/x86/vf_exposure_init.c
+++ b/libavfilter/x86/vf_exposure_init.c
@@ -24,6 +24,7 @@
  #include "libavfilter/exposure.h"
  
  void ff_exposure_sse(float *ptr, int length, float black, float scale);

+void ff_exposure_avx2(float *ptr, int length, float black, float scale);
  
  av_cold void ff_exposure_init_x86(ExposureContext *s)

  {
@@ -32,5 +33,10 @@ av_cold void ff_exposure_init_x86(ExposureContext *s)
  #if ARCH_X86_64
  if (EXTERNAL_SSE(cpu_flags))
  s->exposure_func = ff_exposure_sse;
+
+#if HAVE_AVX2_EXTERNAL


No need for this preprocessor check.


+if (EXTERNAL_AVX2_FAST(cpu_flags))
+s->exposure_func = ff_exposure_avx2;
+#endif
  #endif
  }



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 1/3] avfilter/x86/vf_exposure: add x86 SIMD optimization

2021-11-20 Thread James Almer

On 11/20/2021 1:46 PM, James Almer wrote:

On 11/4/2021 1:18 AM, Wu Jianhua wrote:
diff --git a/libavfilter/x86/vf_exposure.asm 
b/libavfilter/x86/vf_exposure.asm

new file mode 100644
index 00..3351c6fb3b
--- /dev/null
+++ b/libavfilter/x86/vf_exposure.asm
@@ -0,0 +1,55 @@
+;* 


+;* x86-optimized functions for exposure filter
+;*
+;* This file is part of FFmpeg.
+;*
+;* FFmpeg is free software; you can redistribute it and/or
+;* modify it under the terms of the GNU Lesser General Public
+;* License as published by the Free Software Foundation; either
+;* version 2.1 of the License, or (at your option) any later version.
+;*
+;* FFmpeg is distributed in the hope that it will be useful,
+;* but WITHOUT ANY WARRANTY; without even the implied warranty of
+;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;* Lesser General Public License for more details.
+;*
+;* You should have received a copy of the GNU Lesser General Public
+;* License along with FFmpeg; if not, write to the Free Software
+;* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 
02110-1301 USA
+;** 


+
+%include "libavutil/x86/x86util.asm"
+
+SECTION .text
+
+;*** 


+; void ff_exposure(float *ptr, int length, float black, float scale);
+;*** 


+%macro EXPOSURE 0
+cglobal exposure, 2, 2, 4, ptr, length, black, scale
+    movsxdifnidn lengthq, lengthd
+%if WIN64
+    VBROADCASTSS m0, xmm2
+    VBROADCASTSS m1, xmm3
+%else
+    VBROADCASTSS m0, xmm0
+    VBROADCASTSS m1, xmm1
+%endif
+
+.loop:
+    movu    m2, [ptrq]
+    subps   m2, m2, m0
+    mulps   m2, m2, m1
+    movu    [ptrq], m2
+    add   ptrq, mmsize
+    sub    lengthq, mmsize/4
+
+    jg .loop
+
+    RET
+%endmacro
+
+%if ARCH_X86_64


Why x86_64 only?


+INIT_XMM sse
+EXPOSURE


Is it not possible to add an AVX version to process eight floats per 
loop? The function is already written in a way that you would only need 
to do


%if HAVE_AVX_EXTERNAL
INIT_YMM avx
EXPOSURE
%endif

For it. And ptr alignment is not a problem seeing you're using unaligned 
movs.


Ignore this part. I need to remember to check entire patchsets before 
starting to send replies...

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 1/3] avfilter/x86/vf_exposure: add x86 SIMD optimization

2021-11-20 Thread James Almer

On 11/4/2021 1:18 AM, Wu Jianhua wrote:

Performance data(Less is better):
exposure_c:857394
exposure_sse:  327589

Signed-off-by: Wu Jianhua 
---
  libavfilter/exposure.h | 36 +++
  libavfilter/vf_exposure.c  | 36 +--
  libavfilter/x86/Makefile   |  2 ++
  libavfilter/x86/vf_exposure.asm| 55 ++
  libavfilter/x86/vf_exposure_init.c | 36 +++
  5 files changed, 147 insertions(+), 18 deletions(-)
  create mode 100644 libavfilter/exposure.h
  create mode 100644 libavfilter/x86/vf_exposure.asm
  create mode 100644 libavfilter/x86/vf_exposure_init.c

diff --git a/libavfilter/exposure.h b/libavfilter/exposure.h
new file mode 100644
index 00..e76a517826
--- /dev/null
+++ b/libavfilter/exposure.h
@@ -0,0 +1,36 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVFILTER_EXPOSURE_H
+#define AVFILTER_EXPOSURE_H
+#include "avfilter.h"
+
+typedef struct ExposureContext {
+const AVClass *class;
+
+float exposure;
+float black;
+float scale;
+
+void (*exposure_func)(float *ptr, int length, float black, float scale);
+} ExposureContext;
+
+void ff_exposure_init(ExposureContext *s);
+void ff_exposure_init_x86(ExposureContext *s);
+
+#endif
diff --git a/libavfilter/vf_exposure.c b/libavfilter/vf_exposure.c
index 108fba7930..045ae710d3 100644
--- a/libavfilter/vf_exposure.c
+++ b/libavfilter/vf_exposure.c
@@ -26,23 +26,20 @@
  #include "formats.h"
  #include "internal.h"
  #include "video.h"
+#include "exposure.h"
  
-typedef struct ExposureContext {

-const AVClass *class;
-
-float exposure;
-float black;
+static void exposure_c(float *ptr, int length, float black, float scale)
+{
+int i;
  
-float scale;

-int (*do_slice)(AVFilterContext *s, void *arg,
-int jobnr, int nb_jobs);
-} ExposureContext;
+for (i = 0; i < length; i++)
+ptr[i] = (ptr[i] - black) * scale;
+}
  
  static int exposure_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs)

  {
  ExposureContext *s = ctx->priv;
  AVFrame *frame = arg;
-const int width = frame->width;
  const int height = frame->height;
  const int slice_start = (height * jobnr) / nb_jobs;
  const int slice_end = (height * (jobnr + 1)) / nb_jobs;
@@ -52,24 +49,27 @@ static int exposure_slice(AVFilterContext *ctx, void *arg, 
int jobnr, int nb_job
  for (int p = 0; p < 3; p++) {
  const int linesize = frame->linesize[p] / 4;
  float *ptr = (float *)frame->data[p] + slice_start * linesize;
-for (int y = slice_start; y < slice_end; y++) {
-for (int x = 0; x < width; x++)
-ptr[x] = (ptr[x] - black) * scale;
-
-ptr += linesize;
-}
+s->exposure_func(ptr, linesize * (slice_end - slice_start), black, 
scale);
  }
  
  return 0;

  }
  
+void ff_exposure_init(ExposureContext *s)

+{
+s->exposure_func = exposure_c;
+
+if (ARCH_X86)
+ff_exposure_init_x86(s);
+}
+
  static int filter_frame(AVFilterLink *inlink, AVFrame *frame)
  {
  AVFilterContext *ctx = inlink->dst;
  ExposureContext *s = ctx->priv;
  
  s->scale = 1.f / (exp2f(-s->exposure) - s->black);

-ff_filter_execute(ctx, s->do_slice, frame, NULL,
+ff_filter_execute(ctx, exposure_slice, frame, NULL,
FFMIN(frame->height, ff_filter_get_nb_threads(ctx)));
  
  return ff_filter_frame(ctx->outputs[0], frame);

@@ -80,7 +80,7 @@ static av_cold int config_input(AVFilterLink *inlink)
  AVFilterContext *ctx = inlink->dst;
  ExposureContext *s = ctx->priv;
  
-s->do_slice = exposure_slice;

+ff_exposure_init(s);
  
  return 0;

  }
diff --git a/libavfilter/x86/Makefile b/libavfilter/x86/Makefile
index a29941eaeb..e84a388aa5 100644
--- a/libavfilter/x86/Makefile
+++ b/libavfilter/x86/Makefile
@@ -8,6 +8,7 @@ OBJS-$(CONFIG_BWDIF_FILTER)  += 
x86/vf_bwdif_init.o
  OBJS-$(CONFIG_COLORSPACE_FILTER) += x86/colorspacedsp_init.o
  OBJS-$(CONFIG_CONVOLUTION_FILTER)+= x86/vf_convolution_init.o
  OBJS-$(CONFIG_EQ_FILTER) += x86/vf_eq_init.o
+OBJS-$(CONFIG_EXPOSURE_FILTER) 

[FFmpeg-devel] [PATCH 2/2] Revert "Disable warnings for casting pointers to integers, there is nothing wrong with that."

2021-11-20 Thread Andreas Rheinhardt
This reverts commit 5258f64a14713499cf84840b3ab3a1ee7cdcaeb8.
The premise of said commit (that conversions from pointer to int
are ok) is wrong: C99/C11 6.3.2.3 5: "Any pointer type may be converted
to an integer type. [...] If the result cannot be represented in the
integer type, the behavior is undefined." (C90 6.3.4 contains a similar
restriction.) So don't disable -Wpointer-to-int-cast.

Signed-off-by: Andreas Rheinhardt 
---
 configure | 1 -
 1 file changed, 1 deletion(-)

diff --git a/configure b/configure
index d068b11073..82e60b5da3 100755
--- a/configure
+++ b/configure
@@ -6903,7 +6903,6 @@ check_cflags -Wwrite-strings
 check_cflags -Wtype-limits
 check_cflags -Wundef
 check_cflags -Wmissing-prototypes
-check_cflags -Wno-pointer-to-int-cast
 check_cflags -Wstrict-prototypes
 check_cflags -Wempty-body
 
-- 
2.30.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 1/2] av(codec|device): Don't cast pointers to int

2021-11-20 Thread Andreas Rheinhardt
C99/C11 6.3.2.3 5: "Any pointer type may be converted to an integer
type. [...] If the result cannot be represented in the integer type,
the behavior is undefined." So stop casting pointers to int; use
uintptr_t instead.

Signed-off-by: Andreas Rheinhardt 
---
When did these asserts actually help the last time?

 libavcodec/dvdec.c | 4 ++--
 libavcodec/x86/h264_qpel.c | 8 
 libavcodec/x86/me_cmp_init.c   | 6 +++---
 libavcodec/x86/mpegvideoenc_template.c | 2 +-
 libavdevice/xcbgrab.c  | 2 +-
 5 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/libavcodec/dvdec.c b/libavcodec/dvdec.c
index a7424fd1d4..b72a67d01c 100644
--- a/libavcodec/dvdec.c
+++ b/libavcodec/dvdec.c
@@ -399,8 +399,8 @@ static int dv_decode_video_segment(AVCodecContext *avctx, 
void *arg)
 int retried = 0;
 int sta;
 
-av_assert1int) mb_bit_buffer) & 7) == 0);
-av_assert1int) vs_bit_buffer) & 7) == 0);
+av_assert1uintptr_t) mb_bit_buffer) & 7) == 0);
+av_assert1uintptr_t) vs_bit_buffer) & 7) == 0);
 
 retry:
 
diff --git a/libavcodec/x86/h264_qpel.c b/libavcodec/x86/h264_qpel.c
index a2d8a64976..320d98933a 100644
--- a/libavcodec/x86/h264_qpel.c
+++ b/libavcodec/x86/h264_qpel.c
@@ -338,7 +338,7 @@ static void OPNAME ## h264_qpel ## SIZE ## _mc21_ ## 
MMX(uint8_t *dst, const uin
 LOCAL_ALIGNED(ALIGN, uint8_t, temp, [SIZE*(SIZE<8?12:24)*2 + SIZE*SIZE]);\
 uint8_t * const halfHV= temp;\
 int16_t * const halfV= (int16_t*)(temp + SIZE*SIZE);\
-av_assert2(((int)temp & 7) == 0);\
+av_assert2(((uintptr_t)temp & 7) == 0);\
 ff_put_h264_qpel ## SIZE ## _hv_lowpass_ ## MMX(halfHV, halfV, src, SIZE, 
SIZE, stride);\
 ff_ ## OPNAME ## h264_qpel ## SIZE ## _h_lowpass_l2_ ## MMX(dst, src, 
halfHV, stride, SIZE);\
 }\
@@ -348,7 +348,7 @@ static void OPNAME ## h264_qpel ## SIZE ## _mc23_ ## 
MMX(uint8_t *dst, const uin
 LOCAL_ALIGNED(ALIGN, uint8_t, temp, [SIZE*(SIZE<8?12:24)*2 + SIZE*SIZE]);\
 uint8_t * const halfHV= temp;\
 int16_t * const halfV= (int16_t*)(temp + SIZE*SIZE);\
-av_assert2(((int)temp & 7) == 0);\
+av_assert2(((uintptr_t)temp & 7) == 0);\
 ff_put_h264_qpel ## SIZE ## _hv_lowpass_ ## MMX(halfHV, halfV, src, SIZE, 
SIZE, stride);\
 ff_ ## OPNAME ## h264_qpel ## SIZE ## _h_lowpass_l2_ ## MMX(dst, 
src+stride, halfHV, stride, SIZE);\
 }\
@@ -358,7 +358,7 @@ static void OPNAME ## h264_qpel ## SIZE ## _mc12_ ## 
MMX(uint8_t *dst, const uin
 LOCAL_ALIGNED(ALIGN, uint8_t, temp, [SIZE*(SIZE<8?12:24)*2 + SIZE*SIZE]);\
 uint8_t * const halfHV= temp;\
 int16_t * const halfV= (int16_t*)(temp + SIZE*SIZE);\
-av_assert2(((int)temp & 7) == 0);\
+av_assert2(((uintptr_t)temp & 7) == 0);\
 ff_put_h264_qpel ## SIZE ## _hv_lowpass_ ## MMX(halfHV, halfV, src, SIZE, 
SIZE, stride);\
 ff_ ## OPNAME ## pixels ## SIZE ## _l2_shift5_mmxext(dst, halfV+2, halfHV, 
stride, SIZE, SIZE);\
 }\
@@ -368,7 +368,7 @@ static void OPNAME ## h264_qpel ## SIZE ## _mc32_ ## 
MMX(uint8_t *dst, const uin
 LOCAL_ALIGNED(ALIGN, uint8_t, temp, [SIZE*(SIZE<8?12:24)*2 + SIZE*SIZE]);\
 uint8_t * const halfHV= temp;\
 int16_t * const halfV= (int16_t*)(temp + SIZE*SIZE);\
-av_assert2(((int)temp & 7) == 0);\
+av_assert2(((uintptr_t)temp & 7) == 0);\
 ff_put_h264_qpel ## SIZE ## _hv_lowpass_ ## MMX(halfHV, halfV, src, SIZE, 
SIZE, stride);\
 ff_ ## OPNAME ## pixels ## SIZE ## _l2_shift5_mmxext(dst, halfV+3, halfHV, 
stride, SIZE, SIZE);\
 }\
diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c
index 701eb1ab25..9af911bb88 100644
--- a/libavcodec/x86/me_cmp_init.c
+++ b/libavcodec/x86/me_cmp_init.c
@@ -131,7 +131,7 @@ static int vsad_intra16_mmx(MpegEncContext *v, uint8_t 
*pix, uint8_t *dummy,
 {
 int tmp;
 
-av_assert2int) pix) & 7) == 0);
+av_assert2(((uintptr_t) pix & 7) == 0);
 av_assert2((stride & 7) == 0);
 
 #define SUM(in0, in1, out0, out1)   \
@@ -195,8 +195,8 @@ static int vsad16_mmx(MpegEncContext *v, uint8_t *pix1, 
uint8_t *pix2,
 {
 int tmp;
 
-av_assert2int) pix1) & 7) == 0);
-av_assert2int) pix2) & 7) == 0);
+av_assert2(((uintptr_t)pix1 & 7) == 0);
+av_assert2(((uintptr_t)pix2 & 7) == 0);
 av_assert2((stride & 7) == 0);
 
 #define SUM(in0, in1, out0, out1)   \
diff --git a/libavcodec/x86/mpegvideoenc_template.c 
b/libavcodec/x86/mpegvideoenc_template.c
index b32b1b0350..30c06a6b2c 100644
--- a/libavcodec/x86/mpegvideoenc_template.c
+++ b/libavcodec/x86/mpegvideoenc_template.c
@@ -109,7 +109,7 @@ static int RENAME(dct_quantize)(MpegEncContext *s,
 const uint16_t *qmat, *bias;
 LOCAL_ALIGNED_16(int16_t, temp_block, [64]);
 
-av_assert2((7&(int)(&temp_block[0])) == 0); //did gcc align it correctly?
+av_assert2((7&(uintptr_t)(&temp_block[0])) == 0); //did gcc align it 
correctly?
 
 //s->fdct (block);
 RENAME_FDCT(ff_fdct)(block); // cannot be an

[FFmpeg-devel] [PATCH] avformat/cafdec: Avoid unnecessary avio_tell() calls

2021-11-20 Thread Andreas Rheinhardt
Signed-off-by: Andreas Rheinhardt 
---
 libavformat/cafdec.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavformat/cafdec.c b/libavformat/cafdec.c
index 3ff0558afe..4b5b15b58d 100644
--- a/libavformat/cafdec.c
+++ b/libavformat/cafdec.c
@@ -224,7 +224,7 @@ static int read_pakt_chunk(AVFormatContext *s, int64_t size)
 av_log(s, AV_LOG_ERROR, "error reading packet table\n");
 return AVERROR_INVALIDDATA;
 }
-avio_skip(pb, ccount + size - avio_tell(pb));
+avio_seek(pb, ccount + size, SEEK_SET);
 
 caf->num_bytes = pos;
 return 0;
@@ -331,7 +331,7 @@ static int read_header(AVFormatContext *s)
 if (size > 0 && (pb->seekable & AVIO_SEEKABLE_NORMAL)) {
 if (pos > INT64_MAX - size)
 return AVERROR_INVALIDDATA;
-avio_skip(pb, FFMAX(0, pos + size - avio_tell(pb)));
+avio_seek(pb, pos + size, SEEK_SET);
 }
 }
 
-- 
2.30.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 1/3] avfilter/x86/vf_exposure: add x86 SIMD optimization

2021-11-20 Thread Paul B Mahol
will apply soon
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] fftools/ffmpeg: Take type limitations of AVFifo API into account

2021-11-20 Thread Andreas Rheinhardt
The types used by the AVFifo API are inconsistent:
av_fifo_(space|size)() returns an int; av_fifo_alloc() takes an
unsigned, other parts use size_t. This commit therefore ensures
that the size of the muxing_queue FIFO never exceeds INT_MAX.

While just at it, also make sure not to call av_fifo_size()
unnecessarily often.

Signed-off-by: Andreas Rheinhardt 
---
 fftools/ffmpeg.c | 9 -
 fftools/ffmpeg_opt.c | 1 +
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fftools/ffmpeg.c b/fftools/ffmpeg.c
index c9a9cdfcd6..662cd7fda3 100644
--- a/fftools/ffmpeg.c
+++ b/fftools/ffmpeg.c
@@ -753,14 +753,13 @@ static void write_packet(OutputFile *of, AVPacket *pkt, 
OutputStream *ost, int u
 AVPacket *tmp_pkt;
 /* the muxer is not initialized yet, buffer the packet */
 if (!av_fifo_space(ost->muxing_queue)) {
+size_t cur_size = av_fifo_size(ost->muxing_queue);
 unsigned int are_we_over_size =
 (ost->muxing_queue_data_size + pkt->size) > 
ost->muxing_queue_data_threshold;
-int new_size = are_we_over_size ?
-   FFMIN(2 * av_fifo_size(ost->muxing_queue),
- ost->max_muxing_queue_size) :
-   2 * av_fifo_size(ost->muxing_queue);
+size_t limit= are_we_over_size ? ost->max_muxing_queue_size : 
INT_MAX;
+size_t new_size = FFMIN(2 * cur_size, limit);
 
-if (new_size <= av_fifo_size(ost->muxing_queue)) {
+if (new_size <= cur_size) {
 av_log(NULL, AV_LOG_ERROR,
"Too many packets buffered for output stream %d:%d.\n",
ost->file_index, ost->st->index);
diff --git a/fftools/ffmpeg_opt.c b/fftools/ffmpeg_opt.c
index 6732a29625..de6af0aea9 100644
--- a/fftools/ffmpeg_opt.c
+++ b/fftools/ffmpeg_opt.c
@@ -1637,6 +1637,7 @@ static OutputStream *new_output_stream(OptionsContext *o, 
AVFormatContext *oc, e
 
 ost->max_muxing_queue_size = 128;
 MATCH_PER_STREAM_OPT(max_muxing_queue_size, i, ost->max_muxing_queue_size, 
oc, st);
+ost->max_muxing_queue_size = FFMIN(ost->max_muxing_queue_size, INT_MAX / 
sizeof(ost->pkt));
 ost->max_muxing_queue_size *= sizeof(ost->pkt);
 
 ost->muxing_queue_data_size = 0;
-- 
2.30.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/2] libRIST: allow setting fifo size and fail on overflow.

2021-11-20 Thread Gijs Peskens

Thanks for reviewing again!

On 18-11-2021 20:35, Marton Balint wrote:



On Wed, 17 Nov 2021, Gijs Peskens wrote:


Introduce fifo_size and overrun_nonfatal params to configure fifo buffer
behavior.
Fifo size is used to left shift 2, since libRIST only accepts powers 
of 2.


Can you please make fifo_size simply mean the number of packets? User 
facing options should be easily understandable by the user. Even if 
librist (at the moment) only supports power of two values.


Yes, will do. libRIST will error out with a log message when it's not 
accepted, so I'd forego checking if the chosen number is a power of 2 on 
the ffmpeg side.


Use newly introduced RIST_DATA_FLAGS_OVERFLOW flag to check for overrun
and error out in that case.
---
libavformat/librist.c | 36 
1 file changed, 36 insertions(+)

diff --git a/libavformat/librist.c b/libavformat/librist.c
index 378b635ea7..6f68868f3c 100644
--- a/libavformat/librist.c
+++ b/libavformat/librist.c
@@ -43,6 +43,9 @@
    ((patch) + ((minor)* 0x100) + ((major) *0x1))
#define FF_LIBRIST_VERSION 
FF_LIBRIST_MAKE_VERSION(LIBRIST_API_VERSION_MAJOR, 
LIBRIST_API_VERSION_MINOR, LIBRIST_API_VERSION_PATCH)

#define FF_LIBRIST_VERSION_41 FF_LIBRIST_MAKE_VERSION(4, 1, 0)
+#define FF_LIBRIST_VERSION_42 FF_LIBRIST_MAKE_VERSION(4, 2, 0)
+
+#define FF_LIBRIST_FIFO_DEFAULT_SHIFT 13

typedef struct RISTContext {
    const AVClass *class;
@@ -52,6 +55,8 @@ typedef struct RISTContext {
    int packet_size;
    int log_level;
    int encryption;
+    int fifo_shift;
+    bool overrun_nonfatal;
    char *secret;

    struct rist_logging_settings logging_settings;
@@ -70,6 +75,8 @@ static const AVOption librist_options[] = {
    { "main",    NULL,  0, AV_OPT_TYPE_CONST, 
{.i64=RIST_PROFILE_MAIN}, 0, 0, .flags = D|E, "profile" },
    { "advanced",    NULL,  0, AV_OPT_TYPE_CONST, 
{.i64=RIST_PROFILE_ADVANCED}, 0, 0, .flags = D|E, "profile" },
    { "buffer_size", "set buffer_size in ms", OFFSET(buffer_size), 
AV_OPT_TYPE_INT, {.i64=0}, 0, 3, .flags = D|E },
+    { "fifo_size", "Set libRIST fifo buffer size, applied as: 
buffer_size=2^fifo_size", OFFSET(fifo_shift), AV_OPT_TYPE_INT, 
{.i64=FF_LIBRIST_FIFO_DEFAULT_SHIFT}, 10, 63, .flags = D|E },
+    { "overrun_nonfatal", "survive in case of libRIST receiving 
circular buffer overrun", OFFSET(overrun_nonfatal), AV_OPT_TYPE_BOOL, 
{.i64 = 0}, 0, 1,    D },


This changes existing default behaviour slightly by exiting in case of 
an overflow. I guess that it makes it consistent with udp.c, so fine 
with me if you think it is better that way.
If desired I can reverse the defaults so existing workflows are not 
impacted (though I doubt many exist due to the young age of the librist 
module).

Though the consistency with udp feels more logical to me.


    { "pkt_size",    "set packet size", OFFSET(packet_size), 
AV_OPT_TYPE_INT, {.i64=1316},  1, 
MAX_PAYLOAD_SIZE,    .flags = D|E },
    { "log_level",   "set loglevel",    OFFSET(log_level), 
AV_OPT_TYPE_INT,   {.i64=RIST_LOG_INFO},    -1, INT_MAX, .flags = 
D|E },
    { "secret", "set encryption secret",OFFSET(secret), 
AV_OPT_TYPE_STRING,{.str=NULL},  0, 0, .flags = D|E },
@@ -161,6 +168,19 @@ static int librist_open(URLContext *h, const 
char *uri, int flags)

    if (ret < 0)
    goto err;

+    //Prior to 4.2.0 there was a bug in libRIST which made this call 
always fail.

+#if FF_LIBRIST_VERSION >= FF_LIBRIST_VERSION_42
+    if (flags & AVIO_FLAG_READ) {
+    ret = rist_receiver_set_output_fifo_size(s->ctx, 2 << 
s->fifo_shift);

+    if (ret != 0)
+    goto err;
+    }
+#else
+    if (s->fifo_buffer_size != FF_LIBRIST_FIFO_DEFAULT) {
+    av_log(h, AV_LOG_ERROR, "libRIST prior to 0.2.7 has a bug 
which fails setting the fifo buffer size");

+    }
+#endif
+
    if (((s->encryption == 128 || s->encryption == 256) && 
!s->secret) ||
    ((peer_config->key_size == 128 || peer_config->key_size == 
256) && !peer_config->secret[0])) {
    av_log(h, AV_LOG_ERROR, "secret is mandatory if encryption is 
enabled\n");
@@ -223,8 +243,24 @@ static int librist_read(URLContext *h, uint8_t 
*buf, int size)

    return AVERROR_EXTERNAL;
    }

+#if FF_LIBRIST_VERSION >= FF_LIBRIST_VERSION_42
+    if (data_block->flags & RIST_DATA_FLAGS_OVERFLOW == 
RIST_DATA_FLAGS_OVERFLOW) {

+    if (!s->overrun_nonfatal) {
+    av_log(h, AV_LOG_ERROR, "Fifo buffer overrun. "
+    "To avoid, increase fifo_size URL option. "
+    "To survive in such case, use overrun_nonfatal 
option\n");

+    size = AVERROR(EIO);
+    goto out_free;
+    } else {
+    av_log(h, AV_LOG_WARNING, "Fifo buffer buffer overrun. "
+    "Surviving due to overrun_nonfatal option\n");
+    }
+    }
+#endif
+
    size = data_block->payload_len;
    memcpy(buf, data_block->payload, size);
+out_free:
#if FF

Re: [FFmpeg-devel] [PATCH] avcodec/smcenc: Avoid dangling pointers in context

2021-11-20 Thread Andreas Rheinhardt
Paul B Mahol:
> Commit message is misleading, better to use: switch to stack for
> PutByteContext
> 
Will do. Although the commit message is correct: The pointers contained
in the PutByteContext become dangling after each encode call.

- Andreas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avcodec/smcenc: Avoid dangling pointers in context

2021-11-20 Thread Paul B Mahol
Commit message is misleading, better to use: switch to stack for
PutByteContext
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".