Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2019-01-20 Thread Michael Niedermayer
On Sat, Jan 19, 2019 at 12:28:25AM +0100, Marton Balint wrote:
> 
> 
> On Thu, 17 Jan 2019, Michael Niedermayer wrote:
> 
> >On Wed, Jan 16, 2019 at 08:00:22PM +0100, Marton Balint wrote:
> >>
> >>
> >>On Tue, 15 Jan 2019, Michael Niedermayer wrote:
> >>
> >>>On Sun, Dec 30, 2018 at 07:15:49PM +0100, Marton Balint wrote:
> 
> 
> On Fri, 28 Dec 2018, Michael Niedermayer wrote:
> 
> >On Wed, Dec 26, 2018 at 10:16:47PM +0100, Marton Balint wrote:
> >>
> >>
> >>On Wed, 26 Dec 2018, Paul B Mahol wrote:
> >>
> >>>On 12/26/18, Michael Niedermayer  wrote:
> On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:
> >On 12/25/18, Michael Niedermayer  wrote:
> >>Fixes: Timeout
> >>Fixes:
> >>11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>Before: Executed
> >>clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>in 11294 ms
> >>After : Executed
> >>clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>in 4249 ms
> >>
> >>Found-by: continuous fuzzing process
> >>https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> >>Signed-off-by: Michael Niedermayer 
> >>---
> >>libavutil/imgutils.c | 6 ++
> >>1 file changed, 6 insertions(+)
> >>
> >>diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
> >>index 4938a7ef67..cc38f1e878 100644
> >>--- a/libavutil/imgutils.c
> >>+++ b/libavutil/imgutils.c
> >>@@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
> >>dst_size,
> >>uint8_t *clear,
> >>   }
> >>   } else if (clear_size == 4) {
> >>   uint32_t val = AV_RN32(clear);
> >>+uint64_t val8 = val * 0x10001ULL;
> >>+for (; dst_size >= 32; dst_size -= 32) {
> >>+AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
> >>+AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
> >>+dst += 32;
> >>+}
> >>   for (; dst_size >= 4; dst_size -= 4) {
> >>   AV_WN32(dst, val);
> >>   dst += 4;
> >>--
> >>2.20.1
> >>
> >
> >NAK, implement special memset function instead.
> 
> I can move the added loop into a seperate function, if thats what you
> suggest ?
> >>>
> >>>No, don't do that.
> >>>
> All the code is already in a "special" memset though, this is
> memset_bytes()
> 
> >>>
> >>>I guess function is less useful if its static. So any duplicate should
> >>>be avoided in codebase.
> >>
> >>Isn't av_memcpy_backptr does almost exactly what is needed here? That 
> >>can
> >>also be optimized further if needed.
> >
> >av_memcpy_backptr() copies data with overlap, its more like a recursive
> >memmove().
> 
> So? As far as I see the memset_bytes function in imgutils.c can be 
> replaced
> with this:
> 
>    if (clear_size > dst_size)
>    clear_size = dst_size;
>    memcpy(dst, clear, clear_size);
>    av_memcpy_backptr(dst + clear_size, clear_size, dst_size - clear_size);
> 
> I am not against an av_memset_bytes API addition, but I believe it should
> share code with av_memcpy_backptr to avoid duplication.
> >>>
> >>>ive implemented this, it does not seem to be really faster in the testcase
> >>
> >>I guess it is not faster because you have not applied your original
> >>optimalization to fill32 in libavutil/mem.c. Could you compare speed after
> >>optimizing that the same way your original patch did it with imgutils
> >>memset_bytes?
> >
> >sure, that makes it faster:
> 
> Thanks, both patches LGTM.

will apply

thanks

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Observe your enemies, for they first find out your faults. -- Antisthenes


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2019-01-18 Thread Marton Balint



On Thu, 17 Jan 2019, Michael Niedermayer wrote:


On Wed, Jan 16, 2019 at 08:00:22PM +0100, Marton Balint wrote:



On Tue, 15 Jan 2019, Michael Niedermayer wrote:


On Sun, Dec 30, 2018 at 07:15:49PM +0100, Marton Balint wrote:



On Fri, 28 Dec 2018, Michael Niedermayer wrote:


On Wed, Dec 26, 2018 at 10:16:47PM +0100, Marton Balint wrote:



On Wed, 26 Dec 2018, Paul B Mahol wrote:


On 12/26/18, Michael Niedermayer  wrote:

On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:

On 12/25/18, Michael Niedermayer  wrote:

Fixes: Timeout
Fixes:
11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
Before: Executed
clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
in 11294 ms
After : Executed
clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
in 4249 ms

Found-by: continuous fuzzing process
https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer 
---
libavutil/imgutils.c | 6 ++
1 file changed, 6 insertions(+)

diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
index 4938a7ef67..cc38f1e878 100644
--- a/libavutil/imgutils.c
+++ b/libavutil/imgutils.c
@@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
dst_size,
uint8_t *clear,
   }
   } else if (clear_size == 4) {
   uint32_t val = AV_RN32(clear);
+uint64_t val8 = val * 0x10001ULL;
+for (; dst_size >= 32; dst_size -= 32) {
+AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
+AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
+dst += 32;
+}
   for (; dst_size >= 4; dst_size -= 4) {
   AV_WN32(dst, val);
   dst += 4;
--
2.20.1



NAK, implement special memset function instead.


I can move the added loop into a seperate function, if thats what you
suggest ?


No, don't do that.


All the code is already in a "special" memset though, this is
memset_bytes()



I guess function is less useful if its static. So any duplicate should
be avoided in codebase.


Isn't av_memcpy_backptr does almost exactly what is needed here? That can
also be optimized further if needed.


av_memcpy_backptr() copies data with overlap, its more like a recursive
memmove().


So? As far as I see the memset_bytes function in imgutils.c can be replaced
with this:

   if (clear_size > dst_size)
   clear_size = dst_size;
   memcpy(dst, clear, clear_size);
   av_memcpy_backptr(dst + clear_size, clear_size, dst_size - clear_size);

I am not against an av_memset_bytes API addition, but I believe it should
share code with av_memcpy_backptr to avoid duplication.


ive implemented this, it does not seem to be really faster in the testcase


I guess it is not faster because you have not applied your original
optimalization to fill32 in libavutil/mem.c. Could you compare speed after
optimizing that the same way your original patch did it with imgutils
memset_bytes?


sure, that makes it faster:


Thanks, both patches LGTM.

Regards,
Marton
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2019-01-16 Thread Marton Balint



On Tue, 15 Jan 2019, Michael Niedermayer wrote:


On Sun, Dec 30, 2018 at 07:15:49PM +0100, Marton Balint wrote:



On Fri, 28 Dec 2018, Michael Niedermayer wrote:


On Wed, Dec 26, 2018 at 10:16:47PM +0100, Marton Balint wrote:



On Wed, 26 Dec 2018, Paul B Mahol wrote:


On 12/26/18, Michael Niedermayer  wrote:

On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:

On 12/25/18, Michael Niedermayer  wrote:

Fixes: Timeout
Fixes:
11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
Before: Executed
clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
in 11294 ms
After : Executed
clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
in 4249 ms

Found-by: continuous fuzzing process
https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer 
---
libavutil/imgutils.c | 6 ++
1 file changed, 6 insertions(+)

diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
index 4938a7ef67..cc38f1e878 100644
--- a/libavutil/imgutils.c
+++ b/libavutil/imgutils.c
@@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
dst_size,
uint8_t *clear,
}
} else if (clear_size == 4) {
uint32_t val = AV_RN32(clear);
+uint64_t val8 = val * 0x10001ULL;
+for (; dst_size >= 32; dst_size -= 32) {
+AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
+AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
+dst += 32;
+}
for (; dst_size >= 4; dst_size -= 4) {
AV_WN32(dst, val);
dst += 4;
--
2.20.1



NAK, implement special memset function instead.


I can move the added loop into a seperate function, if thats what you
suggest ?


No, don't do that.


All the code is already in a "special" memset though, this is
memset_bytes()



I guess function is less useful if its static. So any duplicate should
be avoided in codebase.


Isn't av_memcpy_backptr does almost exactly what is needed here? That can
also be optimized further if needed.


av_memcpy_backptr() copies data with overlap, its more like a recursive
memmove().


So? As far as I see the memset_bytes function in imgutils.c can be replaced
with this:

if (clear_size > dst_size)
clear_size = dst_size;
memcpy(dst, clear, clear_size);
av_memcpy_backptr(dst + clear_size, clear_size, dst_size - clear_size);

I am not against an av_memset_bytes API addition, but I believe it should
share code with av_memcpy_backptr to avoid duplication.


ive implemented this, it does not seem to be really faster in the testcase


I guess it is not faster because you have not applied your 
original optimalization to fill32 in libavutil/mem.c. Could you compare 
speed after optimizing that the same way your original patch did it with 
imgutils memset_bytes?


Thanks,
Marton
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2019-01-15 Thread Michael Niedermayer
On Sun, Dec 30, 2018 at 07:15:49PM +0100, Marton Balint wrote:
> 
> 
> On Fri, 28 Dec 2018, Michael Niedermayer wrote:
> 
> >On Wed, Dec 26, 2018 at 10:16:47PM +0100, Marton Balint wrote:
> >>
> >>
> >>On Wed, 26 Dec 2018, Paul B Mahol wrote:
> >>
> >>>On 12/26/18, Michael Niedermayer  wrote:
> On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:
> >On 12/25/18, Michael Niedermayer  wrote:
> >>Fixes: Timeout
> >>Fixes:
> >>11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>Before: Executed
> >>clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>in 11294 ms
> >>After : Executed
> >>clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>in 4249 ms
> >>
> >>Found-by: continuous fuzzing process
> >>https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> >>Signed-off-by: Michael Niedermayer 
> >>---
> >> libavutil/imgutils.c | 6 ++
> >> 1 file changed, 6 insertions(+)
> >>
> >>diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
> >>index 4938a7ef67..cc38f1e878 100644
> >>--- a/libavutil/imgutils.c
> >>+++ b/libavutil/imgutils.c
> >>@@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
> >>dst_size,
> >>uint8_t *clear,
> >> }
> >> } else if (clear_size == 4) {
> >> uint32_t val = AV_RN32(clear);
> >>+uint64_t val8 = val * 0x10001ULL;
> >>+for (; dst_size >= 32; dst_size -= 32) {
> >>+AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
> >>+AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
> >>+dst += 32;
> >>+}
> >> for (; dst_size >= 4; dst_size -= 4) {
> >> AV_WN32(dst, val);
> >> dst += 4;
> >>--
> >>2.20.1
> >>
> >
> >NAK, implement special memset function instead.
> 
> I can move the added loop into a seperate function, if thats what you
> suggest ?
> >>>
> >>>No, don't do that.
> >>>
> All the code is already in a "special" memset though, this is
> memset_bytes()
> 
> >>>
> >>>I guess function is less useful if its static. So any duplicate should
> >>>be avoided in codebase.
> >>
> >>Isn't av_memcpy_backptr does almost exactly what is needed here? That can
> >>also be optimized further if needed.
> >
> >av_memcpy_backptr() copies data with overlap, its more like a recursive
> >memmove().
> 
> So? As far as I see the memset_bytes function in imgutils.c can be replaced
> with this:
> 
> if (clear_size > dst_size)
> clear_size = dst_size;
> memcpy(dst, clear, clear_size);
> av_memcpy_backptr(dst + clear_size, clear_size, dst_size - clear_size);
> 
> I am not against an av_memset_bytes API addition, but I believe it should
> share code with av_memcpy_backptr to avoid duplication.

ive implemented this, it does not seem to be really faster in the testcase

patches below for reference:
From 30549c4e674abce48608d99ed3e7f8ccbd557ada Mon Sep 17 00:00:00 2001
From: Michael Niedermayer 
Date: Tue, 25 Dec 2018 23:15:20 +0100
Subject: [PATCH] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

Fixes: Timeout
Fixes: 
11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
Before: Executed 
clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920 
in 11294 ms
After : Executed 
clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920 
in 4249 ms

Found-by: continuous fuzzing process 
https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer 
---
 libavutil/imgutils.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
index 4938a7ef67..6c0d3950de 100644
--- a/libavutil/imgutils.c
+++ b/libavutil/imgutils.c
@@ -529,6 +529,14 @@ static void memset_bytes(uint8_t *dst, size_t dst_size, 
uint8_t *clear,
 }
 } else if (clear_size == 4) {
 uint32_t val = AV_RN32(clear);
+#if HAVE_FAST_64BIT
+uint64_t val8 = val * 0x10001ULL;
+for (; dst_size >= 32; dst_size -= 32) {
+AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
+AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
+dst += 32;
+}
+#endif
 for (; dst_size >= 4; dst_size -= 4) {
 AV_WN32(dst, val);
 dst += 4;
-- 
2.20.1


From 8e5140bf92d7e41090bfca1c6163f9c428402904 Mon Sep 17 00:00:00 2001
From: Michael Niedermayer 
Date: Tue, 25 Dec 2018 23:15:20 +0100
Subject: [PATCH] avutil/imgutils: Optimize memset_bytes() by using
 av_memcpy_backptr()

This is strongly based on code by Marton Balint

Fixes: Timeout
Fixes: 
11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
Before: Executed 

Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2018-12-30 Thread Michael Niedermayer
On Sun, Dec 30, 2018 at 07:15:49PM +0100, Marton Balint wrote:
> 
> 
> On Fri, 28 Dec 2018, Michael Niedermayer wrote:
> 
> >On Wed, Dec 26, 2018 at 10:16:47PM +0100, Marton Balint wrote:
> >>
> >>
> >>On Wed, 26 Dec 2018, Paul B Mahol wrote:
> >>
> >>>On 12/26/18, Michael Niedermayer  wrote:
> On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:
> >On 12/25/18, Michael Niedermayer  wrote:
> >>Fixes: Timeout
> >>Fixes:
> >>11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>Before: Executed
> >>clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>in 11294 ms
> >>After : Executed
> >>clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>in 4249 ms
> >>
> >>Found-by: continuous fuzzing process
> >>https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> >>Signed-off-by: Michael Niedermayer 
> >>---
> >> libavutil/imgutils.c | 6 ++
> >> 1 file changed, 6 insertions(+)
> >>
> >>diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
> >>index 4938a7ef67..cc38f1e878 100644
> >>--- a/libavutil/imgutils.c
> >>+++ b/libavutil/imgutils.c
> >>@@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
> >>dst_size,
> >>uint8_t *clear,
> >> }
> >> } else if (clear_size == 4) {
> >> uint32_t val = AV_RN32(clear);
> >>+uint64_t val8 = val * 0x10001ULL;
> >>+for (; dst_size >= 32; dst_size -= 32) {
> >>+AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
> >>+AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
> >>+dst += 32;
> >>+}
> >> for (; dst_size >= 4; dst_size -= 4) {
> >> AV_WN32(dst, val);
> >> dst += 4;
> >>--
> >>2.20.1
> >>
> >
> >NAK, implement special memset function instead.
> 
> I can move the added loop into a seperate function, if thats what you
> suggest ?
> >>>
> >>>No, don't do that.
> >>>
> All the code is already in a "special" memset though, this is
> memset_bytes()
> 
> >>>
> >>>I guess function is less useful if its static. So any duplicate should
> >>>be avoided in codebase.
> >>
> >>Isn't av_memcpy_backptr does almost exactly what is needed here? That can
> >>also be optimized further if needed.
> >
> >av_memcpy_backptr() copies data with overlap, its more like a recursive
> >memmove().
> 
> So? As far as I see the memset_bytes function in imgutils.c can be replaced
> with this:
> 
> if (clear_size > dst_size)
> clear_size = dst_size;
> memcpy(dst, clear, clear_size);
> av_memcpy_backptr(dst + clear_size, clear_size, dst_size - clear_size);
> 
> I am not against an av_memset_bytes API addition, but I believe it should
> share code with av_memcpy_backptr to avoid duplication.

Iam a bit concerned that combining them could result in lower speed 
but i can surely combine them if people prefer?

thx

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If you think the mosad wants you dead since a long time then you are either
wrong or dead since a long time.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2018-12-30 Thread Marton Balint



On Fri, 28 Dec 2018, Michael Niedermayer wrote:


On Wed, Dec 26, 2018 at 10:16:47PM +0100, Marton Balint wrote:



On Wed, 26 Dec 2018, Paul B Mahol wrote:


On 12/26/18, Michael Niedermayer  wrote:

On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:

On 12/25/18, Michael Niedermayer  wrote:

Fixes: Timeout
Fixes:
11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
Before: Executed
clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
in 11294 ms
After : Executed
clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
in 4249 ms

Found-by: continuous fuzzing process
https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer 
---
 libavutil/imgutils.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
index 4938a7ef67..cc38f1e878 100644
--- a/libavutil/imgutils.c
+++ b/libavutil/imgutils.c
@@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
dst_size,
uint8_t *clear,
 }
 } else if (clear_size == 4) {
 uint32_t val = AV_RN32(clear);
+uint64_t val8 = val * 0x10001ULL;
+for (; dst_size >= 32; dst_size -= 32) {
+AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
+AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
+dst += 32;
+}
 for (; dst_size >= 4; dst_size -= 4) {
 AV_WN32(dst, val);
 dst += 4;
--
2.20.1



NAK, implement special memset function instead.


I can move the added loop into a seperate function, if thats what you
suggest ?


No, don't do that.


All the code is already in a "special" memset though, this is
memset_bytes()



I guess function is less useful if its static. So any duplicate should
be avoided in codebase.


Isn't av_memcpy_backptr does almost exactly what is needed here? That can
also be optimized further if needed.


av_memcpy_backptr() copies data with overlap, its more like a recursive
memmove().


So? As far as I see the memset_bytes function in imgutils.c can be 
replaced with this:


if (clear_size > dst_size)
clear_size = dst_size;
memcpy(dst, clear, clear_size);
av_memcpy_backptr(dst + clear_size, clear_size, dst_size - clear_size);

I am not against an av_memset_bytes API addition, but I believe it should 
share code with av_memcpy_backptr to avoid duplication.


Regards,
Marton
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2018-12-28 Thread Michael Niedermayer
On Wed, Dec 26, 2018 at 10:16:47PM +0100, Marton Balint wrote:
> 
> 
> On Wed, 26 Dec 2018, Paul B Mahol wrote:
> 
> >On 12/26/18, Michael Niedermayer  wrote:
> >>On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:
> >>>On 12/25/18, Michael Niedermayer  wrote:
>  Fixes: Timeout
>  Fixes:
>  11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>  Before: Executed
>  clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>  in 11294 ms
>  After : Executed
>  clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>  in 4249 ms
> 
>  Found-by: continuous fuzzing process
>  https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
>  Signed-off-by: Michael Niedermayer 
>  ---
>   libavutil/imgutils.c | 6 ++
>   1 file changed, 6 insertions(+)
> 
>  diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
>  index 4938a7ef67..cc38f1e878 100644
>  --- a/libavutil/imgutils.c
>  +++ b/libavutil/imgutils.c
>  @@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
>  dst_size,
>  uint8_t *clear,
>   }
>   } else if (clear_size == 4) {
>   uint32_t val = AV_RN32(clear);
>  +uint64_t val8 = val * 0x10001ULL;
>  +for (; dst_size >= 32; dst_size -= 32) {
>  +AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
>  +AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
>  +dst += 32;
>  +}
>   for (; dst_size >= 4; dst_size -= 4) {
>   AV_WN32(dst, val);
>   dst += 4;
>  --
>  2.20.1
> 
> >>>
> >>>NAK, implement special memset function instead.
> >>
> >>I can move the added loop into a seperate function, if thats what you
> >>suggest ?
> >
> >No, don't do that.
> >
> >>All the code is already in a "special" memset though, this is
> >>memset_bytes()
> >>
> >
> >I guess function is less useful if its static. So any duplicate should
> >be avoided in codebase.
> 
> Isn't av_memcpy_backptr does almost exactly what is needed here? That can
> also be optimized further if needed.

av_memcpy_backptr() copies data with overlap, its more like a recursive
memmove(). 

thx
[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No snowflake in an avalanche ever feels responsible. -- Voltaire


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2018-12-27 Thread Michael Niedermayer
On Wed, Dec 26, 2018 at 12:30:05AM +, Kieran Kunhya wrote:
> On Tue, 25 Dec 2018 at 22:17 Michael Niedermayer 
> wrote:
> 
> > Fixes: Timeout
> > Fixes:
> > 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> > Before: Executed
> > clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> > in 11294 ms
> > After : Executed
> > clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> > in 4249 ms
> >
> 
> So basically you go from one arbitrary timeout to another. I assume your
> fuzzer timeout is 5 or 10 seconds and so you manage to make it pass?

I think the timeout used is 25sec. I assume the server that detected it
was maybe slower than the machine i used. Its even quite possible that
variation between runs, due to hardware, load or other issues to cause
timeouts to disappear.

But none of this really should matter in the context of this patchset.
Which simply makes the code faster.

Thanks

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Many things microsoft did are stupid, but not doing something just because
microsoft did it is even more stupid. If everything ms did were stupid they
would be bankrupt already.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2018-12-26 Thread Paul B Mahol
On 12/26/18, Michael Niedermayer  wrote:
> On Wed, Dec 26, 2018 at 10:02:15PM +0100, Paul B Mahol wrote:
>> On 12/26/18, Michael Niedermayer  wrote:
>> > On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:
>> >> On 12/25/18, Michael Niedermayer  wrote:
>> >> > Fixes: Timeout
>> >> > Fixes:
>> >> > 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>> >> > Before: Executed
>> >> > clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>> >> > in 11294 ms
>> >> > After : Executed
>> >> > clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>> >> > in 4249 ms
>> >> >
>> >> > Found-by: continuous fuzzing process
>> >> > https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
>> >> > Signed-off-by: Michael Niedermayer 
>> >> > ---
>> >> >  libavutil/imgutils.c | 6 ++
>> >> >  1 file changed, 6 insertions(+)
>> >> >
>> >> > diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
>> >> > index 4938a7ef67..cc38f1e878 100644
>> >> > --- a/libavutil/imgutils.c
>> >> > +++ b/libavutil/imgutils.c
>> >> > @@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
>> >> > dst_size,
>> >> > uint8_t *clear,
>> >> >  }
>> >> >  } else if (clear_size == 4) {
>> >> >  uint32_t val = AV_RN32(clear);
>> >> > +uint64_t val8 = val * 0x10001ULL;
>> >> > +for (; dst_size >= 32; dst_size -= 32) {
>> >> > +AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
>> >> > +AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
>> >> > +dst += 32;
>> >> > +}
>> >> >  for (; dst_size >= 4; dst_size -= 4) {
>> >> >  AV_WN32(dst, val);
>> >> >  dst += 4;
>> >> > --
>> >> > 2.20.1
>> >> >
>> >>
>> >> NAK, implement special memset function instead.
>> >
>> > I can move the added loop into a seperate function, if thats what you
>> > suggest ?
>>
>> No, don't do that.
>>
>> > All the code is already in a "special" memset though, this is
>> > memset_bytes()
>> >
>>
>> I guess function is less useful if its static. So any duplicate should
>> be avoided in codebase.
>
> i can make it non static and export it if you want ?
>

Yes. Also make it used where there is already similar code doing same.

> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> If you fake or manipulate statistics in a paper in physics you will never
> get a job again.
> If you fake or manipulate statistics in a paper in medicin you will get
> a job for life at the pharma industry.
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2018-12-26 Thread Marton Balint



On Wed, 26 Dec 2018, Paul B Mahol wrote:


On 12/26/18, Michael Niedermayer  wrote:

On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:

On 12/25/18, Michael Niedermayer  wrote:
> Fixes: Timeout
> Fixes:
> 
11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> Before: Executed
> clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> in 11294 ms
> After : Executed
> clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> in 4249 ms
>
> Found-by: continuous fuzzing process
> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer 
> ---
>  libavutil/imgutils.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
> index 4938a7ef67..cc38f1e878 100644
> --- a/libavutil/imgutils.c
> +++ b/libavutil/imgutils.c
> @@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
> dst_size,
> uint8_t *clear,
>  }
>  } else if (clear_size == 4) {
>  uint32_t val = AV_RN32(clear);
> +uint64_t val8 = val * 0x10001ULL;
> +for (; dst_size >= 32; dst_size -= 32) {
> +AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
> +AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
> +dst += 32;
> +}
>  for (; dst_size >= 4; dst_size -= 4) {
>  AV_WN32(dst, val);
>  dst += 4;
> --
> 2.20.1
>

NAK, implement special memset function instead.


I can move the added loop into a seperate function, if thats what you
suggest ?


No, don't do that.


All the code is already in a "special" memset though, this is
memset_bytes()



I guess function is less useful if its static. So any duplicate should
be avoided in codebase.


Isn't av_memcpy_backptr does almost exactly what is needed here? That can 
also be optimized further if needed.


Thanks,
Marton
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2018-12-26 Thread Michael Niedermayer
On Wed, Dec 26, 2018 at 10:02:15PM +0100, Paul B Mahol wrote:
> On 12/26/18, Michael Niedermayer  wrote:
> > On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:
> >> On 12/25/18, Michael Niedermayer  wrote:
> >> > Fixes: Timeout
> >> > Fixes:
> >> > 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >> > Before: Executed
> >> > clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >> > in 11294 ms
> >> > After : Executed
> >> > clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >> > in 4249 ms
> >> >
> >> > Found-by: continuous fuzzing process
> >> > https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> >> > Signed-off-by: Michael Niedermayer 
> >> > ---
> >> >  libavutil/imgutils.c | 6 ++
> >> >  1 file changed, 6 insertions(+)
> >> >
> >> > diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
> >> > index 4938a7ef67..cc38f1e878 100644
> >> > --- a/libavutil/imgutils.c
> >> > +++ b/libavutil/imgutils.c
> >> > @@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
> >> > dst_size,
> >> > uint8_t *clear,
> >> >  }
> >> >  } else if (clear_size == 4) {
> >> >  uint32_t val = AV_RN32(clear);
> >> > +uint64_t val8 = val * 0x10001ULL;
> >> > +for (; dst_size >= 32; dst_size -= 32) {
> >> > +AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
> >> > +AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
> >> > +dst += 32;
> >> > +}
> >> >  for (; dst_size >= 4; dst_size -= 4) {
> >> >  AV_WN32(dst, val);
> >> >  dst += 4;
> >> > --
> >> > 2.20.1
> >> >
> >>
> >> NAK, implement special memset function instead.
> >
> > I can move the added loop into a seperate function, if thats what you
> > suggest ?
> 
> No, don't do that.
> 
> > All the code is already in a "special" memset though, this is
> > memset_bytes()
> >
> 
> I guess function is less useful if its static. So any duplicate should
> be avoided in codebase.

i can make it non static and export it if you want ?

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If you fake or manipulate statistics in a paper in physics you will never
get a job again.
If you fake or manipulate statistics in a paper in medicin you will get
a job for life at the pharma industry.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2018-12-26 Thread Paul B Mahol
On 12/26/18, Michael Niedermayer  wrote:
> On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:
>> On 12/25/18, Michael Niedermayer  wrote:
>> > Fixes: Timeout
>> > Fixes:
>> > 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>> > Before: Executed
>> > clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>> > in 11294 ms
>> > After : Executed
>> > clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>> > in 4249 ms
>> >
>> > Found-by: continuous fuzzing process
>> > https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
>> > Signed-off-by: Michael Niedermayer 
>> > ---
>> >  libavutil/imgutils.c | 6 ++
>> >  1 file changed, 6 insertions(+)
>> >
>> > diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
>> > index 4938a7ef67..cc38f1e878 100644
>> > --- a/libavutil/imgutils.c
>> > +++ b/libavutil/imgutils.c
>> > @@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
>> > dst_size,
>> > uint8_t *clear,
>> >  }
>> >  } else if (clear_size == 4) {
>> >  uint32_t val = AV_RN32(clear);
>> > +uint64_t val8 = val * 0x10001ULL;
>> > +for (; dst_size >= 32; dst_size -= 32) {
>> > +AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
>> > +AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
>> > +dst += 32;
>> > +}
>> >  for (; dst_size >= 4; dst_size -= 4) {
>> >  AV_WN32(dst, val);
>> >  dst += 4;
>> > --
>> > 2.20.1
>> >
>>
>> NAK, implement special memset function instead.
>
> I can move the added loop into a seperate function, if thats what you
> suggest ?

No, don't do that.

> All the code is already in a "special" memset though, this is
> memset_bytes()
>

I guess function is less useful if its static. So any duplicate should
be avoided in codebase.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2018-12-26 Thread Michael Niedermayer
On Tue, Dec 25, 2018 at 10:12:13PM -0300, James Almer wrote:
> On 12/25/2018 7:15 PM, Michael Niedermayer wrote:
> > Fixes: Timeout
> > Fixes: 
> > 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> > Before: Executed 
> > clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >  in 11294 ms
> > After : Executed 
> > clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >  in 4249 ms
> > 
> > Found-by: continuous fuzzing process 
> > https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> > Signed-off-by: Michael Niedermayer 
> > ---
> >  libavutil/imgutils.c | 6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
> > index 4938a7ef67..cc38f1e878 100644
> > --- a/libavutil/imgutils.c
> > +++ b/libavutil/imgutils.c
> > @@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t 
> > dst_size, uint8_t *clear,
> >  }
> >  } else if (clear_size == 4) {
> >  uint32_t val = AV_RN32(clear);
> > +uint64_t val8 = val * 0x10001ULL;
> > +for (; dst_size >= 32; dst_size -= 32) {
> > +AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
> > +AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
> > +dst += 32;
> > +}
> 
> This should be wrapped with a HAVE_FAST_64BIT preprocessor check.

will do so


> 
> Also, is it much slower if you also write one per loop like everywhere
> else in the function? I'd prefer if things are consistent.

as in the patch:
 3955 ms  3954 ms  3954 ms
 
with one write per iteration:
 5705 ms  5635 ms  5629 ms

 
> Similarly, you could add four and eight bytes loops to the clear_size ==
> 2 case above.

yes i can if you want me to?, but i have no testcase for that so it would be 
untested

thx


[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Dictatorship naturally arises out of democracy, and the most aggravated
form of tyranny and slavery out of the most extreme liberty. -- Plato


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2018-12-26 Thread Michael Niedermayer
On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:
> On 12/25/18, Michael Niedermayer  wrote:
> > Fixes: Timeout
> > Fixes:
> > 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> > Before: Executed
> > clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> > in 11294 ms
> > After : Executed
> > clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> > in 4249 ms
> >
> > Found-by: continuous fuzzing process
> > https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> > Signed-off-by: Michael Niedermayer 
> > ---
> >  libavutil/imgutils.c | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
> > index 4938a7ef67..cc38f1e878 100644
> > --- a/libavutil/imgutils.c
> > +++ b/libavutil/imgutils.c
> > @@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t dst_size,
> > uint8_t *clear,
> >  }
> >  } else if (clear_size == 4) {
> >  uint32_t val = AV_RN32(clear);
> > +uint64_t val8 = val * 0x10001ULL;
> > +for (; dst_size >= 32; dst_size -= 32) {
> > +AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
> > +AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
> > +dst += 32;
> > +}
> >  for (; dst_size >= 4; dst_size -= 4) {
> >  AV_WN32(dst, val);
> >  dst += 4;
> > --
> > 2.20.1
> >
> 
> NAK, implement special memset function instead.

I can move the added loop into a seperate function, if thats what you
suggest ?
All the code is already in a "special" memset though, this is
memset_bytes()

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

"I am not trying to be anyone's saviour, I'm trying to think about the
 future and not be sad" - Elon Musk



signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2018-12-26 Thread Paul B Mahol
On 12/25/18, Michael Niedermayer  wrote:
> Fixes: Timeout
> Fixes:
> 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> Before: Executed
> clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> in 11294 ms
> After : Executed
> clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> in 4249 ms
>
> Found-by: continuous fuzzing process
> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer 
> ---
>  libavutil/imgutils.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
> index 4938a7ef67..cc38f1e878 100644
> --- a/libavutil/imgutils.c
> +++ b/libavutil/imgutils.c
> @@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t dst_size,
> uint8_t *clear,
>  }
>  } else if (clear_size == 4) {
>  uint32_t val = AV_RN32(clear);
> +uint64_t val8 = val * 0x10001ULL;
> +for (; dst_size >= 32; dst_size -= 32) {
> +AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
> +AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
> +dst += 32;
> +}
>  for (; dst_size >= 4; dst_size -= 4) {
>  AV_WN32(dst, val);
>  dst += 4;
> --
> 2.20.1
>

NAK, implement special memset function instead.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2018-12-25 Thread James Almer
On 12/25/2018 7:15 PM, Michael Niedermayer wrote:
> Fixes: Timeout
> Fixes: 
> 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> Before: Executed 
> clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>  in 11294 ms
> After : Executed 
> clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>  in 4249 ms
> 
> Found-by: continuous fuzzing process 
> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer 
> ---
>  libavutil/imgutils.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
> index 4938a7ef67..cc38f1e878 100644
> --- a/libavutil/imgutils.c
> +++ b/libavutil/imgutils.c
> @@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t dst_size, 
> uint8_t *clear,
>  }
>  } else if (clear_size == 4) {
>  uint32_t val = AV_RN32(clear);
> +uint64_t val8 = val * 0x10001ULL;
> +for (; dst_size >= 32; dst_size -= 32) {
> +AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
> +AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
> +dst += 32;
> +}

This should be wrapped with a HAVE_FAST_64BIT preprocessor check.

Also, is it much slower if you also write one per loop like everywhere
else in the function? I'd prefer if things are consistent.
Similarly, you could add four and eight bytes loops to the clear_size ==
2 case above.

>  for (; dst_size >= 4; dst_size -= 4) {
>  AV_WN32(dst, val);
>  dst += 4;
> 

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

2018-12-25 Thread Kieran Kunhya
On Tue, 25 Dec 2018 at 22:17 Michael Niedermayer 
wrote:

> Fixes: Timeout
> Fixes:
> 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> Before: Executed
> clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> in 11294 ms
> After : Executed
> clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> in 4249 ms
>

So basically you go from one arbitrary timeout to another. I assume your
fuzzer timeout is 5 or 10 seconds and so you manage to make it pass?

Kieran
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel