Re: [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx
On Thu, 4 Aug 2022, Michael Niedermayer wrote: On Thu, Aug 04, 2022 at 10:47:34AM +0300, Martin Storsjö wrote: On Wed, 13 Jul 2022, Martin Storsjö wrote: The height is hardcoded in some of the me_cmp functions, but not in all of them. But in the case of all other functions, it's hardcoded in the same place in SIMD functions as in the C reference functions, while this one function differs from the behaviour of the C code. (Before 542765ce3eccbca587d54262a512cbdb1407230d, there were a couple other sad8_*_mmx functions with similar hardcoded height.) --- libavcodec/x86/me_cmp_init.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c index 61e9396b8f..dcc2621276 100644 --- a/libavcodec/x86/me_cmp_init.c +++ b/libavcodec/x86/me_cmp_init.c @@ -202,13 +202,12 @@ static inline int sum_mmx(void) static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ uint8_t *blk1, ptrdiff_t stride, int h) \ { \ -av_assert2(h == 8); \ __asm__ volatile ( \ "pxor %%mm7, %%mm7 \n\t"\ "pxor %%mm6, %%mm6 \n\t"\ ::);\ \ -sad8_4_ ## suf(blk1, blk2, stride, 8); \ +sad8_4_ ## suf(blk1, blk2, stride, h); \ \ return sum_ ## suf(); \ } \ -- 2.25.1 Ping, does this seem reasonable? Michael indicated a desire to make the me_cmp functions more general and flexible than what they are today, and this would be a first step to making checkasm test such cases. LGTM assuming it doesnt have any problematic perforamce impact Thanks - I didn't notice any significant change in the checkasm bench numbers for it. // Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx
On Thu, Aug 04, 2022 at 10:47:34AM +0300, Martin Storsjö wrote: > On Wed, 13 Jul 2022, Martin Storsjö wrote: > > > The height is hardcoded in some of the me_cmp functions, but not > > in all of them. But in the case of all other functions, it's hardcoded > > in the same place in SIMD functions as in the C reference functions, > > while this one function differs from the behaviour of the C code. > > > > (Before 542765ce3eccbca587d54262a512cbdb1407230d, there were a > > couple other sad8_*_mmx functions with similar hardcoded height.) > > --- > > libavcodec/x86/me_cmp_init.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c > > index 61e9396b8f..dcc2621276 100644 > > --- a/libavcodec/x86/me_cmp_init.c > > +++ b/libavcodec/x86/me_cmp_init.c > > @@ -202,13 +202,12 @@ static inline int sum_mmx(void) > > static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ > > uint8_t *blk1, ptrdiff_t stride, int h) \ > > { \ > > -av_assert2(h == 8); > > \ > > __asm__ volatile ( \ > > "pxor %%mm7, %%mm7 \n\t"\ > > "pxor %%mm6, %%mm6 \n\t"\ > > ::);\ > > \ > > -sad8_4_ ## suf(blk1, blk2, stride, 8); \ > > +sad8_4_ ## suf(blk1, blk2, stride, h); \ > > \ > > return sum_ ## suf(); \ > > } \ > > -- > > 2.25.1 > > Ping, does this seem reasonable? Michael indicated a desire to make the > me_cmp functions more general and flexible than what they are today, and > this would be a first step to making checkasm test such cases. LGTM assuming it doesnt have any problematic perforamce impact thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Frequently ignored answer#1 FFmpeg bugs should be sent to our bugtracker. User questions about the command line tools should be sent to the ffmpeg-user ML. And questions about how to use libav* should be sent to the libav-user ML. signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx
On Wed, 13 Jul 2022, Martin Storsjö wrote: The height is hardcoded in some of the me_cmp functions, but not in all of them. But in the case of all other functions, it's hardcoded in the same place in SIMD functions as in the C reference functions, while this one function differs from the behaviour of the C code. (Before 542765ce3eccbca587d54262a512cbdb1407230d, there were a couple other sad8_*_mmx functions with similar hardcoded height.) --- libavcodec/x86/me_cmp_init.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c index 61e9396b8f..dcc2621276 100644 --- a/libavcodec/x86/me_cmp_init.c +++ b/libavcodec/x86/me_cmp_init.c @@ -202,13 +202,12 @@ static inline int sum_mmx(void) static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ uint8_t *blk1, ptrdiff_t stride, int h) \ { \ -av_assert2(h == 8); \ __asm__ volatile ( \ "pxor %%mm7, %%mm7 \n\t"\ "pxor %%mm6, %%mm6 \n\t"\ ::);\ \ -sad8_4_ ## suf(blk1, blk2, stride, 8); \ +sad8_4_ ## suf(blk1, blk2, stride, h); \ \ return sum_ ## suf(); \ } \ -- 2.25.1 Ping, does this seem reasonable? Michael indicated a desire to make the me_cmp functions more general and flexible than what they are today, and this would be a first step to making checkasm test such cases. // Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx
The height is hardcoded in some of the me_cmp functions, but not in all of them. But in the case of all other functions, it's hardcoded in the same place in SIMD functions as in the C reference functions, while this one function differs from the behaviour of the C code. (Before 542765ce3eccbca587d54262a512cbdb1407230d, there were a couple other sad8_*_mmx functions with similar hardcoded height.) --- libavcodec/x86/me_cmp_init.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c index 61e9396b8f..dcc2621276 100644 --- a/libavcodec/x86/me_cmp_init.c +++ b/libavcodec/x86/me_cmp_init.c @@ -202,13 +202,12 @@ static inline int sum_mmx(void) static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ uint8_t *blk1, ptrdiff_t stride, int h) \ { \ -av_assert2(h == 8); \ __asm__ volatile ( \ "pxor %%mm7, %%mm7 \n\t"\ "pxor %%mm6, %%mm6 \n\t"\ ::);\ \ -sad8_4_ ## suf(blk1, blk2, stride, 8); \ +sad8_4_ ## suf(blk1, blk2, stride, h); \ \ return sum_ ## suf(); \ } \ -- 2.25.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".