On Sun, Nov 02, 2014 at 07:55:35PM -0300, James Almer wrote: > On 02/11/14 7:43 PM, Clément Bœsch wrote: > > On Sun, Nov 02, 2014 at 07:31:48PM -0300, James Almer wrote: > >> Two to four times faster depending on instruction set, block size and > >> channel count. > >> > >> Signed-off-by: James Almer <jamr...@gmail.com> > >> --- > >> TODO: 16 bits indep for 4, 6 and 8 channels. 24/32 bits indep for 8 > >> channels. > >> AVX2 and maybe MMX versions. > >> Planar? > >> > >> libavcodec/arm/flacdsp_init_arm.c | 2 +- > >> libavcodec/flacdec.c | 6 +- > >> libavcodec/flacdsp.c | 6 +- > >> libavcodec/flacdsp.h | 6 +- > >> libavcodec/flacenc.c | 2 +- > >> libavcodec/x86/flacdsp.asm | 206 > >> ++++++++++++++++++++++++++++++++++++++ > >> libavcodec/x86/flacdsp_init.c | 48 ++++++++- > >> 7 files changed, 264 insertions(+), 12 deletions(-) > > [...] > >> + mova m0, [in0q] > >> + mova m1, [in0q+in1q] > >> +%if %1 > 2 > >> + mova m2, [in0q+in2q] > >> + mova m3, [in0q+in3q] > >> +%if %1 > 4 > >> + mova m4, [in0q+in4q] > >> + mova m5, [in0q+in5q] > >> +%endif > >> +%endif > >> + pslld m0, m%2 > >> + pslld m1, m%2 > >> +%if %1 > 2 > >> + pslld m2, m%2 > >> + pslld m3, m%2 > >> +%if %1 > 4 > >> + pslld m4, m%2 > >> + pslld m5, m%2 > >> +%endif > >> +%endif > > > > Can't you do something like this? (untested) > > pslld m0, [in0q], m%2 > > %assign i 0 > > %rep %1 > > pslld m%i, [in0q+in%iq], m%2 > > %assign i i+1 > > %endrep > > YASM libavcodec/x86/flacdsp.o > D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: undefined > symbol `m' (first use) > D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: undefined > symbol `i' (first use) > D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: undefined > symbol `in' (first use) > D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: undefined > symbol `iq' (first use) > D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: (Each > undefined symbol is reported only once.) > make: *** [libavcodec/x86/flacdsp.o] Error 1 > > A %rep like that is only four lines shorter. Do you consider it more readable > than the alternative to justify trying > to get it working?
Totally up to you, it looked easier to maintain and obvious than several nested ifdefery. -- Clément B.
pgpzLgMCq7wMk.pgp
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel