On Wed, May 11, 2016 at 9:04 PM, Reimar Döffinger <reimar.doeffin...@gmx.de> wrote:
> > > On 11.05.2016, at 20:37, Michael Niedermayer <mich...@niedermayer.cc> > wrote: > > > On Wed, May 11, 2016 at 06:39:20PM +0200, Matthieu Bouron wrote: > >> From: Matthieu Bouron <matthieu.bou...@stupeflix.com> > >> > >> --- > >> > >> Hello, > >> > >> Here are some benchmark on a rpi2 of the attached patch. > >> > >> ./ffmpeg -f lavfi -i > sine=440,aformat=sample_fmts=fltp,asetnsamples=4096,abench=start,aresample=48000,abench=stop > -t 1000 -f null - > >> > >> With patch: avg=0.001159 speed=44,1x > >> Without patch: avg=0.001297 speed=40,8x > >> > >> ./ffmpeg -f lavfi -i > sine=440,aformat=sample_fmts=s16p,asetnsamples=4096,abench=start,aresample=48000,abench=stop > -t 1000 -f null - > >> > > > >> With patch: avg=0.001374 speed=45,6x > >> Without patch: avg=0.000782 speed=64,6x > > > > so its slower ? or am i misreading this ? > > > Yes, that seems weird. > Also, what are common filter lengths? > Sorry I inverted the two results, the neon version is actually faster: With*out* patch: avg=0.001374 speed=45,6x With patch: avg=0.000782 speed=64,6x > Because for a length of 4 or 8 or 16 I'd think this would be much better > fully unrolled. > And for longer ones at least partially unrolled. > The common filter length seems to be 32 but it might depends. Regarding the little performance gain on the float version it seems to be due to the switch between vfp instructions versus neon instructions (i'm not 100% sure). Matthieu [...] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel