On 04/07/2013 04:21 PM, Christophe Gisquet wrote: >>From 253 to 70(sse)/52(sse2) cycles on Arrandale and Win64. > 61/55 cycles on SandyBridge. > --- > libavcodec/x86/sbrdsp.asm | 46 > ++++++++++++++++++++++++++++++++++++++++++++ > libavcodec/x86/sbrdsp_init.c | 7 +++++++ > 2 files changed, 53 insertions(+) [...] > +%if cpuflag(sse2) > +%define XOR pxor > +%define SHUFFLE pshufd > +%define MOVH movq > +%else > +%define XOR xorps > +%define SHUFFLE shufps > +%define MOVH movlps > +%endif > + [...] > + > +INIT_XMM sse > +SBR_QMF_PRE_SHUFFLE > + > +INIT_XMM sse2 > +SBR_QMF_PRE_SHUFFLE
So, what exactly is the point of all this extra code to optimize for systems with SSE but not SSE2? Elsewhere we typically just use SSE2 if it's faster. -Justin _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
