On 04/07/2013 04:21 PM, Christophe Gisquet wrote:
>>From 253 to 70(sse)/52(sse2) cycles on Arrandale and Win64.
> 61/55 cycles on SandyBridge.
> ---
>  libavcodec/x86/sbrdsp.asm    | 46 
> ++++++++++++++++++++++++++++++++++++++++++++
>  libavcodec/x86/sbrdsp_init.c |  7 +++++++
>  2 files changed, 53 insertions(+)
[...]
> +%if cpuflag(sse2)
> +%define XOR      pxor
> +%define SHUFFLE  pshufd
> +%define MOVH     movq
> +%else
> +%define XOR      xorps
> +%define SHUFFLE  shufps
> +%define MOVH     movlps
> +%endif
> +
[...]
> +
> +INIT_XMM sse
> +SBR_QMF_PRE_SHUFFLE
> +
> +INIT_XMM sse2
> +SBR_QMF_PRE_SHUFFLE

So, what exactly is the point of all this extra code to optimize for
systems with SSE but not SSE2? Elsewhere we typically just use SSE2 if
it's faster.

-Justin
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to