On 09/09/14 9:52 AM, Pascal Massimino wrote:
> +    mova      m2, m_sum
> +%if mmsize == 16
> +    psrldq    m2, 4
> +    paddd     m_sum, m2
> +    psrldq    m2, 4
> +    paddd     m_sum, m2
> +    psrldq    m2, 4
> +    paddd     m_sum, m2
> +%else
> +    psrlq     m2, 32
> +    paddd     m_sum, m2
> +%endif

The SSE2 version is using three instructions more than necessary here.
You could use the HADDD macro to replace the code above, which expands 
to a more optimized SSE2 version.

And now that i check the old stuff again, you could also use it in the 
IDET_FILTER_LINE macro. It will be one less instruction for the mmxext 
version.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to