2014-03-29 2:10 GMT+01:00 James Almer <jamr...@gmail.com>:
> You're right that it's all float data, but both Christophe and I tested and
> xorps/shufps was a bit slower than pxor/pshufd (At least in my tests it was
> about five cycles slower), so i decided to use some ifdeffery to keep the
> SSE2 version intact.

I can confirm this: James did what you proposed first, and I mentioned
having benchmarked it as slower. Same observation from him, hence the
current code.

If this was always true, it would be nice to have something like
xorps/... a macro switching to either instruction depending on the
set. Not sure x264 would benefit from this, of course.

-- 
Christophe
_______________________________________________
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to