Hi, On Sat, Oct 15, 2011 at 2:53 AM, Loren Merritt <[email protected]> wrote: > On Fri, 14 Oct 2011, Ronald S. Bultje wrote: > >> + packusdw m0, m1 >> + packusdw m2, m3 > > sse4
Ah, that's why Kieran's assembly was marked sse4. I'll make a sse2-version that needs a pmaxsw x, zero also then. > Are things usually unaligned? No, I'm a little too pessimistic in this patch. In fact, the src in this function is always aligned, so these should be mova. I'm not sure about dest, in my tests they tend to be aligned but I'm not sure if the API guarantees that. I don't think it does. I can test for alignment at function start and split the loop into two copies, one for aligned dest and one for unaligned dest. Ronald _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
