Only the last pixel got the dependency of m1(uninitialized), anyway that particular pixel was not used anywhere on the code.
Moreover psrldq has higher latency than the palignr and also need to use additional mov instruction. On Fri, Jan 17, 2014 at 12:08 PM, chen <chenm...@163.com> wrote: > At 2014-01-17 14:00:55,"Jason Garrett-Glaser" <ja...@x264.com> wrote: > > >+ movu m0, [r2 + 1] ; [16 15 14 13 > >12 11 10 9 8 7 6 5 4 3 2 1] > >+ palignr m1, m0, 1 ; [x 16 15 14 > >13 12 11 10 9 8 7 6 5 4 3 2] > > > >Shouldn't this be pslrdq or similar? The dependency on uninitialized > >registers is a bit weird too... > This algorithm is suggest by me, the psrldq can't move register, we > have to wasting some instruction to do it. > Of course, we have a restrict use uninitialize value on other instruction. > > > _______________________________________________ > x265-devel mailing list > x265-devel@videolan.org > https://mailman.videolan.org/listinfo/x265-devel > >
_______________________________________________ x265-devel mailing list x265-devel@videolan.org https://mailman.videolan.org/listinfo/x265-devel