Only the last pixel got the dependency of m1(uninitialized), anyway that
particular pixel was not used anywhere on the code.

Moreover psrldq has higher latency than the palignr and also need to use
additional mov instruction.


On Fri, Jan 17, 2014 at 12:08 PM, chen <chenm...@163.com> wrote:

> At 2014-01-17 14:00:55,"Jason Garrett-Glaser" <ja...@x264.com> wrote:
>
> >+    movu        m0,        [r2 + 1]                   ; [16 15 14 13
> >12 11 10 9 8 7 6 5 4 3 2 1]
> >+    palignr     m1,        m0, 1                      ; [x 16 15 14
> >13 12 11 10 9 8 7 6 5 4 3 2]
> >
> >Shouldn't this be pslrdq or similar?  The dependency on uninitialized
> >registers is a bit weird too...
> This algorithm is suggest by me, the  psrldq can't move register, we
> have to wasting some instruction to do it.
> Of course, we have a restrict use uninitialize value on other instruction.
>
>
> _______________________________________________
> x265-devel mailing list
> x265-devel@videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
>
_______________________________________________
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel

Reply via email to