@Yuvaraj,  Jason means all of m1 is uninitialize before instruction execute, 
you said it is initialize after.
see Intel doc for palignr, the register m1 is src and dest, it is logic 
problem, but it save a mov instruction and work fine.
 
At 2014-01-17 14:42:31,"Yuvaraj Venkatesh" <yuva...@multicorewareinc.com> wrote:

Only the last pixel got the dependency of m1(uninitialized), anyway that 
particular pixel was not used anywhere on the code.



Moreover psrldq has higher latency than the palignr and also need to use 
additional mov instruction.



On Fri, Jan 17, 2014 at 12:08 PM, chen <chenm...@163.com> wrote:

At 2014-01-17 14:00:55,"Jason Garrett-Glaser" <ja...@x264.com> wrote:

>+    movu        m0,        [r2 + 1]                   ; [16 15 14 13
>12 11 10 9 8 7 6 5 4 3 2 1]
>+    palignr     m1,        m0, 1                      ; [x 16 15 14
>13 12 11 10 9 8 7 6 5 4 3 2]
>
>Shouldn't this be pslrdq or similar?  The dependency on uninitialized
>registers is a bit weird too...

This algorithm is suggest by me, the  psrldq can't move register, we
have to wasting some instruction to do it.
Of course, we have a restrict use uninitialize value on other instruction.
 

_______________________________________________
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel



_______________________________________________
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel

Reply via email to