Hi,

one more, I forgot.

On Sun, May 19, 2024 at 8:46 PM Stone Chen <chen.stonec...@gmail.com> wrote:

> +pw_1: dw 1
>
[..]

> +    vpbroadcastw       m4, [pw_1]
>

We typically suggest to use vpbroadcastd, not w (and then pw_1: times 2 dw
1). agner shows that on e.g. Haswell, the former (d) is 1 uops with 5
cycles latency, whereas the latter (w) is 3 uops with 7 cycles latency, or
more generally d is faster then w.

Ronald
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to