[llvm-bugs] [Bug 164200] [X86] 8-bit vector multiplication should use shift and add method for more constants

LLVM Bugs via llvm-bugs Sun, 19 Oct 2025 21:03:40 -0700

Issue	164200
Summary	[X86] 8-bit vector multiplication should use shift and add method for more constants
Labels	new issue
Assignees
Reporter	WalterKruger

    Vector multiplication by most 8-bit constants is currently implemented by a width extension to 16-bits:
```asm
multiplyBy10_clang:
        movdqa  xmm1, xmm0
 punpckhbw       xmm1, xmm1
        movdqa  xmm2, xmmword ptr [rip + .LCPI0_0]
        pmullw  xmm1, xmm2
        movdqa  xmm3, xmmword ptr [rip + .LCPI0_1]
        pand    xmm1, xmm3
        punpcklbw       xmm0, xmm0
 pmullw  xmm0, xmm2
        pand    xmm0, xmm3
        packuswb xmm0, xmm1
        ret
```


However, it is often more efficient to instead perform a short sequence of shift-and-adds both in terms of size and dependency length. For example, `x * 10 = (x << 3) + (x << 1)`:
```asm
multiplyBy10_shiftAndAdd:
        movdqa  xmm1, xmm0
 paddb   xmm0, xmm0
        psllw   xmm1, 3
        pand    xmm1, xmmword ptr [rip + .LCPI0_0]
        paddb   xmm0, xmm1
        ret
```

This method is currently implemented, but only for constants that are almost powers of two. Notably, gcc always use this method (although its sequences are often non-optimal).

https://godbolt.org/z/naKxr6z6a

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 164200] [X86] 8-bit vector multiplication should use shift and add method for more constants

Reply via email to