Issue 178792
Summary [X86] Missed fold: `vgf2p8affineqb` + 8-bit shifts
Labels new issue
Assignees
Reporter WalterKruger
    Performing a 8-bit shift on `vgf2p8affineqb` can be folded by together by shifting its matrix by 8 times the shift amount in the opposite direction. The position of each matrix byte determines a corresponding bit in the output, in reverse order, so shifting the matrix is equivalent. (Further reading: https://wunkolo.github.io/post/2020/11/gf2p8affineqb-int8-shifting/)

This applies to logical, rotational, and arithmetic shifts in either direction (but with the arithmetic shifting-in the entire least significant byte rather then LS bit). Only seem beneficial when the matrix and shift amount are constant.

```asm
bitRevSHL1_clang:
        gf2p8affineqb   xmm0, xmmword ptr [rip + .LCPI0_0], 0
        paddb   xmm0, xmm0
 ret
```

```asm
bitRevSHL1_tgt:
        gf2p8affineqb   xmm0, xmmword ptr [rip + .LCPI0_0], 0
 ret
```

https://godbolt.org/z/rd7frehEd
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to