在 2024/7/31 下午6:25, Xi Ruoyao 写道:
On Wed, 2024-07-31 at 16:57 +0800, Lulu Cheng wrote:
在 2024/7/29 下午3:58, Xi Ruoyao 写道:
Per a gcc-help thread we are generating sub-optimal code for
__builtin_bswap{32,64}.  To fix it:

- Use a single revb.d instruction for bswapdi2.
- Use a single revb.2w instruction for bswapsi2 for TARGET_64BIT,
     revb.2h + rotri.w for !TARGET_64BIT.
- Use a single revb.2h instruction for bswapsi2 (x) r>> 16, and a single
     revb.2w instruction for bswapdi2 (x) r>> 32.

Unfortunately I cannot figure out a way to make the compiler generate
revb.4h or revh.{2w,d} instructions.
This optimization is really ingenious and I have no problem.

I also haven't figured out how to generate revb.4h or revh. {2w,d}.
I think we can merge this patch first.
Pushed r15-2433.
Ok. Thanks!

FWIW I tried a naive pattern for revh.2w:

(set (match_operand:DI 0 "register_operand" "=r")
      (ior:DI
        (and:DI
          (ashift:DI (match_operand:DI 1 "register_operand" "r")
                     (const_int 16))
          (const_int 18446462603027742720))
        (and:DI
          (lshiftrt:DI (match_dup 1)
                       (const_int 16))
          (const_int 281470681808895))))
But it seems too complex to be recognized.

I think it needs to be recognized as a bswap operation in the tree-bswap phase,

but it seems a bit difficult to be recognized



Reply via email to