https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117000
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|target |tree-optimization
Status|UNCONFIRMED |ASSIGNED
Version|unknown |13.3.0
Last reconfirmed| |2024-10-08
Ever confirmed|0 |1
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
In particular we miss the fact that
_29 = .REDUC_IOR (vect_folded_10.32_23);
_12 = _29 == 0;
could be optimized to
_12 = vect_folded_10.32_23 == {0, 0, ... };
it's probably too late for RTL to realize this. Some pattern in match.pd
could handle this, like
(for cmp (eq ne)
(simplify
(cmp (IFN_REDUC_IOR @0) integer_zerop)
(cmp @0 { build_zero_cst (TREE_TYPE (@0)); } )))
results in
_Z5test1RK4U256:
.LFB5:
.cfi_startproc
movdqu (%rdi), %xmm0
movdqu 16(%rdi), %xmm1
por %xmm1, %xmm0
ptest %xmm0, %xmm0
sete %al
ret
_Z5test2RK4U256:
.LFB6:
.cfi_startproc
movdqu 16(%rdi), %xmm0
movdqu (%rdi), %xmm1
por %xmm1, %xmm0
ptest %xmm0, %xmm0
sete %al
ret