[Bug middle-end/122598] [16 Regression] ICE: expand_insn, at optabs.cc:8293 with -mavx512f -mgfni r16-3364

2025-11-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122598

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
   Priority|P3  |P1
 Resolution|--- |FIXED

--- Comment #11 from Jakub Jelinek  ---
Fixed.

[Bug middle-end/122598] [16 Regression] ICE: expand_insn, at optabs.cc:8293 with -mavx512f -mgfni r16-3364

2025-11-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122598

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:1c0897caa516bc564258266860e3b75054b9e78e

commit r16-5486-g1c0897caa516bc564258266860e3b75054b9e78e
Author: Jakub Jelinek 
Date:   Fri Nov 21 14:06:05 2025 +0100

i386: Remove cond_{ashl,lshr,ashr}v{64,16,32}qi expanders [PR122598]

As mentioned in the PR, the COND_SH{L,R} internal fns are expanded without
fallback, their expansion must succeed, and furthermore they don't
differentiate between scalar and vector shift counts, so again both have
to be supported.  That is the case of the {ashl,lshr,ashr}v*[hsd]i
patterns which use nonimmediate_or_const_vec_dup_operand predicate for
the shift count, so if the argument isn't const vec dup, it can be always
legitimized by loading into a vector register.
This is not the case of the QImode element conditional vector shifts,
there is no fallback for those and we emit individual element shifts
in that case when not conditional and shift count is not a constant.

So, I'm afraid we can't announce such an expander because then the
vectorizer etc. count with it being fully available.

As I've tried to show in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122598#c9
even without this pattern we can sometimes emit
vgf2p8affineqb  $0, .LC0(%rip), %ymm0, %ymm0{%k1}
etc. instructions.

2025-11-21  Jakub Jelinek  

PR target/122598
* config/i386/predicates.md (const_vec_dup_operand): Remove.
* config/i386/sse.md (cond< with VI1_AVX512VL
iterator):
Remove.

* gcc.target/i386/pr122598.c: New test.

[Bug middle-end/122598] [16 Regression] ICE: expand_insn, at optabs.cc:8293 with -mavx512f -mgfni r16-3364

2025-11-20 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122598

--- Comment #9 from Jakub Jelinek  ---
So, let's consider e.g.
typedef char V __attribute__ ((vector_size (32)));

V
foo (V v)
{
  V a = v >> 5;
  return (V) {} < v ? v : a;
}
for -O2 -mavx512{vl,dq,cd,bw} -mgfni and for -O2 -mavx512vl -mgfni and then
the same with s/32/64/.
For the first one before combine I see
(insn 9 8 10 2 (set (reg:V32QI 106 [ a_3 ])
(unspec:V32QI [
(reg/v:V32QI 101 [ v ])
(reg:V32QI 107)
(const_int 0 [0])
] UNSPEC_GF2P8AFFINE)) "pr122598-5.C":6:5 10237
{vgf2p8affineqb_v32qi}
 (expr_list:REG_DEAD (reg:V32QI 107)
(expr_list:REG_EQUAL (ashiftrt:V32QI (reg/v:V32QI 101 [ v ])
(const_int 5 [0x5]))
(nil
(insn 10 9 15 2 (set (reg:V32QI 103)
(vec_merge:V32QI (reg/v:V32QI 101 [ v ])
(reg:V32QI 106 [ a_3 ])
(reg:SI 105 [ _1 ]))) "pr122598-5.C":7:27 discrim 1 2583
{avx512vl_blendmv32qi}
 (expr_list:REG_DEAD (reg:V32QI 106 [ a_3 ])
(expr_list:REG_DEAD (reg:SI 105 [ _1 ])
(expr_list:REG_DEAD (reg/v:V32QI 101 [ v ])
(nil)
and in the combine dump I see
Trying 9 -> 10:
9: r106:V32QI=unspec[r101:V32QI,[`*.LC0'],0] 200
   10: r103:V32QI=vec_merge(r101:V32QI,r106:V32QI,r105:SI)
  REG_DEAD r106:V32QI
  REG_DEAD r105:SI
  REG_DEAD r101:V32QI
Failed to match this instruction:
(set (reg:V32QI 103)
(vec_merge:V32QI (reg/v:V32QI 101 [ v ])
(unspec:V32QI [
(reg/v:V32QI 101 [ v ])
(mem/u/c:V32QI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S32
A256])
(const_int 0 [0])
] UNSPEC_GF2P8AFFINE)
(reg:SI 105 [ _1 ])))
vec_merge is not commutative.
Now, if I try
typedef char V __attribute__ ((vector_size (32)));

V
foo (V v)
{
  V a = v >> 5;
  return (V) {} < v ? a : v;
}
instead, it is handled as masked insn with -O2 -mavx512{vl,dq,cd,bw} -mgfni
with no changes:
vpxor   %xmm1, %xmm1, %xmm1
vpcmpb  $6, %ymm1, %ymm0, %k1
vgf2p8affineqb  $0, .LC0(%rip), %ymm0, %ymm0{%k1}
ret
Could we try to swap the VEC_MERGE arguments if it doesn't match and if that
matches invert the mask?  Yes, but it would be a general change, not related to
this particular insn.

[Bug middle-end/122598] [16 Regression] ICE: expand_insn, at optabs.cc:8293 with -mavx512f -mgfni r16-3364

2025-11-20 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122598

--- Comment #8 from Jakub Jelinek  ---
(In reply to Andi Kleen from comment #7)
> I was thinking of the same fix, but that means no conditional gfaffine
> shifts can be generated anymore, right?

It can.  The vectorizer will just emit it separately, shift unconditionally and
conditional move from the result of that and some other value (0 or some
value).
And then the combiner can combine them together, like it handles many other
cases.
Except that right now you only handle it in define_expand.
So, guess we need a pattern that will handle it later on too.

[Bug middle-end/122598] [16 Regression] ICE: expand_insn, at optabs.cc:8293 with -mavx512f -mgfni r16-3364

2025-11-20 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122598

--- Comment #7 from Andi Kleen  ---
I was thinking of the same fix, but that means no conditional gfaffine shifts
can be generated anymore, right?

Perhaps your other proposals to make it always succeed are better, but I didn't
fully understand them.

[Bug middle-end/122598] [16 Regression] ICE: expand_insn, at optabs.cc:8293 with -mavx512f -mgfni r16-3364

2025-11-20 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122598

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #6 from Jakub Jelinek  ---
Created attachment 62859
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62859&action=edit
gcc16-pr122598.patch

Untested fix.

[Bug middle-end/122598] [16 Regression] ICE: expand_insn, at optabs.cc:8293 with -mavx512f -mgfni r16-3364

2025-11-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122598

Jakub Jelinek  changed:

   What|Removed |Added

   Keywords|needs-bisection |
 CC||andi-gcc at firstfloor dot org

--- Comment #5 from Jakub Jelinek  ---
I think the problem is that while for unconditional shifts we have separate
{ashl,ashr,lshr,rotl,rotr}_optab and v{ashl,ashr,lshr,rotl,rotr}_optab where
the former is for scalar operations and vector operations with scalar count and
the latter for vector operations with vector count, for the conditional one
there is just IFN_ASHL and IFN_ASHR which covers everything, i.e. both scalar
and vector counts (and any scalar and any vector counts, not just a subset of
those).
typedef char V __attribute__ ((vector_size (64)));

V
bar (V v)
{
  V a = v >> 5;
  return (V) {} < v ? v : a;
}
ICEs too with -O -mavx512f -mgfni, this time not because the vgf2p8affineqb
insn couldn't handle it, but because it expects a vector with the same counts
like (V)
{5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5}
instead of a scalar 5.
Unfortunately the cond_ashl/cond_ashr optabs aren't conditional, the generic
code doesn't have a fallback for this, so the pattern must always succeed.
And expanding the conditional left/right shift, scalar or vector in the
V*[SD]Imode case isn't that hard, but handling arbitrary V*[QH]Imode by
V*[QH]Imode shifts would be a nightmare.
So perhaps it might be better to drop the cond_{ashr,ashl,lshr}v*[qh]i optabs
and let combine merge a conditional move with the shifts and for
cond_{ashr,ashl,lshr}v*[sd]i make sure they can expand any count?
Actually
;; ../../gcc/config/i386/sse.md: 29296
(define_expand ("cond_ashlv16si")
 [
(set (match_operand:V16SI 0 ("register_operand") (""))
(vec_merge:V16SI (ashift:V16SI (match_operand:V16SI 2
("register_operand") (""))
(match_operand:V16SI 3
("nonimmediate_or_const_vec_dup_operand") ("")))
(match_operand:V16SI 4 ("nonimm_or_0_operand") (""))
(match_operand:HI 1 ("register_operand") (""
] ("TARGET_AVX512F") ("{
most likely handles any kind, maybe_legitimize_operand can handle broadcasting
of a scalar to vector.

[Bug middle-end/122598] [16 Regression] ICE: expand_insn, at optabs.cc:8293 with -mavx512f -mgfni r16-3364

2025-11-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122598

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org
Summary|[16 Regression] ICE:|[16 Regression] ICE:
   |expand_insn, at |expand_insn, at
   |optabs.cc:8293 with |optabs.cc:8293 with
   |-mavx512f -mgfni|-mavx512f -mgfni r16-3364

--- Comment #4 from Jakub Jelinek  ---
Started with r16-3364-g001cd39749f94ece8276b63f91eb864babb81a5d