On Wed, 9 Jul 2025, Tamar Christina wrote: > (on mobile so doing a top reply) > > > So it isn't as efficient to use cbranch_any (g != 0) here? I think it > > should be practically equivalent... > > Ah yeah, it can expand what we currently expand vector boolean to. > > I was initially confused because for SVE what we want here is an ORRS (flag > setting inclusive ORR) > > Using this optab we can get to that an easier way too. > > So yeah I agree, cbranch for vectors can be deprecated. > > Note that in my patch I named the new one vec_cbranch_any/all to implicitly > say they are only vectors. > > Do you want to fully deprecated cbranch for vectors?
Yes, I think that would remove confusion. > This would mean though that all target checks needs to be updated unless we > update the supports checks with a helper? > > Thanks, > Tamar > > ________________________________ > From: Richard Biener <rguent...@suse.de> > Sent: Wednesday, July 9, 2025 1:24 PM > To: Tamar Christina <tamar.christ...@arm.com> > Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; nd <n...@arm.com> > Subject: RE: [PATCH 1/3]middle-end: support vec_cbranch_any and > vec_cbranch_all [PR118974] > > On Wed, 9 Jul 2025, Tamar Christina wrote: > > > > -----Original Message----- > > > From: Richard Biener <rguent...@suse.de> > > > Sent: Wednesday, July 9, 2025 12:36 PM > > > To: Tamar Christina <tamar.christ...@arm.com> > > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com> > > > Subject: RE: [PATCH 1/3]middle-end: support vec_cbranch_any and > > > vec_cbranch_all [PR118974] > > > > > > On Wed, 9 Jul 2025, Tamar Christina wrote: > > > > > > > > > +Operand 0 is a comparison operator. Operand 1 and operand 2 are > > > > > > the > > > > > > +first and second operands of the comparison, respectively. > > > > > > Operand 3 > > > > > > +is the @code{code_label} to jump to. > > > > > > + > > > > > > +@cindex @code{cbranch_all@var{mode}4} instruction pattern > > > > > > +@item @samp{cbranch_all@var{mode}4} > > > > > > +Conditional branch instruction combined with a compare instruction > > > > > > on > > > vectors > > > > > > +where it is required that at all of the elementwise comparisons of > > > > > > the > > > > > > +two input vectors are true. > > > > > > > > > > See above. > > > > > > > > > > When I look at the RTL for aarch64 I wonder whether the middle-end > > > > > can still invert a jump (for BB reorder, for example)? Without > > > > > a cbranch_none expander we have to invert during RTL expansion? > > > > > > > > > > > > > Isn't cbranch_none just cbranch_all x 0? i.e. all value must be zero. > > > > I think all states are expressible with any and all and flipping the > > > > branches > > > > so it shouldn't be any more restrictive than cbranch itself is today. > > > > > > > > cbranch also only supports eq and ne, so none would be cbranch (eq x 0) > > > > > > > > and FTR the RTL generated for AArch64 (Both SVE And Adv.SIMD) will be > > > simplified to: > > > > > > > > (insn 23 22 24 5 (parallel [ > > > > (set (reg:VNx4BI 128 [ mask_patt_14.15_57 ]) > > > > (unspec:VNx4BI [ > > > > (reg:VNx4BI 129) > > > > (const_int 0 [0x0]) > > > > (gt:VNx4BI (reg:VNx4SI 114 [ vect__2.11 ]) > > > > (const_vector:VNx4SI repeat [ > > > > (const_int 0 [0]) > > > > ])) > > > > ] UNSPEC_PRED_Z)) > > > > (clobber (reg:CC_NZC 66 cc)) > > > > ]) "cbranch.c":25:10 -1 > > > > > > > > (jump_insn 27 26 28 5 (set (pc) > > > > (if_then_else (eq (reg:CC_Z 66 cc) > > > > (const_int 0 [0])) > > > > (label_ref 33) > > > > (pc))) "cbranch.c":25:10 -1 > > > > (int_list:REG_BR_PROB 1014686025 (nil)) > > > > > > > > The thing is we can't rid of the unspecs as there's concept of masking > > > > in RTL > > > compares. > > > > We could technically do an AND (and do in some cases) but then you lose > > > > the > > > predicate > > > > Hint constant in the RTL which tells you whether the mask is known to > > > > be all true > > > or not. > > > > This hint is crucial to allow for further optimizations. > > > > > > > > That said the condition code, branch and compares are fully exposed. > > > > > > > > We expand to a larger sequence than I'd like mostly because there's no > > > > support > > > > for conditional cbranch optabs, or even conditional vector comparisons. > > > > So the > > > comparisons > > > > must be generated unpredicated by generating an all true mask, and later > > > patterns > > > > merge in the AND. > > > > > > > > The new patterns allow us to clean up codegen for Adv.SIMD + SVE (in a > > > > single > > > loop) > > > > But not pure SVE. For which I take a different approach to try to > > > > avoid requiring > > > > a predicated version of these optabs. > > > > > > > > I don't want to push my luck, but would you be ok with a conditional > > > > version of > > > these > > > > optabs too? i.e. cond_cbranch_all and cond_cbranch_all? This would > > > > allow us to > > > > immediately expand to the correct representation for both SVE and > > > > Adv.SIMD > > > > without having to rely on various combine patterns and cc-fusion to > > > > optimize the > > > sequences > > > > later on (which has historically been a bit hit or miss if someone adds > > > > a new CC > > > pattern). > > > > > > Can you explain? A conditional conditional branch makes my head hurt. > > > It's really a cbranch_{all,any} where the (vector) compare has an > > > additional mask applied? So cbranch_cond_{all,any} would be a better > > > fit? > > > > Yeah so cbranch is itself in GIMPLE > > > > c = vec_a `cmp` vec_b > > if (c {any,all} 0) > > ... > > > > where cbranch_{all, any} represents the gimple > > > > If (vec_a `cmp` vec_b) > > ... > > > > cbranch_cond_{all, any} would represent > > > > if ((vec_a `cmp` vec_b) & loop_mask) > > ... > > > > In GIMPLE we mask most operations by & the mask with the result > > of the operation. But cbranch doesn't have an LHS, so we can't wrap > > the & around anything. And because of this we rely on backend patterns > > to later push the mask from the & into the operation such that we can > > generate the predicated compare. > > OK, so we could already do > > c = .COND_`cmp` (vec_a, vec_b, loop_mask, { 0, 0... }); > if (c {any,all} 0) > > but I can see how cond_cbranch is useful (besides chosing a proper name). > > > Because of this we end up requiring patterns such as > > > > ;; Predicated integer comparisons, formed by combining a PTRUE-predicated > > ;; comparison with an AND in which only the flags result is interesting. > > (define_insn_and_rewrite "*cmp<cmp_op><mode>_and_ptest" > > [(set (reg:CC_Z CC_REGNUM) > > (unspec:CC_Z > > [(match_operand:VNx16BI 1 "register_operand") > > (match_operand 4) > > (const_int SVE_KNOWN_PTRUE) > > (and:<VPRED> > > (unspec:<VPRED> > > [(match_operand 5) > > (const_int SVE_KNOWN_PTRUE) > > (SVE_INT_CMP:<VPRED> > > (match_operand:SVE_I 2 "register_operand") > > (match_operand:SVE_I 3 > > "aarch64_sve_cmp_<sve_imm_con>_operand"))] > > UNSPEC_PRED_Z) > > (match_operand:<VPRED> 6 "register_operand"))] > > UNSPEC_PTEST)) > > (clobber (match_scratch:<VPRED> 0))] > > "TARGET_SVE" > > {@ [ cons: =0, 1 , 2 , 3 ; attrs: pred_clobber ] > > [ &Upa , Upl, w , <sve_imm_con>; yes ] > > cmp<cmp_op>\t%0.<Vetype>, %6/z, %2.<Vetype>, #%3 > > [ ?Upl , 0 , w , <sve_imm_con>; yes ] ^ > > [ Upa , Upl, w , <sve_imm_con>; no ] ^ > > [ &Upa , Upl, w , w ; yes ] > > cmp<cmp_op>\t%0.<Vetype>, %6/z, %2.<Vetype>, %3.<Vetype> > > [ ?Upl , 0 , w , w ; yes ] ^ > > [ Upa , Upl, w , w ; no ] ^ > > } > > "&& !rtx_equal_p (operands[4], operands[5])" > > { > > operands[5] = copy_rtx (operands[4]); > > } > > ) > > > > And rely on combine to eventually connect enough pieces back together. > > With the mask cbranch patterns we can just emit a simplified RTL: > > > > (insn 23 22 24 5 (parallel [ > > (set (reg:VNx4BI 128 [ mask_patt_14.15_57 ]) > > (unspec:VNx4BI [ > > (reg:VNx4BI 129) > > (const_int 0 [0x0]) > > (gt:VNx4BI (reg:VNx4SI 114 [ vect__2.11 ]) > > (const_vector:VNx4SI repeat [ > > (const_int 0 [0]) > > ])) > > ] UNSPEC_PRED_Z)) > > (clobber (reg:CC_NZC 66 cc)) > > ]) "cbranch.c":25:10 -1 > > > > Immediately out of expand. > > > > > Though 'cond_' elsewhere suggests there is an else value, instead > > > cbranch_mask_{all,any}? Or would that not capture things exactly? > > > > Yeah mask is fine too, I only suggested cond since that's what we normally > > used. > > But mask works too. > > cond_cbranch didn't seem to imply the cond_ is applied to the compare, > but from a 2nd look it might be OK after all. > > > > cbranch is compare-and-branch, so masked-compare-and-branch aka > > > mcbranch[_{all,any}]? > > > > > > And writing this and not tryinig to remember everything said it > > > appears that 'cbranch' itself (for vectors) becomes redundant? > > > > > > > For the most part, but not when unrolling is involved where the compare > > is done on the reduction of masks. i.e. for > > > > a = b > c > > d = e > f > > g = d | a > > if (g != 0) > > ... > > > > We can't really use the new optabs as `g` itself is not a comparison. > > So it isn't as efficient to use cbranch_any (g != 0) here? I think > it should be practically equivalent... > > Richard. > > > For VLA though > > this happens less often due to the support for widening loads and narrowing > > stores. > > > > Thanks, > > Tamar > > > > > Richard. > > > > > > > And the reason for both is that for Adv.SIMD there's no mask at GIMPLE > > > > level and > > > we have to > > > > make it during expand. > > > > > > > > Thanks, > > > > Tamar > > > > > > > > > > -- > > > Richard Biener <rguent...@suse.de> > > > SUSE Software Solutions Germany GmbH, > > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) > > > > -- > Richard Biener <rguent...@suse.de> > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)