Tamar Christina <tamar.christ...@arm.com> writes: >> -----Original Message----- >> From: Richard Sandiford <richard.sandif...@arm.com> >> Sent: Friday, September 30, 2022 9:29 AM >> To: Tamar Christina <tamar.christ...@arm.com> >> Cc: Richard Biener <rguent...@suse.de>; Tamar Christina via Gcc-patches >> <gcc-patches@gcc.gnu.org>; nd <n...@arm.com>; Jeff Law >> <jeffreya...@gmail.com> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional >> branches, give hint if argument is a truth type to backend >> >> Tamar Christina <tamar.christ...@arm.com> writes: >> >> -----Original Message----- >> >> From: Gcc-patches <gcc-patches- >> >> bounces+tamar.christina=arm....@gcc.gnu.org> On Behalf Of Richard >> >> Biener via Gcc-patches >> >> Sent: Thursday, September 29, 2022 12:09 PM >> >> To: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org> >> >> Cc: Richard Sandiford <richard.sandif...@arm.com>; nd <n...@arm.com> >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional >> >> branches, give hint if argument is a truth type to backend >> >> >> >> >> >> >> >> > Am 29.09.2022 um 12:23 schrieb Tamar Christina via Gcc-patches >> >> > <gcc- >> >> patc...@gcc.gnu.org>: >> >> > >> >> > >> >> >> >> >> >> -----Original Message----- >> >> >> From: Richard Biener <rguent...@suse.de> >> >> >> Sent: Thursday, September 29, 2022 10:41 AM >> >> >> To: Richard Sandiford <richard.sandif...@arm.com> >> >> >> Cc: Jeff Law <jeffreya...@gmail.com>; Tamar Christina >> >> >> <tamar.christ...@arm.com>; gcc-patches@gcc.gnu.org; nd >> >> <n...@arm.com> >> >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of >> >> >> conditional branches, give hint if argument is a truth type to >> >> >> backend >> >> >> >> >> >>> On Thu, 29 Sep 2022, Richard Sandiford wrote: >> >> >>> >> >> >>> Jeff Law <jeffreya...@gmail.com> writes: >> >> >>>> On 9/28/22 09:04, Richard Sandiford wrote: >> >> >>>>> Tamar Christina <tamar.christ...@arm.com> writes: >> >> >>>>>>> Maybe the target could use (subreg:SI (reg:BI ...)) as argument. >> >> Heh. >> >> >>>>>> But then I'd still need to change the expansion code. I >> >> >>>>>> suppose this could prevent the issue with changes to code on >> other targets. >> >> >>>>>> >> >> >>>>>>>>> We have undocumented addcc, negcc, etc. patterns, should >> we >> >> >>>>>>>>> have aandcc >> >> >>>>>> pattern for this indicating support for andcc + jump as >> >> >>>>>> opposedto >> >> >> cmpcc + jump? >> >> >>>>>>>> This could work yeah. I didn't know these existed. >> >> >>>>>>> Ah, so they are conditional add, not add setting CC, so andcc >> >> >>>>>>> wouldn't be appropriate. >> >> >>>>>>> So I'm not sure how we'd handle such situation - maybe >> >> >>>>>>> looking at REG_DECL and recognizing a _Bool PARM_DECL is OK? >> >> >>>>>> I have a slight suspicion that Richard Sandiford would likely >> >> >>>>>> reject this though.. >> >> >>>>> Good guess :-P We shouldn't rely on something like that for >> >> >> correctness. >> >> >>>>> >> >> >>>>> Would it help if we promoted the test-and-branch instructions >> >> >>>>> to optabs, alongside cbranch? The jump expanders could then >> >> >>>>> target it >> >> >> directly. >> >> >>>>> >> >> >>>>> IMO that'd be a reasonable thing to do if it does help. It's a >> >> >>>>> relatively common operation, especially on CISCy targets. >> >> >>>> >> >> >>>> But don't we represent these single bit tests using zero_extract >> >> >>>> as the condition of the branch? I guess if we can generate them >> >> >>>> directly rather than waiting for combine to deduce that we're >> >> >>>> dealing with a single bit test and constructing the zero_extract >> >> >>>> form would be an improvement and might help aarch at the same >> time. >> >> >>> >> >> >>> Do you mean that the promote_mode stuff should use ext(z)v rather >> >> >>> than zero_extend to promote a bool, where available? If so, I >> >> >>> agree that might help. But it sounds like it would have downsides >> too. >> >> >>> Currently a bool memory can be zero-extended on the fly using a >> >> >>> load, but if we used the zero_extract form instead, we'd have to >> >> >>> extract the bit after the load. And (as an alternative) choosing >> >> >>> different behaviour based on whether expand sees a REG or a MEM >> >> >>> sounds like it could still cause problems, since REGs could be >> >> >>> replaced by MEMs (or vice versa) later in the RTL passes. >> >> >>> >> >> >>> ISTM that the original patch was inserting an extra operation in >> >> >>> the branch expansion in order to target a specific instruction. >> >> >>> Targeting the instruction in expand seems good, but IMO we should >> >> >>> do it directly, based on knowledge of whether the instruction >> >> >>> actually >> >> exists. >> >> >> >> >> >> Yes, I think a compare-and-branch pattern is the best fit here. >> >> >> Note on GIMPLE we'd rely on the fact this is a BOOLEAN_TYPE (so >> >> >> even 8 bit precision bools only have 1 and 0 as meaningful values). >> >> >> So the 'compare-' bit in compare-and-branch would be interpreting >> >> >> a BOOLEAN_TYPE, not so much a general compare. >> >> > >> >> > Oh, I was thinking of adding a constant argument representing the >> >> > precision that is relevant for the compare in order to make this a >> >> > bit more >> >> general/future proof. >> >> > >> >> > Are you thinking I should instead just make the optab implicitly >> >> > only work for 1-bit precision comparisons? >> >> >> >> What’s the optab you propose (cite also the documentation part)? >> > >> > tbranchmode5 >> > Conditional branch instruction combined with a bit test instruction. >> Operand 0 is a comparison operator. >> > Operand 1 and Operand 2 are the first and second operands of the >> comparison, respectively. >> > Operand 3 is the number of low-order bits that are relevant for the >> comparison. >> > Operand 4 is the code_label to jump to. >> >> For the TB instructions (and for other similar instructions that I've seen on >> other architectures) it would be more useful to have a single-bit test, with >> operand 4 specifying the bit position. Arguably it might then be better to >> have separate eq and ne optabs, to avoid the awkward doubling of the >> operands (operand 1 contains operands 2 and 3). >> >> I guess a more general way of achieving the same thing would be to make >> operand 4 in the optab above a mask rather than a bit count. But that might >> be overly general, if there are no known architectures that have such an >> instruction. > > One of the reasons I wanted a range rather than a single bit is that I can > the use > this to generate cbz/cbnz early on as well.
We already have the opportunity to do that via cbranch<mode>4. But at the moment aarch64.md always forces the separate comparison instead. (Not sure why TBH. Does it enable more ifcvt opportunities?) If we change the body of cbranch<mode>4 to: if ((GET_CODE (operands[0]) != EQ && GET_CODE (operands[0]) != NE) || operands[2] != const0_rtx) { operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]), operands[1], operands[2]); operands[2] = const0_rtx; } then we generate the cbz/cbnz directly. Thanks, Richard > This would mean we could use my earlier > patch that tried to drop the QI/HI promotions without needing the any_extend > additional > pass if we wanted to. > > We'd also no longer need to rely on seeing a paradoxical subreg for a tst. > > Tamar. > >> >> Thanks, >> Richard >> >> > Specifically this representation would allow us to emit all our >> > different conditional branching instructions without needing to rely >> > on combine. We have some cases that happen during optimization that >> > sometimes prevent the optimal sequence from being generated. This >> would also solve that as we would expand to what we want to start with. >> > >> > Tamar. >> > >> >> >> >> > >> >> > Thanks, >> >> > Tamar >> >> > >> >> >> >> >> >> Richard.