Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-02-03 Thread Richard Sandiford
Maxim Kuvyrkov  writes:
> Hi Richard,
>
> You patch below increases code-size on aarch64-linux-gnu with -Os on SPEC2k6 
> 400.perlbench and 453.povray -- by 1% and 2% respectively.
>
> 400.perlbench,perlbench_base.default, 101,939261,951221
> 453.povray,povray_base.default,   102,707807,721399
>
> Would you please check whether these can be avoided?  [Let me know if you 
> need help reproducing this problem.]

I reverted the patch on Wednesday, see:
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01962.html

Thanks,
Richard

>
> Thank you,
>
> --
> Maxim Kuvyrkov
> https://www.linaro.org
>
>> On Jan 27, 2020, at 7:41 PM, Richard Sandiford  
>> wrote:
>> 
>> In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have:
>> 
>> Failed to match this instruction:
>> (set (reg:SI 95)
>>(ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0)
>>(const_int 3 [0x3])
>>(const_int 0 [0])) 0)
>>(const_int 19 [0x13])))
>> 
>> If we perform the natural simplification to:
>> 
>> (set (reg:SI 95)
>>(ashift:SI (sign_extract:SI (reg:SI 97)
>>(const_int 3 [0x3])
>>(const_int 0 [0])) 0)
>>(const_int 19 [0x13])))
>> 
>> then the pattern matches.  And it turns out that we do have a
>> simplification like that already, but it would only kick in for
>> extractions from a reg, not a subreg.  E.g.:
>> 
>> (set (reg:SI 95)
>>(ashift:SI (subreg:SI (sign_extract:DI (reg:DI X)
>>(const_int 3 [0x3])
>>(const_int 0 [0])) 0)
>>(const_int 19 [0x13])))
>> 
>> would simplify to:
>> 
>> (set (reg:SI 95)
>>(ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0)
>>(const_int 3 [0x3])
>>(const_int 0 [0])) 0)
>>(const_int 19 [0x13])))
>> 
>> IMO the subreg case is even more obviously a simplification
>> than the bare reg case, since the net effect is to remove
>> either one or two subregs, rather than simply change the
>> position of a subreg/truncation.
>> 
>> However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c
>> for -m32 on x86_64-linux-gnu, because we could then simplify
>> a :HI zero_extract to a :QI one.  The associated *testqi_ext_3
>> pattern did already seem to want to handle QImode extractions:
>> 
>>  "ix86_match_ccmode (insn, CCNOmode)
>>   && ((TARGET_64BIT && GET_MODE (operands[2]) == DImode)
>>   || GET_MODE (operands[2]) == SImode
>>   || GET_MODE (operands[2]) == HImode
>>   || GET_MODE (operands[2]) == QImode)
>> 
>> but I'm not sure how often the QI case would trigger in practice,
>> since the zero_extract mode was restricted to HI and above.  I checked
>> the other x86 patterns and couldn't see any other instances of this.
>> 
>> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu,
>> OK to install?
>> 
>> Richard
>> 
>> 
>> 2020-01-27  Richard Sandiford  
>> 
>> gcc/
>>  PR rtl-optimization/87763
>>  * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
>>  simplification to handle subregs as well as bare regs.
>>  * config/i386/i386.md (*testqi_ext_3): Match QI extracts too.
>> ---
>> gcc/config/i386/i386.md | 2 +-
>> gcc/simplify-rtx.c  | 4 +++-
>> 2 files changed, 4 insertions(+), 2 deletions(-)
>> 
>> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
>> index 6e9c9bd2fb6..a125ab350bb 100644
>> --- a/gcc/config/i386/i386.md
>> +++ b/gcc/config/i386/i386.md
>> @@ -8927,7 +8927,7 @@ (define_insn "*testqi_ext_2"
>> (define_insn_and_split "*testqi_ext_3"
>>   [(set (match_operand 0 "flags_reg_operand")
>> (match_operator 1 "compare_operator"
>> -  [(zero_extract:SWI248
>> +  [(zero_extract:SWI
>>   (match_operand 2 "nonimmediate_operand" "rm")
>>   (match_operand 3 "const_int_operand" "n")
>>   (match_operand 4 "const_int_operand" "n"))
>> diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
>> index eff1d07a253..db4f9339c15 100644
>> --- a/gcc/simplify-rtx.c
>> +++ b/gcc/simplify-rtx.c
>> @@ -736,7 +736,9 @@ simplify_truncation (machine_mode mode, rtx op,
>>  (*_extract:M1 (truncate:M1 (reg:M2)) (len) (pos')) if possible without
>>  changing len.  */
>>   if ((GET_CODE (op) == ZERO_EXTRACT || GET_CODE (op) == SIGN_EXTRACT)
>> -  && REG_P (XEXP (op, 0))
>> +  && (REG_P (XEXP (op, 0))
>> +  || (SUBREG_P (XEXP (op, 0))
>> +  && REG_P (SUBREG_REG (XEXP (op, 0)
>>   && GET_MODE (XEXP (op, 0)) == GET_MODE (op)
>>   && CONST_INT_P (XEXP (op, 1))
>>   && CONST_INT_P (XEXP (op, 2)))
>> -- 
>> 2.17.1
>> 


Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-02-03 Thread Maxim Kuvyrkov
Hi Richard,

You patch below increases code-size on aarch64-linux-gnu with -Os on SPEC2k6 
400.perlbench and 453.povray -- by 1% and 2% respectively.

400.perlbench,perlbench_base.default,   101,939261,951221
453.povray,povray_base.default, 102,707807,721399

Would you please check whether these can be avoided?  [Let me know if you need 
help reproducing this problem.]

Thank you,

--
Maxim Kuvyrkov
https://www.linaro.org

> On Jan 27, 2020, at 7:41 PM, Richard Sandiford  
> wrote:
> 
> In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have:
> 
> Failed to match this instruction:
> (set (reg:SI 95)
>(ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0)
>(const_int 3 [0x3])
>(const_int 0 [0])) 0)
>(const_int 19 [0x13])))
> 
> If we perform the natural simplification to:
> 
> (set (reg:SI 95)
>(ashift:SI (sign_extract:SI (reg:SI 97)
>(const_int 3 [0x3])
>(const_int 0 [0])) 0)
>(const_int 19 [0x13])))
> 
> then the pattern matches.  And it turns out that we do have a
> simplification like that already, but it would only kick in for
> extractions from a reg, not a subreg.  E.g.:
> 
> (set (reg:SI 95)
>(ashift:SI (subreg:SI (sign_extract:DI (reg:DI X)
>(const_int 3 [0x3])
>(const_int 0 [0])) 0)
>(const_int 19 [0x13])))
> 
> would simplify to:
> 
> (set (reg:SI 95)
>(ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0)
>(const_int 3 [0x3])
>(const_int 0 [0])) 0)
>(const_int 19 [0x13])))
> 
> IMO the subreg case is even more obviously a simplification
> than the bare reg case, since the net effect is to remove
> either one or two subregs, rather than simply change the
> position of a subreg/truncation.
> 
> However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c
> for -m32 on x86_64-linux-gnu, because we could then simplify
> a :HI zero_extract to a :QI one.  The associated *testqi_ext_3
> pattern did already seem to want to handle QImode extractions:
> 
>  "ix86_match_ccmode (insn, CCNOmode)
>   && ((TARGET_64BIT && GET_MODE (operands[2]) == DImode)
>   || GET_MODE (operands[2]) == SImode
>   || GET_MODE (operands[2]) == HImode
>   || GET_MODE (operands[2]) == QImode)
> 
> but I'm not sure how often the QI case would trigger in practice,
> since the zero_extract mode was restricted to HI and above.  I checked
> the other x86 patterns and couldn't see any other instances of this.
> 
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu,
> OK to install?
> 
> Richard
> 
> 
> 2020-01-27  Richard Sandiford  
> 
> gcc/
>   PR rtl-optimization/87763
>   * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
>   simplification to handle subregs as well as bare regs.
>   * config/i386/i386.md (*testqi_ext_3): Match QI extracts too.
> ---
> gcc/config/i386/i386.md | 2 +-
> gcc/simplify-rtx.c  | 4 +++-
> 2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 6e9c9bd2fb6..a125ab350bb 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -8927,7 +8927,7 @@ (define_insn "*testqi_ext_2"
> (define_insn_and_split "*testqi_ext_3"
>   [(set (match_operand 0 "flags_reg_operand")
> (match_operator 1 "compare_operator"
> -   [(zero_extract:SWI248
> +   [(zero_extract:SWI
>(match_operand 2 "nonimmediate_operand" "rm")
>(match_operand 3 "const_int_operand" "n")
>(match_operand 4 "const_int_operand" "n"))
> diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
> index eff1d07a253..db4f9339c15 100644
> --- a/gcc/simplify-rtx.c
> +++ b/gcc/simplify-rtx.c
> @@ -736,7 +736,9 @@ simplify_truncation (machine_mode mode, rtx op,
>  (*_extract:M1 (truncate:M1 (reg:M2)) (len) (pos')) if possible without
>  changing len.  */
>   if ((GET_CODE (op) == ZERO_EXTRACT || GET_CODE (op) == SIGN_EXTRACT)
> -  && REG_P (XEXP (op, 0))
> +  && (REG_P (XEXP (op, 0))
> +   || (SUBREG_P (XEXP (op, 0))
> +   && REG_P (SUBREG_REG (XEXP (op, 0)
>   && GET_MODE (XEXP (op, 0)) == GET_MODE (op)
>   && CONST_INT_P (XEXP (op, 1))
>   && CONST_INT_P (XEXP (op, 2)))
> -- 
> 2.17.1
> 



Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-01-31 Thread Jeff Law
On Thu, 2020-01-30 at 17:54 +, Richard Sandiford wrote:
> Jeff Law  writes:
> > On Wed, 2020-01-29 at 19:18 +, Richard Sandiford wrote:
> > > Andreas Schwab  writes:
> > > > On Jan 27 2020, Richard Sandiford wrote:
> > > > 
> > > > >   * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
> > > > >   simplification to handle subregs as well as bare regs.
> > > > 
> > > > That breaks gcc.target/m68k/pr39726.c
> > > 
> > > Gah.  Jeff pointed out off-list that it also broke
> > > gcc.target/sh/pr64345-1.c on sh3-linux-gnu.  It didn't look like either
> > > of them would be fixable in an acceptably non-invasive and unhacky way,
> > > so I've reverted the patch "for now".
> > I would have considered letting those two targets regress those tests
> > to move forward on 87763.  aarch64 is (IMHO) more important than the sh
> > and m68k combined ;-)  It also seems to me that your patch generates
> > better RTL and that we could claim that a port that regresses on code
> > quality needs its port maintainer to step in and fix the port.
> > 
> > WRT the m68k issue I'd think it could be fixed by twiddling
> > cbranchsi4_btst_reg_insn_1 to accept a mode iterator on the
> > zero_extract and making some minor adjustments in its output code. 
> > Something like the attached.  I haven't tested it in any real way and
> > haven't really thought about whether or not it does the right thing for
> > a MEM operand.
> 
> It just feels like this is breaking the contract with extv and extzv,
> where all *_extracts are supposed to be in the mode that those patterns
> accept.  My i386 patch was doing that too TBH, I was just being
> optimistic given that QImode was already handled by that pattern. :-)
> 
> So IMO my patch has too many knock-on effects for it to be suitable for
> stage 4.  While we have make_extraction, we probably have to leave this
> kind of expression untouched.
> 
> For AArch64 I was planning on adding a new pattern to match the
> (subreg:SI (zero_extract:DI ...)) -- as one of the comments in the
> PR suggested -- but with the subreg matched via subreg_lowpart_operator,
> to make it endian-safe.  This is similar in spirit to the new i686
> popcount patterns.
Fair enough on the simplify-rtx change :-)  One might argue that we
should be loosening the requirements, but memory operands are
particularly troublesome.  I think HP and I had a light discussion on
that a year or so ago.

WRT adding patterns to aarch64.  Yea, I went that route last year, but
never was able to go from "hey, this seems to work pretty well" to
"it's OK for the trunk".

Jeff



Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-01-30 Thread Richard Sandiford
Jeff Law  writes:
> On Wed, 2020-01-29 at 19:18 +, Richard Sandiford wrote:
>> Andreas Schwab  writes:
>> > On Jan 27 2020, Richard Sandiford wrote:
>> > 
>> > >  * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
>> > >  simplification to handle subregs as well as bare regs.
>> > 
>> > That breaks gcc.target/m68k/pr39726.c
>> 
>> Gah.  Jeff pointed out off-list that it also broke
>> gcc.target/sh/pr64345-1.c on sh3-linux-gnu.  It didn't look like either
>> of them would be fixable in an acceptably non-invasive and unhacky way,
>> so I've reverted the patch "for now".
> I would have considered letting those two targets regress those tests
> to move forward on 87763.  aarch64 is (IMHO) more important than the sh
> and m68k combined ;-)  It also seems to me that your patch generates
> better RTL and that we could claim that a port that regresses on code
> quality needs its port maintainer to step in and fix the port.
>
> WRT the m68k issue I'd think it could be fixed by twiddling
> cbranchsi4_btst_reg_insn_1 to accept a mode iterator on the
> zero_extract and making some minor adjustments in its output code. 
> Something like the attached.  I haven't tested it in any real way and
> haven't really thought about whether or not it does the right thing for
> a MEM operand.

It just feels like this is breaking the contract with extv and extzv,
where all *_extracts are supposed to be in the mode that those patterns
accept.  My i386 patch was doing that too TBH, I was just being
optimistic given that QImode was already handled by that pattern. :-)

So IMO my patch has too many knock-on effects for it to be suitable for
stage 4.  While we have make_extraction, we probably have to leave this
kind of expression untouched.

For AArch64 I was planning on adding a new pattern to match the
(subreg:SI (zero_extract:DI ...)) -- as one of the comments in the
PR suggested -- but with the subreg matched via subreg_lowpart_operator,
to make it endian-safe.  This is similar in spirit to the new i686
popcount patterns.

Thanks,
Richard

>
> I'd be surprised if the SH fix wasn't similar, but I know almost
> nothing about the SH implementation.
>
> Jeff


Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-01-30 Thread Jeff Law
On Thu, 2020-01-30 at 18:27 +0100, Jakub Jelinek wrote:
> On Thu, Jan 30, 2020 at 10:23:35AM -0700, Jeff Law wrote:
> > diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md
> > index 8e35357ea23..78c4cbe4753 100644
> > --- a/gcc/config/m68k/m68k.md
> > +++ b/gcc/config/m68k/m68k.md
> > @@ -644,12 +644,12 @@
> >return m68k_output_branch_integer (code);
> >  })
> >  
> > -(define_insn "cbranchsi4_btst_reg_insn_1"
> > +(define_insn "cbranch4_btst_reg_insn_1"
> >[(set (pc)
> > (if_then_else (match_operator 0 "equality_comparison_operator"
> > -  [(zero_extract:SI (match_operand:SI 1 
> > "nonimmediate_operand" "do,dQ")
> > -(const_int 1)
> > -(match_operand:SI 2 
> > "const_int_operand" "n,n"))
> > +  [(zero_extract:I (match_operand:I 1 
> > "nonimmediate_operand" "do,dQ")
> > +   (const_int 1)
> > +   (match_operand:I 2 "const_int_operand" 
> > "n,n"))
> > (const_int 0)])
> >   (label_ref (match_operand 3 ""))
> >   (pc)))]
> > @@ -665,8 +665,9 @@
> >  }
> >else
> >  {
> > -  operands[2] = GEN_INT (31 - INTVAL (operands[2]));
> > -  code = m68k_output_btst (operands[2], operands[1], code, 31);
> > +  operands[2] = GEN_INT (GET_MODE_BITSIZE (GET_MODE (operands[1]))
> > +- INTVAL (operands[2]) - 1);
> > +  code = m68k_output_btst (operands[2], operands[1], code, 
> > GET_MODE_BITSIZE (GET_MODE (operands[1])) - 1);
> 
> s/GET_MODE (operands[1])/mode/g ?
> Also, the last line is too long.
It's not meant for inclusion as-is, but to show how we might fix this
and allow us to move forward on 87763.
Jeff



Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-01-30 Thread Jakub Jelinek
On Thu, Jan 30, 2020 at 10:23:35AM -0700, Jeff Law wrote:
> diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md
> index 8e35357ea23..78c4cbe4753 100644
> --- a/gcc/config/m68k/m68k.md
> +++ b/gcc/config/m68k/m68k.md
> @@ -644,12 +644,12 @@
>return m68k_output_branch_integer (code);
>  })
>  
> -(define_insn "cbranchsi4_btst_reg_insn_1"
> +(define_insn "cbranch4_btst_reg_insn_1"
>[(set (pc)
>   (if_then_else (match_operator 0 "equality_comparison_operator"
> -[(zero_extract:SI (match_operand:SI 1 
> "nonimmediate_operand" "do,dQ")
> -  (const_int 1)
> -  (match_operand:SI 2 
> "const_int_operand" "n,n"))
> +[(zero_extract:I (match_operand:I 1 
> "nonimmediate_operand" "do,dQ")
> + (const_int 1)
> + (match_operand:I 2 "const_int_operand" 
> "n,n"))
>   (const_int 0)])
> (label_ref (match_operand 3 ""))
> (pc)))]
> @@ -665,8 +665,9 @@
>  }
>else
>  {
> -  operands[2] = GEN_INT (31 - INTVAL (operands[2]));
> -  code = m68k_output_btst (operands[2], operands[1], code, 31);
> +  operands[2] = GEN_INT (GET_MODE_BITSIZE (GET_MODE (operands[1]))
> +  - INTVAL (operands[2]) - 1);
> +  code = m68k_output_btst (operands[2], operands[1], code, 
> GET_MODE_BITSIZE (GET_MODE (operands[1])) - 1);

s/GET_MODE (operands[1])/mode/g ?
Also, the last line is too long.

Jakub



Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-01-30 Thread Jeff Law
On Wed, 2020-01-29 at 19:18 +, Richard Sandiford wrote:
> Andreas Schwab  writes:
> > On Jan 27 2020, Richard Sandiford wrote:
> > 
> > >   * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
> > >   simplification to handle subregs as well as bare regs.
> > 
> > That breaks gcc.target/m68k/pr39726.c
> 
> Gah.  Jeff pointed out off-list that it also broke
> gcc.target/sh/pr64345-1.c on sh3-linux-gnu.  It didn't look like either
> of them would be fixable in an acceptably non-invasive and unhacky way,
> so I've reverted the patch "for now".
I would have considered letting those two targets regress those tests
to move forward on 87763.  aarch64 is (IMHO) more important than the sh
and m68k combined ;-)  It also seems to me that your patch generates
better RTL and that we could claim that a port that regresses on code
quality needs its port maintainer to step in and fix the port.

WRT the m68k issue I'd think it could be fixed by twiddling
cbranchsi4_btst_reg_insn_1 to accept a mode iterator on the
zero_extract and making some minor adjustments in its output code. 
Something like the attached.  I haven't tested it in any real way and
haven't really thought about whether or not it does the right thing for
a MEM operand.

I'd be surprised if the SH fix wasn't similar, but I know almost
nothing about the SH implementation.

Jeff
diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md
index 8e35357ea23..78c4cbe4753 100644
--- a/gcc/config/m68k/m68k.md
+++ b/gcc/config/m68k/m68k.md
@@ -644,12 +644,12 @@
   return m68k_output_branch_integer (code);
 })
 
-(define_insn "cbranchsi4_btst_reg_insn_1"
+(define_insn "cbranch4_btst_reg_insn_1"
   [(set (pc)
 	(if_then_else (match_operator 0 "equality_comparison_operator"
-		   [(zero_extract:SI (match_operand:SI 1 "nonimmediate_operand" "do,dQ")
-	 (const_int 1)
-	 (match_operand:SI 2 "const_int_operand" "n,n"))
+		   [(zero_extract:I (match_operand:I 1 "nonimmediate_operand" "do,dQ")
+	(const_int 1)
+	(match_operand:I 2 "const_int_operand" "n,n"))
 			(const_int 0)])
 		  (label_ref (match_operand 3 ""))
 		  (pc)))]
@@ -665,8 +665,9 @@
 }
   else
 {
-  operands[2] = GEN_INT (31 - INTVAL (operands[2]));
-  code = m68k_output_btst (operands[2], operands[1], code, 31);
+  operands[2] = GEN_INT (GET_MODE_BITSIZE (GET_MODE (operands[1]))
+			 - INTVAL (operands[2]) - 1);
+  code = m68k_output_btst (operands[2], operands[1], code, GET_MODE_BITSIZE (GET_MODE (operands[1])) - 1);
 }
   return m68k_output_branch_integer (code);
 }


Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-01-29 Thread Richard Sandiford
Andreas Schwab  writes:
> On Jan 27 2020, Richard Sandiford wrote:
>
>>  * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
>>  simplification to handle subregs as well as bare regs.
>
> That breaks gcc.target/m68k/pr39726.c

Gah.  Jeff pointed out off-list that it also broke
gcc.target/sh/pr64345-1.c on sh3-linux-gnu.  It didn't look like either
of them would be fixable in an acceptably non-invasive and unhacky way,
so I've reverted the patch "for now".

Thanks,
Richard


Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-01-29 Thread Andreas Schwab
On Jan 27 2020, Richard Sandiford wrote:

>   * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
>   simplification to handle subregs as well as bare regs.

That breaks gcc.target/m68k/pr39726.c

@@ -67,39 +67,39 @@ Disassembly of section .text:
 0088 :
   88:  302f 0006   movew %sp@(6),%d0
   8c:  906f 000a   subw %sp@(10),%d0
-  90:  4a00tstb %d0
-  92:  6a08bpls 9c 
-  94:  13fc 0001   moveb #1,0 
-  9a:   
-   98: R_68K_32v
-  9c:  4e75rts
+  90:  0240 0080   andiw #128,%d0
+  94:  6708beqs 9e 
+  96:  13fc 0001   moveb #1,0 
+  9c:   
+   9a: R_68K_32v
+  9e:  4e75rts
 
-009e :
-  9e:  302f 0006   movew %sp@(6),%d0
-  a2:  d06f 000a   addw %sp@(10),%d0
-  a6:  4a00tstb %d0
-  a8:  6a08bpls b2 
-  aa:  13fc 0001   moveb #1,0 
-  b0:   
-   ae: R_68K_32v
-  b2:  4e75rts
+00a0 :
+  a0:  302f 0006   movew %sp@(6),%d0
+  a4:  d06f 000a   addw %sp@(10),%d0
+  a8:  0240 0080   andiw #128,%d0
+  ac:  6708beqs b6 
+  ae:  13fc 0001   moveb #1,0 
+  b4:   
+   b2: R_68K_32v
+  b6:  4e75rts
 
-00b4 :
-  b4:  302f 0006   movew %sp@(6),%d0
-  b8:  906f 000a   subw %sp@(10),%d0
-  bc:  0240 8421   andiw #-31711,%d0
-  c0:  6708beqs ca 
-  c2:  13fc 0001   moveb #1,0 
-  c8:   
-   c6: R_68K_32v
-  ca:  4e75rts
+00b8 :
+  b8:  302f 0006   movew %sp@(6),%d0
+  bc:  906f 000a   subw %sp@(10),%d0
+  c0:  0240 8421   andiw #-31711,%d0
+  c4:  6708beqs ce 
+  c6:  13fc 0001   moveb #1,0 
+  cc:   
+   ca: R_68K_32v
+  ce:  4e75rts
 
-00cc :
-  cc:  302f 0006   movew %sp@(6),%d0
-  d0:  d06f 000a   addw %sp@(10),%d0
-  d4:  0240 8421   andiw #-31711,%d0
-  d8:  6708beqs e2 
-  da:  13fc 0001   moveb #1,0 
-  e0:   
-   de: R_68K_32v
-  e2:  4e75rts
+00d0 :
+  d0:  302f 0006   movew %sp@(6),%d0
+  d4:  d06f 000a   addw %sp@(10),%d0
+  d8:  0240 8421   andiw #-31711,%d0
+  dc:  6708beqs e6 
+  de:  13fc 0001   moveb #1,0 
+  e4:   
+   e2: R_68K_32v
+  e6:  4e75rts

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-01-28 Thread Richard Sandiford
Jeff Law  writes:
> On Mon, 2020-01-27 at 16:41 +, Richard Sandiford wrote:
>> In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have:
>> 
>> Failed to match this instruction:
>> (set (reg:SI 95)
>> (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0)
>> (const_int 3 [0x3])
>> (const_int 0 [0])) 0)
>> (const_int 19 [0x13])))
>> 
>> If we perform the natural simplification to:
>> 
>> (set (reg:SI 95)
>> (ashift:SI (sign_extract:SI (reg:SI 97)
>> (const_int 3 [0x3])
>> (const_int 0 [0])) 0)
>> (const_int 19 [0x13])))
>> 
>> then the pattern matches.  And it turns out that we do have a
>> simplification like that already, but it would only kick in for
>> extractions from a reg, not a subreg.  E.g.:
> Yea.  I ran into similar problems with the extract/extend bits in
> combine.  And we know it's a fairly general problem that we don't
> handle SUBREGs anywhere near as consistently as REGs.
>
>
>> 
>> (set (reg:SI 95)
>> (ashift:SI (subreg:SI (sign_extract:DI (reg:DI X)
>> (const_int 3 [0x3])
>> (const_int 0 [0])) 0)
>> (const_int 19 [0x13])))
>> 
>> would simplify to:
>> 
>> (set (reg:SI 95)
>> (ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0)
>> (const_int 3 [0x3])
>> (const_int 0 [0])) 0)
>> (const_int 19 [0x13])))
>> 
>> IMO the subreg case is even more obviously a simplification
>> than the bare reg case, since the net effect is to remove
>> either one or two subregs, rather than simply change the
>> position of a subreg/truncation.
>> 
>> However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c
>> for -m32 on x86_64-linux-gnu, because we could then simplify
>> a :HI zero_extract to a :QI one.  The associated *testqi_ext_3
>> pattern did already seem to want to handle QImode extractions:
>> 
>>   "ix86_match_ccmode (insn, CCNOmode)
>>&& ((TARGET_64BIT && GET_MODE (operands[2]) == DImode)
>>|| GET_MODE (operands[2]) == SImode
>>|| GET_MODE (operands[2]) == HImode
>>|| GET_MODE (operands[2]) == QImode)
>> 
>> but I'm not sure how often the QI case would trigger in practice,
>> since the zero_extract mode was restricted to HI and above.  I checked
>> the other x86 patterns and couldn't see any other instances of this.
>> 
>> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu,
>> OK to install?
>> 
>> Richard
>> 
>> 
>> 2020-01-27  Richard Sandiford  
>> 
>> gcc/
>>  PR rtl-optimization/87763
>>  * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
>>  simplification to handle subregs as well as bare regs.
>>  * config/i386/i386.md (*testqi_ext_3): Match QI extracts too.
> Do you need to check for and reject paradoxicals here?  If not, OK as-
> is.  If you need to check, then that's pre-approved as well.

Thanks, pushed.

On the paradoxical thing: the outer subreg is always non-paradoxical
for simplify_truncation, so we don't need to check that specifically.
For the inner subreg we need to handle the paradoxical case for the PR.

I wondered at one point about punting for non-paradoxical inner subregs,
but there didn't seem to be a good reason to keep an expression like
(truncate (op (truncate...))) over (op (truncate ...)).  If it turns
out there is a good reason though, changing SUBREG_P to
paradoxical_subreg_p would be fine in terms of this PR.

Richard


Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-01-27 Thread Jeff Law
On Mon, 2020-01-27 at 16:41 +, Richard Sandiford wrote:
> In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have:
> 
> Failed to match this instruction:
> (set (reg:SI 95)
> (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0)
> (const_int 3 [0x3])
> (const_int 0 [0])) 0)
> (const_int 19 [0x13])))
> 
> If we perform the natural simplification to:
> 
> (set (reg:SI 95)
> (ashift:SI (sign_extract:SI (reg:SI 97)
> (const_int 3 [0x3])
> (const_int 0 [0])) 0)
> (const_int 19 [0x13])))
> 
> then the pattern matches.  And it turns out that we do have a
> simplification like that already, but it would only kick in for
> extractions from a reg, not a subreg.  E.g.:
Yea.  I ran into similar problems with the extract/extend bits in
combine.  And we know it's a fairly general problem that we don't
handle SUBREGs anywhere near as consistently as REGs.


> 
> (set (reg:SI 95)
> (ashift:SI (subreg:SI (sign_extract:DI (reg:DI X)
> (const_int 3 [0x3])
> (const_int 0 [0])) 0)
> (const_int 19 [0x13])))
> 
> would simplify to:
> 
> (set (reg:SI 95)
> (ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0)
> (const_int 3 [0x3])
> (const_int 0 [0])) 0)
> (const_int 19 [0x13])))
> 
> IMO the subreg case is even more obviously a simplification
> than the bare reg case, since the net effect is to remove
> either one or two subregs, rather than simply change the
> position of a subreg/truncation.
> 
> However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c
> for -m32 on x86_64-linux-gnu, because we could then simplify
> a :HI zero_extract to a :QI one.  The associated *testqi_ext_3
> pattern did already seem to want to handle QImode extractions:
> 
>   "ix86_match_ccmode (insn, CCNOmode)
>&& ((TARGET_64BIT && GET_MODE (operands[2]) == DImode)
>|| GET_MODE (operands[2]) == SImode
>|| GET_MODE (operands[2]) == HImode
>|| GET_MODE (operands[2]) == QImode)
> 
> but I'm not sure how often the QI case would trigger in practice,
> since the zero_extract mode was restricted to HI and above.  I checked
> the other x86 patterns and couldn't see any other instances of this.
> 
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu,
> OK to install?
> 
> Richard
> 
> 
> 2020-01-27  Richard Sandiford  
> 
> gcc/
>   PR rtl-optimization/87763
>   * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
>   simplification to handle subregs as well as bare regs.
>   * config/i386/i386.md (*testqi_ext_3): Match QI extracts too.
Do you need to check for and reject paradoxicals here?  If not, OK as-
is.  If you need to check, then that's pre-approved as well.

jeff
>