Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-20 Thread Richard Henderson
On 04/19/2011 02:52 AM, Denis Chertykov wrote:
> 2011/4/19 Georg-Johann Lay :
>> How can add, sub etc. be split? This would need an explicit
>> representation of carry.
> 
> Yes.
> 
> Look at http://gcc.gnu.org/ml/gcc/2005-03/msg00871.html

Well, sort-of, but not really.

It gets a tad ugly, but have a look at the adddi3* patterns in
the mn10300 and rx ports.

In particular note how both the inputs and outputs to the insn
(not the expander) are all SImode, allowing for lower_subreg to
do its job.  The patterns are split post-reload -- some of that
is for scheduling, some of that simply makes computing the 
individual insn lengths significantly easier -- but you wouldn't
really have to do that for AVR.

For AVR things would become even trickier.  You might consider

(define_predicate "concat_operator"
  (match_code "concat"))

(define_insn "addsi3_"
  [(set (match_operand:QI 0 "register_operand" "=r")
(truncate:QI
  (plus:SI
(match_operator:SI 12 "concat_operator"
   [(match_operand:QI  4 "register_operand" "0")
(match_operand:QI  5 "register_operand" "1")
(match_operand:QI  6 "register_operand" "2")
(match_operand:QI  7 "register_operand" "3")])
(match_operator:SI 13 "concat_operator"
   [(match_operand:QI  8 "reg_or_0_operand" "rL")
(match_operand:QI  9 "reg_or_0_operand" "rL")
(match_operand:QI 10 "reg_or_0_operand" "rL")
(match_operand:QI 11 "reg_or_0_operand" "rL")])))
   (set (match_operand:QI 1 "register_operand" "=r")
(truncate:QI
  (lshiftrt:SI
(plus:SI (match_dup 24) (match_dup 25))
(const_int 8
   (set (match_operand:QI 2 "register_operand" "=r")
(truncate:QI
  (lshiftrt:SI
(plus:SI (match_dup 24) (match_dup 25))
(const_int 16
   (set (match_operand:QI 3 "register_operand" "=r")
(truncate:QI
  (lshiftrt:SI
(plus:SI (match_dup 24) (match_dup 25))
(const_int 24]
  ""
  "add %0,%Z8\;adc %1,%Z9\;adc %2,%Z10\;adc %3,%Z11"
  [(set_attr "length" "4")]
)

This may require a little bit of code inside combine to handle
CONCAT in a reasonable way, but that should be fairly minimal.

It may also want some more special case patterns and/or peep2s
to more efficiently handle constants, particularly considering
adiw and subic.  But I think it's at least worth investigating.


r~


Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-20 Thread Denis Chertykov
2011/4/20 Georg-Johann Lay :
> Denis Chertykov schrieb:
>> 2011/4/17 Denis Chertykov :
>>> 2011/4/15 Georg-Johann Lay :
 Finally, I exposed alternative #3 of the insns to the register
 allocator, because it is not possible to distinguish between
 overlapping or non-overlapping regs, and #3 does not need a scratch.

 Ran C-testsuite with no regressions.
>>> Are you encountered any difference in code size ?
>>
>> I'm ask about code size because the IRA pass isn't work with
>> `scratch:MODE' at all.
>> This lead to bad/wrong register allocation in IRA pass.
>> The reload pass will correct such a wrong allocation, but reload can't
>> generate optimal code. (reload generate correct code).
>> Because of that, may be you right and may be better to have
>> (clobber (match_operand)) instead of (clobber (match_scratch...)).
>
> So the conclusion is not to commit this patch and that the one-liner
> already installed is sufficient to fix the ICE?

Yes.

Denis.


Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-20 Thread Georg-Johann Lay
Denis Chertykov schrieb:
> 2011/4/17 Denis Chertykov :
>> 2011/4/15 Georg-Johann Lay :
>>> Finally, I exposed alternative #3 of the insns to the register
>>> allocator, because it is not possible to distinguish between
>>> overlapping or non-overlapping regs, and #3 does not need a scratch.
>>>
>>> Ran C-testsuite with no regressions.
>> Are you encountered any difference in code size ?
> 
> I'm ask about code size because the IRA pass isn't work with
> `scratch:MODE' at all.
> This lead to bad/wrong register allocation in IRA pass.
> The reload pass will correct such a wrong allocation, but reload can't
> generate optimal code. (reload generate correct code).
> Because of that, may be you right and may be better to have
> (clobber (match_operand)) instead of (clobber (match_scratch...)).

So the conclusion is not to commit this patch and that the one-liner
already installed is sufficient to fix the ICE?

> 
> Denis.
> 



Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-19 Thread Richard Earnshaw

On Tue, 2011-04-19 at 15:17 +0400, Denis Chertykov wrote:
> 2011/4/19 Georg-Johann Lay :
> > Denis Chertykov schrieb:
> >> 2011/4/19 Georg-Johann Lay :
> >>> How can add, sub etc. be split? This would need an explicit
> >>> representation of carry.
> >>
> >> Yes.
> >>
> >> Look at http://gcc.gnu.org/ml/gcc/2005-03/msg00871.html
> >
> > Just skimmed the conversation. I thought about making AVR ISA's
> > effects on SREG explicit several times, but I always got stuck at some
> > point.
> >
> > - It's not only about scheduling (which does not happen for avr) but
> >  also about moving instructions across jumps.
> >
> > - Many transformations would happen before reload, but at these stages
> > the effect on SREG is not yet known in many cases. There is
> > sophisticated instruction output for many patterns, and their impact
> > on SREG/CC is not known before reload.
> >
> > - Making CC explicit would render many single_set insns to PARALLELs
> > making the optimizers' live much harder or impossible. Imagine
> > instructions that could be combined. Explicit CC would clutter up
> > insns and combine won't try to transform the bulky patterns.
> >
> > - Backend would be much more complicated, harder to maintain and
> > understand. Almost any insn would have to be changed.
> 
> Generally, I'm agree with you, the AVR port uses CC0 because of that.

Thumb-1 support in the ARM compiler has similar flag-setting properties
(ie most instructions set the condition codes).  It doesn't use CC0.  It
works because it doesn't model the condition code register at all, but
treats compare/branch sequences as indivisible operations.

R.




Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-19 Thread Denis Chertykov
2011/4/19 Georg-Johann Lay :
> Denis Chertykov schrieb:
>> 2011/4/19 Georg-Johann Lay :
>>> How can add, sub etc. be split? This would need an explicit
>>> representation of carry.
>>
>> Yes.
>>
>> Look at http://gcc.gnu.org/ml/gcc/2005-03/msg00871.html
>
> Just skimmed the conversation. I thought about making AVR ISA's
> effects on SREG explicit several times, but I always got stuck at some
> point.
>
> - It's not only about scheduling (which does not happen for avr) but
>  also about moving instructions across jumps.
>
> - Many transformations would happen before reload, but at these stages
> the effect on SREG is not yet known in many cases. There is
> sophisticated instruction output for many patterns, and their impact
> on SREG/CC is not known before reload.
>
> - Making CC explicit would render many single_set insns to PARALLELs
> making the optimizers' live much harder or impossible. Imagine
> instructions that could be combined. Explicit CC would clutter up
> insns and combine won't try to transform the bulky patterns.
>
> - Backend would be much more complicated, harder to maintain and
> understand. Almost any insn would have to be changed.

Generally, I'm agree with you, the AVR port uses CC0 because of that.

Denis.


Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-19 Thread Georg-Johann Lay
Denis Chertykov schrieb:
> 2011/4/19 Georg-Johann Lay :
>> How can add, sub etc. be split? This would need an explicit
>> representation of carry.
> 
> Yes.
> 
> Look at http://gcc.gnu.org/ml/gcc/2005-03/msg00871.html

Just skimmed the conversation. I thought about making AVR ISA's
effects on SREG explicit several times, but I always got stuck at some
point.

- It's not only about scheduling (which does not happen for avr) but
  also about moving instructions across jumps.

- Many transformations would happen before reload, but at these stages
the effect on SREG is not yet known in many cases. There is
sophisticated instruction output for many patterns, and their impact
on SREG/CC is not known before reload.

- Making CC explicit would render many single_set insns to PARALLELs
making the optimizers' live much harder or impossible. Imagine
instructions that could be combined. Explicit CC would clutter up
insns and combine won't try to transform the bulky patterns.

- Backend would be much more complicated, harder to maintain and
understand. Almost any insn would have to be changed.

Johann

> 
> Denis.
> 



Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-19 Thread Denis Chertykov
2011/4/19 Georg-Johann Lay :
> How can add, sub etc. be split? This would need an explicit
> representation of carry.

Yes.

Look at http://gcc.gnu.org/ml/gcc/2005-03/msg00871.html

Denis.


Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-19 Thread Georg-Johann Lay
Denis Chertykov schrieb:
> 2011/4/18 Georg-Johann Lay :
>> Denis Chertykov schrieb:
>>> 2011/4/17 Denis Chertykov :
 2011/4/15 Georg-Johann Lay :
> Finally, I exposed alternative #3 of the insns to the register
> allocator, because it is not possible to distinguish between
> overlapping or non-overlapping regs, and #3 does not need a scratch.
>
> Ran C-testsuite with no regressions.
 Are you encountered any difference in code size ?
>>> I'm ask about code size because the IRA pass isn't work with
>>> `scratch:MODE' at all.
>> I wonder what the difference is; IRA could treat scratch:MODE just the
>> same way as pseudo:MODE. Anyway, thanks for that insight. It's merely
>> impossible to get the big picture of IRA/reload from digging sources.
>>
>> So that means IRA generates garbage for scratch and reload has to fix
>> it, generate spill code if needed, etc?
> 
> No garbage, IRA just ignore scratch.
> The reload pass will allocate scratch registers if it's possible.
> If it's impossible, then reload pass will spill and generate reload for
> insns with scratches.
> 
>> Does this mean that if IRA sees a pseudo together with constraint "X"
>> it will allocate the pseudo, anyway?
> 
> I think so, but better to test IRA.
> 
>>> This lead to bad/wrong register allocation in IRA pass.
>>> The reload pass will correct such a wrong allocation, but reload can't
>>> generate optimal code. (reload generate correct code).
>>> Because of that, may be you right and may be better to have
>>> (clobber (match_operand)) instead of (clobber (match_scratch...)).
>> I don't see effects on code size, at least for the binaries left by
>> testsuite.
>>
>> Is there a policy how to drive size-performance tests for avr?
>> Anatoly posted a result on code size some weeks ago (concerning move
>> costs), it would be interesting to know his setup and what
>> sources/options he uses. Using testsuite seems not appropriate to me,
>> because it runs -O3, -finline etc. and most code is no real-world
>> code, at least not for avr.
> 
> IMHO libgcc is a good test case.

I don't see differences there, either. This is presumably because
rotate pattern is not used resp. used only very seldom.

>> FYI, what I observe is a remarkable dependency on subreg lowering that
>> can be generalized as this:
>> * code size decreases for mode <= SImode
>> * code size increases for mode > SImode
> 
> I think that DImode move instructions must be supported for representative
> test results for mode > SImode.
> 
> 
> Personally I'm don't want to support DImode for AVR.
> IMHO DImode is an overkill for 8bits controller.

Some people use 64bit on AVR. But to support a movdi would be very
painful, I think it would cause more harm like spill failures than good.

> Another one note:
> Few years ago I have played with early splitting of anything possible
> (move,add,sub,and,...). The results was very bad.
> It's happened because flow of splitted insns (8bits insns) becomes
> unreadable for most of GCC optimisation passes.
> May be splitting is appropriate before register allocation or even after
> reload pass.

The -f[no-]split-wide-types happens before reload, in .subreg1 (prior
to combine) resp. .subreg2 (after combine).

For 64bit, without subreg lowering, there is an insn like

(clobber (reg:DI 43 [  ]))

which causes bunch of moves and finally the setup of a frame pointer
even though nothing lives in the frame.

How can add, sub etc. be split? This would need an explicit
representation of carry.

> Denis.
> PS: about code size tests: try to email directly to Anatoly.
> PPS: I can try to call him.

Johann



Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-18 Thread Denis Chertykov
2011/4/18 Denis Chertykov :
> 2011/4/18 Georg-Johann Lay :
> Few years ago I have played with early splitting of anything possible
> (move,add,sub,and,...). The results was very bad.
> It's happened because flow of splitted insns (8bits insns) becomes
> unreadable for most of GCC optimisation passes.
> May be splitting is appropriate before register allocation or even after
> reload pass.

May be something like this (early splitting) happened with the subreg
lowering.

Denis.


Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-18 Thread Denis Chertykov
2011/4/18 Georg-Johann Lay :
> Denis Chertykov schrieb:
>> 2011/4/17 Denis Chertykov :
>>> 2011/4/15 Georg-Johann Lay :
 Finally, I exposed alternative #3 of the insns to the register
 allocator, because it is not possible to distinguish between
 overlapping or non-overlapping regs, and #3 does not need a scratch.

 Ran C-testsuite with no regressions.
>>> Are you encountered any difference in code size ?
>>
>> I'm ask about code size because the IRA pass isn't work with
>> `scratch:MODE' at all.
>
> I wonder what the difference is; IRA could treat scratch:MODE just the
> same way as pseudo:MODE. Anyway, thanks for that insight. It's merely
> impossible to get the big picture of IRA/reload from digging sources.
>
> So that means IRA generates garbage for scratch and reload has to fix
> it, generate spill code if needed, etc?

No garbage, IRA just ignore scratch.
The reload pass will allocate scratch registers if it's possible.
If it's impossible, then reload pass will spill and generate reload for
insns with scratches.

> Does this mean that if IRA sees a pseudo together with constraint "X"
> it will allocate the pseudo, anyway?

I think so, but better to test IRA.

>> This lead to bad/wrong register allocation in IRA pass.
>> The reload pass will correct such a wrong allocation, but reload can't
>> generate optimal code. (reload generate correct code).
>> Because of that, may be you right and may be better to have
>> (clobber (match_operand)) instead of (clobber (match_scratch...)).
>
> I don't see effects on code size, at least for the binaries left by
> testsuite.
>
> Is there a policy how to drive size-performance tests for avr?
> Anatoly posted a result on code size some weeks ago (concerning move
> costs), it would be interesting to know his setup and what
> sources/options he uses. Using testsuite seems not appropriate to me,
> because it runs -O3, -finline etc. and most code is no real-world
> code, at least not for avr.

IMHO libgcc is a good test case.

> FYI, what I observe is a remarkable dependency on subreg lowering that
> can be generalized as this:
> * code size decreases for mode <= SImode
> * code size increases for mode > SImode

I think that DImode move instructions must be supported for representative
test results for mode > SImode.


Personally I'm don't want to support DImode for AVR.
IMHO DImode is an overkill for 8bits controller.


Another one note:
Few years ago I have played with early splitting of anything possible
(move,add,sub,and,...). The results was very bad.
It's happened because flow of splitted insns (8bits insns) becomes
unreadable for most of GCC optimisation passes.
May be splitting is appropriate before register allocation or even after
reload pass.

Denis.
PS: about code size tests: try to email directly to Anatoly.
PPS: I can try to call him.


Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-18 Thread Georg-Johann Lay
Denis Chertykov schrieb:
> 2011/4/17 Denis Chertykov :
>> 2011/4/15 Georg-Johann Lay :
>>> Finally, I exposed alternative #3 of the insns to the register
>>> allocator, because it is not possible to distinguish between
>>> overlapping or non-overlapping regs, and #3 does not need a scratch.
>>>
>>> Ran C-testsuite with no regressions.
>> Are you encountered any difference in code size ?
> 
> I'm ask about code size because the IRA pass isn't work with
> `scratch:MODE' at all.

I wonder what the difference is; IRA could treat scratch:MODE just the
same way as pseudo:MODE. Anyway, thanks for that insight. It's merely
impossible to get the big picture of IRA/reload from digging sources.

So that means IRA generates garbage for scratch and reload has to fix
it, generate spill code if needed, etc?

Does this mean that if IRA sees a pseudo together with constraint "X"
it will allocate the pseudo, anyway?

> This lead to bad/wrong register allocation in IRA pass.
> The reload pass will correct such a wrong allocation, but reload can't
> generate optimal code. (reload generate correct code).
> Because of that, may be you right and may be better to have
> (clobber (match_operand)) instead of (clobber (match_scratch...)).

I don't see effects on code size, at least for the binaries left by
testsuite.

Is there a policy how to drive size-performance tests for avr?
Anatoly posted a result on code size some weeks ago (concerning move
costs), it would be interesting to know his setup and what
sources/options he uses. Using testsuite seems not appropriate to me,
because it runs -O3, -finline etc. and most code is no real-world
code, at least not for avr.

FYI, what I observe is a remarkable dependency on subreg lowering that
can be generalized as this:
* code size decreases for mode <= SImode
* code size increases for mode > SImode

An example for gcc.target/avr/torture/pr41885.c -Os

-fno-split-wide-types

 0008 T rotl_16a
0008 0006 T rotl_16b
0014 0012 T rotl_32a
0026 0010 T rotl_32b
0036 0006 T rotl_32c
0042 0010 T rotl_32d
0052 0010 T rotl_32e
0062 0088 T rotl_64
0150 0106 T rotl_64a
0256 0082 T rotl_64b
0338 0096 T rotl_64c
0434 0098 T rotl_64d
0532 0100 T rotl_64e
0632 0102 T rotl_64f
0734 0104 T rotl_64g
0838 0110 T rotl_64h

-fsplit-wide-types

 0008 T rotl_16a
0008 0006 T rotl_16b
0014 0024 T rotl_32a
0038 0020 T rotl_32b
0058 0024 T rotl_32c
0082 0022 T rotl_32d
0104 0022 T rotl_32e
0126 0036 T rotl_64
0162 0050 T rotl_64a
0212 0026 T rotl_64b
0238 0036 T rotl_64c
0274 0038 T rotl_64d
0312 0040 T rotl_64e
0352 0042 T rotl_64f
0394 0044 T rotl_64g
0438 0038 T rotl_64h

Johann

> 
> Denis.
> 



Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-17 Thread Denis Chertykov
2011/4/17 Denis Chertykov :
> 2011/4/15 Georg-Johann Lay :
>> Finally, I exposed alternative #3 of the insns to the register
>> allocator, because it is not possible to distinguish between
>> overlapping or non-overlapping regs, and #3 does not need a scratch.
>>
>> Ran C-testsuite with no regressions.
>
> Are you encountered any difference in code size ?

I'm ask about code size because the IRA pass isn't work with
`scratch:MODE' at all.
This lead to bad/wrong register allocation in IRA pass.
The reload pass will correct such a wrong allocation, but reload can't
generate optimal code. (reload generate correct code).
Because of that, may be you right and may be better to have
(clobber (match_operand)) instead of (clobber (match_scratch...)).

Denis.


Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-17 Thread Denis Chertykov
2011/4/15 Georg-Johann Lay :
> Denis Chertykov schrieb:
>> 2011/4/14 Georg-Johann Lay :
>>> Denis Chertykov schrieb:
 2011/4/14 Georg-Johann Lay :
> The "rotl3" expanders (mode \in {HI,SI,DI}) violates synopsis by
> using 4 operands instead of 3. This runs in ICE in top of
> optabs.c:maybe_gen_insn.
>
> The right way to do this is to use match_dup, not match_operand. So
> the fix is obvious.
>
> Regenerated avr-gcc and run against avr testsuite:
>
> gcc.target/avr/torture/pr41885.c generates these patterns
>
> Johann
>
> 2011-04-14  Georg-Johann Lay  
>
>        * config/avr/avr.md ("rotlhi3", "rotlsi3", "rotldi3"): Fix
>        expanders operand 3 from match_operand to match_dup.
 May be better to use (clobber (match_scratch ...)) here and in
 `*rotw' and `*rotb'.
>>> These are splitters that might split before reload, producing strange,
>>> non-matching insns like
>>>  (set (scratch:QI) ...
>>> or crashing already in avr_rotate_bytes becase the split condition reads
>>> "&& (reload_completed || mode == DImode)"
>>
>> Generally I'm agree with you, change match_operand to match_dup is easy.
>
> I already submitted the easy patch to avoid the ICE.
>
>> But IMHO the right way is:
>>  - the splitters and avr_rotate_bytes are wrong and must be fixed too.
>>  - operands[3] is a scratch register and right way is to use match_scratch.
>>
>> I can approve the patch.
>>
>> Denis.
>
> Ok, here is the right-way patch.
>
> The expander generates a SCRATCH instead of a pseudo, and in case of a
> pre-reload split the SCRATCH is replaced with a pseudo because it is
> not known if or if not the scratch will actually be needed.
>
> avr_rotate_bytes now skips operations on non-REG 'scratch'.
> Furthermore, I added an assertion that ensures that 'scratch' is
> actually a REG when it is needed.
>
> Besides that, I fixed indentation. I know it is not optimal to fix
> indentation alongside with functionality... did it anyway :-)
>
> Finally, I exposed alternative #3 of the insns to the register
> allocator, because it is not possible to distinguish between
> overlapping or non-overlapping regs, and #3 does not need a scratch.
>
> Ran C-testsuite with no regressions.

Are you encountered any difference in code size ?

Denis.


Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-15 Thread Georg-Johann Lay
Denis Chertykov schrieb:
> 2011/4/14 Georg-Johann Lay :
>> Denis Chertykov schrieb:
>>> 2011/4/14 Georg-Johann Lay :
 The "rotl3" expanders (mode \in {HI,SI,DI}) violates synopsis by
 using 4 operands instead of 3. This runs in ICE in top of
 optabs.c:maybe_gen_insn.

 The right way to do this is to use match_dup, not match_operand. So
 the fix is obvious.

 Regenerated avr-gcc and run against avr testsuite:

 gcc.target/avr/torture/pr41885.c generates these patterns

 Johann

 2011-04-14  Georg-Johann Lay  

* config/avr/avr.md ("rotlhi3", "rotlsi3", "rotldi3"): Fix
expanders operand 3 from match_operand to match_dup.
>>> May be better to use (clobber (match_scratch ...)) here and in
>>> `*rotw' and `*rotb'.
>> These are splitters that might split before reload, producing strange,
>> non-matching insns like
>>  (set (scratch:QI) ...
>> or crashing already in avr_rotate_bytes becase the split condition reads
>> "&& (reload_completed || mode == DImode)"
> 
> Generally I'm agree with you, change match_operand to match_dup is easy.

I already submitted the easy patch to avoid the ICE.

> But IMHO the right way is:
>  - the splitters and avr_rotate_bytes are wrong and must be fixed too.
>  - operands[3] is a scratch register and right way is to use match_scratch.
> 
> I can approve the patch.
> 
> Denis.

Ok, here is the right-way patch.

The expander generates a SCRATCH instead of a pseudo, and in case of a
pre-reload split the SCRATCH is replaced with a pseudo because it is
not known if or if not the scratch will actually be needed.

avr_rotate_bytes now skips operations on non-REG 'scratch'.
Furthermore, I added an assertion that ensures that 'scratch' is
actually a REG when it is needed.

Besides that, I fixed indentation. I know it is not optimal to fix
indentation alongside with functionality... did it anyway :-)

Finally, I exposed alternative #3 of the insns to the register
allocator, because it is not possible to distinguish between
overlapping or non-overlapping regs, and #3 does not need a scratch.

Ran C-testsuite with no regressions.

Johann

2011-04-15  Georg-Johann Lay  

* config/avr/avr.md ("rotl3"): Use SCRATCH instead
of REG in operand 3. Fix indentation. Unquote C snippet.
("*rotw","*rotbw"): Ditto. Replace scratch
with pseudo for pre-reload splits. Use const_int_operand for
operand 2. Expose alternative 3 to register allocator.
* config/avr/avr.c (avr_rotate_bytes): Handle SCRATCH in
operands[3]. Use GNU indentation style.

Index: config/avr/avr.md
===
--- config/avr/avr.md	(Revision 172493)
+++ config/avr/avr.md	(Arbeitskopie)
@@ -1519,22 +1519,22 @@ (define_mode_attr rotsmode [(DI "QI") (S
 
 (define_expand "rotl3"
   [(parallel [(set (match_operand:HIDI 0 "register_operand" "")
-		   (rotate:HIDI (match_operand:HIDI 1 "register_operand" "")
-(match_operand:VOID 2 "const_int_operand" "")))
-		(clobber (match_dup 3))])]
+   (rotate:HIDI (match_operand:HIDI 1 "register_operand" "")
+(match_operand:VOID 2 "const_int_operand" "")))
+  (clobber (match_dup 3))])]
   ""
-  "
-{
-  if (CONST_INT_P (operands[2]) && 0 == (INTVAL (operands[2]) % 8))
   {
-  if (AVR_HAVE_MOVW && 0 == INTVAL (operands[2]) % 16)
-operands[3] = gen_reg_rtx (mode);
-  else
-operands[3] = gen_reg_rtx (QImode);
-  }
-  else
-FAIL;
-}")
+if (CONST_INT_P (operands[2])
+&& 0 == INTVAL (operands[2]) % 8)
+  {
+if (AVR_HAVE_MOVW && 0 == INTVAL (operands[2]) % 16)
+  operands[3] = gen_rtx_SCRATCH (mode);
+else
+  operands[3] = gen_rtx_SCRATCH (QImode);
+  }
+else
+  FAIL;
+  })
 
 
 ;; Overlapping non-HImode registers often (but not always) need a scratch.
@@ -1545,35 +1545,49 @@ (define_expand "rotl3"
 
 ; Split word aligned rotates using scratch that is mode dependent.
 (define_insn_and_split "*rotw"
-  [(set (match_operand:HIDI 0 "register_operand" "=r,r,#&r")
-	(rotate:HIDI (match_operand:HIDI 1 "register_operand" "0,r,r")
-		 (match_operand 2 "immediate_operand" "n,n,n")))
-   (clobber (match_operand: 3 "register_operand"  "=" ))]
-  "(CONST_INT_P (operands[2]) &&
- (0 == (INTVAL (operands[2]) % 16) && AVR_HAVE_MOVW))"
+  [(set (match_operand:HIDI 0 "register_operand"  "=r,r,&r")
+(rotate:HIDI (match_operand:HIDI 1 "register_operand"  "0,r,r")
+ (match_operand 2 "const_int_operand"  "n,n,n")))
+   (clobber (match_scratch: 3   "="))]
+  "AVR_HAVE_MOVW
+   && 0 == INTVAL (operands[2]) % 16"
   "#"
   "&& (reload_completed || mode == DImode)"
   [(const_int 0)]
-  "avr_rotate_bytes (operands);
-  DONE;"
-)
+  {
+/* Split happens in .split1: Cook up a pseudo */
+if (SCRATCH == GET_CODE (operands[3])
+&& !re

Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-14 Thread Denis Chertykov
2011/4/14 Georg-Johann Lay :
> Denis Chertykov schrieb:
>> 2011/4/14 Georg-Johann Lay :
>>> The "rotl3" expanders (mode \in {HI,SI,DI}) violates synopsis by
>>> using 4 operands instead of 3. This runs in ICE in top of
>>> optabs.c:maybe_gen_insn.
>>>
>>> The right way to do this is to use match_dup, not match_operand. So
>>> the fix is obvious.
>>>
>>> Regenerated avr-gcc and run against avr testsuite:
>>>
>>> gcc.target/avr/torture/pr41885.c generates these patterns
>>>
>>> Johann
>>>
>>> 2011-04-14  Georg-Johann Lay  
>>>
>>>        * config/avr/avr.md ("rotlhi3", "rotlsi3", "rotldi3"): Fix
>>>        expanders operand 3 from match_operand to match_dup.
>>
>> May be better to use (clobber (match_scratch ...)) here and in
>> `*rotw' and `*rotb'.
>
> These are splitters that might split before reload, producing strange,
> non-matching insns like
>  (set (scratch:QI) ...
> or crashing already in avr_rotate_bytes becase the split condition reads
> "&& (reload_completed || mode == DImode)"

Generally I'm agree with you, change match_operand to match_dup is easy.

But IMHO the right way is:
 - the splitters and avr_rotate_bytes are wrong and must be fixed too.
 - operands[3] is a scratch register and right way is to use match_scratch.

I can approve the patch.

Denis.


Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-14 Thread Georg-Johann Lay
Denis Chertykov schrieb:
> 2011/4/14 Georg-Johann Lay :
>> The "rotl3" expanders (mode \in {HI,SI,DI}) violates synopsis by
>> using 4 operands instead of 3. This runs in ICE in top of
>> optabs.c:maybe_gen_insn.
>>
>> The right way to do this is to use match_dup, not match_operand. So
>> the fix is obvious.
>>
>> Regenerated avr-gcc and run against avr testsuite:
>>
>> gcc.target/avr/torture/pr41885.c generates these patterns
>>
>> Johann
>>
>> 2011-04-14  Georg-Johann Lay  
>>
>>* config/avr/avr.md ("rotlhi3", "rotlsi3", "rotldi3"): Fix
>>expanders operand 3 from match_operand to match_dup.
> 
> May be better to use (clobber (match_scratch ...)) here and in
> `*rotw' and `*rotb'.

These are splitters that might split before reload, producing strange,
non-matching insns like
 (set (scratch:QI) ...
or crashing already in avr_rotate_bytes becase the split condition reads
"&& (reload_completed || mode == DImode)"

By the way: this and other patterns implement DImode insns, but there
is *no* movdi in avr backend. That is strongly discouraged, i.e.
implementing insns for FOOmode if there in no movfoo insn that can do
reloading for that mode. It's tempting to omit movdi (because movdi
would be the most complicated DI insn) but it's unhealthy.

IMO the DI stuff should be removed / deactivated as long as there is
no proper support of DImode.

> 
> Denis.
> 



Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-14 Thread Denis Chertykov
2011/4/14 Georg-Johann Lay :
> The "rotl3" expanders (mode \in {HI,SI,DI}) violates synopsis by
> using 4 operands instead of 3. This runs in ICE in top of
> optabs.c:maybe_gen_insn.
>
> The right way to do this is to use match_dup, not match_operand. So
> the fix is obvious.
>
> Regenerated avr-gcc and run against avr testsuite:
>
> gcc.target/avr/torture/pr41885.c generates these patterns
>
> Johann
>
> 2011-04-14  Georg-Johann Lay  
>
>        * config/avr/avr.md ("rotlhi3", "rotlsi3", "rotldi3"): Fix
>        expanders operand 3 from match_operand to match_dup.

May be better to use (clobber (match_scratch ...)) here and in
`*rotw' and `*rotb'.

Denis.


[Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-14 Thread Georg-Johann Lay
The "rotl3" expanders (mode \in {HI,SI,DI}) violates synopsis by
using 4 operands instead of 3. This runs in ICE in top of
optabs.c:maybe_gen_insn.

The right way to do this is to use match_dup, not match_operand. So
the fix is obvious.

Regenerated avr-gcc and run against avr testsuite:

gcc.target/avr/torture/pr41885.c generates these patterns

Johann

2011-04-14  Georg-Johann Lay  

* config/avr/avr.md ("rotlhi3", "rotlsi3", "rotldi3"): Fix
expanders operand 3 from match_operand to match_dup.


Index: config/avr/avr.md
===
--- config/avr/avr.md	(Revision 172416)
+++ config/avr/avr.md	(Arbeitskopie)
@@ -1521,7 +1521,7 @@ (define_expand "rotl3"
   [(parallel [(set (match_operand:HIDI 0 "register_operand" "")
 		   (rotate:HIDI (match_operand:HIDI 1 "register_operand" "")
 (match_operand:VOID 2 "const_int_operand" "")))
-		(clobber (match_operand 3 ""))])]
+		(clobber (match_dup 3))])]
   ""
   "
 {