Bernd Edlinger wrote:
> On 11/29/16 16:06, Wilco Dijkstra wrote:
> > Bernd Edlinger wrote:
> >
> > -  "TARGET_32BIT && reload_completed
> > +  "TARGET_32BIT && ((!TARGET_NEON && !TARGET_IWMMXT) || reload_completed)
> >     && ! (TARGET_NEON && IS_VFP_REGNUM (REGNO (operands[0])))"
> >
> > This is equivalent to "&& (!TARGET_IWMMXT || reload_completed)" since we're
> > already excluding NEON.
>
> Aehm, no.  This would split the addi_neon insn before it is clear
> if the reload pass will assign a VFP register.

Hmm that's strange... This instruction shouldn't be used to also split some 
random
Neon pattern - for example arm_subdi3 doesn't do the same. To understand and
reason about any of these complex patterns they should all work in the same 
way...

> But when I make *arm_cmpdi_insn split early, it ICEs:

(insn 4870 4869 1636 87 (set (scratch:SI)
         (minus:SI (minus:SI (subreg:SI (reg:DI 2261) 4)
                 (subreg:SI (reg:DI 473 [ X$14 ]) 4))
             (ltu:SI (reg:CC_C 100 cc)
                 (const_int 0 [0])))) "pr77308-2.c":140 -1
      (nil))

That's easy, we don't have a sbcs <scratch>, r1, r2 pattern. A quick workaround 
is
to create a temporary for operand[2] (if before reload) so it will match the 
standard
sbcs pattern, and then the split works fine.

> So it is certainly possible, but not really simple to improve the
> stack size even further.  But I would prefer to do that in a
> separate patch.

Yes separate patches would be fine. However there is a lot of scope to improve 
this
further. For example after your patch shifts and logical operations are 
expanded in
expand, add/sub are in split1 after combine runs and everything else is split 
after
reload. It doesn't make sense to split different operations at different times 
- it means
you're still going to get the bad DImode subregs and miss lots of optimization
opportunities due to the mix of partly split and partly not-yet-split 
operations.

> BTW: there are also negd2_compare, *negdi_extendsidi,
> *negdi_zero_extendsidi, *thumb2_negdi2.

I have a patch to merge thumb2_negdi2 into arm_negdi2. For extends, if we split 
them
at expand time, then none of the combined alu+extend patterns will be needed, 
and
that will be a huge simplification.

> I think it would be a precondition to have test cases that exercise
> each of these patterns before we try to split these instructions.

Agreed.

Wilco

Reply via email to