Bernd Edlinger wrote: > On 11/29/16 16:06, Wilco Dijkstra wrote: > > Bernd Edlinger wrote: > > > > - "TARGET_32BIT && reload_completed > > + "TARGET_32BIT && ((!TARGET_NEON && !TARGET_IWMMXT) || reload_completed) > > && ! (TARGET_NEON && IS_VFP_REGNUM (REGNO (operands[0])))" > > > > This is equivalent to "&& (!TARGET_IWMMXT || reload_completed)" since we're > > already excluding NEON. > > Aehm, no. This would split the addi_neon insn before it is clear > if the reload pass will assign a VFP register.
Hmm that's strange... This instruction shouldn't be used to also split some random Neon pattern - for example arm_subdi3 doesn't do the same. To understand and reason about any of these complex patterns they should all work in the same way... > But when I make *arm_cmpdi_insn split early, it ICEs: (insn 4870 4869 1636 87 (set (scratch:SI) (minus:SI (minus:SI (subreg:SI (reg:DI 2261) 4) (subreg:SI (reg:DI 473 [ X$14 ]) 4)) (ltu:SI (reg:CC_C 100 cc) (const_int 0 [0])))) "pr77308-2.c":140 -1 (nil)) That's easy, we don't have a sbcs <scratch>, r1, r2 pattern. A quick workaround is to create a temporary for operand[2] (if before reload) so it will match the standard sbcs pattern, and then the split works fine. > So it is certainly possible, but not really simple to improve the > stack size even further. But I would prefer to do that in a > separate patch. Yes separate patches would be fine. However there is a lot of scope to improve this further. For example after your patch shifts and logical operations are expanded in expand, add/sub are in split1 after combine runs and everything else is split after reload. It doesn't make sense to split different operations at different times - it means you're still going to get the bad DImode subregs and miss lots of optimization opportunities due to the mix of partly split and partly not-yet-split operations. > BTW: there are also negd2_compare, *negdi_extendsidi, > *negdi_zero_extendsidi, *thumb2_negdi2. I have a patch to merge thumb2_negdi2 into arm_negdi2. For extends, if we split them at expand time, then none of the combined alu+extend patterns will be needed, and that will be a huge simplification. > I think it would be a precondition to have test cases that exercise > each of these patterns before we try to split these instructions. Agreed. Wilco