> From: Hans-Peter Nilsson <h...@axis.com>
> Date: Thu, 11 May 2023 17:05:40 +0200

> Next, I'll turn around completely, and try defaulting to
> -fsplit-wide-types-early, which sounds more promising. :)
> I don't like throwing defaults around randomly, but trying
> out a promising idea this way is easy.

Absolutely nothing changed (not counting now running
"subreg2" and generating a dump-file), compared to the
default.  Besides coremark and local micro-benchmarks I
inspected running arith-rand-ll.c with -O2 and briefly
stepped through the passes with gdb: the costs guiding the
splits are fine, properly enabling the splits, but not all
DImode registers are naturally "splittable"; looks like the
ones used in non-decomposable operations remain.  It seems
all splittable opportunities are dealt with by the first
pass ("subreg1").  I guess this pass has the most impact for
targets that have few or no DImode operations at all.

But why is the option called -fsplit-wide-types-early when
what it does is enabling a "subreg2" pass, there being
"subreg1" and "subreg3" enabled with -fsplit-wide-types?  It
should rather be called -fsplit-wide-types-second! :)

Looking at its placement in passes.def makes me wonder what
magic properties targets have that benefit from it.

Anyway, Roger mentioned that the clobbers emitted by the
lower-subreg passes were apparently damaging, so I'll try
this out "for fun", on the assumption that they're actually
unnecessary.  I don't think actually removing them has been
attempted?

The patch below seems to substantially lower register
pressure for arith-rand-ll for CRIS, but I've only inspected
the assembly source (not even compared the result to the
reload version).  Quoting it for reference only, and if it
"works" (passes regtest for cris-elf and x86-64-linux) I
think I'll resubmit as a proper patch:

--- lower-subreg.cc.orig        2023-04-29 02:53:39.000000000 +0200
+++ lower-subreg.cc     2023-05-12 15:35:25.574668930 +0200
@@ -1086,9 +1086,6 @@ resolve_simple_move (rtx set, rtx_insn *
     {
       unsigned int i;
 
-      if (REG_P (dest) && !HARD_REGISTER_NUM_P (REGNO (dest)))
-       emit_clobber (dest);
-
       for (i = 0; i < words; ++i)
        {
          rtx t = simplify_gen_subreg_concatn (word_mode, dest,

brgds, H-P

Reply via email to