Hi H-P,
This patch should now already be on trunk:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d8a6945c6ea22efa4d5e42fe1922d2
b27953c8cd
Many thanks to Jeff for the review/approval.
There have been no reported adverse effects so far.
Please let me/us know if this has helped CRIS.

Cheers,
Roger
--

-----Original Message-----
From: Hans-Peter Nilsson <h...@axis.com> 
Sent: 12 May 2023 14:54
To: Hans-Peter Nilsson <h...@axis.com>
Cc: ro...@nextmovesoftware.com; jeffreya...@gmail.com;
gcc-patches@gcc.gnu.org; seg...@kernel.crashing.org
Subject: Re: [committed] Convert xstormy16 to LRA

> From: Hans-Peter Nilsson <h...@axis.com>
> Date: Thu, 11 May 2023 17:05:40 +0200

> Next, I'll turn around completely, and try defaulting to 
> -fsplit-wide-types-early, which sounds more promising. :) I don't like 
> throwing defaults around randomly, but trying out a promising idea 
> this way is easy.

Absolutely nothing changed (not counting now running "subreg2" and
generating a dump-file), compared to the default.  Besides coremark and
local micro-benchmarks I inspected running arith-rand-ll.c with -O2 and
briefly stepped through the passes with gdb: the costs guiding the splits
are fine, properly enabling the splits, but not all DImode registers are
naturally "splittable"; looks like the ones used in non-decomposable
operations remain.  It seems all splittable opportunities are dealt with by
the first pass ("subreg1").  I guess this pass has the most impact for
targets that have few or no DImode operations at all.

But why is the option called -fsplit-wide-types-early when what it does is
enabling a "subreg2" pass, there being "subreg1" and "subreg3" enabled with
-fsplit-wide-types?  It should rather be called -fsplit-wide-types-second!
:)

Looking at its placement in passes.def makes me wonder what magic properties
targets have that benefit from it.

Anyway, Roger mentioned that the clobbers emitted by the lower-subreg passes
were apparently damaging, so I'll try this out "for fun", on the assumption
that they're actually unnecessary.  I don't think actually removing them has
been attempted?

The patch below seems to substantially lower register pressure for
arith-rand-ll for CRIS, but I've only inspected the assembly source (not
even compared the result to the reload version).  Quoting it for reference
only, and if it "works" (passes regtest for cris-elf and x86-64-linux) I
think I'll resubmit as a proper patch:

--- lower-subreg.cc.orig        2023-04-29 02:53:39.000000000 +0200
+++ lower-subreg.cc     2023-05-12 15:35:25.574668930 +0200
@@ -1086,9 +1086,6 @@ resolve_simple_move (rtx set, rtx_insn *
     {
       unsigned int i;
 
-      if (REG_P (dest) && !HARD_REGISTER_NUM_P (REGNO (dest)))
-       emit_clobber (dest);
-
       for (i = 0; i < words; ++i)
        {
          rtx t = simplify_gen_subreg_concatn (word_mode, dest,

brgds, H-P

Reply via email to