On Thu, Oct 19, 2017 at 7:03 PM, Sandra Loosemore
<san...@codesourcery.com> wrote:
> This is the set of nios2 optimization patches that I've previously
> mentioned in these threads:
>
> https://gcc.gnu.org/ml/gcc/2017-10/msg00016.html
> https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00957.html
>
> To give an overview of what this is for....
>
> The nios2 backend currently generates quite bad code for memory
> accesses with addresses involving symbolic constants.  Like a typical
> RISC machine, nios2 requires splitting such 32-bit constants into
> HIGH/LO_SUM pairs.  Currently this happens in expand, and address
> expressions involving such constants are always converted to use a
> register indirect form.
>
> One part of the problem is that the backend currently doesn't
> recognize that LO_SUM is a legitimate address form (it's register
> indirect with a constant offset using the %lo relocation).  That's
> fixed in these patches.
>
> A harder problem is that doing the high/lo_sum splitting in expand
> inhibits subsequent optimizations.  One such problem arises when you
> have accesses to multiple fields in a static structure object.  Expand
> sees this as many (symbol + offset) expressions involving the same
> symbol with different constant offsets.  What we should be doing in
> that case is CSE'ing the symbol address computation rather than
> splitting every such expression individually.
>
> This patch series attacks that problem by deferring splitting to the
> split1 pass, which happens after cse and fwprop optimizations.
> Deferring the splitting also requires that TARGET_LEGITIMATE_ADDRESS_P
> accept these symbolic constant expressions until the splitting takes
> place, and that code that might generate 32-bit constants in other
> places (e.g., the movsi expander) must not do so after they are
> supposed to have been split.
>
> This patch series also includes general improvements to the cost model
> to get better CSE results -- in particular, the nios2 backend has been
> completely missing an implementation for TARGET_ADDRESS_COST.  I also found
> that making TARGET_LEGITIMIZE_ADDRESS smarter resulted in better
> address cost modeling by the ivopts pass.
>
> All together, this resulted in about a 7% code size improvement on the
> customer-provided test case I was using for tuning purposes.

I remember the Sony version of the SPU Back-end doing something
similar and getting similar improvements.
But I don't remember the exact details either.  It might have been
because the SPU only had 128bit loads so expanding the loads too soon
was missing optimizations.  I do remember not upstreaming that code
and I was always disappointed it was not.

Thanks,
Andrew

>
> Patches in this set are broken down as follows:
>
> 1: Switch to LRA.
> 2: Detect when splitting has been completed.
> 3: Add splitters and recognize the new address modes.
> 4: Cost model improvements.
> 5: Test cases.
>
> Part 2 is the piece that relates to the discussion linked above.  As
> implemented, it works fine, but it's maybe not the best design.  I'll
> hold off on committing the entire set for at least a few days in case
> somebody wants to suggest a better solution.
>
> -Sandra
>

Reply via email to