https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112824

Hongyu Wang <wwwhhhyyy333 at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wwwhhhyyy333 at gmail dot com

--- Comment #9 from Hongyu Wang <wwwhhhyyy333 at gmail dot com> ---
(In reply to Hongtao Liu from comment #4)
> there're 2 reasons.

> 2. There's still spills for (subreg:DF (reg: V8DF) since
> ix86_modes_tieable_p return false for DF and V8DF.
> 

There could be some issue in sra that the aggregates are not properly
scalarized due to size limit.

The sra considers maximum aggregate size using move_ratio * UNITS_PER_WORD, but
here the aggregate Dual<Dual<double, 8l>, 2l> actually contains several V8DF
component that can be handled in zmm under avx512f. 

Add --param sra-max-scalarization-size-Ospeed=2048 will eliminate those spills

So for sra we can consider using MOVE_MAX * move_ratio as the size limit for
Ospeed which represents real backend instruction count.

Reply via email to