https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112824
Hongyu Wang <wwwhhhyyy333 at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |wwwhhhyyy333 at gmail dot com --- Comment #9 from Hongyu Wang <wwwhhhyyy333 at gmail dot com> --- (In reply to Hongtao Liu from comment #4) > there're 2 reasons. > 2. There's still spills for (subreg:DF (reg: V8DF) since > ix86_modes_tieable_p return false for DF and V8DF. > There could be some issue in sra that the aggregates are not properly scalarized due to size limit. The sra considers maximum aggregate size using move_ratio * UNITS_PER_WORD, but here the aggregate Dual<Dual<double, 8l>, 2l> actually contains several V8DF component that can be handled in zmm under avx512f. Add --param sra-max-scalarization-size-Ospeed=2048 will eliminate those spills So for sra we can consider using MOVE_MAX * move_ratio as the size limit for Ospeed which represents real backend instruction count.