https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97784
--- Comment #6 from Segher Boessenkool <segher at gcc dot gnu.org> --- (In reply to Richard Biener from comment #3) > There is targetm.sched.reassociation_width which specifies how re-assocation > should make such sequence "wide". Ah cool, thank you :-) > Andrew is correct that we don't do this > for any types that are TYPE_OVERFLOW_UNDEFINED. Yes; but I see the sub-optimal behaviour for unsigned, too. > And powerpc has > > static int > rs6000_reassociation_width (unsigned int opc ATTRIBUTE_UNUSED, > machine_mode mode) > { > switch (rs6000_tune) > { > case PROCESSOR_POWER8: > case PROCESSOR_POWER9: > case PROCESSOR_POWER10: > if (DECIMAL_FLOAT_MODE_P (mode)) > return 1; > if (VECTOR_MODE_P (mode)) > return 4; > if (INTEGRAL_MODE_P (mode)) > return 1; Yeah this last 1 is the problem :-) > thus you get width 1 which means a linear chain (even if the user wrote > a tree). Yup. > Note RTL doesn't do any such thing like re-assocation (I guess in principle > scheduling could, and that's the only place where it would make sense > on RTL). RTL unrolling can, actually! "Variable expansion" is its horrible name (and it makes a lot of sense there: it allows breaking a bit linear chain into pieces).