https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393
--- Comment #7 from rguenther at suse dot de <rguenther at suse dot de> --- On Thu, 25 Nov 2021, crazylht at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 > > --- Comment #6 from Hongtao.liu <crazylht at gmail dot com> --- > (In reply to Hongtao.liu from comment #5) > > (In reply to Richard Biener from comment #3) > > > (In reply to H.J. Lu from comment #2) > > > > (In reply to Richard Biener from comment #1) > > > > > It isn't the vectorizer but memmove inline expansion. I'm not sure > > > > > it's > > > > > really a bug, but there isn't a way to disable %ymm use besides > > > > > disabling > > > > > AVX entirely. > > > > > HJ? > > > > > > > > YMM move is generated by loop distribution which doesn't check > > > > TARGET_PREFER_AVX128. > > > > > > I think it's generated by gimple_fold_builtin_memory_op which since > > > Richards > > > changes accepts bigger now, up to MOVE_MAX * MOVE_RATIO and that ends up > > > picking an integer mode via > > > > > > scalar_int_mode mode; > > > if (int_mode_for_size (ilen * 8, 0).exists (&mode) > > > && GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8 > > > && have_insn_for (SET, mode) > > > /* If the destination pointer is not aligned we must be > > > able > > > to emit an unaligned store. */ > > > && (dest_align >= GET_MODE_ALIGNMENT (mode) > > > || !targetm.slow_unaligned_access (mode, dest_align) > > > || (optab_handler (movmisalign_optab, mode) > > > != CODE_FOR_nothing))) > > > > > > not sure if there's another way to validate things. > > > > For one single set operation, shouldn't the total size be less than MOVE_MAX > > instead of MOVE_MAX * MOVE_RATIO? > > r12-3482 change MOVE_MAX to MOVE_MAX * MOVE_RATIO Yes, IIRC it was specifically to allow vector register moves on aarch64/arm which doesn't seem to have a MOVE_MAX that exceeds WORD_SIZE. It looks like x86 carefully tries to have a MOVE_MAX that honors -mprefer-xxx as to not exceed a single move size. Both seem to be in conflict here. Richard - why could arm/aarch64 not increase MOVE_MAX here?