On Wednesday 24 January 2018 06:29 PM, Siddhesh Poyarekar wrote: >>> + /* Avoid register indexing for 128-bit stores when the >>> + AARCH64_EXTRA_TUNE_SLOW_REGOFFSET_QUADWORD_STORE option is set. */ >>> + if (!optimize_size >>> + && type == ADDR_QUERY_STR >>> + && (aarch64_tune_params.extra_tuning_flags >>> + & AARCH64_EXTRA_TUNE_SLOW_REGOFFSET_QUADWORD_STORE) >>> + && (mode == TImode || mode == TFmode >>> + || aarch64_vector_data_mode_p (mode))) >>> + allow_reg_index_p = false; >> >> The aarch64_classify_vector_mode code has been reworked recently for SVE >> so I'm not entirely >> up to date with its logic, but I believe that >> "aarch64_classify_vector_mode (mode)" will >> allow 64-bit vector modes, which would not be using the 128-bit Q >> register, so you may be disabling >> register indexing for D-register memory stores. > > I check this and fix the condition if necessary.
Looking back at the patch I remember why I used aarch64_vector_data_mode_p; this is to catch the pattern aarch64_simd_mov<VQ:mode> which optimizes a 64-bit store pair into a single quad word store. It should not avoid register indexing for any other vector modes since their patterns won't pass ADDR_QUERY_STR. In any case, I will be doing the CPU2017 run without -mcpu=falkor, so I'll report results from that. Siddhesh