Re: [committed] [RISC-V] Allow uarchs to set TARGET_OVERLAP_OP_BY_PIECES_P

Jeff Law Tue, 07 May 2024 14:30:43 -0700



On 5/7/24 3:24 PM, Palmer Dabbelt wrote:

@@ -529,6 +536,7 @@ static const struct riscv_tune_param generic_ooo_tune_info 
= {
    4,                                          /* fmv_cost */
    false,                                      /* slow_unaligned_access */
    false,                                      /* use_divmod_expansion */
+  false,                                       /* overlap_op_by_pieces */


IMO we should turn this on for the generic OOO tuning -- the benchmarks
say it's not faster for the T-Head OOO cores, but we were all so
surprised to find that I don't think we even fully trust the benchmarks.
I'd assume OOO cores are faster with the overlapping stores, so we
should just lean into it and let vendors say something if that's the
wrong assumption.

Several factors likely come into play (branch prediction, OOOproperties, write combining, etc etc).

But sure, I don't think that'd be terribly controversial. I can goahead and make that change now given its triviality.


Jeff

Re: [committed] [RISC-V] Allow uarchs to set TARGET_OVERLAP_OP_BY_PIECES_P

Reply via email to