在 2023/12/14 上午9:16, chenglulu 写道:

在 2023/12/13 下午9:20, Xi Ruoyao 写道:
On Wed, 2023-12-13 at 20:22 +0800, chenglulu wrote:

在 2023/12/10 上午1:03, Xi Ruoyao 写道:
Replace the instruction costs in loongarch_rtx_cost_data constructor
based on micro-benchmark results on LA464 and LA664.

This allows optimizations like "x * 17" to alsl, and "x * 68" to alsl
and slli.

gcc/ChangeLog:

      PR target/112936
      * config/loongarch/loongarch-def.cc
      (loongarch_rtx_cost_data::loongarch_rtx_cost_data): Update
      instruction costs per micro-benchmark results.
      (loongarch_rtx_cost_optimize_size): Set all instruction costs
      to (COSTS_N_INSNS (1) + 1).
      * config/loongarch/loongarch.cc (loongarch_rtx_costs): Remove
      special case for multiplication when optimizing for size.
      Adjust division cost when TARGET_64BIT && !TARGET_DIV32.
      Account the extra cost when TARGET_CHECK_ZERO_DIV and
      optimizing for speed.

gcc/testsuite/ChangeLog

      PR target/112936
      * gcc.target/loongarch/mul-const-reduction.c: New test.
---
    gcc/config/loongarch/loongarch-def.cc         | 39 ++++++++++---------
    gcc/config/loongarch/loongarch.cc             | 22 +++++------
    .../loongarch/mul-const-reduction.c           | 11 ++++++
    3 files changed, 43 insertions(+), 29 deletions(-)
    create mode 100644 gcc/testsuite/gcc.target/loongarch/mul-const-reduction.c

Well, I'm curious about how the value of this cost is obtained.

I just make a loop containing 1000 mul.w instructions, then run the loop
1000000 times and compare the time usage with running another loop
containing 1000 addi.w instructions iterated 1000000 times too.
Likewise for other instructions...

Ok. I need to do a performance comparison of the spec here. Probably tomorrow the results will be available.

Thanks!

Sorry, there is a problem with my test environment, so the results may not be available until tomorrow.

Reply via email to