This FAIL was introduced from r14-6908. The reason is that when merging
constant vector permutation implementations, the 128-bit matching situation
was not fully considered. In fact, the expansion of 128-bit vectors after
merging only supports value-based 4 elements set shuffle, so this time is a
We found that when only 128-bit vectorization was enabled, 549.fotonik3d_r
failed to vectorize effectively. For this reason, we adjust the cost of
128-bit vector_stmt that match the multiply-add pattern to facilitate 128-bit
vectorization.
The experimental results show that after the modification,
We found that in the spec17 521.wrf program, some loop invariant code generated
from single-precision floating-point approximate division calculation failed to
propose a loop. This is because the pseudo-register that stores the
intermediate temporary calculation results is rewritten in the
We found that when only 128-bit vectorization was enabled, 549.fotonik3d_r
failed to vectorize effectively. For this reason, we adjust the cost of
128-bit vector_stmt that match the multiply-add pattern to facilitate 128-bit
vectorization.
The experimental results show that after the modification,
Eliminate the redundant sign extension that exists after the conditional
move when the target register is SImode.
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_expand_conditional_move):
Adjust.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/sign-extend-2.c:
We found that the current combine optimization pass in gcc cannot handle
the following redundant sign extension situations:
(insn 77 76 78 5 (set (reg:SI 143)
(plus:SI (subreg/s/u:SI (reg/v:DI 104 [ len ]) 0)
(const_int 1 [0x1]))) {addsi3}
(expr_list:REG_DEAD (reg/v:DI 104
There are currently two versions of the implementations of constant
vector permutation: loongarch_expand_vec_perm_const_1 and
loongarch_expand_vec_perm_const_2. The implementations of the two
versions are different. Currently, only the implementation of
loongarch_expand_vec_perm_const_1 is used
There are currently two versions of the implementations of constant
vector permutation: loongarch_expand_vec_perm_const_1 and
loongarch_expand_vec_perm_const_2. The implementations of the two
versions are different. Currently, only the implementation of
loongarch_expand_vec_perm_const_1 is used
We found that using the latest compiled gcc will cause a miscompare error
when running spec2006 400.perlbench test with -flto turned on. After testing,
it was found that only the LoongArch architecture will report errors.
The first error commit was located through the git bisect command as
In the r14-5547 commit, C[LT]Z_DEFINED_VALUE_AT_ZERO were defined at
the same time, but in fact, CLZ_DEFINED_VALUE_AT_ZERO has already been
defined, so remove the duplicate definition.
gcc/ChangeLog:
* config/loongarch/loongarch.h (CTZ_DEFINED_VALUE_AT_ZERO): Add
description.
For vector constant extract-{even/odd} permutation replace the default
[x]vshuf instruction combination with [x]vilv{l/h} instruction, which
can reduce instructions and improves performance.
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_is_odd_extraction):
In LoongArch, the vector popcount has corresponding instructions, while
the scalar does not. Currently, the scalar popcount is calculated
through a loop, and the value of a non-power of two needs to be iterated
several times, so the vector popcount instruction is considered for
optimization.
The LoongArch has defined ctz and clz on the backend, but if we want GCC
do CTZ transformation optimization in forwprop2 pass, GCC need to know
the value of c[lt]z at zero, which may be beneficial for some test cases
(like spec2017 deepsjeng_r).
After implementing the macro, we test dynamic
The LoongArch has defined ctz and clz on the backend, but if we want GCC
do CTZ transformation optimization in forwprop2 pass, GCC need to know
the value of c[lt]z at zero, which may be beneficial for some test cases
(like spec2017 deepsjeng_r).
After implementing the macro, we test dynamic
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_builtin_vectorization_cost):
---
gcc/config/loongarch/loongarch.cc | 21 ++---
1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/gcc/config/loongarch/loongarch.cc
b/gcc/config/loongarch/loongarch.cc
15 matches
Mail list logo