Re: [PATCH v3] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-28 Thread chenglulu
在 2023/12/29 上午12:11, Xi Ruoyao 写道: The problem with peephole2 is it uses a naive sliding-window algorithm and misses many cases. For example: float a[1]; float t() { return a[0] + a[8000]; } is compiled to: la.local$r13,a la.local$r12,a+32768 fld.s

Re: [PATCH] RISC-V: Fix misaligned stack offset for interrupt function

2023-12-28 Thread Fei Gao
On 2023-12-25 16:45  Kito Cheng wrote: >+++ b/gcc/testsuite/gcc.target/riscv/interrupt-misaligned.c >@@ -0,0 +1,29 @@ >+/* { dg-do compile } */ >+/* { dg-options "-O2 -march=rv64gc -mabi=lp64d -fno-schedule-insns >-fno-schedule-insns2" } */ >+/* { dg-skip-if "" { *-*-* } { "-flto

[PATCH v1] LoongArch: testsuite:Add loongarch to gcc.dg/vect/slp-26.c.

2023-12-28 Thread chenxiaolong
In the LoongArch architecture, GCC supports the vectorization function tested by vect/slp-26.c, but there is no detection of loongarch in dg-finals. Add loongarch to the appropriate dg-finals. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-26.c: Add loongarch. ---

[PATCH v1] LoongArch: testsuite:Add loongarch to gcc.dg/vect/slp-21.c.

2023-12-28 Thread chenxiaolong
In the GCC code of LoongArch architecture, IFN_STORE_LANES optimization operation is not supported, and four SLP statements are used for vectorization in slp-21.c. So add loongarch*-*-* to the corresponding dg-finals. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-21.c: Add loongarch. ---

Re: [PATCH v1 1/8] LoongArch: testsuite:Add detection procedures supported by the target.

2023-12-28 Thread Chenghua Xu
chenxiaolong writes: > In order to improve and check the function of vector quantization in > LoongArch architecture, tests on vector instruction set are provided > in target-support.exp. > > gcc/testsuite/ChangeLog: > > * lib/target-supports.exp:Add LoongArch to the list of supported >

RE: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-28 Thread Li, Pan2
Thanks Jeff. I think I locate where aarch64 performs the trick here. 1. In the .final we have rtl like (insn:TI 6 8 29 (set (reg:SF 32 v0) (const_double:SF -0.0 [-0x0.0p+0])) "/home/box/panli/gnu-toolchain/gcc/gcc/testsuite/gcc.dg/pr30957-1.c":31:7 79 {*movsf_aarch64} (nil)) 2.

[PATCH] Fix gen-vect-26.c testcase after loops with multiple exits [PR113167]

2023-12-28 Thread Andrew Pinski
This fixes the gcc.dg/tree-ssa/gen-vect-26.c testcase by adding `#pragma GCC novector` in front of the loop that is doing the checking of the result. We only want to test the first loop to see if it can be vectorize. Committed as obvious after testing on x86_64-linux-gnu with -m32.

[PATCH v4 6/6] RISC-V: Add support for xtheadvector-specific intrinsics.

2023-12-28 Thread Jun Sha (Joshua)
This patch only involves the generation of xtheadvector special load/store instructions and vext instructions. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class th_loadstore_width): Define new builtin bases. (BASE): Define new builtin bases. *

[PATCH v4] RISC-V: Handle differences between XTheadvector and Vector

2023-12-28 Thread Jun Sha (Joshua)
This patch is to handle the differences in instruction generation between Vector and XTheadVector. In this version, we only support partial xtheadvector instructions that leverage directly from current RVV1.0 with simple adding "th." prefix. For different name xtheadvector instructions but share

[PATCH v4] RISC-V: Adds the prefix "th." for the instructions of XTheadVector.

2023-12-28 Thread Jun Sha (Joshua)
This patch adds th. prefix to all XTheadVector instructions by implementing new assembly output functions. We only check the prefix is 'v', so that no extra attribute is needed. gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_asm_output_opcode): New function to add assembler

[PATCH v4] RISC-V: Introduce XTheadVector as a subset of V1.0.0

2023-12-28 Thread Jun Sha (Joshua)
This patch is to introduce basic XTheadVector support (march string parsing and a test for __riscv_xtheadvector) according to https://github.com/T-head-Semi/thead-extension-spec/ gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_subset_list::parse): Add new vendor

[PATCH v4] RISC-V: Change csr_operand into vector_length_operand for vsetvl patterns.

2023-12-28 Thread Jun Sha (Joshua)
This patch use vector_length_operand instead of csr_operand for vsetvl patterns, so that changes for vector will not affect scalar patterns using csr_operand in riscv.md. gcc/ChangeLog: * config/riscv/vector.md: Use vector_length_operand for vsetvl patterns. Co-authored-by: Jin

[PATCH v4] RISC-V: Change csr_operand into

2023-12-28 Thread Jun Sha (Joshua)
This patch use vector_length_operand instead of csr_operand for vsetvl patterns, so that changes for vector will not affect scalar patterns using csr_operand in riscv.md. gcc/ChangeLog: * config/riscv/vector.md: Use vector_length_operand for vsetvl patterns. Co-authored-by: Jin

[PATCH v4] RISC-V: Refactor riscv-vector-builtins-bases.cc

2023-12-28 Thread Jun Sha (Joshua)
This patch moves the definition of the enums lst_type and frm_op_type into riscv-vector-builtins-bases.h and removes the static visibility of fold_fault_load(), so these can be used in other compile units. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (enum lst_type):

[PATCH v4] RISC-V: Support XTheadVector extension

2023-12-28 Thread Jun Sha (Joshua)
This patch series presents gcc implementation of the XTheadVector extension [1]. [1] https://github.com/T-head-Semi/thead-extension-spec/ For some vector patterns that cannot be avoided, we use "!TARGET_XTHEADVECTOR" to disable them in order not to generate instructions that xtheadvector does

[PATCH v1] LoongArch: testsuite:Add the "-ffast-math" compilation option for the file vect-fmin-3.c.

2023-12-28 Thread chenxiaolong
After the detection of maximum reduction is enabled on LoongArch architecture, the regression test of GCC finds that vect-fmin-3.c fails. Currently, in the target-supports.exp file, only aarch64,arm,riscv, and LoongArch architectures are supported. Through analysis, the "-ffast-math" compilation

Re:[PATCH v4 5/6] RISC-V: Handle differences between XTheadvector and Vector

2023-12-28 Thread joshua
Hi Juzhe, These vsetvl patterns were written by you with csr_operand initially. Are you sure it can be repalced by vector_length_operand? Joshua -- 发件人:juzhe.zh...@rivai.ai 发送时间:2023年12月29日(星期五) 10:25 收件人:"cooper.joshua";

Re: [PATCH] MIPS: Implement TARGET_INSN_COSTS

2023-12-28 Thread YunQiang Su
Roger Sayle 于2023年12月29日周五 00:54写道: > > > > The current (default) behavior is that when the target doesn’t define > > TARGET_INSN_COST the middle-end uses the backend’s > > TARGET_RTX_COSTS, so multiplications are slower than additions, > > but about the same size when optimizing for size (with

Re:Re:[PATCH v4 5/6] RISC-V: Handle differences between XTheadvector and Vector

2023-12-28 Thread joshua
We do not have vector_length_operand in vsetvl patterns. (define_insn "@vsetvl" [(set (match_operand:P 0 "register_operand" "=r") (unspec:P [(match_operand:P 1 "vector_csr_operand" "rK") (match_operand 2 "const_int_operand" "i") (match_operand 3

Re:[PATCH v4 5/6] RISC-V: Handle differences between XTheadvector and Vector

2023-12-28 Thread joshua
Hi Juzhe, For vector_csr_operand, please refer to https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641124.html. Joshua -- 发件人:juzhe.zh...@rivai.ai 发送时间:2023年12月29日(星期五) 10:14 收件人:"cooper.joshua"; "gcc-patches" 抄 送:Jim

[PATCH v1 8/8] LoongArch: testsuite:Modify the result check in the FMA file.

2023-12-28 Thread chenxiaolong
When gcc enabled the vectorization of the common layer, some FAIL items appeared in GCC regression tests, such as gcc.dg/fma-{3,4,6,7}.c. On LoongArch architecture, for example, the result of fmsub.s instruction is a*b-c, and there is a problem of positive and negative zero inequality between the

[PATCH v1 4/8] LoongArch: testsuite:Fix FAIL in file bind_c_array_params_2.f90.

2023-12-28 Thread chenxiaolong
In the GCC regression test result, it is found that the bind_c_array_params_2.f90 test fails. After analysis, it is found that the reason why the test fails is that the regular expression in the test result cannot correctly detect the correct assembly code (such as bl %plt(myBindC)) generated on

[PATCH v1 3/8] LoongArch: testsuite:Added test support for vect-{82, 83}.c.

2023-12-28 Thread chenxiaolong
When gcc enables the file test under gcc.dg/vect, it is found that vect-{82, 83}.c does not support the test. Through analysis, LoongArch architecture supports the detection function of this test case. Therefore, the detection of LoongArch architecture is added to the test rules to solve the

[PATCH v1 6/8] LoongArch: testsuite:Added additional vectorization "-mlasx" compilation option.

2023-12-28 Thread chenxiaolong
After the detection procedure under the gcc.dg/vect directory was added to GCC, FAIL entries of vector multiplication transformations of different types appeared in the gcc regression test results. After debugging analysis, the main problem is that the 128-bit vector of LoongArch architecture does

[PATCH v1 7/8] LoongArch: testsuite:Added additional vectorization "-mlsx" compilation option.

2023-12-28 Thread chenxiaolong
When GCC is able to detect vectorized test cases in the common layer, FAIL entries appear in some test cases after regression testing. The cause of the error is that the vectorization option was not set when testing the program, and the vectorization code could not be generated, so additional

[PATCH v1 5/8] LoongArch: testsuite:Modify the test behavior in file pr60510.f.

2023-12-28 Thread chenxiaolong
When using binutils that does not support vectorization and gcc compiler toolchain that supports vectorization, regression tests found that pr60510.f had a FAIL entry. The reason is that the default setting of the program is the execution state, which will cause problems in the assembly stage when

[PATCH v1 2/8] LoongArch: testsuite:Modify the test behavior of the vect-bic-bitmask-{12, 23}.c file.

2023-12-28 Thread chenxiaolong
When the toolchain is built using binutils that does not support vectorization and gcc that supports vectorization, the regression test results of GCC show that the vect-bic-bitmask-{12,23}.c file fails. The reason is that it carries out two stages of compilation and assembly test, in the

[PATCH v1 1/8] LoongArch: testsuite:Add detection procedures supported by the target.

2023-12-28 Thread chenxiaolong
In order to improve and check the function of vector quantization in LoongArch architecture, tests on vector instruction set are provided in target-support.exp. gcc/testsuite/ChangeLog: * lib/target-supports.exp:Add LoongArch to the list of supported targets. ---

Re:[PATCH v4 5/6] RISC-V: Handle differences between XTheadvector and Vector

2023-12-28 Thread joshua
H Juzhe, This patch "RISC-V: Handle differences between XTheadvector and Vector" is addressing some code generation issues for RVV1.0 instructions that xtheadvector does not have, not with intrinsics. BTW, what about the following patch " RISC-V: Add support for xtheadvector-specific

回复:[PATCH v4 5/6] RISC-V: Handle differences between XTheadvector and Vector

2023-12-28 Thread joshua
H Juzhe, This patch "RISC-V: Handle differences between XTheadvector and Vector" is addressing some code generation issues for RVV1.0 instructions that xtheadvector does not have, not with intrinsics. BTW, what about the following patch " RISC-V: Add support for xtheadvector-specific

[PATCH v1 0/8] LoongArch:Enable testing for common

2023-12-28 Thread chenxiaolong
When using binutils, which does not support vectorization, and the gcc compiler toolchain, which does support vectorization, the following two types of error problems occur in gcc regression testing. 1.Failure of common tests in the gcc.dg/vect directory??? Regression testing of GCC has found

Re: [PATCH] Improved RTL expansion of field assignments into promoted registers.

2023-12-28 Thread YunQiang Su
In general, I agree with this change. When gcc12 on RV64, more than one `sext.w` will be produced with our test. (Note, use -O1). > > There are two things that help here. The first is that the most significant > bit never appears in the middle of a field, so we don't have to worry about >

Re: [PATCH] Improved RTL expansion of field assignments into promoted registers.

2023-12-28 Thread YunQiang Su
Jeff Law 于2023年12月29日周五 02:23写道: > > > > On 12/28/23 07:59, Roger Sayle wrote: > > > > This patch fixes PR rtl-optmization/104914 by tweaking/improving the way > > that fields are written into a pseudo register that needs to be kept sign > > extended. > Well, I think "fixes" is a bit of a

[PATCH v4 6/6] RISC-V: Add support for xtheadvector-specific intrinsics.

2023-12-28 Thread Jun Sha (Joshua)
This patch only involves the generation of xtheadvector special load/store instructions and vext instructions. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class th_loadstore_width): Define new builtin bases. (BASE): Define new builtin bases. *

[PATCH v4 5/6] RISC-V: Handle differences between XTheadvector and Vector

2023-12-28 Thread Jun Sha (Joshua)
This patch is to handle the differences in instruction generation between Vector and XTheadVector. In this version, we only support partial xtheadvector instructions that leverage directly from current RVV1.0 with simple adding "th." prefix. For different name xtheadvector instructions but share

[PATCH v1] LoongArch: testsuite:Fix FAIL in lasx-xvstelm.c file.

2023-12-28 Thread chenxiaolong
After implementing the cost model on the LoongArch architecture, the GCC compiler code has this feature turned on by default, which causes the lasx-xvstelm.c file test to fail. Through analysis, this test case can generate vectorization instructions required for detection only after disabling the

回复:[PATCH v3 1/6] RISC-V: Refactor riscv-vector-builtins-bases.cc

2023-12-28 Thread joshua
Hi Jeff, Perhaps fold_fault_load cannot be moved to riscv-protos.h since gimple_folder is declared in riscv-vector-builtins.h. It's not reasonable to include riscv-vector-builtins.h in riscv-protos.h. In fact, fold_fault_load is defined specially for some builtin functions, and it would be

[Committed] RISC-V: Robostify testcase pr113112-1.c

2023-12-28 Thread Juzhe-Zhong
The redudant dump check is fragile and easily changed, not necessary. Tested on both RV32/RV64 no regression. Remove it and committed. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr113112-1.c: Remove redundant checks. ---

[PATCH] RISC-V: Count pointer type SSA into RVV regs liveness for dynamic LMUL cost model

2023-12-28 Thread Juzhe-Zhong
This patch fixes the following choosing unexpected big LMUL which cause register spillings. Before this patch, choosing LMUL = 4: addisp,sp,-160 addiw t1,a2,-1 li a5,7 bleut1,a5,.L16 vsetivlizero,8,e64,m4,ta,ma vmv.v.x v4,a0

Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-28 Thread Jeff Law
On 12/28/23 17:42, Li, Pan2 wrote: Thanks Jeff for comments, and Happy new year! Interesting. So I'd actually peel one more layer off this onion. Why do the aarch64 and riscv targets generate different constants (0.0 vs -0.0)? Yeah, it surprise me too when debugging the foo function.

RE: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-28 Thread Li, Pan2
Thanks Jeff for comments, and Happy new year! > Interesting. So I'd actually peel one more layer off this onion. Why > do the aarch64 and riscv targets generate different constants (0.0 vs > -0.0)? Yeah, it surprise me too when debugging the foo function. But didn't dig into it in previous

Re: Fortran: Use non conflicting file extensions for intermediates [PR81615]

2023-12-28 Thread Harald Anlauf
Hi Rimvydas! Am 28.12.23 um 08:09 schrieb Rimvydas Jasinskas: On Wed, Dec 27, 2023 at 10:34 PM Harald Anlauf wrote: The patch is almost fine, except for a strange wording here: +@smallexample +gfortran -save-temps -c foo.F90 +@end smallexample + +preprocesses to in @file{foo.fii}, compiles

RE: [PATCH] Improved RTL expansion of field assignments into promoted registers.

2023-12-28 Thread Roger Sayle
Hi Jeff, Thanks for the speedy review. > On 12/28/23 07:59, Roger Sayle wrote: > > This patch fixes PR rtl-optmization/104914 by tweaking/improving the > > way that fields are written into a pseudo register that needs to be > > kept sign extended. > Well, I think "fixes" is a bit of a stretch.

Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-28 Thread Jeff Law
On 12/24/23 05:24, Roger Sayle wrote: What's exceedingly weird is T_N_T_M_P (DImode, SImode) isn't actually a truncation! The output precision is first, the input precision is second. The docs explicitly state the output precision should be smaller than the input precision (which makes

Re: [PATCH] Improved RTL expansion of field assignments into promoted registers.

2023-12-28 Thread Jeff Law
On 12/28/23 07:59, Roger Sayle wrote: This patch fixes PR rtl-optmization/104914 by tweaking/improving the way that fields are written into a pseudo register that needs to be kept sign extended. Well, I think "fixes" is a bit of a stretch. We're avoiding the issue by changing the early RTL

Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-28 Thread Jeff Law
On 12/24/23 01:11, YunQiang Su wrote: Yes. I also guess so. Any new idea? Well, I see multiple intertwined issues and I think MIPS has largely mucked this up. At a high level DI -> SI truncation is not a nop on MIPS64. We must explicitly sign extend the value from SI->DI to preserve the

[PATCH] MIPS: Implement TARGET_INSN_COSTS

2023-12-28 Thread Roger Sayle
The current (default) behavior is that when the target doesn't define TARGET_INSN_COST the middle-end uses the backend's TARGET_RTX_COSTS, so multiplications are slower than additions, but about the same size when optimizing for size (with -Os or -Oz). All of this gets disabled with your

Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-28 Thread Jeff Law
On 12/26/23 02:34, pan2...@intel.com wrote: From: Pan Li This patch would like to XFAIL the test case pr30957-1.c for the RVV when build the elf with some configurations (list at the end of the log) It will be vectorized during vect_transform_loop with a variable factor. It won't benefit

Re: [PATCH V2] RISC-V: Disallow transformation into VLMAX AVL for cond_len_xxx when length is in range [0,31]

2023-12-28 Thread Jeff Law
On 12/26/23 19:38, Juzhe-Zhong wrote: Notice we have this following situation: vsetivlizero,4,e32,m1,ta,ma vlseg4e32.v v4,(a5) vlseg4e32.v v12,(a3) vsetvli a5,zero,e32,m1,tu,ma ---> This is redundant since VLMAX AVL = 4 when it

[PATCH v3] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-28 Thread Xi Ruoyao
The problem with peephole2 is it uses a naive sliding-window algorithm and misses many cases. For example: float a[1]; float t() { return a[0] + a[8000]; } is compiled to: la.local$r13,a la.local$r12,a+32768 fld.s $f1,$r13,0 fld.s $f0,$r12,-768

Re: [ARC PATCH] Table-driven ashlsi implementation for better code/rtx_costs.

2023-12-28 Thread Jeff Law
On 12/23/23 16:37, Roger Sayle wrote: One of the cool features of the H8 backend is its use of tables to select optimal shift implementations for different CPU variants. This patch borrows (plagiarizes) that idiom for SImode left shifts in the ARC backend (for CPUs without a

Re: [PATCH] RISC-V: Fix misaligned stack offset for interrupt function

2023-12-28 Thread Jeff Law
On 12/25/23 01:45, Kito Cheng wrote: `interrupt` function will backup fcsr register, but it fixed to SImode, it's not big issue since fcsr only used 8 bits so far, however the offset should still using UNITS_PER_WORD to prevent the stack offset become non 8 byte aligned, it will cause problem

Re: [PATCH] RISC-V: Add crypto machine descriptions

2023-12-28 Thread Jeff Law
On 12/26/23 19:47, Kito Cheng wrote: Thanks Feng, the patch is LGTM from my side, I am happy to accept vector crypto stuffs for GCC 14, it's mostly intrinsic stuff, and the only few non-intrinsic stuff also low risk enough (e.g. vrol, vctz) I won't object. I'm disappointed that we're in a

Re: 回复:[PATCH v3 2/6] RISC-V: Split csr_operand in predicates.md for vector patterns.

2023-12-28 Thread Jeff Law
On 12/26/23 19:49, joshua wrote: Hi Jeff, Yes, I will change soemthing in vector_csr_operand in the following patches. Constraints will be added that the AVL cannot be encoded as an immediate for xtheadvecotr vsetvl. Ah. Thanks. Makes sense. jeff

[middle-end PATCH] Only call targetm.truly_noop_truncation for truncations.

2023-12-28 Thread Roger Sayle
The truly_noop_truncation target hook is documented, in target.def, as "true if it is safe to convert a value of inprec bits to one of outprec bits (where outprec is smaller than inprec) by merely operating on it as if it had only outprec bits", i.e. the middle-end can use a SUBREG instead of a

[PATCH] Improved RTL expansion of field assignments into promoted registers.

2023-12-28 Thread Roger Sayle
This patch fixes PR rtl-optmization/104914 by tweaking/improving the way that fields are written into a pseudo register that needs to be kept sign extended. The motivating example from the bugzilla PR is: extern void ext(int); void foo(const unsigned char *buf) { int val; ((unsigned

[PATCH] MIPS: Implement TARGET_INSN_COSTS

2023-12-28 Thread YunQiang Su
MIPS backend had some information about INSN, including length, count etc. And since some instructions are more costly, let's add a new attr `perf_ratio`. It's default value is (const_int 1). The return value of mips_insn_cost is insn_count * perf_ratio * 4. The magic `4` here, is due to

[PATCH] aarch64: fortran: Adjust vect-8.f90 for libmvec

2023-12-28 Thread Szabolcs Nagy
With new glibc one more loop can be vectorized via simd exp in libmvec. Found by the Linaro TCWG CI. gcc/testsuite/ChangeLog: * gfortran/vect/vect-8.f90: Accept more vectorized loops. --- gcc/testsuite/gfortran.dg/vect/vect-8.f90 | 4 ++-- 1 file changed, 2 insertions(+), 2

Re: Re: [PATCH v1] LoongArch: Merge constant vector permuatation implementations.

2023-12-28 Thread 李威
I also have the same doubts about vector instructions. Sorry i can't prove it, so i used simplify_gen_subreg instead to make sure there won't be problems (i submitted the v2 version), my oversight. > -原始邮件- > 发件人: "Xi Ruoyao" > 发送时间:2023-12-28 18:55:01 (星期四) > 收件人: "Li Wei" ,

[PATCH v2] LoongArch: Merge constant vector permuatation implementations.

2023-12-28 Thread Li Wei
There are currently two versions of the implementations of constant vector permutation: loongarch_expand_vec_perm_const_1 and loongarch_expand_vec_perm_const_2. The implementations of the two versions are different. Currently, only the implementation of loongarch_expand_vec_perm_const_1 is used

[committed] i386: Cleanup ix86_expand_{unary|binary}_operator issues

2023-12-28 Thread Uros Bizjak
Move ix86_expand_unary_operator from i386.cc to i386-expand.cc, re-arrange prototypes and do some cosmetic changes with the usage of TARGET_APX_NDD. No functional changes. gcc/ChangeLog: * config/i386/i386.cc (ix86_unary_operator_ok): Move from here... * config/i386/i386-expand.cc

Re: [PATCH v1] LoongArch: Merge constant vector permuatation implementations.

2023-12-28 Thread Xi Ruoyao
On Thu, 2023-12-28 at 14:59 +0800, Li Wei wrote: > There are currently two versions of the implementations of constant > vector permutation: loongarch_expand_vec_perm_const_1 and > loongarch_expand_vec_perm_const_2.  The implementations of the two > versions are different. Currently, only the

Re: [x86_PATCH] peephole2 to resolve failure of gcc.target/i386/pr43644-2.c

2023-12-28 Thread Uros Bizjak
On Fri, Dec 22, 2023 at 11:14 AM Roger Sayle wrote: > > > This patch resolves the failure of pr43644-2.c in the testsuite, a code > quality test I added back in July, that started failing as the code GCC > generates for 128-bit values (and their parameter passing) has been in > flux. After a few

[PATCH v1] LoongArch: Merge constant vector permuatation implementations.

2023-12-28 Thread Li Wei
There are currently two versions of the implementations of constant vector permutation: loongarch_expand_vec_perm_const_1 and loongarch_expand_vec_perm_const_2. The implementations of the two versions are different. Currently, only the implementation of loongarch_expand_vec_perm_const_1 is used