Re: [PATCH v4 0/6] Add Loongson SX/ASX instruction support to LoongArch target.

2023-10-19 Thread chenglulu
在 2023/8/20 下午4:25, Xi Ruoyao 写道: On Thu, 2023-08-17 at 15:20 +0800, Chenghui Pan wrote: Seems ARMv8-A only guarantees to preserve low 64-bit value of NEON/floating-point register value. I'm not sure that I modify the testcase in the right way and maybe we need more investigations. Any ideas

Re: [PATCH 2/5] LoongArch: Use explicit relocs for GOT access when -mexplicit-relocs=auto and LTO during a final link with linker plugin

2023-10-21 Thread chenglulu
/* snip */ +/* If -mexplicit-relocs=auto, we use machine operations with reloc hints + for cases where the linker is unable to relax so we can schedule the + machine operations, otherwise use an assembler pseudo-op so the + assembler will generate R_LARCH_RELAX. */ + +bool

Re:[pushed] [PATCH] LoongArch: Define macro CLEAR_INSN_CACHE.

2023-10-22 Thread chenglulu
Pushed to r14-4836. 在 2023/10/20 下午3:15, Lulu Cheng 写道: LoongArch's microstructure ensures cache consistency by hardware. Due to out-of-order execution, ibar is required to ensure the visibility of the store (invalidated icache) executed by this CPU before ibar (to the instance). ibar will not

Re: [PATCH 2/5] LoongArch: Use explicit relocs for GOT access when -mexplicit-relocs=auto and LTO during a final link with linker plugin

2023-10-22 Thread chenglulu
在 2023/10/21 下午4:42, Xi Ruoyao 写道: On Sat, 2023-10-21 at 15:32 +0800, chenglulu wrote: +  /* If we are performing LTO for a final link, and we have the linker + plugin so we know the resolution of the symbols, then all GOT + references are binding to external symbols or preemptable

Re: [PATCH] Loongarch: Fix plugin header missing install.

2023-08-17 Thread chenglulu
LGTM! 在 2023/8/16 上午9:48, Guo Jie 写道: gcc/ChangeLog: * config/loongarch/t-loongarch: Add loongarch-driver.h into TM_H. Add loongarch-def.h and loongarch-tune.h into OPTIONS_H_EXTRA. Co-authored-by: Lulu Cheng --- gcc/config/loongarch/t-loongarch | 4 1 file

Re: [pushed][PATCH v2] libffi: Backport of LoongArch support for libffi.

2023-08-23 Thread chenglulu
Pushed to r14-3405. 在 2023/8/23 上午10:56, Lulu Cheng 写道: v1 -> v2: Modify the changelog information and add PR libffi/108682. This is a backport of , and contains modifications to commit 5a4774cd4d, as well as the LoongArch schema

Re: [pushed][PATCH v2] LoongArch: Remove redundant sign extension instructions caused by SLT instructions.

2023-08-27 Thread chenglulu
Pushed to r14-3511. 在 2023/8/25 下午5:31, Lulu Cheng 写道: v1 -> v2: 1. Modify description information Since the SLT instruction does not distinguish between 64-bit operations and 32-bit operations under the 64-bit LoongArch architecture, if the operand of slt is SImode, the sign

Re: [PATCH v1] LoongArch: Remove the symbolic extension instruction due to the SLT directive.

2023-08-25 Thread chenglulu
在 2023/8/25 下午12:16, WANG Xuerui 写道: On 8/25/23 12:01, Lulu Cheng wrote: Since the slt instruction does not distinguish between 32-bit and 64-bit operations under the LoongArch 64-bit architecture, if the operands of slt are of SImode, symbol expansion is required before operation.

Re: [pushed][PATCH v2] LoongArch: Enable '-free' starting at -O2.

2023-08-28 Thread chenglulu
Pushed to r14-3533. 在 2023/8/28 下午5:21, Xi Ruoyao 写道: On Mon, 2023-08-28 at 11:46 +0800, Lulu Cheng wrote: v1 -> v2: 1. Modify Changelog information format. gcc/ChangeLog: * common/config/loongarch/loongarch-common.cc: Enable '-free' on O2 and above. *

Re: [pushed][PATCH v1] LoongArch: Fix instruction name typo in lsx_vreplgr2vr_ template

2023-11-10 Thread chenglulu
Pushed to r14-5314. 在 2023/11/3 下午5:01, Chenghui Pan 写道: gcc/ChangeLog: * config/loongarch/lsx.md: Fix instruction name typo in lsx_vreplgr2vr_ template. --- gcc/config/loongarch/lsx.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

Re: [PATCH v3] LoongArch: Libvtv add loongarch support.

2022-10-28 Thread chenglulu
在 2022/10/28 17:38, WANG Xuerui 写道: Hi, The code change seems good but a few grammatical nits. Patch subject should be a verb phrase, something like "libvtv: add LoongArch support" could be better. Ok, thank you. I'll make the changes. On 2022/10/28 16:01, Lulu Cheng wrote: After

Re: [pushed][PATCH v3] LoongArch: Add prefetch instructions.

2022-11-22 Thread chenglulu
Pushed r13-4259. 在 2022/11/16 10:10, Lulu Cheng 写道: v2 -> v3: 1. Remove preldx support. --- Enable sw prefetching at -O3 and higher. Co-Authored-By: xujiahao gcc/ChangeLog: * config/loongarch/constraints.md (ZD): New constraint. *

Re: [PATCH v4] LoongArch: Optimize immediate load.

2022-11-22 Thread chenglulu
在 2022/11/23 00:44, Xi Ruoyao 写道: While I still can't fully understand the immediate load issue and how this patch fix it, I've tested this patch (alongside the prefetch instruction patch) with bootstrap-ubsan.  And the compiled result of imm-load1.c seems OK. And it's doing correct thing for

Re: [PATCH v1] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2022-11-23 Thread chenglulu
在 2022/11/23 16:59, Xi Ruoyao 写道: On Wed, 2022-11-23 at 14:49 +0800, Lulu Cheng wrote:     'A' Print a _DB suffix if the memory model requires a release.     'b' Print the address of a memory operand, without offset. +   'c'  print an integer. Nit: 'c' Print an integer. to match

Re: [PATCH v1] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2022-11-23 Thread chenglulu
在 2022/11/23 17:25, Xi Ruoyao 写道: On Wed, 2022-11-23 at 17:14 +0800, chenglulu wrote: 在 2022/11/23 16:59, Xi Ruoyao 写道: On Wed, 2022-11-23 at 14:49 +0800, Lulu Cheng wrote: 'A' Print a _DB suffix if the memory model requires a release. 'b' Print the address of a memory operand

Re: [PATCH v6] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2023-01-19 Thread chenglulu
在 2023/1/18 下午5:14, Richard Sandiford 写道: Lulu Cheng writes: Co-authored-by: Yang Yujie gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_classify_address): Add precessint for CONST_INT. (loongarch_print_operand_reloc): Operand modifier 'c' is supported.

Re: [pushed][PATCH v6] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2023-01-23 Thread chenglulu
Pushed r13-5319. 在 2023/1/18 下午5:14, Richard Sandiford 写道: Lulu Cheng writes: Co-authored-by: Yang Yujie gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_classify_address): Add precessint for CONST_INT. (loongarch_print_operand_reloc): Operand modifier 'c'

Re: [PATCH] LoongArch: Allow using --with-arch=native if host CPU is LoongArch

2023-07-22 Thread chenglulu
在 2023/7/20 下午9:28, Xi Ruoyao 写道: If the host triple and the target triple are different but the host is LoongArch, in some cases --with-arch=native can be useful. For example, if we are bootstrapping a loongarch64-linux-musl toolchain on a Glibc-based system and we don't intend to use the

[PATCH] LoongArch: Fix bug in loongarch_emit_stack_tie [PR110484].

2023-06-29 Thread chenglulu
From: Lulu Cheng Which may result in implicit references to $fp when frame_pointer_needed is false, causing regs_ever_live[$fp] to be true when $fp is not explicitly used, resulting in $fp being used as the target replacement register in the rnreg pass. The bug originates from SPEC2017

Re: [PATCH] LoongArch: Enable shrink wrapping

2023-05-06 Thread chenglulu
在 2023/5/7 上午1:07, Xi Ruoyao 写道: On Wed, 2023-04-26 at 18:21 +0800, WANG Xuerui wrote: On 2023/4/26 18:14, Lulu Cheng wrote: 在 2023/4/26 下午6:02, WANG Xuerui 写道: On 2023/4/26 17:53, Lulu Cheng wrote: Hi, ruoyao:   The performance of spec2006 is finished. The fixed-point 400.perlbench

Re: Pushed: [PATCH v2] LoongArch: Enable shrink wrapping

2023-05-06 Thread chenglulu
在 2023/5/7 上午1:05, Xi Ruoyao 写道: On Wed, 2023-04-26 at 21:29 +0800, Xi Ruoyao via Gcc-patches wrote:    Do you have any questions about the test cases mentioned by Guo Jie? If there is no problem, modify the test case, I think the code can be merged into the main branch. I'll rewrite

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-19 Thread chenglulu
在 2024/1/19 下午1:46, Xi Ruoyao 写道: On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote: Virtual register 1479 will be used in insn 2744, but register 1479 was assigned the REG_UNUSED attribute in the previous instruction. The attached file is the wrong file. The compilation command

Re: [PATCH v3] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT

2024-01-19 Thread chenglulu
Hi, Jiahao: This patch will introduce redundant FAIL, and the reason needs to be explained. +FAIL: gcc.dg/tree-ssa/copy-headers-8.c scan-tree-dump-times ch2 "Conditional combines static and invariant" 1 +FAIL: gcc.dg/tree-ssa/copy-headers-8.c scan-tree-dump-times ch2 "Will duplicate bb" 2

Re: [PATCH] LoongArch: Disable explicit reloc for TLS LD/GD with -mexplicit-relocs=auto

2024-01-22 Thread chenglulu
LGTM! Thanks! 在 2024/1/23 上午2:42, Xi Ruoyao 写道: Binutils 2.42 supports TLS LD/GD relaxation which requires the assembler macro. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_explicit_relocs_p): If la_opt_explicit_relocs is EXPLICIT_RELOCS_AUTO, return false

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-08 Thread chenglulu
在 2024/2/7 上午12:23, Xi Ruoyao 写道: Hi Lulu, I'm proposing to backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP." to releases/gcc-12 and releases/gcc-13. The reasons: 1. Strictly speaking, the old ASM_OUTPUT_ALIGN_WITH_NOP macro may cause a correctness issue.

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-19 Thread chenglulu
在 2024/2/9 下午4:08, Xi Ruoyao 写道: On Fri, 2024-02-09 at 00:02 +0800, chenglulu wrote: 在 2024/2/7 上午12:23, Xi Ruoyao 写道: Hi Lulu, I'm proposing to backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP." to releases/gcc-12 and releases/gcc-13.  The r

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-20 Thread chenglulu
在 2024/2/20 下午7:31, Xi Ruoyao 写道: On Tue, 2024-02-20 at 19:25 +0800, Xi Ruoyao wrote: On Tue, 2024-02-20 at 10:07 +0800, chenglulu wrote: So I think that without worrying about performance and ensuring that there is no problem with binutils, I think we can make the following modifications

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-20 Thread chenglulu
在 2024/2/20 下午7:54, Xi Ruoyao 写道: On Tue, 2024-02-20 at 19:50 +0800, chenglulu wrote: 在 2024/2/20 下午7:31, Xi Ruoyao 写道: On Tue, 2024-02-20 at 19:25 +0800, Xi Ruoyao wrote: On Tue, 2024-02-20 at 10:07 +0800, chenglulu wrote: So I think that without worrying about performance and ensuring

Re: [PATCH v1 0/4] Fix a series of problems caused by

2024-02-20 Thread chenglulu
Sorry, this title is incomplete and has been resent. 在 2024/2/21 上午11:08, Lulu Cheng 写道: Because binutils2.42 corrects the implementation of ".align [abs-expr,[abs-expr[,abs-expr]]]". The macro ASM_OUTPUT_ALIGN_WITH_NOP in GCC uses this assembler directive, and an error occurs. See link below

Re: [pushed][PATCH v1 0/4] Fix a series of problems caused by ASM_OUTPUT_ALIGN_WITH_NOP (release/gcc-12).

2024-02-21 Thread chenglulu
Pushed to r12-10169...r12-10172. 在 2024/2/21 上午11:10, Lulu Cheng 写道: Because binutils2.42 corrects the implementation of ".align [abs-expr,[abs-expr[,abs-expr]]]". The macro ASM_OUTPUT_ALIGN_WITH_NOP in GCC uses this assembler directive, and an error occurs. See link below for detailed

Re:[pushed] [PATCH v1 0/4] Fix a series of problems caused by

2024-02-21 Thread chenglulu
Pushed to r13-8349...r13-8352. 在 2024/2/21 上午11:04, Lulu Cheng 写道: Because binutils2.42 corrects the implementation of ".align [abs-expr,[abs-expr[,abs-expr]]]". The macro ASM_OUTPUT_ALIGN_WITH_NOP in GCC uses this assembler directive, and an error occurs. See link below for detailed

Re: [PATCH v2] LoongArch: Split loongarch_option_override_internal into smaller procedures

2024-02-21 Thread chenglulu
)     (unspec:SF [     (reg/v:SF 82 [ b ])     ] UNSPEC_RECIPE)) "recip.c":4:12 -1 (nil)) during RTL pass: vregs recip.c:5:1: 编译器内部错误:在 extract_insn 中,于 recog.cc:2812 0x135d1d4 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) /home/chenglulu/wor

Re:[pushed] [PATCH 1/2] LoongArch: Fix wrong return value type of __iocsrrd_h.

2024-02-17 Thread chenglulu
Pushed to r14-9053. 在 2024/2/6 上午10:10, Lulu Cheng 写道: gcc/ChangeLog: * config/loongarch/larchintrin.h (__iocsrrd_h): Modify the function return value type to unsigned short. --- gcc/config/loongarch/larchintrin.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff

Re:[pushed] [PATCH 2/2] LoongArch: Remove redundant symbol type conversions in larchintrin.h.

2024-02-17 Thread chenglulu
Pushed to r14-9054. 在 2024/2/6 上午10:10, Lulu Cheng 写道: gcc/ChangeLog: * config/loongarch/larchintrin.h (__movgr2fcsr): Remove redundant symbol type conversions. (__cacop_d): Likewise. (__cpucfg): Likewise. (__asrtle_d): Likewise. (__asrtgt_d):

Re: [pushed][PATCH v3 0/2] LoongArch D support

2023-12-17 Thread chenglulu
Pushed to r14-6648 r14-6649 and r14-6650. Thanks. 在 2023/12/8 下午6:09, Yang Yujie 写道: This patchset is based on Zixing Liu's initial support patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631260.html Updates v1 -> v2: Rebased onto the dmd/druntime upstream state. v2 -> v3:

Re: [PATCH 2/3] LoongArch: Fix instruction costs [PR112936]

2023-12-13 Thread chenglulu
在 2023/12/13 下午9:20, Xi Ruoyao 写道: On Wed, 2023-12-13 at 20:22 +0800, chenglulu wrote: 在 2023/12/10 上午1:03, Xi Ruoyao 写道: Replace the instruction costs in loongarch_rtx_cost_data constructor based on micro-benchmark results on LA464 and LA664. This allows optimizations like "x * 17&quo

Re: [PATCH 0/2] LoongArch: Fix PR113033 and clean up code

2023-12-18 Thread chenglulu
We will read and test these patches as soon as possible. Thanks! 在 2023/12/19 下午2:59, Xi Ruoyao 写道: Superseds https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640871.html. Per Jakub's response, vec_init patterns do not have a predicate on the input operand so the operand can be

Re: [PATCH v2] extend.texi: Fix typos in LSX intrinsics

2023-12-18 Thread chenglulu
LGTM! Thanks for the revision.:-) 在 2023/12/13 下午11:26, Jiajie Chen 写道: Several typos have been found and fixed: missing semicolons, using variable name instead of type, duplicate functions and wrong types. gcc/ChangeLog: * doc/extend.texi(__lsx_vabsd_di): remove extra `i' in name.

Re: [PATCH] LoongArch: Expand left rotate to right rotate with negated amount

2023-12-22 Thread chenglulu
Hi, This patch will cause the following tests to fail: +FAIL: gcc.dg/vect/pr97081-2.c (internal compiler error: in extract_insn, at recog.cc:2812) +FAIL: gcc.dg/vect/pr97081-2.c (test for excess errors) +FAIL: gcc.dg/vect/pr97081-2.c -flto -ffat-lto-objects (internal compiler error: in

Re: [PATCH] LoongArch: Add sign_extend pattern for 32-bit rotate shift

2023-12-22 Thread chenglulu
LGTM! Thanks! 在 2023/12/17 下午11:16, Xi Ruoyao 写道: Remove a redundant sign extension. gcc/ChangeLog: * config/loongarch/loongarch.md (rotrsi3_extend): New define_insn. gcc/testsuite/ChangeLog: * gcc.target/loongarch/rotrw.c: New test. --- Bootstrapped and regtested

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-22 Thread chenglulu
在 2023/12/22 下午3:21, chenglulu 写道: 在 2023/12/22 下午3:09, Xi Ruoyao 写道: On Fri, 2023-12-22 at 11:44 +0800, chenglulu wrote: 在 2023/12/21 下午8:00, chenglulu 写道: Sorry, I've been busy with something else these two days. I don't think there's anything wrong with the code, but I need to test

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-22 Thread chenglulu
在 2023/12/23 上午10:26, chenglulu 写道: 在 2023/12/22 下午3:21, chenglulu 写道: 在 2023/12/22 下午3:09, Xi Ruoyao 写道: On Fri, 2023-12-22 at 11:44 +0800, chenglulu wrote: 在 2023/12/21 下午8:00, chenglulu 写道: Sorry, I've been busy with something else these two days. I don't think there's anything wrong

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-24 Thread chenglulu
在 2023/12/24 下午8:59, Xi Ruoyao 写道: On Sat, 2023-12-23 at 18:47 +0800, Xi Ruoyao wrote: On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote: On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote: The performance drop has nothing to do with this patch. I found that the h264 performance compiled

Re: [PATCH 2/3] LoongArch: Fix instruction costs [PR112936]

2023-12-16 Thread chenglulu
在 2023/12/15 下午3:56, chenglulu 写道: 在 2023/12/14 上午9:16, chenglulu 写道: 在 2023/12/13 下午9:20, Xi Ruoyao 写道: On Wed, 2023-12-13 at 20:22 +0800, chenglulu wrote: 在 2023/12/10 上午1:03, Xi Ruoyao 写道: Replace the instruction costs in loongarch_rtx_cost_data constructor based on micro-benchmark

Re: [PATCH v2] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT.

2023-12-12 Thread chenglulu
在 2023/12/13 上午2:27, Xi Ruoyao 写道: fld.s $f1,$r4,0 fld.s $f0,$r4,4 fld.s $f3,$r4,8 fld.s $f2,$r4,12 fcmp.slt.s $fcc1,$f0,$f3 fcmp.sgt.s $fcc0,$f1,$f2 movcf2gr$r13,$fcc1 movcf2gr$r12,$fcc0

Re: [PATCH v2] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT.

2023-12-12 Thread chenglulu
在 2023/12/13 上午2:27, Xi Ruoyao 写道: On Tue, 2023-12-12 at 20:39 +0800, Xi Ruoyao wrote: fld.s $f1,$r4,0 fld.s $f0,$r4,4 fld.s $f3,$r4,8 fld.s $f2,$r4,12 fcmp.slt.s $fcc1,$f0,$f3 fcmp.sgt.s $fcc0,$f1,$f2 movcf2gr

Re: [PATCH 2/3] LoongArch: Fix instruction costs [PR112936]

2023-12-13 Thread chenglulu
在 2023/12/10 上午1:03, Xi Ruoyao 写道: Replace the instruction costs in loongarch_rtx_cost_data constructor based on micro-benchmark results on LA464 and LA664. This allows optimizations like "x * 17" to alsl, and "x * 68" to alsl and slli. gcc/ChangeLog: PR target/112936 *

Re: [PATCH 1/3] LoongArch: Include rtl.h for COSTS_N_INSNS instead of hard coding our own

2023-12-13 Thread chenglulu
LGTM! Thanks. 在 2023/12/10 上午1:03, Xi Ruoyao 写道: With loongarch-def.cc switched from C to C++, we can include rtl.h for COSTS_N_INSNS, instead of hard coding our own. THis is a non-functional change for now, but it will make the code more future-proof in case COSTS_N_INSNS in rtl.h would be

Re: [PATCH 3/3] LoongArch: Add alslsi3_extend

2023-12-13 Thread chenglulu
LGTM! Thanks! 在 2023/12/10 上午1:03, Xi Ruoyao 写道: Following the instruction cost fix, we are generating alsl.w $a0, $a0, $a0, 4 instead of li.w $t0, 17 mul.w $a0, $t0 for "x * 4", because alsl.w is 4 times faster than mul.w. But we didn't have a sign-extending pattern for

Re: [PATCH] LoongArch: Fix infinite secondary reloading of FCCmode [PR113148]

2023-12-26 Thread chenglulu
在 2023/12/27 上午6:37, Xi Ruoyao 写道: The GCC internal doc says: X might be a pseudo-register or a 'subreg' of a pseudo-register, which could either be in a hard register or in memory. Use 'true_regnum' to find out; it will return -1 if the pseudo is in memory and the

Re: [PATCH v2] LoongArch: Expand left rotate to right rotate with negated amount

2023-12-26 Thread chenglulu
LGTM! Thanks! 在 2023/12/24 下午8:33, Xi Ruoyao 写道: gcc/ChangeLog: * config/loongarch/loongarch.md (rotl3): New define_expand. * config/loongarch/simd.md (vrotl3): Likewise. (rotl3): Likewise. gcc/testsuite/ChangeLog: *

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-26 Thread chenglulu
在 2023/12/23 下午6:44, Xi Ruoyao 写道: On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote: The performance drop has nothing to do with this patch. I found that the h264 performance compiled by r14-6787 compared to r14-6421 dropped by 6.4%. Then I guess we should create a bug report... The code

Re: [pushed][PATCH v1] LoongArch: Fixed bug in *bstrins__for_ior_mask template.

2023-12-26 Thread chenglulu
Pushed to r14-6847. 在 2023/12/25 上午11:20, Li Wei 写道: We found that using the latest compiled gcc will cause a miscompare error when running spec2006 400.perlbench test with -flto turned on. After testing, it was found that only the LoongArch architecture will report errors. The first error

Re: [pushed][PATCH v1] LoongArch: Fix ICE when passing two same vector argument consecutively

2023-12-26 Thread chenglulu
Pushed to r14-6849. 在 2023/12/22 下午4:18, Chenghui Pan 写道: Following code will cause ICE on LoongArch target: #include extern void bar (__m128i, __m128i); __m128i a; void foo () { bar (a, a); } It is caused by missing constraint definition in mov_lsx. This patch

Re: [pushed ][PATCH v1] LoongArch: Fix insn output of vec_concat templates for LASX.

2023-12-26 Thread chenglulu
Pused to r14-6848. 在 2023/12/22 下午4:22, Chenghui Pan 写道: When investigaing failure of gcc.dg/vect/slp-reduc-sad.c, following instruction block are being generated by vec_concatv32qi (which is generated by vec_initv32qiv16qi) at entrance of foo() function: vldx$vr3,$r5,$r6 vld

Re: [PATCH 0/2] When cmodel=extreme, add macro support and only

2023-12-27 Thread chenglulu
在 2023/12/27 下午4:46, Lulu Cheng 写道: When cmodel=extreme, since the symbol address is obtained through four instructions, errors may occur in some cases during linking. Therefore, in order to ensure that the instructions for obtaining the symbol address are together, macro instructions are

Re: [PATCH v3] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-28 Thread chenglulu
在 2023/12/29 上午12:11, Xi Ruoyao 写道: The problem with peephole2 is it uses a naive sliding-window algorithm and misses many cases. For example: float a[1]; float t() { return a[0] + a[8000]; } is compiled to: la.local$r13,a la.local$r12,a+32768 fld.s

Re:[pushed] [PATCH v2] LoongArch: Add asm modifiers to the LSX and LASX directives in the doc.

2023-12-22 Thread chenglulu
Pushed to r14-6800. 在 2023/12/5 下午2:44, chenxiaolong 写道: gcc/ChangeLog: * doc/extend.texi:Add modifiers to the vector of asm in the doc. * doc/md.texi:Refine the description of the modifier 'f' in the doc. --- gcc/doc/extend.texi | 47

Re: [pushed][PATCH v1] LoongArch: Fix builtin function prototypes for LASX in doc.

2023-12-21 Thread chenglulu
Pushed to r14-6776. 在 2023/12/19 下午4:43, chenxiaolong 写道: gcc/ChangeLog: * doc/extend.texi:According to the documents submitted earlier, Two problems with function return types and using the actual types of parameters instead of variable names were found and fixed. ---

Re:[pushed] [PATCH v2] LoongArch: Modify the check type of the vector builtin function.

2023-12-21 Thread chenglulu
Pushed to r14-6774. 在 2023/12/13 上午9:31, chenxiaolong 写道: On LoongArch architecture, using the latest gcc14 in regression test, it is found that the vector test cases in vector directory appear FAIL entries with unmatched pointer types. In order to solve this kind of problem, the type of the

Re:[pushed] [PATCH v2] LoongArch: Fix incorrect code generation for sad pattern

2023-12-21 Thread chenglulu
Pushed to r14-6773. 在 2023/12/14 下午8:49, Jiahao Xu 写道: When I attempt to enable vect_usad_char effective target for LoongArch, slp-reduc-sad.c and vect-reduc-sad*.c tests fail. These tests fail because the sad pattern generates bad code. This patch to fixed them, for sad patterns, use zero

Re:[pushed] [PATCH v2] extend.texi: Fix typos in LSX intrinsics

2023-12-21 Thread chenglulu
Pushed to r14-6775. Thank you so much! 在 2023/12/13 下午11:26, Jiajie Chen 写道: Several typos have been found and fixed: missing semicolons, using variable name instead of type, duplicate functions and wrong types. gcc/ChangeLog: * doc/extend.texi(__lsx_vabsd_di): remove extra `i' in

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread chenglulu
在 2023/12/22 下午3:09, Xi Ruoyao 写道: On Fri, 2023-12-22 at 11:44 +0800, chenglulu wrote: 在 2023/12/21 下午8:00, chenglulu 写道: Sorry, I've been busy with something else these two days. I don't think there's anything wrong with the code, but I need to test the spec.:-) Hi, Ruoyao: After

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread chenglulu
在 2023/12/21 下午8:00, chenglulu 写道: Sorry, I've been busy with something else these two days. I don't think there's anything wrong with the code, but I need to test the spec.:-) Hi, Ruoyao: After applying this patch, spec2006 464.h264 ref will have a 6.4% performance drop. So I'm going

Re: [PATCH] LoongArch: Fix warnings building libgcc

2023-12-11 Thread chenglulu
在 2023/12/10 上午12:38, Xi Ruoyao 写道: We are excluding loongarch-opts.h from target libraries, but now struct loongarch_target and gcc_options are not declared in the target libraries, causing: In file included from ../.././gcc/options.h:8, from ../.././gcc/tm.h:49,

Re:[pushed] [PATCH v5] LoongArch: Fix eh_return epilogue for normal returns.

2023-12-11 Thread chenglulu
Pushed to r14-6440. 在 2023/12/8 下午6:01, Yang Yujie 写道: On LoongArch, the regitsters $r4 - $r7 (EH_RETURN_DATA_REGNO) will be saved and restored in the function prologue and epilogue if the given function calls __builtin_eh_return. This causes the return value to be overwritten on normal return

Re: [PATCH] LoongArch: Fix warnings building libgcc

2023-12-11 Thread chenglulu
在 2023/12/12 上午9:58, chenglulu 写道: 在 2023/12/10 上午12:38, Xi Ruoyao 写道: We are excluding loongarch-opts.h from target libraries, but now struct loongarch_target and gcc_options are not declared in the target libraries, causing: In file included from ../.././gcc/options.h:8

Re: [PATCH v1] LoongArch: testsuite:Add the "-ffast-math" compilation option for the file vect-fmin-3.c.

2023-12-30 Thread chenglulu
在 2023/12/30 下午8:25, Xi Ruoyao 写道: On Sat, 2023-12-30 at 12:15 +, Richard Sandiford wrote: This shouldn't be necessary.  The test does:   for (int i = 0; i < n; i += 2)     {   x0 = __builtin_fmin (x0, ptr[i + 0]);   x1 = __builtin_fmin (x1, ptr[i + 1]);     }   res[0] =

Re: [PATCH 2/3] LoongArch: Fix instruction costs [PR112936]

2023-12-14 Thread chenglulu
在 2023/12/14 上午9:16, chenglulu 写道: 在 2023/12/13 下午9:20, Xi Ruoyao 写道: On Wed, 2023-12-13 at 20:22 +0800, chenglulu wrote: 在 2023/12/10 上午1:03, Xi Ruoyao 写道: Replace the instruction costs in loongarch_rtx_cost_data constructor based on micro-benchmark results on LA464 and LA664. This allows

Re: [pushed][PATCH] LoongArch: Added TLS Le Relax support.

2024-01-01 Thread chenglulu
Pushed to r14-6879 and modified this issue. 在 2023/12/19 下午8:37, Xi Ruoyao 写道: On Tue, 2023-12-19 at 19:04 +0800, Lulu Cheng wrote: +(define_insn "@add_tls_le_relax" +  [(set (match_operand:P 0 "register_operand" "=r") +   (unspec:P [(match_operand:P 1 "register_operand" "r") +

Re: [PATCH] LoongArch: Provide fmin/fmax RTL pattern for vectors

2024-01-03 Thread chenglulu
LGTM! Thanks! 在 2024/1/1 上午3:15, Xi Ruoyao 写道: We already had smin/smax RTL pattern using vfmin/vfmax instructions. But for smin/smax, it's unspecified what will happen if either operand contains any NaN operands. So we would not vectorize the loop with -fno-finite-math-only (the default for

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread chenglulu
在 2024/1/5 下午4:37, Xi Ruoyao 写道: On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:  bool  loongarch_explicit_relocs_p (enum loongarch_symbol_type type)  { +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent + so that the linker can infer the PC of pcalau12i

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread chenglulu
在 2024/1/5 下午4:37, Xi Ruoyao 写道: On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:  bool  loongarch_explicit_relocs_p (enum loongarch_symbol_type type)  { +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent + so that the linker can infer the PC of pcalau12i

Re: [pushed][PATCH] LoongArch: Fixed the problem of incorrect judgment of the immediate field of the [x]vld/[x]vst instruction.

2024-01-05 Thread chenglulu
Pushed to r14-6955. 在 2024/1/4 上午10:37, Lulu Cheng 写道: The [x]vld/[x]vst directive is defined as follows: [x]vld/[x]vst {x/v}d, rj, si12 When not modified, the immediate field of [x]vld/[x]vst is between 10 and 14 bits depending on the type. However, in loongarch_valid_offset_p, the

Re: [pushed][PATCH v2 0/7] LoongArch:Enable testing for common

2024-01-05 Thread chenglulu
Pushed 2-7 to r14-6955...r14-6961. 在 2024/1/5 上午11:43, chenxiaolong 写道: v1->v2: On the basis of v1, the reason of the analysis problem is described in detail. When using binutils, which does not support vectorization, and the gcc compiler toolchain, which does support vectorization, the

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread chenglulu
在 2024/1/5 下午7:55, Xi Ruoyao 写道: On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote: On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote: 在 2024/1/5 下午4:37, Xi Ruoyao 写道: On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:   bool   loongarch_explicit_relocs_p (enum loongarch_symbol_type

Re: [pushed][PATCH v3] LoongArch: testsuite:Added support for vector object detection.

2024-01-05 Thread chenglulu
pushed to r14-6954. 在 2024/1/5 下午2:05, chenxiaolong 写道: - Change the default vectorization "-mlasx" option to "-mlsx" because there are many non-aligned memory accesses when using 256-bit vectorization. - The following detection procedure is added to the target-supports.exp file:

Re: [PATCH 1/2] LoongArch: Add the macro implementation of mcmodel=extreme.

2024-01-03 Thread chenglulu
在 2024/1/4 上午11:51, Xi Ruoyao 写道: On Wed, 2023-12-27 at 16:46 +0800, Lulu Cheng wrote: +(define_insn "movdi_pcrel64" + [(set (match_operand:DI 0 "register_operand" "=") +   (match_operand:DI 1 "symbolic_pcrel64_operand")) +  (unspec:DI [(const_int 0)] +    UNSPEC_MOV_PCREL64) +  (use

Re: [pushed][PATCH] LoongArch: Improve lasx_xvpermi_q_ insn pattern

2024-01-05 Thread chenglulu
Pushed to r14-6968. 在 2024/1/5 下午3:37, Jiahao Xu 写道: For instruction xvpermi.q, unused bits in operands[3] need be set to 0 to avoid causing undefined behavior on LA464. gcc/ChangeLog: * config/loongarch/lasx.md: Set the unused bits in operand[3] to 0. gcc/testsuite/ChangeLog:

Re: [PATCH] LoongArch: Optimize zero_extendqisi2 and zero_extendqidi2 patterns

2024-01-06 Thread chenglulu
Hi,jiahao:  The instruction latencies of the two instructions I tested here are the same on 3a5000 and 3a6000. This issue needs to be confirmed again. 在 2024/1/5 下午3:37, Jiahao Xu 写道: For zero_extendqisi2 and zero_extendqidi2, use andi instead of bstrpick.w, because andi is 6 times faster

Re:[pushed] [PATCH v1] LoongArch: testsuite:Fixed a bug that added a target check error.

2024-01-10 Thread chenglulu
Pushed to r14-7096. 在 2024/1/10 下午3:24, chenxiaolong 写道: After the code is committed in r14-6948, GCC regression testing on some architectures will produce the following error: "error executing dg-final: unknown effective target keyword `loongarch*-*-*'" gcc/testsuite/ChangeLog: *

Re:[pushed] [PATCH v2] LoongArch: testsuite:Added support for loongarch.

2024-01-10 Thread chenglulu
Pushed to r14-7097. 在 2024/1/10 下午3:25, chenxiaolong 写道: The function of this test is to check that the compiler supports vectorization using SLP and vec_{load/store/*}_lanes. However, vec_{load/store/*}_lanes are not supported on LoongArch, such as the corresponding "st4/ld4" directives on

Re:[pushed] [PATCH] LoongArch: Implenment vec_init where N is a LSX vector mode

2024-01-08 Thread chenglulu
Pushed to r14-7022. 在 2024/1/5 下午3:38, Jiahao Xu 写道: This patch implenments more vec_init optabs that can handle two LSX vectors producing a LASX vector by concatenating them. When an lsx vector is concatenated with an LSX const_vector of zeroes, the vec_concatz pattern can be used

Re: [pushed][PATCH 1/3] LoongArch: Optimized some of the symbolic expansion instructions generated during bitwise operations.

2024-01-10 Thread chenglulu
Pushed to r14-7125. 在 2024/1/6 下午4:54, Lulu Cheng 写道: There are two mode iterators defined in the loongarch.md: (define_mode_iterator GPR [SI (DI "TARGET_64BIT")]) and (define_mode_iterator X [(SI "!TARGET_64BIT") (DI "TARGET_64BIT")]) Replace the mode in the bit arithmetic

Re:[pushed] [PATCH v2] LoongArch: Implement option save/restore

2024-01-11 Thread chenglulu
Pushed to r14-7134. 在 2024/1/11 上午9:07, Yang Yujie 写道: LTO option streaming and target attributes both require per-function target configuration, which is achieved via option save/restore. We implement TARGET_OPTION_{SAVE,RESTORE} to switch the la_target context in addition to other

Re:[pushed] [PATCH v2 1/2] LoongArch: Redundant sign extension elimination optimization.

2024-01-11 Thread chenglulu
Pushed to r14-7160 and r14-7161. 在 2024/1/11 下午7:36, Li Wei 写道: We found that the current combine optimization pass in gcc cannot handle the following redundant sign extension situations: (insn 77 76 78 5 (set (reg:SI 143) (plus:SI (subreg/s/u:SI (reg/v:DI 104 [ len ]) 0)

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-11 Thread chenglulu
I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS: we need a target hook to tell the generic code UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll see millions lines of messages like ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-14 Thread chenglulu
在 2024/1/15 下午2:42, Xi Ruoyao 写道: On Mon, 2024-01-15 at 14:32 +0800, YunQiang Su wrote: Xi Ruoyao 于2024年1月15日周一 12:11写道: On Mon, 2024-01-15 at 09:29 +0800, chenxiaolong wrote: At 21:13 +0800 on Saturday, 2024-01-13, Xi Ruoyao wrote: At 15:28 +0800 on Saturday 2024-01-13, chenxiaolong

Re: Ping: [PATCH] LoongArch: Remove constraint z from movsi_internal

2024-01-15 Thread chenglulu
在 2024/1/16 下午1:34, Xi Ruoyao 写道: Ping. On Fri, 2023-12-15 at 20:56 +0800, Xi Ruoyao wrote: We don't allow SImode in FCC, so constraint z is never really used here. gcc/ChangeLog: * config/loongarch/loongarch.md (movsi_internal): Remove constraint z. --- Bootstrapped and

Re: Ping: [PATCH] LoongArch: Remove constraint z from movsi_internal

2024-01-15 Thread chenglulu
在 2024/1/16 下午2:20, Xi Ruoyao 写道: On Tue, 2024-01-16 at 14:16 +0800, chenglulu wrote: 在 2024/1/16 下午1:34, Xi Ruoyao 写道: Ping. On Fri, 2023-12-15 at 20:56 +0800, Xi Ruoyao wrote: We don't allow SImode in FCC, so constraint z is never really used here. gcc/ChangeLog: * config

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-17 Thread chenglulu
在 2024/1/17 下午5:50, Xi Ruoyao 写道: On Wed, 2024-01-17 at 17:38 +0800, chenglulu wrote: 在 2024/1/13 下午9:05, Xi Ruoyao 写道: 在 2024-01-13星期六的 15:01 +0800,chenglulu写道: 在 2024/1/12 下午7:42, Xi Ruoyao 写道: 在 2024-01-12星期五的 09:46 +0800,chenglulu写道: I found an issue bootstrapping GCC with -mcmodel

Re: [pushed][PATCH v2] LoongArch: testsuite:Fix fail in gen-vect-{2,25}.c file.

2024-01-17 Thread chenglulu
Pushed to r14-8204. 在 2024/1/13 下午3:28, chenxiaolong 写道: 1.Added dg-do compile on LoongArch. When binutils does not support vector instruction sets, an error occurs because the assembler does not recognize vector instructions. 2.Added "-mlsx" option for vectorization on LoongArch.

Re: [pushed][PATCH] LoongArch: Assign the '/u' attribute to the mem to which the global offset table belongs.

2024-01-17 Thread chenglulu
Pushed to r14-8203. 在 2024/1/13 下午2:37, Lulu Cheng 写道: gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_split_symbol): Assign the '/u' attribute to the mem. gcc/testsuite/ChangeLog: * g++.target/loongarch/got-load.C: New test. ---

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-17 Thread chenglulu
gcc.dg/tree-ssa/scev-16.c is OK to move gcc.dg/pr104992.c should simply add -fno-tree-vectorize to the used options and remove the vect_* stuff Hi Richard: I have a question. I don't understand the purpose of adding '-fno-tree-vectorize' here. Thanks!

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-18 Thread chenglulu
在 2024/1/18 下午3:44, Xi Ruoyao 写道: On Thu, 2024-01-18 at 15:15 +0800, chenglulu wrote: gcc.dg/tree-ssa/scev-16.c is OK to move gcc.dg/pr104992.c should simply add -fno-tree-vectorize to the used options and remove the vect_* stuff Hi Richard: I have a question. I don't understand

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-18 Thread chenglulu
在 2024/1/18 下午4:49, chenglulu 写道: 在 2024/1/18 下午3:44, Xi Ruoyao 写道: On Thu, 2024-01-18 at 15:15 +0800, chenglulu wrote: gcc.dg/tree-ssa/scev-16.c is OK to move gcc.dg/pr104992.c should simply add -fno-tree-vectorize to the used options and remove the vect_* stuff Hi Richard: I have

Re: [PATCH v3] LoongArch: testsuite:Added support for vector object detection.

2024-01-09 Thread chenglulu
在 2024/1/10 上午3:51, Andreas Schwab 写道: gcc: gcc.dg/vect/vect-outer-4a-big-array.c -flto -ffat-lto-objects: error executing dg-final: unknown effective target keyword `loongarch*-*-*' gcc: gcc.dg/vect/vect-outer-4a-big-array.c: error executing dg-final: unknown effective target keyword

Re:[pushed] [PATCH v1] LoongArch: testsuite:Fix FAIL in lasx-xvstelm.c file.

2024-01-03 Thread chenglulu
Pushed to r14-6909. 在 2023/12/29 上午9:45, chenxiaolong 写道: After implementing the cost model on the LoongArch architecture, the GCC compiler code has this feature turned on by default, which causes the lasx-xvstelm.c file test to fail. Through analysis, this test case can generate vectorization

Re:[pushed] [PATCH v1] LoongArch: testsuite:Add loongarch to gcc.dg/vect/slp-26.c.

2024-01-03 Thread chenglulu
Pushed to r14-6911. 在 2023/12/29 下午3:48, chenxiaolong 写道: In the LoongArch architecture, GCC supports the vectorization function tested by vect/slp-26.c, but there is no detection of loongarch in dg-finals. Add loongarch to the appropriate dg-finals. gcc/testsuite/ChangeLog: *

Re:[pushed] [PATCH v2] LoongArch: Merge constant vector permuatation implementations.

2024-01-03 Thread chenglulu
Pushed to r14-6908. 在 2023/12/28 下午8:26, Li Wei 写道: There are currently two versions of the implementations of constant vector permutation: loongarch_expand_vec_perm_const_1 and loongarch_expand_vec_perm_const_2. The implementations of the two versions are different. Currently, only the

<    1   2   3   4   >