Re: [PATCH 47/52] loongarch: New hook implementation loongarch_c_mode_for_floating_type

2024-06-03 Thread Lulu Cheng
Ok! Thanks! Lulu Cheng 在 2024/6/3 上午11:01, Kewen Lin 写道: This is to add new port specific hook implementation loongarch_c_mode_for_floating_type, remove macro defines for FLOAT_TYPE_SIZE and DOUBLE_TYPE_SIZE, and rename LONG_DOUBLE_TYPE_SIZE to LA_LONG_DOUBLE_TYPE_SIZE as we poison

Re: [PATCH] LoongArch: Guard REGNO with REG_P in loongarch_expand_conditional_move [PR115169]

2024-05-23 Thread Lulu Cheng
LGTM! Thanks! 在 2024/5/22 下午7:24, Xi Ruoyao 写道: gcc/ChangeLog: PR target/115169 * config/loongarch/loongarch.cc (loongarch_expand_conditional_move): Guard REGNO with REG_P. --- Bootstrapped with --enable-checking=all. Ok for trunk and 14?

Re: [pushed] [PATCH v4 1/2] LoongArch: Define ISA versions

2024-05-07 Thread Lulu Cheng
will open lsx by default. On Tue, 2024-04-23 at 11:31 +0800, Lulu Cheng wrote: Pushed to r14-10083. 在 2024/4/23 上午10:42, Yang Yujie 写道: These ISA versions are defined as -march= parameters and are recommended for building binaries for distribution. Detailed description of these definition

Re: [pushed][PATCH][gcc-13] LoongArch: Fix eh_return epilogue for normal returns.

2024-04-29 Thread Lulu Cheng
Pushed to r13-8661. 在 2024/4/29 下午4:09, Lulu Cheng 写道: From: Yang Yujie On LoongArch, the regitsters $r4 - $r7 (EH_RETURN_DATA_REGNO) will be saved and restored in the function prologue and epilogue if the given function calls __builtin_eh_return. This causes the return value

Re: [pushed][PATCH][gcc-12] LoongArch: Fix eh_return epilogue for normal returns.

2024-04-29 Thread Lulu Cheng
Pushed to r12-10403. 在 2024/4/29 下午4:09, Lulu Cheng 写道: From: Yang Yujie On LoongArch, the regitsters $r4 - $r7 (EH_RETURN_DATA_REGNO) will be saved and restored in the function prologue and epilogue if the given function calls __builtin_eh_return. This causes the return value

[PATCH][gcc-12] LoongArch: Fix eh_return epilogue for normal returns.

2024-04-29 Thread Lulu Cheng
From: Yang Yujie On LoongArch, the regitsters $r4 - $r7 (EH_RETURN_DATA_REGNO) will be saved and restored in the function prologue and epilogue if the given function calls __builtin_eh_return. This causes the return value to be overwritten on normal return paths and breaks a rare case of

[PATCH][gcc-13] LoongArch: Fix eh_return epilogue for normal returns.

2024-04-29 Thread Lulu Cheng
From: Yang Yujie On LoongArch, the regitsters $r4 - $r7 (EH_RETURN_DATA_REGNO) will be saved and restored in the function prologue and epilogue if the given function calls __builtin_eh_return. This causes the return value to be overwritten on normal return paths and breaks a rare case of

Re: [PATCH] LoongArch: Add constraints for bit string operation define_insn_and_split's [PR114861]

2024-04-26 Thread Lulu Cheng
LGTM! Thanks. 在 2024/4/26 下午9:52, Xi Ruoyao 写道: Without the constrants, the compiler attempts to use a stack slot as the target, causing an ICE building the kernel with -Os: drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c:3144:1: error: could not split insn (insn:TI 1764 67 1745

Re: [pushed][PATCH] wwwdocs: gcc-14/changes.html: Add Loongarch changes.

2024-04-24 Thread Lulu Cheng
在 2024/4/23 上午11:43, Lulu Cheng 写道: --- htdocs/gcc-14/changes.html | 156 + 1 file changed, 156 insertions(+) diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html index 9509487c..f0f0efe0 100644 --- a/htdocs/gcc-14/changes.html +++ b

[PATCH] wwwdocs: gcc-14/changes.html: Add Loongarch changes.

2024-04-22 Thread Lulu Cheng
--- htdocs/gcc-14/changes.html | 156 + 1 file changed, 156 insertions(+) diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html index 9509487c..f0f0efe0 100644 --- a/htdocs/gcc-14/changes.html +++ b/htdocs/gcc-14/changes.html @@ -877,6 +877,162

Re:[pushed] [PATCH v4 1/2] LoongArch: Define ISA versions

2024-04-22 Thread Lulu Cheng
Pushed to r14-10083. 在 2024/4/23 上午10:42, Yang Yujie 写道: These ISA versions are defined as -march= parameters and are recommended for building binaries for distribution. Detailed description of these definitions can be found at https://github.com/loongson/la-toolchain-conventions, which the

Re: [pushed][PATCH v4 2/2] LoongArch: Define builtin macros for ISA evolutions

2024-04-22 Thread Lulu Cheng
Pushed to r14-10084. 在 2024/4/23 上午10:42, Yang Yujie 写道: Detailed description of these definitions can be found at https://github.com/loongson/la-toolchain-conventions, which the LoongArch GCC port aims to conform to. gcc/ChangeLog: * config.gcc: Add loongarch-evolution.o. *

Re: [PATCH 1/2] LoongArch: Define ISA versions

2024-04-19 Thread Lulu Cheng
在 2024/4/19 下午10:27, Xi Ruoyao 写道: On Fri, 2024-04-19 at 19:04 +0800, Yang Yujie wrote:  @table @samp  @item native -This selects the CPU to generate code for at compilation time by determining -the processor type of the compiling machine.  Using @option{-march=native} -enables all

[PATCH] gcc-13/changes.html (LoongArch): Fix link.

2024-04-19 Thread Lulu Cheng
--- htdocs/gcc-13/changes.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html index 4384c329..15a309d6 100644 --- a/htdocs/gcc-13/changes.html +++ b/htdocs/gcc-13/changes.html @@ -625,7 +625,7 @@ You may also want to

Re: [pushed][PATCH] LoongArch: Add indexes for some compilation options.

2024-04-15 Thread Lulu Cheng
Pushed to r14-9984. 在 2024/4/9 下午4:19, Lulu Cheng 写道: gcc/ChangeLog: * config/loongarch/loongarch.opt.urls: Regenerate. * config/mn10300/mn10300.opt.urls: Likewise. * config/msp430/msp430.opt.urls: Likewise. * config/nds32/nds32-elf.opt.urls: Likewise

Re:[pushed] [PATCH v2] LoongArch: Enable switchable target

2024-04-09 Thread Lulu Cheng
Pushed to r14-9866. 在 2024/4/8 下午4:45, Yang Yujie 写道: This patch fixes the back-end context switching in cases where functions should be built with their own target contexts instead of the global one, such as LTO linking and functions with target attributes (TBD). PR target/113233

[PATCH] LoongArch: Add indexes for some compilation options.

2024-04-09 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch.opt.urls: Regenerate. * config/mn10300/mn10300.opt.urls: Likewise. * config/msp430/msp430.opt.urls: Likewise. * config/nds32/nds32-elf.opt.urls: Likewise. * config/nds32/nds32-linux.opt.urls: Likewise. *

Re:[pushed] [PATCH v1] LoongArch: Set default alignment for functions jumps and loops [PR112919].

2024-04-07 Thread Lulu Cheng
在 2024/4/6 下午5:53, Xi Ruoyao 写道: On Tue, 2024-04-02 at 15:03 +0800, Lulu Cheng wrote: +/* Alignment for functions loops and jumps for best performance.  For new +   uarchs the value should be measured via benchmarking.  See the documentation +   for -falign-functions -falign-loops and -falign

Re:[pushed] [PATCH] LoongArch: Remove unused code

2024-04-02 Thread Lulu Cheng
Pushed to r14-9766. 在 2024/4/2 下午2:33, Jiahao Xu 写道: For machines that satisfy ISA_HAS_LSX && !TARGET_64BIT, we will not support them now and in the future, so this patch removes these unused code. gcc/ChangeLog: * config/loongarch/lasx.md: Remove unused code. *

[PATCH v1] LoongArch: Set default alignment for functions jumps and loops [PR112919].

2024-04-02 Thread Lulu Cheng
Xi Ruoyao set the alignment rules under LA464 in commit r14-1839, but the macro ASM_OUTPUT_ALIGN_WITH_NOP was removed in R14-4674, which affected the alignment rules. So I set different aligns on LA464 and LA664 again to test the performance of spec2006, and modify the alignment based on the test

[PATCH] Regenerate loongarch.opt.urls.

2024-03-31 Thread Lulu Cheng
Fixes: d28ea8e5a704 ("LoongArch: Split loongarch_option_override_internal into smaller procedures") gcc/ChangeLog: * config/loongarch/loongarch.opt.urls: Regenerate. --- gcc/config/loongarch/loongarch.opt.urls | 19 +-- 1 file changed, 17

[PATCH] LoongArch: Add descriptions of the compilation options.

2024-03-30 Thread Lulu Cheng
Add descriptions for the compilation options '-mfrecipe' '-mdiv32' '-mlam-bh' '-mlamcas' and '-mld-seq-sa'. gcc/ChangeLog: * doc/invoke.texi: Add descriptions for the compilation options. --- gcc/doc/invoke.texi | 45 +++-- 1 file changed,

[PATCH] LoongArch: gcc13: Implement option save/restore.

2024-03-16 Thread Lulu Cheng
LTO option streaming and target attributes both require per-function target configuration, which is achieved via option save/restore. We implement TARGET_OPTION_{SAVE,RESTORE} to switch the la_target context in addition to other automatically maintained option states (via the "Save" option

[PATCH] LoongArch: gcc12: Implement option save/restore.

2024-03-16 Thread Lulu Cheng
LTO option streaming and target attributes both require per-function target configuration, which is achieved via option save/restore. We implement TARGET_OPTION_{SAVE,RESTORE} to switch the la_target context in addition to other automatically maintained option states (via the "Save" option

[PATCH] LoongArch: testsuite: Add compilation options to the regname-fp-s9.c.

2024-03-06 Thread Lulu Cheng
When the value of the macro DEFAULT_CFLAGS is set to '-ansi -pedantic-errors', regname-s9-fp.c will test to fail. To solve this problem, add the compilation option '-Wno-pedantic -std=gnu90' to this test case. gcc/testsuite/ChangeLog: * gcc.target/loongarch/regname-fp-s9.c: Add

[PATCH v1] LoongArch: Fixed an issue with the implementation of the template atomic_compare_and_swapsi.

2024-03-06 Thread Lulu Cheng
If the hardware does not support LAMCAS, atomic_compare_and_swapsi needs to be implemented through "ll.w+sc.w". In the implementation of the instruction sequence, it is necessary to determine whether the two registers are equal. Since LoongArch's comparison instructions do not distinguish between

[PATCH v1] LoongArch: When checking whether the assembler supports conditional branch relaxation, add compilation parameter "--fatal-warnings" to the assembler.

2024-02-20 Thread Lulu Cheng
In binutils 2.40 and earlier versions, only a warning will be reported when a relocation immediate value is out of bounds. As a result, the value of the macro HAVE_AS_COND_BRANCH_RELAXATION will also be defined as 1 when the assembler does not support conditional branch relaxation. Therefore, add

[PATCH v1 4/4] LoongArch: Define HAVE_AS_TLS to 0 if it's undefined [PR112299]

2024-02-20 Thread Lulu Cheng
From: Xi Ruoyao Now loongarch.md uses HAVE_AS_TLS, we need this to fix the failure building a cross compiler if the cross assembler is not installed yet. gcc/ChangeLog: PR target/112299 * config/loongarch/loongarch-opts.h (HAVE_AS_TLS): Define to 0 if not defined yet.

[PATCH v1 3/4] LoongArch: Disable relaxation if the assembler don't support conditional branch relaxation [PR112330]

2024-02-20 Thread Lulu Cheng
From: Xi Ruoyao As the commit message of r14-4674 has indicated, if the assembler does not support conditional branch relaxation, a relocation overflow may happen on conditional branches when relaxation is enabled because the number of NOP instructions inserted by the assembler will be more than

[PATCH v1 2/4] LoongArch: Check whether binutils supports the relax function. If supported, explicit relocs are turned off by default.

2024-02-20 Thread Lulu Cheng
gcc/ChangeLog: * config.in: Regenerate. * config/loongarch/genopts/loongarch.opt.in: Add compilation option mrelax. And set the initial value of explicit-relocs according to the detection status. * config/loongarch/gnu-user.h: When compiling with

[PATCH v1 1/4] LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP.

2024-02-20 Thread Lulu Cheng
There are two reasons for removing this macro definition: 1. The default in the assembler is to use the nop instruction for filling. 2. For assembly directives: .align [abs-expr[, abs-expr[, abs-expr]]] The third expression it is the maximum number of bytes that should be skipped by this

[PATCH v1 0/4] Fix a series of problems caused by ASM_OUTPUT_ALIGN_WITH_NOP (release/gcc-12).

2024-02-20 Thread Lulu Cheng
nal branch relaxation. (cherry pick r14-5434) PR112299 is also fixed here. Lulu Cheng (2): LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP. LoongArch: Check whether binutils supports the relax function. If supported, explicit relocs are turned off by default. Xi Ruoyao (2):

[PATCH v1 3/4] LoongArch: Disable relaxation if the assembler don't support conditional branch relaxation [PR112330]

2024-02-20 Thread Lulu Cheng
From: Xi Ruoyao As the commit message of r14-4674 has indicated, if the assembler does not support conditional branch relaxation, a relocation overflow may happen on conditional branches when relaxation is enabled because the number of NOP instructions inserted by the assembler will be more than

[PATCH v1 2/4] LoongArch: Check whether binutils supports the relax function. If supported, explicit relocs are turned off by default.

2024-02-20 Thread Lulu Cheng
gcc/ChangeLog: * config.in: Regenerate. * config/loongarch/genopts/loongarch.opt.in: Add compilation option mrelax. And set the initial value of explicit-relocs according to the detection status. * config/loongarch/gnu-user.h: When compiling with

[PATCH v1 4/4] LoongArch: Define HAVE_AS_TLS to 0 if it's undefined [PR112299]

2024-02-20 Thread Lulu Cheng
From: Xi Ruoyao Now loongarch.md uses HAVE_AS_TLS, we need this to fix the failure building a cross compiler if the cross assembler is not installed yet. gcc/ChangeLog: PR target/112299 * config/loongarch/loongarch-opts.h (HAVE_AS_TLS): Define to 0 if not defined yet.

[PATCH v1 1/4] LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP.

2024-02-20 Thread Lulu Cheng
There are two reasons for removing this macro definition: 1. The default in the assembler is to use the nop instruction for filling. 2. For assembly directives: .align [abs-expr[, abs-expr[, abs-expr]]] The third expression it is the maximum number of bytes that should be skipped by this

[PATCH v1 0/4] Fix a series of problems caused by

2024-02-20 Thread Lulu Cheng
nal branch relaxation. (cherry pick r14-5434) PR112299 is also fixed here. Lulu Cheng (2): LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP. LoongArch: Check whether binutils supports the relax function. If supported, explicit relocs are turned off by default. Xi Ruoyao (2):

[PATCH v1 2/4] LoongArch: Check whether binutils supports the relax function. If supported, explicit relocs are turned off by default.

2024-02-20 Thread Lulu Cheng
gcc/ChangeLog: * config.in: Regenerate. * config/loongarch/genopts/loongarch.opt.in: Add compilation option mrelax. And set the initial value of explicit-relocs according to the detection status. * config/loongarch/gnu-user.h: When compiling with

[PATCH v1 3/4] LoongArch: Disable relaxation if the assembler don't support conditional branch relaxation [PR112330]

2024-02-20 Thread Lulu Cheng
From: Xi Ruoyao As the commit message of r14-4674 has indicated, if the assembler does not support conditional branch relaxation, a relocation overflow may happen on conditional branches when relaxation is enabled because the number of NOP instructions inserted by the assembler will be more than

[PATCH v1 0/4] Fix a series of problems caused by

2024-02-20 Thread Lulu Cheng
nal branch relaxation. (cherry pick r14-5434) PR112299 is also fixed here. Lulu Cheng (2): LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP. LoongArch: Check whether binutils supports the relax function. If supported, explicit relocs are turned off by default. Xi Ruoyao (2):

[PATCH v1 1/4] LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP.

2024-02-20 Thread Lulu Cheng
There are two reasons for removing this macro definition: 1. The default in the assembler is to use the nop instruction for filling. 2. For assembly directives: .align [abs-expr[, abs-expr[, abs-expr]]] The third expression it is the maximum number of bytes that should be skipped by this

[PATCH v1 4/4] LoongArch: Define HAVE_AS_TLS to 0 if it's undefined [PR112299]

2024-02-20 Thread Lulu Cheng
From: Xi Ruoyao Now loongarch.md uses HAVE_AS_TLS, we need this to fix the failure building a cross compiler if the cross assembler is not installed yet. gcc/ChangeLog: PR target/112299 * config/loongarch/loongarch-opts.h (HAVE_AS_TLS): Define to 0 if not defined yet.

[PATCH 2/2] LoongArch: Remove redundant symbol type conversions in larchintrin.h.

2024-02-05 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/larchintrin.h (__movgr2fcsr): Remove redundant symbol type conversions. (__cacop_d): Likewise. (__cpucfg): Likewise. (__asrtle_d): Likewise. (__asrtgt_d): Likewise. (__lddir_d): Likewise.

[PATCH 1/2] LoongArch: Fix wrong return value type of __iocsrrd_h.

2024-02-05 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/larchintrin.h (__iocsrrd_h): Modify the function return value type to unsigned short. --- gcc/config/loongarch/larchintrin.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/loongarch/larchintrin.h

[PATCH v2] LoongArch: libsanitizer: Enable Lsan and Tsan for loongarch64.

2024-02-03 Thread Lulu Cheng
From: chenguoqi libsanitizer/ChangeLog: * configure.tgt: Enable tsan and lsan for loongarch64. * tsan/Makefile.am (EXTRA_libtsan_la_SOURCES): Add tsan_rtl_loongarch64.S. * tsan/Makefile.in: Regenerate. --- libsanitizer/configure.tgt| 5 +

[PATCH v2] LoongArch: Modify the address calculation logic for obtaining array element values through fp.

2024-01-29 Thread Lulu Cheng
Modify address calculation logic from (((a x C) + fp) + offset) to ((fp + offset) + a x C). Thereby modifying the register dependencies and optimizing the code. The value of C is 2 4 or 8. The following is the assembly code before and after a loop modification in spec2006 401.bzip:

[PATCH] LoongArch: Modify the address calculation logic for obtaining array element values through fp.

2024-01-29 Thread Lulu Cheng
Modify address calculation logic from (((a x C) + fp) + offset) to ((fp + offset) + a x C). Thereby modifying the register dependencies and optimizing the code. The value of C is 2 4 or 8. The following is the assembly code before and after a loop modification in spec2006 401.bzip:

[PATCH] LoongArch: libsanitizer: Enable build lsan and tsan for loongarch64.

2024-01-29 Thread Lulu Cheng
From: chenguoqi libsanitizer/ChangeLog: * configure.tgt: Enable tsan and lsan for loongarch64. * tsan/Makefile.am: Add tsan_rtl_loongarch64.S to EXTRA_libtsan_la_SOURCES. * tsan/Makefile.in: Regenerate. --- libsanitizer/configure.tgt| 5 +

[PATCH v5 1/5] LoongArch: Merge template got_load_tls_{ld/gd/le/ie}.

2024-01-29 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_load_tls): Load all types of tls symbols through one function. (loongarch_got_load_tls_gd): Delete. (loongarch_got_load_tls_ld): Delete. (loongarch_got_load_tls_ie): Delete.

[PATCH v5 5/5] LoongArch: Don't split the instructions containing relocs for extreme code model.

2024-01-29 Thread Lulu Cheng
From: Xi Ruoyao The ABI mandates the pcalau12i/addi.d/lu32i.d/lu52i.d instructions for addressing a symbol to be adjacent. So model them as "one large instruction", i.e. define_insn, with two output registers. The real address is the sum of these two registers. The advantage of this approach

[PATCH v5 3/5] LoongArch: Enable explicit reloc for extreme TLS GD/LD with -mexplicit-relocs=auto.

2024-01-29 Thread Lulu Cheng
Binutils does not support relaxation using four instructions to obtain symbol addresses gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_explicit_relocs_p): When the code model of the symbol is extreme and -mexplicit-relocs=auto, the macro instruction loading

[PATCH v5 2/5] LoongArch: Add the macro implementation of mcmodel=extreme.

2024-01-29 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch-protos.h (loongarch_symbol_extreme_p): Add function declaration. * config/loongarch/loongarch.cc (loongarch_symbolic_constant_p): For SYMBOL_PCREL64, non-zero addend of "la.local $rd,$rt,sym+addend" is not allowed

[PATCH v5 0/5] When cmodel=extreme, add macro implementation and fix problems with explicit relos implementation.

2024-01-29 Thread Lulu Cheng
reloc for extreme TLS GD/LD with -mexplicit-relocs=auto. v2 -> v3: 1. Modify the detection rules of a test case. v1 -> v2: 1. Use the temporarily allocated registers as intermediate registers to implement the extreme macro. 2. Fixed bugs in v1 test cases. Lulu Cheng (4): Loon

[PATCH v5 4/5] LoongArch: Added support for loading __get_tls_addr symbol address using call36.

2024-01-29 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_call_tls_get_addr): Add support for call36. gcc/testsuite/ChangeLog: * gcc.target/loongarch/explicit-relocs-medium-call36-auto-tls-ld-gd.c: New test. --- gcc/config/loongarch/loongarch.cc | 22

[PATCH v4 1/4] LoongArch: Merge template got_load_tls_{ld/gd/le/ie}.

2024-01-25 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_load_tls): Load all types of tls symbols through one function. (loongarch_got_load_tls_gd): Delete. (loongarch_got_load_tls_ld): Delete. (loongarch_got_load_tls_ie): Delete.

[PATCH v4 2/4] LoongArch: Add the macro implementation of mcmodel=extreme.

2024-01-25 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch-protos.h (loongarch_symbol_extreme_p): Add function declaration. * config/loongarch/loongarch.cc (loongarch_symbolic_constant_p): For SYMBOL_PCREL64, non-zero addend of "la.local $rd,$rt,sym+addend" is not allowed

[PATCH v4 4/4] LoongArch: Added support for loading __get_tls_addr symbol address using call36.

2024-01-25 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_call_tls_get_addr): Add support for call36. gcc/testsuite/ChangeLog: * gcc.target/loongarch/explicit-relocs-medium-call36-auto-tls-ld-gd.c: New test. --- gcc/config/loongarch/loongarch.cc | 20

[PATCH v4 3/4] LoongArch: Enable explicit reloc for extreme TLS GD/LD with -mexplicit-relocs=auto.

2024-01-25 Thread Lulu Cheng
Binutils does not support relaxation using four instructions to obtain symbol addresses gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_explicit_relocs_p): When the code model of the symbol is extreme and -mexplicit-relocs=auto, the macro instruction loading

[PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-25 Thread Lulu Cheng
rules of a test case. v1 -> v2: 1. Use the temporarily allocated registers as intermediate registers to implement the extreme macro. 2. Fixed bugs in v1 test cases. Lulu Cheng (4): LoongArch: Merge template got_load_tls_{ld/gd/le/ie}. LoongArch: Add the macro implementation of mcmode

[PATCH] LoongArch: Disable TLS type symbols from generating non-zero offsets.

2024-01-22 Thread Lulu Cheng
TLS gd ld and ie type symbols will generate corresponding GOT entries, so non-zero offsets cannot be generated. The address of TLS le type symbol+addend is not implemented in binutils, so non-zero offset is not generated here for the time being. gcc/ChangeLog: *

[PATCH] LoongArch: Assign the '/u' attribute to the mem to which the global offset table belongs.

2024-01-12 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_split_symbol): Assign the '/u' attribute to the mem. gcc/testsuite/ChangeLog: * g++.target/loongarch/got-load.C: New test. --- gcc/config/loongarch/loongarch.cc | 5 +

[PATCH 3/3] LoongArch: Redundant sign extension elimination optimization 2.

2024-01-06 Thread Lulu Cheng
From: liwei Eliminate the redundant sign extension that exists after the conditional move when the target register is SImode. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_expand_conditional_move): Adjust. gcc/testsuite/ChangeLog: *

[PATCH 1/3] LoongArch: Optimized some of the symbolic expansion instructions generated during bitwise operations.

2024-01-06 Thread Lulu Cheng
There are two mode iterators defined in the loongarch.md: (define_mode_iterator GPR [SI (DI "TARGET_64BIT")]) and (define_mode_iterator X [(SI "!TARGET_64BIT") (DI "TARGET_64BIT")]) Replace the mode in the bit arithmetic from GPR to X. Since the bitwise operation instruction

[PATCH 2/3] LoongArch: Redundant sign extension elimination optimization.

2024-01-06 Thread Lulu Cheng
From: liwei We found that the current combine optimization pass in gcc cannot handle the following redundant sign extension situations: (insn 77 76 78 5 (set (reg:SI 143) (plus:SI (subreg/s/u:SI (reg/v:DI 104 [ len ]) 0) (const_int 1 [0x1]))) {addsi3} (expr_list:REG_DEAD

[PATCH v3 1/2] LoongArch: Add the macro implementation of mcmodel=extreme.

2024-01-04 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_symbolic_constant_p): Remove the sym+addend form from the SYMBOL_PCREL64 type symbol. (loongarch_output_mi_thunk): Add code model extreme support. (loongarch_option_override_internal): Supports option

[PATCH v3 0/2] When cmodel=extreme, add macro support and only support macros.

2024-01-04 Thread Lulu Cheng
a test case. Lulu Cheng (2): LoongArch: Add the macro implementation of mcmodel=extreme. LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs. gcc/config/loongarch/loongarch.cc

[PATCH v3 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-04 Thread Lulu Cheng
Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent so that the linker can infer the PC of pcalau12i to apply relocations to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if these four instructions are not in the same 4KiB page. See the link for details:

[PATCH v2 1/2] LoongArch: Add the macro implementation of mcmodel=extreme.

2024-01-04 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_symbolic_constant_p): Remove the sym+addend form from the SYMBOL_PCREL64 type symbol. (loongarch_output_mi_thunk): Add code model extreme support. (loongarch_option_override_internal): Supports option

[PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-04 Thread Lulu Cheng
Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent so that the linker can infer the PC of pcalau12i to apply relocations to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if these four instructions are not in the same 4KiB page. See the link for details:

[PATCH v2 0/2] When cmodel=extreme, add macro support and only support macros.

2024-01-04 Thread Lulu Cheng
cmodel=extreme. https://github.com/loongson/la-abi-specs/blob/release/laelf.adoc#extreme-code-model v1 -> v2: 1. Use the temporarily allocated registers as intermediate registers to implement the extreme macro. 2. Fixed bugs in v1 test cases. Lulu Cheng (2): LoongArch: Add the ma

[PATCH] LoongArch: Fixed the problem of incorrect judgment of the immediate field of the [x]vld/[x]vst instruction.

2024-01-03 Thread Lulu Cheng
The [x]vld/[x]vst directive is defined as follows: [x]vld/[x]vst {x/v}d, rj, si12 When not modified, the immediate field of [x]vld/[x]vst is between 10 and 14 bits depending on the type. However, in loongarch_valid_offset_p, the immediate field is restricted first, so there is no error.

[PATCH 1/2] LoongArch: Add the macro implementation of mcmodel=extreme.

2023-12-27 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_symbolic_constant_p): Remove the sym+addend form from the SYMBOL_PCREL64 type symbol. (loongarch_option_override_internal): Supports option combinations of -cmodel=extreme and -mexplicit-relocs=none.

[PATCH 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2023-12-27 Thread Lulu Cheng
Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent so that the linker can infer the PC of pcalau12i to apply relocations to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if these four instructions are not in the same 4KiB page. See the link for details:

[PATCH 0/2] When cmodel=extreme, add macro support and only

2023-12-27 Thread Lulu Cheng
cmodel=extreme. https://github.com/loongson/la-abi-specs/blob/release/laelf.adoc#extreme-code-model Lulu Cheng (2): LoongArch: Add the macro implementation of mcmodel=extreme. LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless

[PATCH] LoongArch: Added TLS Le Relax support.

2023-12-19 Thread Lulu Cheng
Check whether the assembler supports tls le relax. If it supports it, the assembly instruction sequence of tls le relax will be generated by default. The original way to obtain the tls le symbol address: lu12i.w $rd, %le_hi20(sym) ori $rd, $rd, %le_lo12(sym) add.{w/d} $rd, $rd, $tp

[PATCH v2 1/2] LoongArch: Switch loongarch-def from C to C++ to make it possible.

2023-12-04 Thread Lulu Cheng
From: Xi Ruoyao We'll use HOST_WIDE_INT in LoongArch static properties in following patches. To keep the same readability as C99 designated initializers, create a std::array like data structure with position setter function, and add field setter functions for structs used in loongarch-def.cc.

[PATCH v2 2/2] LoongArch: Remove the definition of ISA_BASE_LA64V110 from the code.

2023-12-04 Thread Lulu Cheng
The instructions defined in LoongArch Reference Manual v1.1 are not the instruction set v1.1 version. The CPU defined later may only support some instructions in LoongArch Reference Manual v1.1. Therefore, the macro ISA_BASE_LA64V110 and related definitions are removed here. gcc/ChangeLog:

[PATCH v2 0/2] Delete ISA_BASE_LA64V110 related definitions.

2023-12-04 Thread Lulu Cheng
e version number of the Loongson architecture. So delete the ISA_BASE_LA64V110 related definitions here. *** BLURB HERE *** Lulu Cheng (1): LoongArch: Remove the definition of ISA_BASE_LA64V110 from the code. Xi Ruoyao (1): LoongArch: Switch loongarch-def from C to C++ to make it possible. .../l

[PATCH v1 1/2] LoongArch: Switch loongarch-def from C to C++ to make it possible.

2023-12-02 Thread Lulu Cheng
From: Xi Ruoyao We'll use HOST_WIDE_INT in LoongArch static properties in following patches. Switch loongarch-def from C to C++ to make it possible. To keep the same readability as C99 designated initializers, create a std::array like data structure with position setter function, and add field

[PATCH v1 0/2] Delete ISA_BASE_LA64V110 related definitions.

2023-12-02 Thread Lulu Cheng
. It is recommended that the software determines the running process based on this information rather than the version number of the Loongson architecture. So delete the ISA_BASE_LA64V110 related definitions here. Lulu Cheng (1): LoongArch: Remove the definition of ISA_BASE_LA64V110 from the code

[PATCH v1 2/2] LoongArch: Remove the definition of ISA_BASE_LA64V110 from the code.

2023-12-02 Thread Lulu Cheng
The instructions defined in LoongArch Reference Manual v1.1 are not the instruction set v1.1 version. The CPU defined later may only support some instructions in LoongArch Reference Manual v1.1. Therefore, the macro ISA_BASE_LA64V110 and related definitions are removed here. gcc/ChangeLog:

[PATCH v2] LoongArch: Add intrinsic function descriptions for LSX and LASX instructions to doc.

2023-11-29 Thread Lulu Cheng
From: chenxiaolong gcc/ChangeLog: * doc/extend.texi: Add information about the intrinsic function of the vector instruction. Change-Id: I0117d6f5d68731f1596b6c3016fd82f3d5e2a98d --- gcc/doc/extend.texi | 1662 +++ 1 file changed, 1662

[PATCH] LoongArch: Modify MUSL_DYNAMIC_LINKER.

2023-11-17 Thread Lulu Cheng
Use no suffix at all in the musl dynamic linker name for hard float ABI. Use -sf and -sp suffixes in musl dynamic linker name for soft float and single precision ABIs. The following table outlines the musl interpreter names for the LoongArch64 ABI names. musl interpreter| LoongArch64

[PATCH v1 2/3] LoongArch: Implement atomic operations using LoongArch1.1 instructions.

2023-11-17 Thread Lulu Cheng
1. short and char type calls for atomic_add_fetch and __atomic_fetch_add are implemented using amadd{_db}.{b/h}. 2. Use amcas{_db}.{b/h/w/d} to implement __atomic_compare_exchange_n and __atomic_compare_exchange. 3. The short and char types of the functions __atomic_exchange and

[PATCH v1 1/3] LoongArch: Add LA664 support.

2023-11-17 Thread Lulu Cheng
Define ISA_BASE_LA64V110, which represents the base instruction set defined in LoongArch1.1. Support the configure setting --with-arch =la664, and support -march=la664,-mtune=la664. gcc/ChangeLog: * config.gcc: Support LA664. * config/loongarch/genopts/loongarch-strings:

[PATCH v1 0/3] Add LoongarchV1.1 instructions support.

2023-11-17 Thread Lulu Cheng
Lulu Cheng (3): LoongArch: Add LA664 support. LoongArch: Implement atomic operations using LoongArch1.1 instructions. LoongArch: atomic_load and atomic_store are implemented using dbar grading. gcc/config.gcc| 10 +- .../loongarch/genopts

[PATCH v1 3/3] LoongArch: atomic_load and atomic_store are implemented using dbar grading.

2023-11-17 Thread Lulu Cheng
Because the la464 memory model design allows the same address load out of order, so in the following test example, the Load of 23 lines may be executed first over the load of 21 lines, resulting in an error. So when memmodel is MEMMODEL_RELAXED, the load instruction will be followed by "dbar

[PATCH v2] LoongArch: Add code generation support for call36 function calls.

2023-11-15 Thread Lulu Cheng
When compiling with '-mcmodel=medium', the function call is made through 'pcaddu18i+jirl' if binutils supports call36, otherwise the native implementation 'pcalau12i+jirl' is used. gcc/ChangeLog: * config.in: Regenerate. * config/loongarch/loongarch-opts.h

[PATCH v1] LoongArch: Added code generation support for call36 function calls.

2023-11-14 Thread Lulu Cheng
When compiling with '-mcmodel=medium', the function call is made through 'pcaddu18i+jirl' if binutils supports call36, otherwise the native implementation 'pcalau12i+jirl' is used. gcc/ChangeLog: * config.in: Regenerate. * config/loongarch/loongarch-opts.h

[PATCH] LoongArch: Define macro CLEAR_INSN_CACHE.

2023-10-20 Thread Lulu Cheng
LoongArch's microstructure ensures cache consistency by hardware. Due to out-of-order execution, ibar is required to ensure the visibility of the store (invalidated icache) executed by this CPU before ibar (to the instance). ibar will not invalidate the icache, so the start and end parameters are

[PATCH v2] LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP.

2023-10-12 Thread Lulu Cheng
There are two reasons for removing this macro definition: 1. The default in the assembler is to use the nop instruction for filling. 2. For assembly directives: .align [abs-expr[, abs-expr[, abs-expr]]] The third expression it is the maximum number of bytes that should be skipped by this

[PATCH] LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP.

2023-09-15 Thread Lulu Cheng
There are two reasons for removing this macro definition: 1. The default in the assembler is to use the nop instruction for filling. 2. For assembly directives: .align [abs-expr[, abs-expr[, abs-expr]]] The third expression it is the maximum number of bytes that should be skipped by this

[PATCH v1] LoongArch: Add floating point conditional move support.

2023-09-14 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch-protos.h (loongarch_expand_conditional_move): Modify the return value type of a function. * config/loongarch/loongarch.cc (loongarch_expand_conditional_move): Added floating point conditional transfer

[PATCH v1] LoongArch: Check whether binutils supports the relax function. If supported, explicit relocs are turned off by default.

2023-09-14 Thread Lulu Cheng
gcc/ChangeLog: * config.in: Regenerate. * config/loongarch/genopts/loongarch.opt.in: Add compilation option mrelax. And set the initial value of explicit-relocs according to the detection status. * config/loongarch/gnu-user.h: When compiling with

[PATCH] LoongArch: gcc: Modify gas uleb128 support test.

2023-09-14 Thread Lulu Cheng
From: mengqinggang Add "ld conftest.o -o conftest" process, then the "objdump -dr" contents is right. Because gas write zero to objdec file and generate R_LARCH_ADD_ULEB128/R_LARCH_SUB_ULEB128 reloc pair to calcualte uleb128 format symbol subtraction after ld relaxation. gcc/ChangeLog:

[PATCH] LoongArch: Change the value of branch_cost from 2 to 6.

2023-09-12 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch-def.c: Modify the default value of branch_cost. gcc/testsuite/ChangeLog: * gcc.target/loongarch/cmov_ii.c: New test. --- gcc/config/loongarch/loongarch-def.c | 4 ++-- gcc/testsuite/gcc.target/loongarch/cmov_ii.c | 16

[PATCH v2] LoongArch: Fix bug of 'di3_fake'.

2023-09-12 Thread Lulu Cheng
PR 111334 gcc/ChangeLog: * config/loongarch/loongarch.md: Fix bug of 'di3_fake'. gcc/testsuite/ChangeLog: * gcc.target/loongarch/pr111334.c: New test. --- v1 -> v2: Modify the template "*3", the SI type division operation is not supported under

[PATCH v1] LoongArch: Fix bug of 'di3_fake'.

2023-09-09 Thread Lulu Cheng
PR 111334 gcc/ChangeLog: * config/loongarch/loongarch.md: Fix bug of di3_fake. gcc/testsuite/ChangeLog: * gcc.target/loongarch/pr111334.c: New test. --- gcc/config/loongarch/loongarch.md | 14 +-- gcc/testsuite/gcc.target/loongarch/pr111334.c | 39

[PATCH v1] LoongArch: Optimized multiply instruction generation.

2023-09-05 Thread Lulu Cheng
1. Can generate mulh.w[u] instruction. 2. Can generate mulw.d.wu instruction. gcc/ChangeLog: * config/loongarch/loongarch.md (mulsidi3_64bit): (muldi3_highpart): Modify template name. (mulsi3_highpart): Likewise. (mulsidi3_64bit): Field unsigned

[PATCH 1/2] LoongArch: Optimize switch with sign-extended index.

2023-09-02 Thread Lulu Cheng
The patch refers to the submission of RISCV 7bbce9b50302959286381d9177818642bceaf301. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_extend_comparands): In unsigned QImode test, check for sign extended subreg and/or constant operands, and do a sign extend in

  1   2   3   4   >