[PATCH] LoongArch: Guard REGNO with REG_P in loongarch_expand_conditional_move [PR115169]

2024-05-22 Thread Xi Ruoyao
gcc/ChangeLog: PR target/115169 * config/loongarch/loongarch.cc (loongarch_expand_conditional_move): Guard REGNO with REG_P. --- Bootstrapped with --enable-checking=all. Ok for trunk and 14? gcc/config/loongarch/loongarch.cc | 17 - 1 file changed, 12

Re: [COMMITTED] Revert "Revert: "Enable prange support.""

2024-05-16 Thread Xi Ruoyao
t "[PATCH v2 1/3] RISC-V: movmem for RISCV with V extension" > >     This reverts commit df15eb15b5f820321c81efc75f0af13ff8c0dd5b. Revert is OK, but revert revert is not. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651144.html -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Ping: [PATCH 0/2] Fix two test failures with --enable-default-pie [PR70150]

2024-05-15 Thread Xi Ruoyao
Ping. On Mon, 2024-05-06 at 12:45 +0800, Xi Ruoyao wrote: > In GCC 14.1-rc1, there are two new (comparing to GCC 13) failures if > the build is configured --enable-default-pie.  Let's fix them. > > Tested on x86_64-linux-gnu.  Ok for trunk and releases/gcc-14? > > Xi Ru

Pushed: [PATCH] driver: Move -fdiagnostics-urls= early like -fdiagnostics-color= [PR114980]

2024-05-09 Thread Xi Ruoyao
On Thu, 2024-05-09 at 20:21 +, Joseph Myers wrote: > On Wed, 8 May 2024, Xi Ruoyao wrote: > > > In GCC 14 we started to emit URLs for "command-line option is > > valid for but not " and "-Werror= argument > > '-Werror=' is not valid for " warnings

[PATCH] driver: Move -fdiagnostics-urls= early like -fdiagnostics-color= [PR114980]

2024-05-07 Thread Xi Ruoyao
In GCC 14 we started to emit URLs for "command-line option is valid for but not " and "-Werror= argument '-Werror=' is not valid for " warnings. So we should have moved -fdiagnostics-urls= early like -fdiagnostics-color=, or -fdiagnostics-urls= wouldn't be able to control URLs in these

Re: [pushed] [PATCH v4 1/2] LoongArch: Define ISA versions

2024-05-07 Thread Xi Ruoyao
On Tue, 2024-05-07 at 18:01 +0800, Lulu Cheng wrote: > > 在 2024/5/7 下午5:42, Xi Ruoyao 写道: > > On Tue, 2024-05-07 at 17:07 +0800, Xi Ruoyao wrote: > > > Hmm, after this change the default (-march=la64v1.0) is enabling LSX: > > > > > > $ e

Re: [pushed] [PATCH v4 1/2] LoongArch: Define ISA versions

2024-05-07 Thread Xi Ruoyao
On Tue, 2024-05-07 at 17:07 +0800, Xi Ruoyao wrote: > Hmm, after this change the default (-march=la64v1.0) is enabling LSX: > > $ echo "int dummy;" | cc -c -v |& tail -n1 > COLLECT_GCC_OPTIONS='-c' '-v' '-mabi=lp64d' '-march=la64v1.0' '- > mfpu=64' > '-msimd=lsx' '

Re: [pushed] [PATCH v4 1/2] LoongArch: Define ISA versions

2024-05-07 Thread Xi Ruoyao
LOONGARCH64: > > +    case TUNE_LA464: > > +    case TUNE_LA664: > >     /* Vector part.  */ > >     if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P > > (mode)) > >    { > > @@ -10980,9

[PATCH 0/2] Fix two test failures with --enable-default-pie [PR70150]

2024-05-05 Thread Xi Ruoyao
In GCC 14.1-rc1, there are two new (comparing to GCC 13) failures if the build is configured --enable-default-pie. Let's fix them. Tested on x86_64-linux-gnu. Ok for trunk and releases/gcc-14? Xi Ruoyao (2): i386: testsuite: Add -no-pie for pr113689-1.c [PR70150] i386: testsuite: Adapt

[PATCH 2/2] i386: testsuite: Adapt fentryname3.c for r14-811 change [PR70150]

2024-05-05 Thread Xi Ruoyao
After r14-811 "call *nop@GOTPCREL(%rip)" is only generated with -mno-direct-extern-access even if --enable-default-pie. So the r13-1614 change to this file is not valid anymore. gcc/testsuite/ChangeLog: PR testsuite/70150 * gcc.target/i386/fentryname3.c (dg-final): Revert

[PATCH 1/2] i386: testsuite: Add -no-pie for pr113689-1.c [PR70150]

2024-05-05 Thread Xi Ruoyao
For a --enable-default-pie build, using -fno-pic (for compiler) but not -no-pie (for linker) triggers some linker warnings counted as excess errors: /usr/bin/ld: /tmp/cc8MgxiR.o: warning: relocation in read-only section `.text.startup' /usr/bin/ld: warning: creating DT_TEXTREL in a

Pushed: [PATCH] LoongArch: Add constraints for bit string operation define_insn_and_split's [PR114861]

2024-04-26 Thread Xi Ruoyao
On Sat, 2024-04-27 at 11:04 +0800, Lulu Cheng wrote: > LGTM! > > Thanks. Pushed r15-11 and r14-10142. > 在 2024/4/26 下午9:52, Xi Ruoyao 写道: > > Without the constrants, the compiler attempts to use a stack slot as the > > target, causing an ICE building the kernel with -O

[PATCH] LoongArch: Add constraints for bit string operation define_insn_and_split's [PR114861]

2024-04-26 Thread Xi Ruoyao
Without the constrants, the compiler attempts to use a stack slot as the target, causing an ICE building the kernel with -Os: drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c:3144:1: error: could not split insn (insn:TI 1764 67 1745 (set (mem/c:DI (reg/f:DI 3 $r3) [707 %sfp+-80 S8 A64])

Re: [PATCH v2 1/2] LoongArch: Define ISA versions

2024-04-22 Thread Xi Ruoyao
does not support HPTW, which **is** a v1.1 feature. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH 1/2] LoongArch: Define ISA versions

2024-04-20 Thread Xi Ruoyao
uot;+Generate instructions for the machine type > @var{arch-type}.", > >  so is there no need to write it like this here? Then maybe just say "all LoongArch v1.1 instructions" instead of "features" here as well? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH 1/2] LoongArch: Define ISA versions

2024-04-19 Thread Xi Ruoyao
A version 1.1. IMO it's better to use a wording like LA664, i.e. "a CPU implementing all LoongArch v1.1 unprivileged features" (emphasising "all", as the v1.1 manual allows to only implement a subset of v1.1 features). -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH 1/2] LoongArch: Define ISA versions

2024-04-19 Thread Xi Ruoyao
ventions, which > the LoongArch GCC port aims to conform to. The links seems broken. Do you mean la-softdev-convention? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2] LoongArch: Enable switchable target

2024-04-08 Thread Xi Ruoyao
I do LTO, so I don't really rely on this patch though. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] LoongArch: Enable switchable target

2024-04-07 Thread Xi Ruoyao
this line. > (loongarch_expand_builtin): Turn assertion of builtin > availability >     into a test. and this line. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] ICF: Make ICF and SRA agree on padding

2024-04-07 Thread Xi Ruoyao
.length (); > +  if (l != p2.m_padding.length ()) > +    return false; > +  for (unsigned i = 0; i < l; i++) > +    if (p1.m_padding[i].first != p2.m_padding[i].first > + || p1.m_padding[i].second != p2.m_padding[i].second) > +  return false; > + > +  return true; > +} &

Re: [PATCH] ICF: Make ICF and SRA agree on padding

2024-04-07 Thread Xi Ruoyao
farm.  If that goes well, I intend to commit the patch and then start > working on backports. I've tried these two patches out on my own 24-core AArch64 machine. Bootstrapped (but no LTO or PGO) and regtested fine. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] LoongArch: Enable switchable target

2024-04-07 Thread Xi Ruoyao
On Sun, 2024-04-07 at 16:23 +0800, Yang Yujie wrote: > On Sun, Apr 07, 2024 at 04:23:53PM +0800, Xi Ruoyao wrote: > > On Sun, 2024-04-07 at 15:47 +0800, Yang Yujie wrote: > > > This patch fixes the back-end context switching in cases where functions > > > should be

Re: [PATCH] LoongArch: Enable switchable target

2024-04-07 Thread Xi Ruoyao
R target/113233 Oops, so this PR isn't fixed with r14-7134 "LoongArch: Implement option save/restore"? Should I reopen it? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v1] LoongArch: Set default alignment for functions jumps and loops [PR112919].

2024-04-06 Thread Xi Ruoyao
invoke.texi for > the ^ ^ Better have two commas here. Otherwise it should be OK. > +   format.  */ -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v5] LoongArch: Add support for TLS descriptors

2024-04-01 Thread Xi Ruoyao
cc.target/loongarch/explicit-relocs-auto-tls-desc.c: New test. > * gcc.target/loongarch/explicit-relocs-extreme-tls-desc.c: New test. > * gcc.target/loongarch/explicit-relocs-tls-desc.c: New test. > > Co-authored-by: Lulu Cheng > Co-authored-by: Xi Ruoyao -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] LoongArch: Increase division costs

2024-03-31 Thread Xi Ruoyao
On Mon, 2024-04-01 at 10:22 +0800, chenglulu wrote: > > 在 2024/4/1 上午9:29, Xi Ruoyao 写道: > > On Fri, 2024-03-29 at 09:23 +0800, chenglulu wrote: > > > > > I tested spec2006. In the floating-point program, the test items with > > > large > > >

Re: [PATCH] LoongArch: Increase division costs

2024-03-31 Thread Xi Ruoyao
some workloads like competitive programming. However "adapting with different modulos" is not possible w/o refactoring generic code so it must be deferred to at least GCC 15. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Ping: [PATCH] mips: Fix C23 (...) functions returning large aggregates [PR114175]

2024-03-28 Thread Xi Ruoyao
Ping. On Wed, 2024-03-20 at 15:10 +0800, Xi Ruoyao wrote: > We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named > arguments and there is nothing to advance, but that is not the case > for (...) functions returning by hidden reference which have one such > artific

Re: [PATCH] LoongArch: Increase division costs

2024-03-27 Thread Xi Ruoyao
On Wed, 2024-03-27 at 18:39 +0800, Xi Ruoyao wrote: > On Wed, 2024-03-27 at 10:38 +0800, chenglulu wrote: > > > > 在 2024/3/26 下午5:48, Xi Ruoyao 写道: > > > The latency of LA464 and LA664 division instructions depends on the > > > input.  When I updated the costs i

Re: [PATCH] LoongArch: Increase division costs

2024-03-27 Thread Xi Ruoyao
On Wed, 2024-03-27 at 08:54 +0100, Richard Biener wrote: > On Tue, Mar 26, 2024 at 10:52 AM Xi Ruoyao wrote: > > > > The latency of LA464 and LA664 division instructions depends on the > > input.  When I updated the costs in r14-6642, I unintentionally set the > > divi

Re: [PATCH] LoongArch: Increase division costs

2024-03-27 Thread Xi Ruoyao
On Wed, 2024-03-27 at 10:38 +0800, chenglulu wrote: > > 在 2024/3/26 下午5:48, Xi Ruoyao 写道: > > The latency of LA464 and LA664 division instructions depends on the > > input.  When I updated the costs in r14-6642, I unintentionally set the > > division costs to the best-case

Re: [PATCH v2] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6

2024-03-26 Thread Xi Ruoyao
On Tue, 2024-03-26 at 11:15 +0800, YunQiang Su wrote: /* snip */ > With -ffinite-math-only -fno-signed-zeros, it does work with >     x >= y ? x : y > while without `-ffinite-math-only -fno-signed-zeros`, it cannot. > @Xi Ruoyao Is it expected by IEEE? When y is (quiet) NaN and x

[PATCH] LoongArch: Increase division costs

2024-03-26 Thread Xi Ruoyao
The latency of LA464 and LA664 division instructions depends on the input. When I updated the costs in r14-6642, I unintentionally set the division costs to the best-case latency (when the first operand is 0). Per a recent discussion [1] we should use "something sensible" instead of it. Use the

TARGET_RTX_COSTS and pipeline latency vs. variable-latency instructions (was Re: [PATCH] RISC-V: Add XiangShan Nanhu microarchitecture.)

2024-03-25 Thread Xi Ruoyao
on may spend different number of cycles for different inputs: on LoongArch LA664 I've observed 5 cycles for some inputs and 39 cycles for other inputs. So should we use the minimal value, the maximum value, or something in- between for TARGET_RTX_COSTS and pipeline descriptions? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6

2024-03-21 Thread Xi Ruoyao
-if "code quality test" { *-*-* } { "-O0" } { "" } } */ -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Pushed: [PATCH] LoongArch: Fix a typo [PR 114407]

2024-03-20 Thread Xi Ruoyao
gcc/ChangeLog: PR target/114407 * config/loongarch/loongarch-opts.cc (loongarch_config_target): Fix typo in diagnostic message, enabing -> enabling. --- Pushed r14-9582 as obvious. gcc/config/loongarch/loongarch-opts.cc | 2 +- 1 file changed, 1 insertion(+), 1

[PATCH] mips: Fix C23 (...) functions returning large aggregates [PR114175]

2024-03-20 Thread Xi Ruoyao
We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named arguments and there is nothing to advance, but that is not the case for (...) functions returning by hidden reference which have one such artificial argument. This is causing gcc.dg/c23-stdarg-{6,8,9}.c to fail. Fix the issue by

Pushed: [PATCH v2] LoongArch: Fix C23 (...) functions returning large aggregates [PR114175]

2024-03-19 Thread Xi Ruoyao
On Tue, 2024-03-19 at 11:19 +0800, chenglulu wrote: > > 在 2024/3/18 下午5:34, Xi Ruoyao 写道: > > We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named > > arguments and there is nothing to advance, but that is not the case > > for (...) functions returning by hid

[PATCH] LoongArch: Fix C23 (...) functions returning large aggregates [PR114175]

2024-03-18 Thread Xi Ruoyao
We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named arguments and there is nothing to advance, but that is not the case for (...) functions returning by hidden reference which have one such artificial argument. This is causing gcc.dg/c23-stdarg-6.c and gcc.dg/c23-stdarg-8.c to fail.

[PATCH] LoongArch: Remove unused and incorrect "sge_" define_insn

2024-03-13 Thread Xi Ruoyao
If this insn is really used, we'll have something like slti $r4,$r0,$r5 in the code. The assembler will reject it because slti wants 2 register operands and 1 immediate operand. But we've not got any bug report for this, indicating this define_insn is unused at all. Note that

Re: [PATCH v1] LoongArch: Remove masking process for operand 3 of xvpermi.q.

2024-03-13 Thread Xi Ruoyao
uite/ChangeLog: > > * gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c: >   Reposition operand 3's value into instruction's defined accept range. ^^ Remove these two white spaces. Should be OK with these ChangeLog style issues fixed. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] testsuite: Fix vfprintf-chk-1.c with -fhardened

2024-03-13 Thread Xi Ruoyao
which will taint the test suite with -fhardened. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v4] LoongArch: Add support for TLS descriptors

2024-03-13 Thread Xi Ruoyao
On Wed, 2024-03-13 at 10:24 +0800, Xi Ruoyao wrote: >    return TARGET_EXPLICIT_RELOCS > -    ? "pcalau12i\t$r4,%%desc_pc_hi20(%1)\n\ > -  \taddi.d\t%2,$r0,%%desc_pc_lo12(%1)\n\ > -  \tlu32i.d\t%2,%%desc64_pc_lo20(%1)\n\ > -  \tlu52i.d\t%2,%2,%%desc64_pc_hi12(%1)\n

Re: [PATCH v4] LoongArch: Add support for TLS descriptors

2024-03-13 Thread Xi Ruoyao
On Wed, 2024-03-13 at 11:06 +0800, mengqinggang wrote: > > 在 2024/3/13 上午6:15, Xi Ruoyao 写道: > > On Tue, 2024-03-12 at 17:20 +0800, mengqinggang wrote: > > > +(define_insn "@got_load_tls_desc" > > > +  [(set (match_operand:P 0 &q

Re: [PATCH v4] LoongArch: Add support for TLS descriptors

2024-03-12 Thread Xi Ruoyao
On Wed, 2024-03-13 at 06:56 +0800, Xi Ruoyao wrote: > On Wed, 2024-03-13 at 06:15 +0800, Xi Ruoyao wrote: > > > +(define_insn "@got_load_tls_desc" > > > +  [(set (match_operand:P 0 "register_operand" "=r") > > Hmm, and it looks like we shou

Re: [PATCH v4] LoongArch: Add support for TLS descriptors

2024-03-12 Thread Xi Ruoyao
On Wed, 2024-03-13 at 06:15 +0800, Xi Ruoyao wrote: > > +(define_insn "@got_load_tls_desc" > > +  [(set (match_operand:P 0 "register_operand" "=r") Hmm, and it looks like we should use (reg:P 4) instead of match_operand here, because the instruct

Re: [PATCH v4] LoongArch: Add support for TLS descriptors

2024-03-12 Thread Xi Ruoyao
+    ? "pcalau12i\t$r4,%%desc_pc_hi20(%1)\n\ > +  \taddi.d\t%2,$r0,%%desc_pc_lo12(%1)\n\ > +  \tlu32i.d\t%2,%%desc64_pc_lo20(%1)\n\ > +  \tlu52i.d\t%2,%2,%%desc64_pc_hi12(%1)\n\ > +  \tadd.d\t$r4,$r4,%2\n\ > +  \tld.d\t$r1,$r4,%%desc_ld(%1)\n\ > +  \tjirl\t$

Re: [PATCH v1] LoongArch: Fixed an issue with the implementation of the template atomic_compare_and_swapsi.

2024-03-07 Thread Xi Ruoyao
On Thu, 2024-03-07 at 21:07 +0800, chenglulu wrote: > > 在 2024/3/7 下午8:52, Xi Ruoyao 写道: > > It should be better to extend the expected value before the ll/sc loop > > (like what LLVM does), instead of repeating the extending in each > > iteration.  Something l

Re: [PATCH v1] LoongArch: Fixed an issue with the implementation of the template atomic_compare_and_swapsi.

2024-03-07 Thread Xi Ruoyao
[4], + operands[6])); +} rtx compare = operands[1]; if (operands[3] != const0_rtx) It produces: slli.w $r4,$r4,0 1: ll.w$r14,$r3,0 bne $r14,$r4,2f or $r15,$zero,$r12 sc.w$r15,$r3,0 beqz$r15,1b b 3f 2: dbar0b10100 3: for the test case and the compiled test case runs successfully. I've not done a full bootstrap yet though. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] LoongArch: Emit R_LARCH_RELAX for TLS IE with non-extreme code model to allow the IE to LE linker relaxation

2024-03-06 Thread Xi Ruoyao
ns "-O2 -mcmodel=normal -mexplicit-relocs -mno-relax" } */ > > +/* { dg-final { scan-assembler-not "R_LARCH_RELAX" { target tls_native } } > > } */ i.e. -mno-relax is used compiling this test case, and the compiled assembly code should not contain R_LARCH_RELAX. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

[PATCH] LoongArch: testsuite: Rewrite {x, }vfcmp-{d, f}.c to avoid named registers

2024-03-05 Thread Xi Ruoyao
Loops on named vector register are not vectorized (see comment 11 of PR113622), so the these test cases have been failing for a while. Rewrite them using check-function-bodies to remove hard coding register names. A barrier is needed to always load the first operand before the second operand.

[PATCH v2] LoongArch: Allow s9 as a register alias

2024-03-05 Thread Xi Ruoyao
The psABI allows using s9 as an alias of r22. gcc/ChangeLog: * config/loongarch/loongarch.h (ADDITIONAL_REGISTER_NAMES): Add s9 as an alias of r22. --- v1 -> v2: Add a test case. Ok for trunk? gcc/config/loongarch/loongarch.h | 1 +

[PATCH v3] testsuite: Add a test case for negating FP vectors containing zeros

2024-03-05 Thread Xi Ruoyao
Recently I've fixed two wrong FP vector negate implementation which caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To prevent a similar issue from happening again, add a test case. Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS (with MSA), LoongArch

Re: [PATCH v2] LoongArch: Fix inconsistent description in *sge_

2024-03-05 Thread Xi Ruoyao
int 1)))] >    "" > -  "slti\t%0,%.,%1" > +  "slt\t%0,%.,%1" >    [(set_attr "type" "slt") >     (set_attr "mode" "")]) Hmm, this define_insn seems never really used or it would generate something like "sltu

Re: [PATCH] LoongArch: Fix inconsistent description in *sge_

2024-03-04 Thread Xi Ruoyao
So allowing const_imm12_operand here makes no benefit. >    "" > -  "slti\t%0,%.,%1" > +  "slt%i1\t%0,%.,%1" >    [(set_attr "type" "slt") >     (set_attr "mode" "")]) >   -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-29 Thread Xi Ruoyao
On Thu, 2024-02-29 at 15:09 +0800, Xi Ruoyao wrote: > Recently I've fixed two wrong FP vector negate implementation which > caused wrong sign bits in zeros in targets (r14-8786 and r14-8801).  To > prevent a similar issue from happening again, add a test case. > > Tested on x86_64

[PATCH] LoongArch: Allow s9 as a register alias

2024-02-28 Thread Xi Ruoyao
The psABI allows using s9 as an alias of r22. gcc/ChangeLog: * config/loongarch/loongarch.h (ADDITIONAL_REGISTER_NAMES): Add s9 as an alias of r22. --- Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk? gcc/config/loongarch/loongarch.h | 1 + 1 file changed, 1

[PATCH] LoongArch: Emit R_LARCH_RELAX for TLS IE with non-extreme code model to allow the IE to LE linker relaxation

2024-02-28 Thread Xi Ruoyao
In Binutils we need to make IE to LE relaxation only allowed when there is an R_LARCH_RELAX after R_LARCH_TLE_IE_PC_{HI20,LO12} so an invalid "partial" relaxation won't happen with the extreme code model. So if we are emitting %ie_pc_{hi20,lo12} in a non-extreme code model, emit an R_LARCH_RELAX

[PATCH v2] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-28 Thread Xi Ruoyao
Recently I've fixed two wrong FP vector negate implementation which caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To prevent a similar issue from happening again, add a test case. Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS (with MSA), LoongArch

[PATCH v2] testsuite: Make pr104992.c irrelated to target vector feature [PR113418]

2024-02-28 Thread Xi Ruoyao
The vect_int_mod target selector is evaluated with the options in DEFAULT_VECTCFLAGS in effect, but these options are not automatically passed to tests out of the vect directories. So this test fails on targets where integer vector modulo operation is supported but requiring an option to enable,

Re: [PATCH v2] LoongArch: Add support for TLS descriptors

2024-02-28 Thread Xi Ruoyao
On Thu, 2024-02-29 at 14:08 +0800, Xi Ruoyao wrote: > > +  "TARGET_TLS_DESC" > > +  "la.tls.desc\t%0,%1" > > With -mexplicit-relocs=always we should emit %desc_pc_lo12 etc. instead > of la.tls.desc.  As we don't want to add too many code we can just ha

Re: [PATCH v2] LoongArch: Add support for TLS descriptors

2024-02-28 Thread Xi Ruoyao
ELOCS_ALWAS ? ".." : "la.tls.desc\t%0,%1"; } > +  [(set_attr "got" "load") > +   (set_attr "mode" "")]) We need (set_attr "length" "16") in this list as this actually expands into 16 bytes. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

[PATCH 2/2] LoongArch: Remove unneeded sign extension after crc/crcc instructions

2024-02-25 Thread Xi Ruoyao
The specification of crc/crcc instructions is clear that the output is sign-extended to GRLEN. Add a define_insn to tell the compiler this fact and allow it to remove the unneeded sign extension on crc/crcc output. As crc/crcc instructions are usually used in a tight loop, this should produce a

[PATCH 1/2] LoongArch: NFC: Deduplicate crc instruction defines

2024-02-25 Thread Xi Ruoyao
Introduce an iterator for UNSPEC_CRC and UNSPEC_CRCC to make the next change easier. gcc/ChangeLog: * config/loongarch/loongarch.md (CRC): New define_int_iterator. (crc): New define_int_attr. (loongarch_crc_w__w, loongarch_crcc_w__w): Unify into ...

Pushed: [GCC 13 PATCH] LoongArch: Don't default to -mno-explicit-relocs if -mno-relax

2024-02-23 Thread Xi Ruoyao
On Thu, 2024-02-22 at 19:09 +0800, chenglulu wrote: > > 在 2024/2/22 下午6:20, Xi Ruoyao 写道: > > To improve Binutils compatibility we've had to backported relaxation > > support.  But if a user just updates to GCC 13.3 and sticks with > > Binutils 2.41, there is no reason to

Pushed: [PATCH] LoongArch: Don't falsely claim gold supported in toplevel configure

2024-02-23 Thread Xi Ruoyao
On Fri, 2024-02-23 at 11:37 +0800, chenglulu wrote: > > 在 2024/2/23 上午11:27, Xi Ruoyao 写道: > > On Fri, 2024-02-23 at 11:16 +0800, chenglulu wrote: > > > 在 2024/2/22 下午5:17, Xi Ruoyao 写道: > > > > The gold linker has never been ported to LoongArch (and it se

Re: [PATCH] LoongArch: Don't falsely claim gold supported in toplevel configure

2024-02-22 Thread Xi Ruoyao
On Fri, 2024-02-23 at 11:16 +0800, chenglulu wrote: > > 在 2024/2/22 下午5:17, Xi Ruoyao 写道: > > The gold linker has never been ported to LoongArch (and it seems > > unlikely to be ported in the future as the new architectures are > > focusing on lld and/or mold for fast link

[GCC 13 PATCH] LoongArch: Don't default to -mno-explicit-relocs if -mno-relax

2024-02-22 Thread Xi Ruoyao
To improve Binutils compatibility we've had to backported relaxation support. But if a user just updates to GCC 13.3 and sticks with Binutils 2.41, there is no reason to use -mno-explicit-relocs as the default because we are turning off relaxation for Binutils 2.41 (it lacks conditional branch

[PATCH] LoongArch: Don't falsely claim gold supported in toplevel configure

2024-02-22 Thread Xi Ruoyao
The gold linker has never been ported to LoongArch (and it seems unlikely to be ported in the future as the new architectures are focusing on lld and/or mold for fast linkers). ChangeLog: * configure.ac (ENABLE_GOLD): Remove loongarch*-*-* from target list. * configure:

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-20 Thread Xi Ruoyao
On Tue, 2024-02-20 at 19:50 +0800, chenglulu wrote: > > 在 2024/2/20 下午7:31, Xi Ruoyao 写道: > > On Tue, 2024-02-20 at 19:25 +0800, Xi Ruoyao wrote: > > > On Tue, 2024-02-20 at 10:07 +0800, chenglulu wrote: > > > > > > > So I think that witho

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-20 Thread Xi Ruoyao
On Tue, 2024-02-20 at 19:25 +0800, Xi Ruoyao wrote: > On Tue, 2024-02-20 at 10:07 +0800, chenglulu wrote: > > > So I think that without worrying about performance and ensuring that > > there is no problem > > > > with binutils, I think we can ma

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-20 Thread Xi Ruoyao
test failures due to "excessive errors" if running the GCC test suite with these earlier GAS versions. Maybe we'll have to add some autoconf-based probing for the linker anyway? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-09 Thread Xi Ruoyao
On Fri, 2024-02-09 at 00:02 +0800, chenglulu wrote: > > 在 2024/2/7 上午12:23, Xi Ruoyao 写道: > > Hi Lulu, > > > > I'm proposing to backport r14-4674 "LoongArch: Delete macro definition > > ASM_OUTPUT_ALIGN_WITH_NOP." to releases/gcc-12 and releases/gcc-13

Re: [PATCH] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-06 Thread Xi Ruoyao
On Tue, 2024-02-06 at 17:55 +0800, Xi Ruoyao wrote: > Recently I've fixed two wrong FP vector negate implementation which > caused wrong sign bits in zeros in targets (r14-8786 and r14-8801).  To > prevent a similar issue from happening again, add a test case. > > Tested on x86_64

LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-06 Thread Xi Ruoyao
eases/gcc-12 and releases/gcc-13 then? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

[PATCH] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-06 Thread Xi Ruoyao
Recently I've fixed two wrong FP vector negate implementation which caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To prevent a similar issue from happening again, add a test case. Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS (with MSA), LoongArch

Pushed: [PATCH] MIPS: Fix wrong MSA FP vector negation

2024-02-05 Thread Xi Ruoyao
On Mon, 2024-02-05 at 09:56 +0800, YunQiang Su wrote: > Xi Ruoyao 于2024年2月5日周一 02:01写道: > > > > We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is > > wrong because -0.0 is not 0 - 0.0.  This causes some Python tests to > > fail when Pytho

[PATCH] MIPS: Fix wrong MSA FP vector negation

2024-02-04 Thread Xi Ruoyao
We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is wrong because -0.0 is not 0 - 0.0. This causes some Python tests to fail when Python is built with MSA enabled. Use the bnegi.df instructions to simply reverse the sign bit instead. gcc/ChangeLog: *

Pushed: [PATCH] LoongArch: Avoid out-of-bounds access in loongarch_symbol_insns

2024-02-04 Thread Xi Ruoyao
On Sun, 2024-02-04 at 11:19 +0800, chenglulu wrote: > > 在 2024/2/2 下午5:55, Xi Ruoyao 写道: > > We call loongarch_symbol_insns with mode = MAX_MACHINE_MODE sometimes. > > But in loongarch_symbol_insns: > > > > if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE

Pushed: [PATCH] LoongArch: Fix wrong LSX FP vector negation

2024-02-04 Thread Xi Ruoyao
On Sun, 2024-02-04 at 11:20 +0800, chenglulu wrote: > > 在 2024/2/3 下午4:58, Xi Ruoyao 写道: > > We expanded (neg x) to (minus const0 x) for LSX FP vectors, this is > > wrong because -0.0 is not 0 - 0.0.  This causes some Python tests to > > fail when Python is built with L

Pushed: [PATCH] LoongArch: Fix an ODR violation

2024-02-03 Thread Xi Ruoyao
On Fri, 2024-02-02 at 10:42 +0800, chenglulu wrote: > LGTM! > > Thanks! Pushed r14-8773. > 在 2024/2/2 上午5:54, Xi Ruoyao 写道: > > When bootstrapping GCC 14 --with-build-config=bootstrap-lto, an ODR > > violation is detected: > > > > ../../gcc/config/loo

[PATCH] LoongArch: Fix wrong LSX FP vector negation

2024-02-03 Thread Xi Ruoyao
We expanded (neg x) to (minus const0 x) for LSX FP vectors, this is wrong because -0.0 is not 0 - 0.0. This causes some Python tests to fail when Python is built with LSX enabled. Use the vbitrevi.{d/w} instructions to simply reverse the sign bit instead. We are already doing this for LASX and

[PATCH] LoongArch: Avoid out-of-bounds access in loongarch_symbol_insns

2024-02-02 Thread Xi Ruoyao
We call loongarch_symbol_insns with mode = MAX_MACHINE_MODE sometimes. But in loongarch_symbol_insns: if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P (mode)) return 0; And LSX_SUPPORTED_MODE_P is defined as: #define LSX_SUPPORTED_MODE_P(MODE) \ (ISA_HAS_LSX \

[PATCH] LoongArch: Fix an ODR violation

2024-02-01 Thread Xi Ruoyao
When bootstrapping GCC 14 --with-build-config=bootstrap-lto, an ODR violation is detected: ../../gcc/config/loongarch/loongarch-opts.cc:57: warning: 'abi_minimal_isa' violates the C++ One Definition Rule [-Wodr] 57 | abi_minimal_isa[N_ABI_BASE_TYPES][N_ABI_EXT_TYPES];

Re: [PATCH] Change gcc/ira-conflicts.cc build_conflict_bit_table to use size_t/%zu

2024-02-01 Thread Xi Ruoyao
On Thu, 2024-02-01 at 14:55 +0100, Jakub Jelinek wrote: > On Thu, Feb 01, 2024 at 01:42:03PM +, Jonathan Yong wrote: > > On 2/1/24 13:06, Xi Ruoyao wrote: > > > On Thu, 2024-02-01 at 14:01 +0100, Jakub Jelinek wrote: > > > > On Thu, Feb 01, 2024 at 12:45:3

Re: [PATCH] Change gcc/ira-conflicts.cc build_conflict_bit_table to use size_t/%zu

2024-02-01 Thread Xi Ruoyao
quot;)\n", Should use HOST_WIDE_INT_PRINT_UNSIGNED instead of PRIu64. >(unsigned HOST_WIDE_INT) (sizeof (IRA_INT_TYPE) >    * allocated_words_num), >(unsigned HOST_WIDE_INT) (sizeof (IRA_INT_TYPE) >

Re: [PATCH] LoongArch: Fix soft-float builds of libffi

2024-01-31 Thread Xi Ruoyao
at. You need to wait until the PR is accepted by the libffi maintainers. Frankly I don't know what libffi maintainers are busy on and I'm frustrated as well (having a MIPS patch unreviewed there for a month) but this is the procedure :(. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-27 Thread Xi Ruoyao
On Sat, 2024-01-27 at 18:02 +0800, Xi Ruoyao wrote: > On Sat, 2024-01-27 at 11:15 +0800, chenglulu wrote: > > > > 在 2024/1/26 下午6:57, Xi Ruoyao 写道: > > > On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote: > > > > 在 2024/1/26 下午4:49, Xi Ruoyao 写道: > > >

Re: [PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-27 Thread Xi Ruoyao
On Sat, 2024-01-27 at 11:15 +0800, chenglulu wrote: > > 在 2024/1/26 下午6:57, Xi Ruoyao 写道: > > On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote: > > > 在 2024/1/26 下午4:49, Xi Ruoyao 写道: > > > > On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote: > > >

Re: [PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-26 Thread Xi Ruoyao
On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote: > > 在 2024/1/26 下午4:49, Xi Ruoyao 写道: > > On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote: > > > v3 -> v4: > > >    1. Add macro support for TLS symbols > > >    2. Added support for loading __get_t

Re: [PATCH v4 2/4] LoongArch: Add the macro implementation of mcmodel=extreme.

2024-01-26 Thread Xi Ruoyao
\t%0,%2,%1"; > +    case SYMBOL_TLSLDM: > +  return "la.tls.ld\t%0,%2,%1"; > + > +    default: > +  gcc_unreachable (); > +  } > +} > + "&& REG_P (operands[1]) && find_reg_note (insn, REG_UNUSED, operands[2]) != > 0" > + [(set (match_dup 0) (match_dup 1))] > + "" > + [(set_attr "mode" "DI") > +  (set_attr "length" "5")]) Should be 20, in bytes. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v4 1/4] LoongArch: Merge template got_load_tls_{ld/gd/le/ie}.

2024-01-26 Thread Xi Ruoyao
uot;la.tls.le\t%0,%1"; > +    case SYMBOL_TLS_IE: > +  return "la.tls.ie\t%0,%1"; > +    case SYMBOL_TLSLDM: > +  return "la.tls.ld\t%0,%1"; > +    case SYMBOL_TLSGD: > +  return "la.tls.gd\t%0,%1"; /* snip */ > +    default: > +  g

Re: [PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-26 Thread Xi Ruoyao
extreme TLS GD/LD with -mexplicit-relocs=auto. I've rebased and attached the patch to fix the bad split in -mexplicit- relocs={always,auto} -mcmodel=extreme on top of this series. I've not tested it seriously though (only tested the added and modified test cases). -- Xi Ruoyao School of Aerospace Scie

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-24 Thread Xi Ruoyao
On Thu, 2024-01-25 at 08:48 +0800, chenglulu wrote: > > 在 2024/1/24 上午3:36, Xi Ruoyao 写道: > > On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote: > > > > > The failure of this test case was because the compiler believes that > > > > > two > > &g

Re: [PATCH] testsuite: Make pr104992.c irrelated to target vector feature [PR113418]

2024-01-24 Thread Xi Ruoyao
On Wed, 2024-01-24 at 19:08 +0800, chenxiaolong wrote: > At 19:00 +0800 on Wednesday, 2024-01-24, Xi Ruoyao wrote: > > On Wed, 2024-01-24 at 18:32 +0800, chenxiaolong wrote: > > > On 20:09 +0800 on Tuesday, 2024-01-23, Xi Ruoyao wrote: > > > > The vect_int_mo

Re: [PATCH] testsuite: Make pr104992.c irrelated to target vector feature [PR113418]

2024-01-24 Thread Xi Ruoyao
On Wed, 2024-01-24 at 18:32 +0800, chenxiaolong wrote: > On 20:09 +0800 on Tuesday, 2024-01-23, Xi Ruoyao wrote: > > The vect_int_mod target selector is evaluated with the options in > > DEFAULT_VECTCFLAGS in effect, but these options are not automatically > > passed to

Re: [PATCH] LoongArch: Fix incorrect return type for frecipe/frsqrte intrinsic functions

2024-01-24 Thread Xi Ruoyao
n __inline float >  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) >  __frecipe_s (float _1) >  { > -  __builtin_loongarch_frecipe_s ((float) _1); > +  return (float) __builtin_loongarch_frecipe_s ((float) _1); I don't think the (float) conversion is needed. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-23 Thread Xi Ruoyao
y papers over the same issue caused spec2006 failure. I tried a bootstrap with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS commented out, and there is no more spurious "note: non-delegitimized UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things. I feel that this hook is still written in a buggy way, so maybe removing it will solve the spec2017 issue. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

[PATCH] testsuite: Make pr104992.c irrelated to target vector feature [PR113418]

2024-01-23 Thread Xi Ruoyao
The vect_int_mod target selector is evaluated with the options in DEFAULT_VECTCFLAGS in effect, but these options are not automatically passed to tests out of the vect directories. So this test fails on targets where integer vector modulo operation is supported but requiring an option to enable,

[PATCH] LoongArch: testsuite: Disable stack protector for got-load.C

2024-01-23 Thread Xi Ruoyao
When building GCC with --enable-default-ssp, the stack protector is enabled for got-load.C, causing additional GOT loads for __stack_chk_guard. So mem/u will be matched more than 2 times and the test will fail. Disable stack protector to fix this issue. gcc/testsuite: *

  1   2   3   4   5   6   7   8   9   10   >