On Wed, 2024-07-31 at 16:57 +0800, Lulu Cheng wrote:
>
> 在 2024/7/29 下午3:58, Xi Ruoyao 写道:
> > Per a gcc-help thread we are generating sub-optimal code for
> > __builtin_bswap{32,64}. To fix it:
> >
> > - Use a single revb.d instruction for bswapdi2.
> >
In r15-1207 I was too stupid to realize we just need to relax
ins_zero_bitmask_operand to allow using bstrins for aligning, instead of
adding a new split. And, "> 12" in ins_zero_bitmask_operand also makes
no sense: it rejects bstrins for things like "x & ~4l" with no good
reason.
So fix my
Per a gcc-help thread we are generating sub-optimal code for
__builtin_bswap{32,64}. To fix it:
- Use a single revb.d instruction for bswapdi2.
- Use a single revb.2w instruction for bswapsi2 for TARGET_64BIT,
revb.2h + rotri.w for !TARGET_64BIT.
- Use a single revb.2h instruction for bswapsi2
+1668,7 @@ (define_insn "*norsi3_internal"
> [(set_attr "type" "logical")
> (set_attr "mode" "SI")])
>
> -(define_insn "n"
> +(define_insn "n3"
> [(set (match_operand:X 0 "register_operand" "=r&q
We already had "si3_extend" insns and we hoped the fwprop or combine
passes can use them to remove unnecessary sign extensions. But this
does not always work: for cases like x << 1 | y, the compiler
tends to do
(sign_extend:DI
(ior:SI (ashift:SI (reg:SI $r4)
On Sun, 2024-07-21 at 22:46 -0700, Andrew Pinski wrote:
> On Sun, Jul 21, 2024 at 3:57 AM Xi Ruoyao wrote:
> >
> > On Mon, 2024-07-15 at 15:53 +0800, Lulu Cheng wrote:
> > > Hi,
> > >
> > > g++.dg/opt/pr107569.C and range-sincos.c vrp-fl
Ping^6 https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650763.html
I'm quite frustrated why no response to such simple test case fixes.
Are people on vacation or something?
On Mon, 2024-05-06 at 12:45 +0800, Xi Ruoyao wrote:
> In GCC 14.1-rc1, there are two new (comparing to GCC 13) failu
ction is fixed.
Oops https://gcc.gnu.org/pipermail/gcc-patches/2024-July/656937.html
won't be enough for pr107569.C. For pr107569.C I guess we need to add
range ops for __builtin_isfinite but the patch only handles
__builtin_isinf.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Sat, 2024-07-20 at 07:16 +0100, Sam James wrote:
> Xi Ruoyao writes:
>
> > On Sat, 2024-07-20 at 06:52 +0100, Sam James wrote:
> > > Some distributions like Gentoo make -Wformat and -Wformat-security
> > > enabled by default. Pass -Wno-format to the test
uess-branch-probability
> -fno-tree-fre -fno-tree-ch -Wno-format" } */
>
> int printf(const char *, ...);
> int a[6], b, c;
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Thu, 2024-07-18 at 20:41 +0800, Xi Ruoyao wrote:
> On Thu, 2024-07-18 at 19:54 +0800, Xi Ruoyao wrote:
> > On Tue, 2024-07-09 at 22:29 -0600, Jeff Law wrote:
> > >
> > >
> > > On 7/9/24 8:35 PM, Xi Ruoyao wrote:
> > > > On Mon, 2024-07-08 at 1
On Thu, 2024-07-18 at 19:54 +0800, Xi Ruoyao wrote:
> On Tue, 2024-07-09 at 22:29 -0600, Jeff Law wrote:
> >
> >
> > On 7/9/24 8:35 PM, Xi Ruoyao wrote:
> > > On Mon, 2024-07-08 at 15:03 -0600, Jeff Law wrote:
> > > > So I would use
On Tue, 2024-07-09 at 22:29 -0600, Jeff Law wrote:
>
>
> On 7/9/24 8:35 PM, Xi Ruoyao wrote:
> > On Mon, 2024-07-08 at 15:03 -0600, Jeff Law wrote:
> > > So I would use tmp (or another word_mode pseudo register) for the
> > > destination of that
This is per the request from the kernel developers. For generating the
ORC unwind info, the objtool program needs to analysis the control flow
of a .o file. If a jump table is used, objtool has to correlate the
jump instruction with the table.
On x86 (where objtool was initially developed) it's
Doing so can avoid loading FP constants from the memory. It also
partially fixes PR 66462 as fclass does not signal on sNaN.
gcc/ChangeLog:
* config/loongarch/loongarch.md (extendsidi2): Add ("=r", "f")
alternative and use movfr2gr.s for it. The spec clearly states
This is per the request from the kernel developers. For generating the
ORC unwind info, the objtool program needs to analysis the control flow
of a .o file. If a jump table is used, objtool has to correlate the
jump instruction with the table.
On x86 (where objtool was initially developed) it's
On Wed, 2024-07-10 at 21:54 +0800, Xi Ruoyao wrote:
> On Mon, 2024-07-01 at 09:11 +0800, HAO CHEN GUI wrote:
> > Hi,
> > Gently ping it.
> > https://gcc.gnu.org/pipermail/gcc-patches/2024-May/653096.html
>
> I guess you can add PR114678 into the subject and the Ch
On Mon, 2024-07-01 at 09:11 +0800, HAO CHEN GUI wrote:
> Hi,
> Gently ping it.
> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/653096.html
I guess you can add PR114678 into the subject and the ChangeLog, and
also mention the patch in the bugzilla.
--
Xi Ruoyao
School of
lic:
> private:
> auto_vec> m_stack;
> auto_vec m_replacements;
> - const std::pair m_marker = std::make_pair (NULL, NULL);
> + const std::pair m_marker = std::make_pair(nullptr, nullptr);
> };
AFAIK we prefer NULL_TREE for this.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Tue, 2024-07-09 at 20:21 -0600, Jeff Law wrote:
>
>
> On 7/9/24 8:14 PM, Xi Ruoyao wrote:
> > On Tue, 2024-07-09 at 16:10 -0700, Vineet Gupta wrote:
> > > On 7/3/24 21:35, Xi Ruoyao wrote:
> > > > On Sun, 2024-06-30 at 17:47 -0700, Vineet Gupta wro
$f0,$f0
movfr2gr.s $r4,$f0
andi$r12,$r4,136
andi$r13,$r4,952
sltu$r12,$r0,$r12
sltu$r13,$r0,$r13
slli.w $r13,$r13,2
andi$r4,$r4,68
slli.w $r12,$r12,1
or $r12,$r12,$r13
sltu$r4,$r0,$r4
or $r4,$r4,$r12
andi$r4,$r4,7 # < Why we need this?!
jr $r1
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Tue, 2024-07-09 at 16:10 -0700, Vineet Gupta wrote:
> On 7/3/24 21:35, Xi Ruoyao wrote:
> > On Sun, 2024-06-30 at 17:47 -0700, Vineet Gupta wrote:
> > > - Don't hardcode SI in patterns, try to keep X to avoid potential
> > > sign extension pitfalls. Implementa
On Tue, 2024-07-09 at 16:33 -0700, Vineet Gupta wrote:
>
>
> On 7/9/24 16:23, Jeff Law wrote:
> >
> > On 7/9/24 5:08 PM, Vineet Gupta wrote:
> > > On 7/3/24 12:08, Xi Ruoyao wrote:
> > > > On Fri, 2024-06-28 at 17:53 -0700, Vineet Gupta wrote:
> &
Ping^5 https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650763.html
On Mon, 2024-05-06 at 12:45 +0800, Xi Ruoyao wrote:
> In GCC 14.1-rc1, there are two new (comparing to GCC 13) failures if
> the build is configured --enable-default-pie. Let's fix them.
>
> Tested on x86_64-li
is required. At least the doc should be updated
to say "operand 0 has an integer mode" or something if doing so is
intentionally allowed.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
anyway: https://godbolt.org/z/bnnGf3a38 and the
standards only require a non-zero return value if the input is infinite
(positive or negative).
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
h_cost->movgr2cf;
Then we don't need to check TARGET_uARCH_LA464.
> + }
> + }
> + return cost;
> +}
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
= gen_rtx_NE (SImode, tmp, const0_rtx);
emit_insn (gen_cstoresi4 (operands[0], cmp, tmp, const0_rtx));
DONE;
})
and remove the necessity of UNSPEC_ISFINITE.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2024-06-28 at 20:34 +0800, chenglulu wrote:
>
> 在 2024/6/28 下午8:25, Xi Ruoyao 写道:
> > Hi Richard,
> >
> > The late combine pass has triggered some FAILs on LoongArch and I'm
> > investigating. One of them is movcf2gr-via-fr.c. In
> > 315r.postr
Could you suggest how to fix this issue?
On Thu, 2024-06-20 at 14:34 +0100, Richard Sandiford wrote:
> This series is a resubmission of the late-combine work. I've fixed
> some bugs that Jeff's cross-target CI found last time and some others
> that I hit since then.
/* snip */
--
Ping.
On Sat, 2024-06-15 at 21:47 +0800, Xi Ruoyao wrote:
> The first form has a lower latency (due to the special handling of
> "move" in LA464 and LA664) despite it's longer.
>
> gcc/ChangeLog:
>
> * config/loongarch/loongarch.md
Ping.
On Sun, 2024-06-16 at 01:50 +0800, Xi Ruoyao wrote:
> Consider
>
> c &= 0xfff;
> a &= ~0xfff;
> b &= ~0xfff;
> a |= c;
> b |= c;
>
> This can be done with 2 bstrins instructions. But we need to
> recognize
> it in loongarch
gcc/ChangeLog:
* doc/rtl.texi (jump_table_data): Fix typos.
---
Pushed as obvious.
gcc/doc/rtl.texi | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index c1717ab5f6b..a1ede418c21 100644
--- a/gcc/doc/rtl.texi
+++
Ping^4 https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650763.html
On Mon, 2024-05-06 at 12:45 +0800, Xi Ruoyao wrote:
> In GCC 14.1-rc1, there are two new (comparing to GCC 13) failures if
> the build is configured --enable-default-pie. Let's fix them.
>
> Tested on x86_64-li
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_print_operand_reloc):
Dedup and sort the comment describing modifiers.
---
It's a non-functional change thus I've not tested it. Ok for trunk?
gcc/config/loongarch/loongarch.cc | 10 +-
1 file changed, 1
Consider
c &= 0xfff;
a &= ~0xfff;
b &= ~0xfff;
a |= c;
b |= c;
This can be done with 2 bstrins instructions. But we need to recognize
it in loongarch_rtx_costs or the compiler will not propagate "c & 0xfff"
forward.
gcc/ChangeLog:
* config/loongarch/loongarch.cc:
On Sat, 2024-06-15 at 21:44 +0800, Xi Ruoyao wrote:
> + for (int i = 0; i < 2; i++)
> + *total += set_src_cost (XEXP (op0, i), mode,
> speed);
Oops this is wrong. I need to fix this and regtest again.
--
Xi Ruoyao
School of Aerospace Science an
The first form has a lower latency (due to the special handling of
"move" in LA464 and LA664) despite it's longer.
gcc/ChangeLog:
* config/loongarch/loongarch.md (define_peephole2): Require
optimize_insn_for_size_p () for move/move/bstrins =>
srai/bstrins transform.
---
Consider
c &= 0xfff;
a &= ~0xfff;
b &= ~0xfff;
a |= c;
b |= c;
This can be done with 2 bstrins instructions. But we need to recognize
it in loongarch_rtx_costs or the compiler will not propagate "c & 0xfff"
forward.
gcc/ChangeLog:
* config/loongarch/loongarch.cc:
I know riscv doesn't implement any of the legacy optabs. But less
> maintained vector targets might need adjustments.
No new test failures on LoongArch.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
;< 31) - 1 + (((off_t) 1 << 31) <<
> 31))
> int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
> && LARGE_OFF_T % 2147483647 == 1)
> ? 1 : -1];
This shouldn't happen. Please regenerate using *vanilla* autoconf-2.69.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
es, and your commit has
removed the spaces. But those spaces are completely missing in the
patch sent to gcc-patches. Maybe your mail client has eaten them?
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
We were comparing a mode size with word_mode, but word_mode is an enum
value thus this does not really make any sense. (Un)luckily E_DImode
happens to be 8 so this seemed to work, but let's make it correct so it
won't blow up when we add LA32 support or add another machine mode...
gcc/ChangeLog:
On Sat, 2024-05-11 at 17:16 +0200, FX Coudert wrote:
> * libgccjit.h: Include
Per the C standard size_t should be provided by stddef.h.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
Ping https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650763.html
again, adding more reviewers into CC...
On Mon, 2024-05-06 at 12:45 +0800, Xi Ruoyao wrote:
> In GCC 14.1-rc1, there are two new (comparing to GCC 13) failures if
> the build is configured --enable-default-pie. Let's fi
A move/bstrins pair is as fast as a (addi.w|lu12i.w|lu32i.d|lu52i.d)/and
pair, and twice fast as a srli/slli pair. When the src reg and the dst
reg happens to be the same, the move instruction can be optimized away.
gcc/ChangeLog:
* config/loongarch/predicates.md (high_bitmask_operand):
---
Pushed as obvious.
htdocs/gcc-14/changes.html | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index 6447898e..7a5eb449 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -1218,9 +1218,9
Ping again.
On Mon, 2024-05-06 at 12:45 +0800, Xi Ruoyao wrote:
> In GCC 14.1-rc1, there are two new (comparing to GCC 13) failures if
> the build is configured --enable-default-pie. Let's fix them.
>
> Tested on x86_64-linux-gnu. Ok for trunk and releases/gcc-14?
>
> Xi Ru
gcc/ChangeLog:
PR target/115169
* config/loongarch/loongarch.cc
(loongarch_expand_conditional_move): Guard REGNO with REG_P.
---
Bootstrapped with --enable-checking=all. Ok for trunk and 14?
gcc/config/loongarch/loongarch.cc | 17 -
1 file changed, 12
t "[PATCH v2 1/3] RISC-V: movmem for RISCV with V extension"
>
> This reverts commit df15eb15b5f820321c81efc75f0af13ff8c0dd5b.
Revert is OK, but revert revert is not.
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651144.html
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
Ping.
On Mon, 2024-05-06 at 12:45 +0800, Xi Ruoyao wrote:
> In GCC 14.1-rc1, there are two new (comparing to GCC 13) failures if
> the build is configured --enable-default-pie. Let's fix them.
>
> Tested on x86_64-linux-gnu. Ok for trunk and releases/gcc-14?
>
> Xi Ru
On Thu, 2024-05-09 at 20:21 +, Joseph Myers wrote:
> On Wed, 8 May 2024, Xi Ruoyao wrote:
>
> > In GCC 14 we started to emit URLs for "command-line option is
> > valid for but not " and "-Werror= argument
> > '-Werror=' is not valid for " warnings
In GCC 14 we started to emit URLs for "command-line option is
valid for but not " and "-Werror= argument
'-Werror=' is not valid for " warnings. So we should
have moved -fdiagnostics-urls= early like -fdiagnostics-color=, or
-fdiagnostics-urls= wouldn't be able to control URLs in these
On Tue, 2024-05-07 at 18:01 +0800, Lulu Cheng wrote:
>
> 在 2024/5/7 下午5:42, Xi Ruoyao 写道:
> > On Tue, 2024-05-07 at 17:07 +0800, Xi Ruoyao wrote:
> > > Hmm, after this change the default (-march=la64v1.0) is enabling LSX:
> > >
> > > $ e
On Tue, 2024-05-07 at 17:07 +0800, Xi Ruoyao wrote:
> Hmm, after this change the default (-march=la64v1.0) is enabling LSX:
>
> $ echo "int dummy;" | cc -c -v |& tail -n1
> COLLECT_GCC_OPTIONS='-c' '-v' '-mabi=lp64d' '-march=la64v1.0' '-
> mfpu=64'
> '-msimd=lsx' '
LOONGARCH64:
> > + case TUNE_LA464:
> > + case TUNE_LA664:
> > /* Vector part. */
> > if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P
> > (mode))
> > {
> > @@ -10980,9
In GCC 14.1-rc1, there are two new (comparing to GCC 13) failures if
the build is configured --enable-default-pie. Let's fix them.
Tested on x86_64-linux-gnu. Ok for trunk and releases/gcc-14?
Xi Ruoyao (2):
i386: testsuite: Add -no-pie for pr113689-1.c [PR70150]
i386: testsuite: Adapt
After r14-811 "call *nop@GOTPCREL(%rip)" is only generated with
-mno-direct-extern-access even if --enable-default-pie. So the r13-1614
change to this file is not valid anymore.
gcc/testsuite/ChangeLog:
PR testsuite/70150
* gcc.target/i386/fentryname3.c (dg-final): Revert
For a --enable-default-pie build, using -fno-pic (for compiler) but
not -no-pie (for linker) triggers some linker warnings counted as
excess errors:
/usr/bin/ld: /tmp/cc8MgxiR.o: warning: relocation in read-only
section `.text.startup'
/usr/bin/ld: warning: creating DT_TEXTREL in a
On Sat, 2024-04-27 at 11:04 +0800, Lulu Cheng wrote:
> LGTM!
>
> Thanks.
Pushed r15-11 and r14-10142.
> 在 2024/4/26 下午9:52, Xi Ruoyao 写道:
> > Without the constrants, the compiler attempts to use a stack slot as the
> > target, causing an ICE building the kernel with -O
Without the constrants, the compiler attempts to use a stack slot as the
target, causing an ICE building the kernel with -Os:
drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c:3144:1:
error: could not split insn
(insn:TI 1764 67 1745
(set (mem/c:DI (reg/f:DI 3 $r3) [707 %sfp+-80 S8 A64])
does not support HPTW, which
**is** a v1.1 feature.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
uot;+Generate instructions for the machine type
> @var{arch-type}.",
>
> so is there no need to write it like this here?
Then maybe just say "all LoongArch v1.1 instructions" instead of
"features" here as well?
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
A version 1.1.
IMO it's better to use a wording like LA664, i.e. "a CPU implementing
all LoongArch v1.1 unprivileged features" (emphasising "all", as the
v1.1 manual allows to only implement a subset of v1.1 features).
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
ventions, which
> the LoongArch GCC port aims to conform to.
The links seems broken. Do you mean la-softdev-convention?
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
I do LTO,
so I don't really rely on this patch though.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
this line.
> (loongarch_expand_builtin): Turn assertion of builtin
> availability
> into a test.
and this line.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
.length ();
> + if (l != p2.m_padding.length ())
> + return false;
> + for (unsigned i = 0; i < l; i++)
> + if (p1.m_padding[i].first != p2.m_padding[i].first
> + || p1.m_padding[i].second != p2.m_padding[i].second)
> + return false;
> +
> + return true;
> +}
&
farm. If that goes well, I intend to commit the patch and then start
> working on backports.
I've tried these two patches out on my own 24-core AArch64 machine.
Bootstrapped (but no LTO or PGO) and regtested fine.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Sun, 2024-04-07 at 16:23 +0800, Yang Yujie wrote:
> On Sun, Apr 07, 2024 at 04:23:53PM +0800, Xi Ruoyao wrote:
> > On Sun, 2024-04-07 at 15:47 +0800, Yang Yujie wrote:
> > > This patch fixes the back-end context switching in cases where functions
> > > should be
R target/113233
Oops, so this PR isn't fixed with r14-7134 "LoongArch: Implement option
save/restore"? Should I reopen it?
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
invoke.texi for
> the
^ ^
Better have two commas here.
Otherwise it should be OK.
> + format. */
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
cc.target/loongarch/explicit-relocs-auto-tls-desc.c: New test.
> * gcc.target/loongarch/explicit-relocs-extreme-tls-desc.c: New test.
> * gcc.target/loongarch/explicit-relocs-tls-desc.c: New test.
>
> Co-authored-by: Lulu Cheng
> Co-authored-by: Xi Ruoyao
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Mon, 2024-04-01 at 10:22 +0800, chenglulu wrote:
>
> 在 2024/4/1 上午9:29, Xi Ruoyao 写道:
> > On Fri, 2024-03-29 at 09:23 +0800, chenglulu wrote:
> >
> > > I tested spec2006. In the floating-point program, the test items with
> > > large
> > >
some workloads like competitive programming. However "adapting with
different modulos" is not possible w/o refactoring generic code so it
must be deferred to at least GCC 15.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
Ping.
On Wed, 2024-03-20 at 15:10 +0800, Xi Ruoyao wrote:
> We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named
> arguments and there is nothing to advance, but that is not the case
> for (...) functions returning by hidden reference which have one such
> artific
On Wed, 2024-03-27 at 18:39 +0800, Xi Ruoyao wrote:
> On Wed, 2024-03-27 at 10:38 +0800, chenglulu wrote:
> >
> > 在 2024/3/26 下午5:48, Xi Ruoyao 写道:
> > > The latency of LA464 and LA664 division instructions depends on the
> > > input. When I updated the costs i
On Wed, 2024-03-27 at 08:54 +0100, Richard Biener wrote:
> On Tue, Mar 26, 2024 at 10:52 AM Xi Ruoyao wrote:
> >
> > The latency of LA464 and LA664 division instructions depends on the
> > input. When I updated the costs in r14-6642, I unintentionally set the
> > divi
On Wed, 2024-03-27 at 10:38 +0800, chenglulu wrote:
>
> 在 2024/3/26 下午5:48, Xi Ruoyao 写道:
> > The latency of LA464 and LA664 division instructions depends on the
> > input. When I updated the costs in r14-6642, I unintentionally set the
> > division costs to the best-case
On Tue, 2024-03-26 at 11:15 +0800, YunQiang Su wrote:
/* snip */
> With -ffinite-math-only -fno-signed-zeros, it does work with
> x >= y ? x : y
> while without `-ffinite-math-only -fno-signed-zeros`, it cannot.
> @Xi Ruoyao Is it expected by IEEE?
When y is (quiet) NaN and x
The latency of LA464 and LA664 division instructions depends on the
input. When I updated the costs in r14-6642, I unintentionally set the
division costs to the best-case latency (when the first operand is 0).
Per a recent discussion [1] we should use "something sensible" instead
of it.
Use the
on may spend different number of cycles
for different inputs: on LoongArch LA664 I've observed 5 cycles for some
inputs and 39 cycles for other inputs.
So should we use the minimal value, the maximum value, or something in-
between for TARGET_RTX_COSTS and pipeline descriptions?
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
-if "code quality test" { *-*-* } { "-O0" } { "" } } */
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
gcc/ChangeLog:
PR target/114407
* config/loongarch/loongarch-opts.cc (loongarch_config_target):
Fix typo in diagnostic message, enabing -> enabling.
---
Pushed r14-9582 as obvious.
gcc/config/loongarch/loongarch-opts.cc | 2 +-
1 file changed, 1 insertion(+), 1
We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named
arguments and there is nothing to advance, but that is not the case
for (...) functions returning by hidden reference which have one such
artificial argument. This is causing gcc.dg/c23-stdarg-{6,8,9}.c to
fail.
Fix the issue by
On Tue, 2024-03-19 at 11:19 +0800, chenglulu wrote:
>
> 在 2024/3/18 下午5:34, Xi Ruoyao 写道:
> > We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named
> > arguments and there is nothing to advance, but that is not the case
> > for (...) functions returning by hid
We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named
arguments and there is nothing to advance, but that is not the case
for (...) functions returning by hidden reference which have one such
artificial argument. This is causing gcc.dg/c23-stdarg-6.c and
gcc.dg/c23-stdarg-8.c to fail.
If this insn is really used, we'll have something like
slti $r4,$r0,$r5
in the code. The assembler will reject it because slti wants 2
register operands and 1 immediate operand. But we've not got any bug
report for this, indicating this define_insn is unused at all.
Note that
uite/ChangeLog:
>
> * gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c:
> Reposition operand 3's value into instruction's defined accept range.
^^
Remove these two white spaces.
Should be OK with these ChangeLog style issues fixed.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
which will taint the test suite with -fhardened.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Wed, 2024-03-13 at 10:24 +0800, Xi Ruoyao wrote:
> return TARGET_EXPLICIT_RELOCS
> - ? "pcalau12i\t$r4,%%desc_pc_hi20(%1)\n\
> - \taddi.d\t%2,$r0,%%desc_pc_lo12(%1)\n\
> - \tlu32i.d\t%2,%%desc64_pc_lo20(%1)\n\
> - \tlu52i.d\t%2,%2,%%desc64_pc_hi12(%1)\n
On Wed, 2024-03-13 at 11:06 +0800, mengqinggang wrote:
>
> 在 2024/3/13 上午6:15, Xi Ruoyao 写道:
> > On Tue, 2024-03-12 at 17:20 +0800, mengqinggang wrote:
> > > +(define_insn "@got_load_tls_desc"
> > > + [(set (match_operand:P 0 &q
On Wed, 2024-03-13 at 06:56 +0800, Xi Ruoyao wrote:
> On Wed, 2024-03-13 at 06:15 +0800, Xi Ruoyao wrote:
> > > +(define_insn "@got_load_tls_desc"
> > > + [(set (match_operand:P 0 "register_operand" "=r")
>
> Hmm, and it looks like we shou
On Wed, 2024-03-13 at 06:15 +0800, Xi Ruoyao wrote:
> > +(define_insn "@got_load_tls_desc"
> > + [(set (match_operand:P 0 "register_operand" "=r")
Hmm, and it looks like we should use (reg:P 4) instead of match_operand
here, because the instruct
+ ? "pcalau12i\t$r4,%%desc_pc_hi20(%1)\n\
> + \taddi.d\t%2,$r0,%%desc_pc_lo12(%1)\n\
> + \tlu32i.d\t%2,%%desc64_pc_lo20(%1)\n\
> + \tlu52i.d\t%2,%2,%%desc64_pc_hi12(%1)\n\
> + \tadd.d\t$r4,$r4,%2\n\
> + \tld.d\t$r1,$r4,%%desc_ld(%1)\n\
> + \tjirl\t$
On Thu, 2024-03-07 at 21:07 +0800, chenglulu wrote:
>
> 在 2024/3/7 下午8:52, Xi Ruoyao 写道:
> > It should be better to extend the expected value before the ll/sc loop
> > (like what LLVM does), instead of repeating the extending in each
> > iteration. Something l
[4],
+ operands[6]));
+}
rtx compare = operands[1];
if (operands[3] != const0_rtx)
It produces:
slli.w $r4,$r4,0
1:
ll.w$r14,$r3,0
bne $r14,$r4,2f
or $r15,$zero,$r12
sc.w$r15,$r3,0
beqz$r15,1b
b 3f
2:
dbar0b10100
3:
for the test case and the compiled test case runs successfully. I've
not done a full bootstrap yet though.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
ns "-O2 -mcmodel=normal -mexplicit-relocs -mno-relax" } */
> > +/* { dg-final { scan-assembler-not "R_LARCH_RELAX" { target tls_native } }
> > } */
i.e. -mno-relax is used compiling this test case, and the compiled
assembly code should not contain R_LARCH_RELAX.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
Loops on named vector register are not vectorized (see comment 11 of
PR113622), so the these test cases have been failing for a while.
Rewrite them using check-function-bodies to remove hard coding register
names. A barrier is needed to always load the first operand before the
second operand.
The psABI allows using s9 as an alias of r22.
gcc/ChangeLog:
* config/loongarch/loongarch.h (ADDITIONAL_REGISTER_NAMES): Add
s9 as an alias of r22.
---
v1 -> v2: Add a test case.
Ok for trunk?
gcc/config/loongarch/loongarch.h | 1 +
1 - 100 of 970 matches
Mail list logo