Recently I've fixed two wrong FP vector negate implementation which
caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To
prevent a similar issue from happening again, add a test case.
Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS
(with MSA), LoongArch
int 1)))]
> ""
> - "slti\t%0,%.,%1"
> + "slt\t%0,%.,%1"
> [(set_attr "type" "slt")
> (set_attr "mode" "")])
Hmm, this define_insn seems never really used or it would generate
something like "sltu
So allowing const_imm12_operand here
makes no benefit.
> ""
> - "slti\t%0,%.,%1"
> + "slt%i1\t%0,%.,%1"
> [(set_attr "type" "slt")
> (set_attr "mode" "")])
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Thu, 2024-02-29 at 15:09 +0800, Xi Ruoyao wrote:
> Recently I've fixed two wrong FP vector negate implementation which
> caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To
> prevent a similar issue from happening again, add a test case.
>
> Tested on x86_64
The psABI allows using s9 as an alias of r22.
gcc/ChangeLog:
* config/loongarch/loongarch.h (ADDITIONAL_REGISTER_NAMES): Add
s9 as an alias of r22.
---
Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk?
gcc/config/loongarch/loongarch.h | 1 +
1 file changed, 1
In Binutils we need to make IE to LE relaxation only allowed when there
is an R_LARCH_RELAX after R_LARCH_TLE_IE_PC_{HI20,LO12} so an invalid
"partial" relaxation won't happen with the extreme code model. So if we
are emitting %ie_pc_{hi20,lo12} in a non-extreme code model, emit an
R_LARCH_RELAX
Recently I've fixed two wrong FP vector negate implementation which
caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To
prevent a similar issue from happening again, add a test case.
Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS
(with MSA), LoongArch
The vect_int_mod target selector is evaluated with the options in
DEFAULT_VECTCFLAGS in effect, but these options are not automatically
passed to tests out of the vect directories. So this test fails on
targets where integer vector modulo operation is supported but requiring
an option to enable,
On Thu, 2024-02-29 at 14:08 +0800, Xi Ruoyao wrote:
> > + "TARGET_TLS_DESC"
> > + "la.tls.desc\t%0,%1"
>
> With -mexplicit-relocs=always we should emit %desc_pc_lo12 etc. instead
> of la.tls.desc. As we don't want to add too many code we can just ha
ELOCS_ALWAS ? ".." : "la.tls.desc\t%0,%1"; }
> + [(set_attr "got" "load")
> + (set_attr "mode" "")])
We need (set_attr "length" "16") in this list as this actually expands
into 16 bytes.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
The specification of crc/crcc instructions is clear that the output is
sign-extended to GRLEN. Add a define_insn to tell the compiler this
fact and allow it to remove the unneeded sign extension on crc/crcc
output. As crc/crcc instructions are usually used in a tight loop,
this should produce a
Introduce an iterator for UNSPEC_CRC and UNSPEC_CRCC to make the next
change easier.
gcc/ChangeLog:
* config/loongarch/loongarch.md (CRC): New define_int_iterator.
(crc): New define_int_attr.
(loongarch_crc_w__w, loongarch_crcc_w__w): Unify
into ...
On Thu, 2024-02-22 at 19:09 +0800, chenglulu wrote:
>
> 在 2024/2/22 下午6:20, Xi Ruoyao 写道:
> > To improve Binutils compatibility we've had to backported relaxation
> > support. But if a user just updates to GCC 13.3 and sticks with
> > Binutils 2.41, there is no reason to
On Fri, 2024-02-23 at 11:37 +0800, chenglulu wrote:
>
> 在 2024/2/23 上午11:27, Xi Ruoyao 写道:
> > On Fri, 2024-02-23 at 11:16 +0800, chenglulu wrote:
> > > 在 2024/2/22 下午5:17, Xi Ruoyao 写道:
> > > > The gold linker has never been ported to LoongArch (and it se
On Fri, 2024-02-23 at 11:16 +0800, chenglulu wrote:
>
> 在 2024/2/22 下午5:17, Xi Ruoyao 写道:
> > The gold linker has never been ported to LoongArch (and it seems
> > unlikely to be ported in the future as the new architectures are
> > focusing on lld and/or mold for fast link
To improve Binutils compatibility we've had to backported relaxation
support. But if a user just updates to GCC 13.3 and sticks with
Binutils 2.41, there is no reason to use -mno-explicit-relocs as the
default because we are turning off relaxation for Binutils 2.41 (it
lacks conditional branch
The gold linker has never been ported to LoongArch (and it seems
unlikely to be ported in the future as the new architectures are
focusing on lld and/or mold for fast linkers).
ChangeLog:
* configure.ac (ENABLE_GOLD): Remove loongarch*-*-* from target
list.
* configure:
On Tue, 2024-02-20 at 19:50 +0800, chenglulu wrote:
>
> 在 2024/2/20 下午7:31, Xi Ruoyao 写道:
> > On Tue, 2024-02-20 at 19:25 +0800, Xi Ruoyao wrote:
> > > On Tue, 2024-02-20 at 10:07 +0800, chenglulu wrote:
> > >
> > > > So I think that witho
On Tue, 2024-02-20 at 19:25 +0800, Xi Ruoyao wrote:
> On Tue, 2024-02-20 at 10:07 +0800, chenglulu wrote:
>
> > So I think that without worrying about performance and ensuring that
> > there is no problem
> >
> > with binutils, I think we can ma
test failures due to "excessive
errors" if running the GCC test suite with these earlier GAS versions.
Maybe we'll have to add some autoconf-based probing for the linker
anyway?
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2024-02-09 at 00:02 +0800, chenglulu wrote:
>
> 在 2024/2/7 上午12:23, Xi Ruoyao 写道:
> > Hi Lulu,
> >
> > I'm proposing to backport r14-4674 "LoongArch: Delete macro definition
> > ASM_OUTPUT_ALIGN_WITH_NOP." to releases/gcc-12 and releases/gcc-13
On Tue, 2024-02-06 at 17:55 +0800, Xi Ruoyao wrote:
> Recently I've fixed two wrong FP vector negate implementation which
> caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To
> prevent a similar issue from happening again, add a test case.
>
> Tested on x86_64
eases/gcc-12 and releases/gcc-13
then?
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
Recently I've fixed two wrong FP vector negate implementation which
caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To
prevent a similar issue from happening again, add a test case.
Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS
(with MSA), LoongArch
On Mon, 2024-02-05 at 09:56 +0800, YunQiang Su wrote:
> Xi Ruoyao 于2024年2月5日周一 02:01写道:
> >
> > We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is
> > wrong because -0.0 is not 0 - 0.0. This causes some Python tests to
> > fail when Pytho
We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is
wrong because -0.0 is not 0 - 0.0. This causes some Python tests to
fail when Python is built with MSA enabled.
Use the bnegi.df instructions to simply reverse the sign bit instead.
gcc/ChangeLog:
*
On Sun, 2024-02-04 at 11:19 +0800, chenglulu wrote:
>
> 在 2024/2/2 下午5:55, Xi Ruoyao 写道:
> > We call loongarch_symbol_insns with mode = MAX_MACHINE_MODE sometimes.
> > But in loongarch_symbol_insns:
> >
> > if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE
On Sun, 2024-02-04 at 11:20 +0800, chenglulu wrote:
>
> 在 2024/2/3 下午4:58, Xi Ruoyao 写道:
> > We expanded (neg x) to (minus const0 x) for LSX FP vectors, this is
> > wrong because -0.0 is not 0 - 0.0. This causes some Python tests to
> > fail when Python is built with L
On Fri, 2024-02-02 at 10:42 +0800, chenglulu wrote:
> LGTM!
>
> Thanks!
Pushed r14-8773.
> 在 2024/2/2 上午5:54, Xi Ruoyao 写道:
> > When bootstrapping GCC 14 --with-build-config=bootstrap-lto, an ODR
> > violation is detected:
> >
> > ../../gcc/config/loo
We expanded (neg x) to (minus const0 x) for LSX FP vectors, this is
wrong because -0.0 is not 0 - 0.0. This causes some Python tests to
fail when Python is built with LSX enabled.
Use the vbitrevi.{d/w} instructions to simply reverse the sign bit
instead. We are already doing this for LASX and
We call loongarch_symbol_insns with mode = MAX_MACHINE_MODE sometimes.
But in loongarch_symbol_insns:
if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P (mode))
return 0;
And LSX_SUPPORTED_MODE_P is defined as:
#define LSX_SUPPORTED_MODE_P(MODE) \
(ISA_HAS_LSX \
When bootstrapping GCC 14 --with-build-config=bootstrap-lto, an ODR
violation is detected:
../../gcc/config/loongarch/loongarch-opts.cc:57: warning:
'abi_minimal_isa' violates the C++ One Definition Rule [-Wodr]
57 | abi_minimal_isa[N_ABI_BASE_TYPES][N_ABI_EXT_TYPES];
On Thu, 2024-02-01 at 14:55 +0100, Jakub Jelinek wrote:
> On Thu, Feb 01, 2024 at 01:42:03PM +, Jonathan Yong wrote:
> > On 2/1/24 13:06, Xi Ruoyao wrote:
> > > On Thu, 2024-02-01 at 14:01 +0100, Jakub Jelinek wrote:
> > > > On Thu, Feb 01, 2024 at 12:45:3
quot;)\n",
Should use HOST_WIDE_INT_PRINT_UNSIGNED instead of PRIu64.
>(unsigned HOST_WIDE_INT) (sizeof (IRA_INT_TYPE)
> * allocated_words_num),
>(unsigned HOST_WIDE_INT) (sizeof (IRA_INT_TYPE)
>
at.
You need to wait until the PR is accepted by the libffi maintainers.
Frankly I don't know what libffi maintainers are busy on and I'm
frustrated as well (having a MIPS patch unreviewed there for a month)
but this is the procedure :(.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Sat, 2024-01-27 at 18:02 +0800, Xi Ruoyao wrote:
> On Sat, 2024-01-27 at 11:15 +0800, chenglulu wrote:
> >
> > 在 2024/1/26 下午6:57, Xi Ruoyao 写道:
> > > On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote:
> > > > 在 2024/1/26 下午4:49, Xi Ruoyao 写道:
> > >
On Sat, 2024-01-27 at 11:15 +0800, chenglulu wrote:
>
> 在 2024/1/26 下午6:57, Xi Ruoyao 写道:
> > On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote:
> > > 在 2024/1/26 下午4:49, Xi Ruoyao 写道:
> > > > On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote:
> > >
On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote:
>
> 在 2024/1/26 下午4:49, Xi Ruoyao 写道:
> > On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote:
> > > v3 -> v4:
> > > 1. Add macro support for TLS symbols
> > > 2. Added support for loading __get_t
\t%0,%2,%1";
> + case SYMBOL_TLSLDM:
> + return "la.tls.ld\t%0,%2,%1";
> +
> + default:
> + gcc_unreachable ();
> + }
> +}
> + "&& REG_P (operands[1]) && find_reg_note (insn, REG_UNUSED, operands[2]) !=
> 0"
> + [(set (match_dup 0) (match_dup 1))]
> + ""
> + [(set_attr "mode" "DI")
> + (set_attr "length" "5")])
Should be 20, in bytes.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
uot;la.tls.le\t%0,%1";
> + case SYMBOL_TLS_IE:
> + return "la.tls.ie\t%0,%1";
> + case SYMBOL_TLSLDM:
> + return "la.tls.ld\t%0,%1";
> + case SYMBOL_TLSGD:
> + return "la.tls.gd\t%0,%1";
/* snip */
> + default:
> + g
extreme TLS GD/LD with -mexplicit-relocs=auto.
I've rebased and attached the patch to fix the bad split in -mexplicit-
relocs={always,auto} -mcmodel=extreme on top of this series. I've not
tested it seriously though (only tested the added and modified test
cases).
--
Xi Ruoyao
School of Aerospace Scie
On Thu, 2024-01-25 at 08:48 +0800, chenglulu wrote:
>
> 在 2024/1/24 上午3:36, Xi Ruoyao 写道:
> > On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote:
> > > > > The failure of this test case was because the compiler believes that
> > > > > two
> > &g
On Wed, 2024-01-24 at 19:08 +0800, chenxiaolong wrote:
> At 19:00 +0800 on Wednesday, 2024-01-24, Xi Ruoyao wrote:
> > On Wed, 2024-01-24 at 18:32 +0800, chenxiaolong wrote:
> > > On 20:09 +0800 on Tuesday, 2024-01-23, Xi Ruoyao wrote:
> > > > The vect_int_mo
On Wed, 2024-01-24 at 18:32 +0800, chenxiaolong wrote:
> On 20:09 +0800 on Tuesday, 2024-01-23, Xi Ruoyao wrote:
> > The vect_int_mod target selector is evaluated with the options in
> > DEFAULT_VECTCFLAGS in effect, but these options are not automatically
> > passed to
n __inline float
> __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> __frecipe_s (float _1)
> {
> - __builtin_loongarch_frecipe_s ((float) _1);
> + return (float) __builtin_loongarch_frecipe_s ((float) _1);
I don't think the (float) conversion is needed.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
y
papers over the same issue caused spec2006 failure. I tried a bootstrap
with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS
commented out, and there is no more spurious "note: non-delegitimized
UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things.
I feel that this hook is still written in a buggy way, so maybe removing
it will solve the spec2017 issue.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
The vect_int_mod target selector is evaluated with the options in
DEFAULT_VECTCFLAGS in effect, but these options are not automatically
passed to tests out of the vect directories. So this test fails on
targets where integer vector modulo operation is supported but requiring
an option to enable,
When building GCC with --enable-default-ssp, the stack protector is
enabled for got-load.C, causing additional GOT loads for
__stack_chk_guard. So mem/u will be matched more than 2 times and the
test will fail.
Disable stack protector to fix this issue.
gcc/testsuite:
*
On Tue, 2024-01-23 at 10:37 +0800, chenglulu wrote:
> LGTM!
>
> Thanks!
Pushed v2 as attached. The only change is in the comment: Qinggang told
me TLE LE relaxation actually *requires* explicit relocs.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian Univer
Binutils 2.42 supports TLS LD/GD relaxation which requires the assembler
macro.
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_explicit_relocs_p):
If la_opt_explicit_relocs is EXPLICIT_RELOCS_AUTO, return false
for SYMBOL_TLS_LDM and SYMBOL_TLS_GD.
quot;)
(unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
With this the buggy REG_UNUSED notes were gone. But it then prevented
the CSE when loading the address of __tls_get_addr (i.e. if we address
10 TLE_LD symbols in a function it would emit 10 instances of "la.global
__tls_get
derstand the purpose of adding
> '-fno-tree-vectorize' here.
I don't think -fno-tree-vectorize will make a difference here. This
test case uses __attribute__((vector_size(...))) explicitly so the
vector operation will be used even if -fno-tree-vectorize.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Wed, 2024-01-17 at 17:38 +0800, chenglulu wrote:
>
> 在 2024/1/13 下午9:05, Xi Ruoyao 写道:
> > 在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
> > > 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
> > > > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
> > > >
> > >
ibstdc++-v3/testsuite/lib/dg-options.exp
> @@ -337,6 +337,7 @@ proc add_options_for_libatomic { flags } {
> || ([istarget powerpc*-*-*] && [check_effective_target_ilp32])
> || [istarget riscv*-*-*]
> || ([istarget sparc*-*-linux-gnu] && [check_effective_target_ilp32])
> + || ([istarget i?86-*-*] || [istarget x86_64-*-*])
This seems too overkill as "dg-add-options libatomic" is not intended to
handle 16-byte atomics. Maybe we can fork this to a new dg-add-options
like "add_options_for_libatomic_16b"?
> } {
> global TOOL_OPTIONS
>
> --
> 2.25.1
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Tue, 2024-01-16 at 12:58 +0800, Xi Ruoyao wrote:
> On Tue, 2024-01-16 at 10:57 +0800, chenxiaolong wrote:
> > 在 2024-01-15一的 15:50 +0800,Xi Ruoyao写道:
> > > On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote:
> > > > At 14:42 +0800 on the first day
On Tue, 2024-01-16 at 14:16 +0800, chenglulu wrote:
>
>
> 在 2024/1/16 下午1:34, Xi Ruoyao 写道:
> > Ping.
> >
> > On Fri, 2023-12-15 at 20:56 +0800, Xi Ruoyao wrote:
> > > We don't allow SImode in FCC, so constraint z is never really used
&g
Ping.
On Fri, 2023-12-15 at 20:56 +0800, Xi Ruoyao wrote:
> We don't allow SImode in FCC, so constraint z is never really used
> here.
>
> gcc/ChangeLog:
>
> * config/loongarch/loongarch.md (movsi_internal): Remove
> constraint z.
> ---
>
> Bootstrappe
On Tue, 2024-01-16 at 10:57 +0800, chenxiaolong wrote:
> 在 2024-01-15一的 15:50 +0800,Xi Ruoyao写道:
> > On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote:
> > > At 14:42 +0800 on the first day of 2024-01-15, Xi Ruoyao wrote:
> > > > On Mon, 2024-01-15 at
On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote:
> At 14:42 +0800 on the first day of 2024-01-15, Xi Ruoyao wrote:
> > On Mon, 2024-01-15 at 14:32 +0800, YunQiang Su wrote:
> > > Xi Ruoyao wrote at 12:11pm on Monday, January
> > > 15, 2024:
> > >
On Mon, 2024-01-15 at 14:32 +0800, YunQiang Su wrote:
> Xi Ruoyao 于2024年1月15日周一 12:11写道:
> >
> > On Mon, 2024-01-15 at 09:29 +0800, chenxiaolong wrote:
> > > At 21:13 +0800 on Saturday, 2024-01-13, Xi Ruoyao wrote:
> > > > At 15:28 +0800 on Saturday 2024-01-1
On Mon, 2024-01-15 at 09:29 +0800, chenxiaolong wrote:
> At 21:13 +0800 on Saturday, 2024-01-13, Xi Ruoyao wrote:
> > At 15:28 +0800 on Saturday 2024-01-13, chenxiaolong wrote:
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/pr104992.c: Added addition
1 100644
> --- a/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
> +++ b/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
> @@ -2,6 +2,7 @@
> ! { dg-require-effective-target vect_double }
> ! { dg-options "-O3 --param vect-max-peeling-for-alignment=0
> -fpredictive
在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
>
> 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
> > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
> >
> > > > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
> > > > we n
enable-bootstrap
> --enable-checking=release
> $ make BOOT_FLAGS="-mcmodel=extreme"
>
> What did I do wrong?:-(
BOOT_CFLAGS, not BOOT_FLAGS :).
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
can-assembler-times "slli.w\t\\\$r\[0-9\]+,\\\$r\[0-9\]+,0"
> 0 } } */
Use scan-assembler-not instead of scan-assembler-times ... 0.
Otherwise LGTM.
> #include
> #define my_min(x, y) ((x) < (y) ? (x) : (y))
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
uot;")])
>
> +(define_insn "*nsi_internal"
> + [(set (match_operand:SI 0 "register_operand" "=r")
> + (neg_bitwise:SI
> + (not:SI (match_operand:SI 1 "register_operand" "r"))
> + (match_operand:SI 2 "register_operand" "r")))]
> + "TARGET_64BIT"
> + "n\t%0,%2,%1"
> + [(set_attr "type" "logical")
> + (set_attr "mode" "SI")])
>
> ;;
> ;;
> @@ -3167,7 +3210,6 @@ (define_expand "condjump"
> (label_ref (match_operand 1))
> (pc)))])
>
> -
>
> ;;
> ;;
> @@ -3967,10 +4009,13 @@ (define_insn "bytepick_w_"
> (define_insn "bytepick_w__extend"
> [(set (match_operand:DI 0 "register_operand" "=r")
> (sign_extend:DI
> - (ior:SI (lshiftrt (match_operand:SI 1 "register_operand" "r")
> - (const_int ))
> - (ashift (match_operand:SI 2 "register_operand" "r")
> - (const_int bytepick_w_ashift_amount)]
> + (subreg:SI
> + (ior:DI (subreg:DI (lshiftrt
> + (match_operand:SI 1 "register_operand" "r")
> + (const_int )) 0)
> + (subreg:DI (ashift
> + (match_operand:SI 2 "register_operand" "r")
> + (const_int bytepick_w_ashift_amount)) 0)) 0)))]
> "TARGET_64BIT"
> "bytepick.w\t%0,%1,%2,"
> [(set_attr "mode" "SI")])
> diff --git a/gcc/testsuite/gcc.target/loongarch/sign-extend-bitwise.c
> b/gcc/testsuite/gcc.target/loongarch/sign-extend-bitwise.c
> new file mode 100644
> index 000..5753ef69db2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/sign-extend-bitwise.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mabi=lp64d -O2" } */
> +/* { dg-final { scan-assembler-not "slli.w\t\\\$r\[0-9\]+,\\\$r\[0-9\]+,0" }
> } */
> +
> +struct pmop
> +{
> + unsigned int op_pmflags;
> + unsigned int op_pmpermflags;
> +};
> +unsigned int PL_hints;
> +
> +struct pmop *pmop;
> +void
> +Perl_newPMOP (int type, int flags)
> +{
> + if (PL_hints & 0x0010)
> + pmop->op_pmpermflags |= 0x0001;
> + if (PL_hints & 0x0004)
> + pmop->op_pmpermflags |= 0x0800;
> + pmop->op_pmflags = pmop->op_pmpermflags;
> +}
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
_rtx (DImode);
> + emit_insn (gen_addsi3_extended (t, operands[1], operands[2]));
AFAIK if !TARGET_64BIT a DImode should be actually a pair of hardware
registers, but addsi3_extended don't output such a pair so this seems
invalid...
> + t = gen_lowpart (SImode, t);
> +
On Fri, 2024-01-05 at 20:45 +0800, chenglulu wrote:
>
> 在 2024/1/5 下午7:55, Xi Ruoyao 写道:
> > On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
> > > On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> > > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > > &
SA_HAS_DIV32 etc. in the code base? It seems some of them are not
replaced.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
> On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> >
> > 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > > > bool
> > > > loongarch_ex
ve_target_loongarch_sx] ||" because SIMD
requires hard float.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
>
> 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > > bool
> > > loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> > > {
> > &g
ive me several hours trying to implement this...
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
_effective_target_s390_vx])
> > +|| ([istarget riscv*-*-*]
> > + && [check_effective_target_riscv_v])
>
> Unless I'm missing something, we have copysign in the scalar
> floating-point ISAs as well. So I think this should be
>
> || ([istarget riscv*-*-*]
> && [check_effective_target_hard_float])
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
as possible. Assuming the rest is ACK'd for the trunk we'll put it into
> the list of optimizations enabled by -O2.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Thu, 2024-01-04 at 11:58 +0800, chenglulu wrote:
>
> 在 2024/1/4 上午11:51, Xi Ruoyao 写道:
> > On Wed, 2023-12-27 at 16:46 +0800, Lulu Cheng wrote:
> > > +(define_insn "movdi_pcrel64"
> > > + [(set (match_operand:DI 0 "register_op
perand:DI 2 "register_operand "="))]
And use
gen_movdi_pcrel64 (operands[0], operands[1], gen_reg_rtx(DImode))
in expand.
> + "TARGET_64BIT"
> + "la.local %0,$r15,%1"
> + [(set_attr "mode" "DI")
> + (set_attr "length" "5")])
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Wed, 2024-01-03 at 16:24 +0800, chenglulu wrote:
> LGTM!
>
> Thanks!
Pushed r14-6890.
FWIW sometimes tree optimizer still fails to emit .reduc_f{max,min} or
it emits them sub-optimally. I've commented in PR112457 but maybe I
should've created a new ticket...
> 在 2024/1/1 上午3:1
We already had smin/smax RTL pattern using vfmin/vfmax instructions.
But for smin/smax, it's unspecified what will happen if either operand
contains any NaN operands. So we would not vectorize the loop with
-fno-finite-math-only (the default for all optimization levels expect
-Ofast).
But,
On Sat, 2023-12-30 at 20:25 +0800, Xi Ruoyao wrote:
> On Sat, 2023-12-30 at 12:15 +, Richard Sandiford wrote:
> > This shouldn't be necessary. The test does:
> >
> > for (int i = 0; i < n; i += 2)
> > {
> > x0 = __builtin_fmin (x0, ptr[i + 0]
duc_fmin_scal_*?
> If so, we probably need a new target selector for fmin/fmax reduction.
Let me try if the [x]vf{min,max} instructions are IEEE-conform. They've
still not released the volume 2 of the instruction manual so I can only
try...
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
gcc/ChangeLog:
* config/loongarch/loongarch.md (bstrins__for_ior_mask):
For the condition, remove unneeded trailing "\" and move "&&" to
follow GNU coding style. NFC.
---
Pushed as obvious.
gcc/config/loongarch/loongarch.md | 4 ++--
1 file changed, 2 insertions(+), 2
Pushed v4 as attached, with the format issues fixed and a minor
adjustment in the commit message ("define_insn_and_split" is changed to
"define_insn_and_rewrite" to match the actual change).
On Fri, 2023-12-29 at 19:55 +0800, Xi Ruoyao wrote:
> On Fri, 2023-12-29 at 15:57
> + return symbolic_pcrel_operand (op, Pmode) ||
> > +symbolic_pcrel_offset_operand (op, Pmode);
> > +})
> > +
> >
> Symbol '||' It shouldn't be at the end of the line.
Indeed.
>
> + return symbolic_pcrel_operand (op, Pmode)
> + || symbolic_pcrel_offset_operand (op, Pmode);
>
> Others LGTM.
> Thanks!
>
> /* snip */
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
The problem with peephole2 is it uses a naive sliding-window algorithm
and misses many cases. For example:
float a[1];
float t() { return a[0] + a[8000]; }
is compiled to:
la.local$r13,a
la.local$r12,a+32768
fld.s $f1,$r13,0
fld.s $f0,$r12,-768
> rperm));
> + tmp = gen_rtx_SUBREG (E_V4DImode, d->target, 0);
Likewise.
> + emit_move_insn (tmp, sel);
> + break;
> + case E_V8SFmode:
> + sel = gen_rtx_CONST_VECTOR (E_V8SImode, gen_rtvec_v (d-
> >nelt,
> +
ymbol_ref:DI ("*.LANCHOR0") [flags 0x182])) [0 S1
> A8]))) "volatile.c":5:11 -1
> (nil))
>
> The volatile property of the mem here is gone, so the test fails.
Phew. I guess I couldn't reproduce it because I have Jeff's ext-dce
patch in my local repo, which removed the zero_extend...
I'll rework this patch.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
The GCC internal doc says:
X might be a pseudo-register or a 'subreg' of a pseudo-register,
which could either be in a hard register or in memory. Use
'true_regnum' to find out; it will return -1 if the pseudo is in
memory and the hard register number if it is in a register.
The problem with peephole2 is it uses a naive sliding-window algorithm
and misses many cases. For example:
float a[1];
float t() { return a[0] + a[8000]; }
is compiled to:
la.local$r13,a
la.local$r12,a+32768
fld.s $f1,$r13,0
fld.s $f0,$r12,-768
On Mon, 2023-12-25 at 10:08 +0800, chenglulu wrote:
>
> 在 2023/12/24 下午8:59, Xi Ruoyao 写道:
> > On Sat, 2023-12-23 at 18:47 +0800, Xi Ruoyao wrote:
> > > On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote:
> > > > On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote:
gt; + "&& true"
> [(set (match_dup 0) (match_dup 1))
> (set (zero_extract:GPR (match_dup 0) (match_dup 2) (match_dup 4))
> (match_dup 3))]
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Sat, 2023-12-23 at 18:47 +0800, Xi Ruoyao wrote:
> On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote:
> > On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote:
> > > > The performance drop has nothing to do with this patch. I found that
> > > > the h264 performa
On Sun, 2023-12-24 at 01:04 +0800, Xi Ruoyao wrote:
> On Sun, 2023-12-24 at 00:56 +0800, Xi Ruoyao wrote:
> > On Sat, 2023-12-23 at 15:00 +0800, chenglulu wrote:
> > > Hi,
> > >
> > > This patch will cause the following tests to fail:
> > >
> >
gcc/ChangeLog:
* config/loongarch/loongarch.md (rotl3):
New define_expand.
* config/loongarch/simd.md (vrotl3): Likewise.
(rotl3): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/rotl-with-rotr.c: New test.
*
On Sun, 2023-12-24 at 00:56 +0800, Xi Ruoyao wrote:
> On Sat, 2023-12-23 at 15:00 +0800, chenglulu wrote:
> > Hi,
> >
> > This patch will cause the following tests to fail:
> >
> > +FAIL: gcc.dg/vect/pr97081-2.c (internal compiler error: in extract_insn,
> &
ence may be caused by a different binutils version or some
other changes in GCC. I'll figure it out...
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote:
> On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote:
> > > The performance drop has nothing to do with this patch. I found that the
> > > h264 performance compiled
> > > by r14-6787 compared to r14-6421 dropped
here is a problem. My regression test has the following two fail
> items.(based on r14-6787)
> +FAIL: gcc.dg/cpp/_Pragma3.c (test for excess errors)
> +FAIL: gcc.dg/pr86617.c scan-rtl-dump-times final "mem/v" 6
Strange. I didn't see them on r14-6650 (with or without the patch).
--
e new
define_insn_and_split produces a better result instead of solely relying
on define_insn_and_split?
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
Ping :).
On Tue, 2023-12-12 at 14:47 +0800, Xi Ruoyao wrote:
> The problem with peephole2 is it uses a naive sliding-window algorithm
> and misses many cases. For example:
>
> float a[1];
> float t() { return a[0] + a[8000]; }
>
> is compiled to:
>
101 - 200 of 970 matches
Mail list logo