If mask is a constant with value ((1 << N) - 1) << M we can perform this
optimization.
gcc/ChangeLog:
PR target/111252
* config/loongarch/loongarch-protos.h
(loongarch_pre_reload_split): Declare new function.
(loongarch_use_bstrins_for_ior_with_mask): Likewise.
stdc++-v3/configure.host
> +++ b/libstdc++-v3/configure.host
> @@ -315,7 +315,10 @@ esac
> # Set any OS-dependent and CPU-dependent bits.
> # THIS TABLE IS SORTED. KEEP IT THAT WAY.
> case "${host}" in
> - *-*-linux* | *-*-uclinux*)
> + loongarch*)
> +
ldptr.w $r6,$r12,8
... ... (no SIMD instructions)
Is this a bug in the driver or I missed something?
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Tue, 2023-09-05 at 20:01 +0800, chenglulu wrote:
>
> 在 2023/9/5 下午7:51, Xi Ruoyao 写道:
> > On Thu, 2023-08-31 at 20:48 +0800, Yang Yujie wrote:
> > > /* Note: optimize_size may vary across functions,
> > > while -m[no]-memcpy imposes a global c
, /* TARGET_EXPLICIT_RELOCS */
| ^~~
| TARGET_EXPLICIT_RELOCS
Why this is removed? If this is an unintentionally change I'll add it
back.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
float"
> diff --git a/gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c
> b/gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c
> new file mode 100644
> index 000..8fb04be8ff5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler-times {stx\..\t\$r0} 2 } } */
> +
> +extern float arr_f[];
> +extern double arr_d[];
> +
> +void
> +test_f (int base, int index)
> +{
> + arr_f[base + index] = 0.0;
> +}
> +
> +void
> +test_d (int base, int index)
> +{
> + arr_d[base + index] = 0.0;
> +}
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
| 333 +-
> gcc/doc/md.texi | 11 +
> 17 files changed, 28645 insertions(+), 280 deletions(-)
> create mode 100644 gcc/config/loongarch/lasx.md
> create mode 100644 gcc/config/loongarch/lasxintrin.h
> create mode 100644 gcc/config/loongarch/lsx.md
> create mode 100644 gcc/config/loongarch/lsxintrin.h
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
_move (op0_lo, op1_lo);
> + emit_insn (gen_insvdi (op0_hi, GEN_INT (63), GEN_INT (0), op1_hi));
> + DONE;
> + }
> +})
Please remove this part too, for now. I'm trying to figure out a more
generic fix, and if I fail we can add this part later.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
# 0xffef
and $r12,$r12,$r4
andi$r5,$r5,16
or $r12,$r12,$r5
slli.w $r4,$r12,0
jr $r1
.cfi_endproc
But the optimal implementation should be:
bstrpick.w $r4, $r4, 4, 4
bstrins.w $r5, $r4, 4, 4
or $r5, $r4, $r0
So to me we should fix the general case instead. Please hold this part
(you can commit the remains of the patch w/o the loongarch.md change for
now), and I'll try to fix the general case.
Created https://gcc.gnu.org/PR111252 for tracking the issue.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
quot;-D_FORTIFY_SOURCE_FAKE", and the second strcmp
will not match "-D_GLIBCXX_ASSERTIONS=1".
> + }
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
we should add -march=la64-baseline (or another name?) as the
"bottom line" of a LA64 CPU. Currently the definition of -
march=loongarch64 includes unaligned access and 64-bit FP support, so
IMO we should have a baseline definition if we need to support something
"below" loongarch64.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
quot;${target}" in
> ;;
> esac
>
> + # Use mstrict-align for building libraries if
> --with-strict-align-lib is given.
> + loongarch_multilib_list_make="${loongarch_mult
>
>
> and the corresponding Info file says:
>
> This is ../info/emacs, produced by makeinfo version 4.8 from emacs.texi.
>
> So I'm not sure what exactly is the feature that requires Texinfo 6.8.
> What am I missing?
FWIW I tried building Binutils-2.41 with Texinfo 6.7 and it built
successfully.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
_BIG_ENDIAN
> ? 0
> : (GET_MODE_SIZE (int_mode)
> - GET_MODE_SIZE (narrow_mode_iter)));
> *pop0 = adjust_address_nv (op0, narrow_mode_iter, offset);
> - *pop1 = GEN_INT (n);
> + *pop1 = gen_int_mode (n, narrow_mode_iter);
> return adjusted_code;
> }
> }
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
l = PL_savestack_ix;
> + int paren_elems_to_push = (PL_regsize - parenfloor) * 4;
> + int p;
> +
> + if (paren_elems_to_push < 0)
> + Perl_croak ("panic: paren_elems_to_push < 0");
> +
> + if (PL_savestack_ix + (paren_elems_to_push + 6) > PL_savestack_max)
> + Perl_savestack_grow_cnt (paren_elems_to_push + 6);
> +
> + return retval;
> +}
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Thu, 2023-08-24 at 11:40 +0800, Xi Ruoyao via Gcc-patches wrote:
> On Thu, 2023-08-24 at 11:13 +0800, Chenghui Pan wrote:
> > - Add dg-skip-if for loongarch*-*-* in vshuf test in g++.dg/torture, because
> > vshuf/xvshuf insn's result is undefined when 6 or 7 bit of vector
tcases.
I'm almost sure this is wrong. You need to fix the code generation so
__builtin_shuffle will always generate something defined on LoongArch,
instead of covering up the issue.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
ither LEU or GEU. */
> > @@ -12051,15 +12052,15 @@ simplify_compare_const (enum rtx_code code,
> > machine_mode mode,
> > HOST_WIDE_INT_PRINT_HEX ") to (MEM %s "
> > HOST_WIDE_INT_PRINT_HEX ").\n", GET_MODE_NAME (int_mode),
> > GET_MODE
* src/loongarch64/ffitarget.h: New file.
> * src/loongarch64/sysv.S: New file.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
ber it."
So when I use asm(name), the compiler has no obligation to guarantee
that it will ever work like a normal variable after a function call.
But I still need to verify that the compiler correctly understands only
the low 64 bits of the vector register is saved. I'll try to make
anothe
On Fri, 2023-08-18 at 15:05 +0800, Xi Ruoyao via Gcc-patches wrote:
> On Fri, 2023-08-18 at 14:58 +0800, Xi Ruoyao via Gcc-patches wrote:
> > On Fri, 2023-08-18 at 14:39 +0800, chenxiaolong wrote:
> > > 在 2023-08-17四的 15:08 +,Joseph Myers写道:
> > > > On Thu,
On Fri, 2023-08-18 at 14:58 +0800, Xi Ruoyao via Gcc-patches wrote:
> On Fri, 2023-08-18 at 14:39 +0800, chenxiaolong wrote:
> > 在 2023-08-17四的 15:08 +,Joseph Myers写道:
> > > On Thu, 17 Aug 2023, Xi Ruoyao via Gcc-patches wrote:
> > >
>
On Fri, 2023-08-18 at 14:39 +0800, chenxiaolong wrote:
> 在 2023-08-17四的 15:08 +,Joseph Myers写道:
> > On Thu, 17 Aug 2023, Xi Ruoyao via Gcc-patches wrote:
> >
> > > So I guess we just need
> > >
> > > builtin_define ("__builtin_fabsq=__builtin_fa
really need the
"q" builtins.
Joseph: the problem here is many customers of LoongArch CPUs wish to
compile their old code with minimal change. Is it acceptable to add
these builtin_define's like rs6000-c.cc? Note "a new architecture" does
not mean we'll only compile post-C2x-era programs onto it.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
| 12 +-
> gcc/config/loongarch/lsx.md | 4481 ++
> gcc/config/loongarch/lsxintrin.h | 5181
> gcc/config/loongarch/predicates.md | 333 +-
> gcc/doc/md.texi | 11 +
> 26
0644
> --- a/libgcc/config/loongarch/t-softfp-tf
> +++ b/libgcc/config/loongarch/t-softfp-tf
> @@ -1,3 +1,6 @@
> softfp_float_modes += tf
> softfp_extensions += sftf dftf
> softfp_truncations += tfsf tfdf
> +#Used to implement a special 128-bit function with a q suffix
> +LIB2ADD += $
On Mon, 2023-08-14 at 19:16 +0800, Xi Ruoyao wrote:
> On Mon, 2023-08-14 at 18:18 +0800, Yujie Yang wrote:
> > On Mon, Aug 14, 2023 at 03:48:53PM +0800, Xi Ruoyao wrote:
> > > On Mon, 2023-08-14 at 15:37 +0800, Yujie Yang wrote:
> > > > On Mon, Aug 14, 2023 at 01:
: Ditto.
> * config/loongarch/loongarch-str.h (OPTSTR_LSX): Ditto.
> * config/loongarch/loongarch.opt: Ditto.
/* snip */
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Mon, 2023-08-14 at 18:18 +0800, Yujie Yang wrote:
> On Mon, Aug 14, 2023 at 03:48:53PM +0800, Xi Ruoyao wrote:
> > On Mon, 2023-08-14 at 15:37 +0800, Yujie Yang wrote:
> > > On Mon, Aug 14, 2023 at 01:38:40PM +0800, Xi Ruoyao wrote:
> > > > On Mon, 2023-08-14 at
On Mon, 2023-08-14 at 16:57 +0800, Yujie Yang wrote:
> On Mon, Aug 14, 2023 at 04:49:11PM +0800, Xi Ruoyao wrote:
> > On Mon, 2023-08-14 at 16:44 +0800, Yujie Yang wrote:
> > > I assume we all want:
> > >
> > > (1) -mlasx -mlsx -> enable LSX and LASX
&g
x86 does this correct;
$ echo __AVX__ + __AVX2__ | LANG= cpp -E -mno-avx -mavx2
# 0 ""
# 0 ""
# 0 ""
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 0 "" 2
# 1 ""
1 + 1
so there must be a way to handle this...
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Mon, 2023-08-14 at 15:37 +0800, Yujie Yang wrote:
> On Mon, Aug 14, 2023 at 01:38:40PM +0800, Xi Ruoyao wrote:
> > On Mon, 2023-08-14 at 11:57 +0800, Yang Yujie wrote:
> >
> > > However, for LoongArch, we do not want such a "toplevel" library
> > > in
On Mon, 2023-08-14 at 13:58 +0800, Xi Ruoyao via Gcc-patches wrote:
> On Mon, 2023-08-14 at 11:57 +0800, Yang Yujie wrote:
> > * Support options for LoongArch SIMD extensions:
> > new configure options --with-simd={none,lsx,lasx};
> > new driver options -m[no]-l[a]sx /
no real reason to make -mlasx and
-msimd=lasx two different things).
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Mon, 2023-08-14 at 13:38 +0800, Xi Ruoyao wrote:
>
> > However, for LoongArch, we do not want such a "toplevel" library
> > installation since the default ABI may change. We expect all
> > multilib variants of libraries to be installed to their designated
>
uration? To me with --
disable-configuration everything should be still in the toplevel
directory, not any sub-directory.
/* snip */
> ChangeLog:
>
> * config-ml.in: add loongarch support. Allow overriding
Use a tab, not 8 white spaces. Likewise for all patches in the series.
>
garch64|la464"
I think we can remove tune_pattern completely. There is no reason to
limit --with-tune setting based on --with-arch setting.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
RCH", la_target.cpu_arch);
> LARCH_CPP_SET_PROCESSOR ("_LOONGARCH_TUNE", la_target.cpu_tune);
>
> + LARCH_CPP_SET_PROCESSOR ("__loongarch_arch", la_target.cpu_arch);
> + LARCH_CPP_SET_PROCESSOR ("__loongarch_tune", la_target.cpu_tune);
> +
> /* Base architecture / ABI. */
> if (TARGET_64BIT)
> {
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
configure options --with-simd={none,lsx,lasx};
> new driver options -m[no]-l[a]sx / -msimd={none,lsx,lasx}.
What's the relationship between -mlasx and -msimd=lasx? What will
happen if the user specifies -mlasx -msimd=none or -mlasx -msimd=lsx?
--
Xi Ruoyao
School of Aerospace Science a
mma),
> ,$(TM_MULTILIB_CONFIG)),\
> $(call gen_mlib_spec,$(subst /, ,$(mlib
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
t_op & GET_MODE_MASK (int_mode),
> + GET_RTX_NAME (adjusted_code), n);
> }
> poly_int64 offset = (BYTES_BIG_ENDIAN
> ? 0
> : (GET_MODE_SIZE (int_mode)
> - GET_MODE_SIZE (narrow_mode_iter)));
> *pop0 = adjust_address_nv (op0, narrow_mode_iter, offset);
> - *pop1 = GEN_INT (n);
> + *pop1 = gen_int_mode (n, narrow_mode_iter);
> return adjusted_code;
> }
> }
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
t; +mscatter
> +Target
> +Enable vectorization for scatter instruction.
> +
> mpreferred-stack-boundary=
> Target RejectNegative Joined UInteger
> Var(ix86_preferred_stack_boundary_arg)
> Attempt to keep stack aligned to this power of 2.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
> (set (match_dup 1)
> (match_operand:GPR 2 "register_operand" "r"))]
> - ""
> + "TARGET_64BIT"
> "amswap%A3.\t%0,%z2,%1"
> [(set (attr "length") (const_int 8))])
>
> @@ -182,7 +182,7 @@
> [(match_operand:QI 0 "register_operand" "") ;; bool output
> (match_operand:QI 1 "memory_operand" "+ZB") ;; memory
> (match_operand:SI 2 "const_int_operand" "")] ;; model
> - ""
> + "TARGET_64BIT"
> {
> /* We have no QImode atomics, so use the address LSBs to form a mask,
> then use an aligned SImode atomic. */
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
8.
Unfortunately for LP64 ABI _ABILP64 is already a part of public API.
I've tried to raise a deprecation warning for them, but it seems doing
so needs a major change in libcpp... However ILP32 ABI is "fresh new"
so we should take the advantage to remove the historic burden.
--
Xi
rom implementing -mabi=ilp32d -
march=loongarch64 and they should be fixed. They are not our excuse to
blindly "simulate" what RISC-V has.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Tue, 2023-08-08 at 10:24 +0800, Xi Ruoyao wrote:
And I think this way to implement these functions (using libgcc calls)
is not the best.
On 64-bit LoongArch a __float128 is stored in a pair of GPR, so
operations like copysignq and absq can be implemented much more
efficiently by expanding
t; +}
Same logic error. And this seems exactly same as nanq, the analogous is
definitely wrong because __builtin_nanq should return a quiet NaN, but
__builtin_nansq should return a signaling NaN.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
only used once?
> + tree type,ftype;
> + tree const_string_type
> +
> =build_pointer_type(build_qualified_type(char_type_node,TYPE_QUAL_CONST));
Really bad format. In GNU coding standard you should have a white space
after '=', and before '(', etc. Please fix the formatting everywhere.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2023-07-21 at 16:58 +0300, Alexander Monakov wrote:
>
> On Fri, 21 Jul 2023, Xi Ruoyao via Gcc-patches wrote:
>
> > Perhaps -ffp-contract=on (not off) is enough to fix the issue (if you
> > are building GCC 14 snapshot). The default is "fast" (if no -
ger floating
> point rounding inaccuracies?)
It's possible that the test itself is flaky. Can you provide some
detail about how it fails?
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
t guard it with #ifdef __GNUC__.
And IMO it's just hiding the real problem.
We need more info of the "particular machine". Is this a hardware bug
(i.e. the machine violates the AArch64 spec) or a GCC code generation
issue? Or should we generally use -ffp-contract=off in BOOT_CFLAGS?
-
If the host triple and the target triple are different but the host is
LoongArch, in some cases --with-arch=native can be useful. For example,
if we are bootstrapping a loongarch64-linux-musl toolchain on a
Glibc-based system and we don't intend to use the toolchain on other
machines, we can use
re system with these GCC patches and -mlasx in
Aug (after Glibc-2.38 release) as a field test too.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
et 1/2 reviewed and committed first.
> Also, can I tweak the commit message without being approved again,
> such as attaching the benchmark result?
Yes, as long as the ChangeLog is still correct (the Git hook will reject
a push with wrong ChangeLog format anyway).
--
Xi Ruoyao
School of Aerospace S
c-patches@gcc.gnu.org for a review, see
https://gcc.gnu.org/contribute.html for the details. Generally we
consider patches attached in bugzilla as drafts.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
stsuite/g++.dg/vect/pr110557.cc:12:8: warning: width of
> 'Item::y' exceeds its type
Ah sorry, I didn't consider ports with 32-bit long.
The attached patch should fix the issue. It has been tested and pushed
r14-2427 and r13-7555.
--
Xi Ruoyao
School of Aerospace Science
On Mon, 2023-07-10 at 10:33 +, Richard Biener wrote:
> On Fri, 7 Jul 2023, Xi Ruoyao wrote:
>
> > If a bit-field is signed and it's wider than the output type, we
> > must
> > ensure the extracted result sign-extended. But this was not handled
> > c
If a bit-field is signed and it's wider than the output type, we must
ensure the extracted result sign-extended. But this was not handled
correctly.
For example:
int x : 8;
long y : 55;
bool z : 1;
The vectorized extraction of y was:
vect__ifc__49.29_110 =
MEM [(struct
' to 'ref_sext'? As
> they
> are named it suggest they apply to the same so I originally thought sign_ext
> should be widening && !TYPE_UNSIGNED.
I'll rename them.
I'll send a v2 after testing it.
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
If a bit-field is signed and it's wider than the output type, we must
ensure the extracted result sign-extended. But this was not handled
correctly.
For example:
int x : 8;
long y : 55;
bool z : 1;
The vectorized extraction of y was:
vect__ifc__49.29_110 =
MEM [(struct
25 files changed, 28723 insertions(+), 290 deletions(-)
> create mode 100644 gcc/config/loongarch/lasx.md
> create mode 100644 gcc/config/loongarch/lasxintrin.h
> create mode 100644 gcc/config/loongarch/lsx.md
> create mode 100644 gcc/config/loongarch/lsxintrin.h
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2023-06-30 at 10:16 +0800, Chenghui Pan wrote:
> +(define_c_enum "unspec" [
> + UNSPEC_LSX_ASUB_S
> + UNSPEC_LSX_VABSD_U
> + UNSPEC_LSX_VAVG_S
/* ... */
To me many of them can be modeled using RTL templates, instead of an
unspec.
--
Xi Ruoyao
Schoo
P64D anymore), or we add some special switch for
it (like x86's -msseregparm and sseregparm attribute).
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
+
> gcc/config/loongarch/predicates.md | 333 +-
> 25 files changed, 28723 insertions(+), 290 deletions(-)
> create mode 100644 gcc/config/loongarch/lasx.md
> create mode 100644 gcc/config/loongarch/lasxintrin.h
> create mode 100644 gcc/config/loongarch/lsx.md
> create mode 100644 gcc/config/loongarch/lsxintrin.h
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2023-06-30 at 04:08 +0800, Xi Ruoyao wrote:
> On Thu, 2023-06-29 at 16:01 -0400, Marek Polacek via Gcc-patches wrote:
> > These tests fail when the testsuite is executed with -fstack-
> > protector-strong.
> > To avoid this, this patch adds -fno-stack-pr
> @@ -1,5 +1,5 @@
> /* { dg-do compile } */
> -/* { dg-options "-O3" } */
> +/* { dg-options "-O3 -fno-stack-protector" } */
>
> static inline void memset_s(void* s, int n) {
> volatile unsigned char * p = s;
>
> base-commit: 070a6bf0bdc6761ad77ac97404c98f00a7007d54
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
@
> +// { dg-do compile { target c++11 } }
> +
> +#include
> +
> +using namespace __gnu_test;
> +
> +#define SA(X) static_assert((X),#X)
> +
> +// Positive tests.
> +SA(__is_const(const int));
> +SA(__is_const(const volatile int));
> +SA(__is_const(cClassType));
> +SA(__is_const(cvClassType));
> +
> +// Negative tests.
> +SA(!__is_const(int));
> +SA(!__is_const(volatile int));
> +SA(!__is_const(ClassType));
> +SA(!__is_const(vClassType));
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
e link should be "../conduct.html" :).
> +
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
to close it.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
> [(set (pc) (match_operand 0 "register_operand"))]
> > ""
> > @@ -2905,7 +2909,7 @@ (define_expand "indirect_jump"
> > })
> >
> > (define_insn "@indirect_jump"
> > - [(set (pc) (match_operand:P 0 "register_operand" "r"))]
> > + [(set (pc) (match_operand:P 0 "register_operand" "e"))]
> > ""
> > "jr\t%0"
> > [(set_attr "type" "jump")
> > @@ -2928,7 +2932,7 @@ (define_expand "tablejump"
> >
> > (define_insn "@tablejump"
> > [(set (pc)
> > - (match_operand:P 0 "register_operand" "r"))
> > + (match_operand:P 0 "register_operand" "e"))
> > (use (label_ref (match_operand 1 "" "")))]
> > ""
> > "jr\t%0"
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
Pushed r14-1839.
On Thu, 2023-06-15 at 09:12 +0800, Lulu Cheng wrote:
> LGTM! Thanks!
>
> 在 2023/6/14 上午8:43, Xi Ruoyao 写道:
> > The LA464 micro-architecture is sensitive to alignment of code. The
> > Loongson team has benchmarked various combinations of function, the
On Wed, 2023-06-14 at 09:55 +0800, Jiufu Guo wrote:
> Hi,
>
> Xi Ruoyao writes:
>
> > On Tue, 2023-06-13 at 20:23 +0800, Jiufu Guo via Gcc-patches wrote:
> >
> > > Compare with previous version, this addes ChangeLog and removes
> > > const_anchor part
The LA464 micro-architecture is sensitive to alignment of code. The
Loongson team has benchmarked various combinations of function, the
results [1] show that 16-byte label alignment together with 32-byte
function alignment gives best results in terms of SPEC score.
Add a mtune-based table-driven
c.gnu.org/bugzilla/show_bug.cgi?id=104843 and the thread
beginning at
https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591470.html. If
you want to use it for rs6000 I guess you need to fix it first...
To me const_anchor needs a complete rework but I don't want to spend my
time on it.
--
On Tue, 2023-05-30 at 09:30 +0800, Lulu Cheng wrote:
>
> 在 2023/5/29 下午2:09, Xi Ruoyao 写道:
> > On Tue, 2023-04-18 at 21:06 +0800, Lulu Cheng wrote:
> > > Hi, ruoyao:
> > >
> > > Thank you so much for making this submission. But we are testing
> &
Ping (in hopes that someone can review before the weekend).
On Sat, 2023-06-03 at 19:25 +0800, Xi Ruoyao wrote:
> We used to skip ifunc check when CX16 is available. But now we use
> CX16+AVX+Intel/AMD for the "perfect" 16b load implementation, so CX16
> alone is not
On Sat, 2023-06-03 at 14:53 +0200, Bernhard Reutner-Fischer wrote:
> On 3 June 2023 13:25:32 CEST, Xi Ruoyao via Gcc-patches
> wrote:
>
> > There seems no good way to check if the CPU is Intel or AMD from
> > the built-in macros (maybe we can check every known mod
We used to skip ifunc check when CX16 is available. But now we use
CX16+AVX+Intel/AMD for the "perfect" 16b load implementation, so CX16
alone is not a sufficient reason not to use ifunc (see PR104688).
This causes a subtle and annoying issue: when GCC is built with a
higher -march= setting in
a -falign-
functions= value for the build :).
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Wed, 2023-05-24 at 18:07 +0800, Lulu Cheng wrote:
>
> 在 2023/5/24 下午5:25, Xi Ruoyao 写道:
> > On Wed, 2023-05-24 at 16:47 +0800, Lulu Cheng wrote:
> > > 在 2023/5/24 下午2:45, Xi Ruoyao 写道:
> > > > On Wed, 2023-05-24 at 14:04 +0800, Lulu Cheng wrote:
> > >
On Wed, 2023-05-24 at 16:47 +0800, Lulu Cheng wrote:
>
> 在 2023/5/24 下午2:45, Xi Ruoyao 写道:
> > On Wed, 2023-05-24 at 14:04 +0800, Lulu Cheng wrote:
> > > An empty struct type that is not non-trivial for the purposes of
> > > calls
> > > will be treated
pass "Test" via registers, we may only allocate the registers
for Test::a and Test::b, and complete ignore Test::empty because there
is no addresses of registers. Is this correct or not?
On Wed, 2023-05-24 at 14:45 +0800, Xi Ruoyao via Gcc-patches wrote:
> On Wed, 2023-05-24 at 14:0
llvm.org/D132285). So we should update the
spec here, instead of changing every implementation.
The C++ standard treats the empty struct as size 1 for ensuring the
semantics of pointer comparison operations. When we pass it through the
registers, there is no need to really consider the empty f
can only a dedicate a certain amount of
> the day to reviews. And reviewing patches can be time-consuming in
> itsself.
>
> So sometimes a patch will get a review within the day. Sometimes it
> will take a bit longer. The fact that a patch doesn't get a response
> within one wor
On Wed, 2023-05-10 at 22:02 +0200, Thomas Koenig wrote:
> On 10.05.23 21:29, Bernhard Reutner-Fischer via Fortran wrote:
> > On Mon, 27 Jun 2022 14:10:36 +0800
> > Xi Ruoyao wrote:
> >
> > > fgrep has been deprecated in favor of grep -F for a long time, and th
ry long. I will reply the result as soon as the test results
> > come out.:-)
> >
> Oh, I got. Thanks very much for all the tests and take your time!
Sorry if it's noisy, but I hope there is some (maybe preliminary)
result: now I finally have some spare time to rebuild the system with
GCC 13 and I'd like to use some -falign-functions= in my CFLAGS :).
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Wed, 2023-04-26 at 21:29 +0800, Xi Ruoyao via Gcc-patches wrote:
> >
> > Do you have any questions about the test cases mentioned by
> > Guo
> > Jie? If there is no problem, modify the test case,
> >
> > I think the code can be merged into the mai
On Sat, 2023-04-29 at 12:05 -0600, Jeff Law wrote:
>
>
> On 4/15/23 06:01, Xi Ruoyao via Gcc-patches wrote:
> > This prevents a spurious message building a cross-compiler when
> > target
> > libc is not installed yet:
> >
> > cc1: error: no i
; > > + }
> > > > +}
> >
> > I think the test case cannot fully reflect the optimization effect
> > of
> > the current patch,
> >
> > because even without the patch, -O -fshrink-wrap will still perform
> > architecture independent optimization.
> >
> > This patch considers architecture related registers as finer grained
> > optimization for shrink wrapping,
> >
> > I think a test case like the one below is more suitable:
> >
> >
> > int foo(int x)
> > {
> > if (x)
> > {
> > __asm__ ("":::"s0","s1");
> > return x;
> > }
> >
> > __asm__ ("":::"s2","s3");
> > return 0;
> > }
> >
> > Otherwise LGTM, thanks!
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
This commit implements the target macros for shrink wrapping of function
prologues/epilogues shrink wrapping on LoongArch.
Bootstrapped and regtested on loongarch64-linux-gnu. I don't have an
access to SPEC CPU so I hope the reviewer can perform a benchmark to see
if there is real benefit.
uot;st\\.h" 1 } } */
> > +/* { dg-final { scan-assembler-times "st\\.b" 1 } } */
> > +
> > +extern char a[], b[];
> > +void test() { __builtin_memcpy(a, b, 15); }
> > diff --git a/gcc/testsuite/gcc.target/loongarch/pr109465-2.c
> > b/gcc/testsuite/gcc.target/loongarch/pr109465-2.c
> > new file mode 100644
> > index 000..703eb951c6d
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/loongarch/pr109465-2.c
> > @@ -0,0 +1,9 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -mabi=lp64d -mstrict-align" } */
> > +/* { dg-final { scan-assembler-times "st\\.d|stptr\\.d" 1 } } */
> > +/* { dg-final { scan-assembler-times "st\\.w|stptr\\.w" 1 } } */
> > +/* { dg-final { scan-assembler-times "st\\.h" 1 } } */
> > +/* { dg-final { scan-assembler-times "st\\.b" 1 } } */
> > +
> > +extern long a[], b[];
> > +void test() { __builtin_memcpy(a, b, 15); }
> > diff --git a/gcc/testsuite/gcc.target/loongarch/pr109465-3.c
> > b/gcc/testsuite/gcc.target/loongarch/pr109465-3.c
> > new file mode 100644
> > index 000..d6a80659b31
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/loongarch/pr109465-3.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -mabi=lp64d -mstrict-align" } */
> > +
> > +/* Three loop iterations each contains 4 st.b, and 3 st.b after the
> > loop */
> > +/* { dg-final { scan-assembler-times "st\\.b" 7 } } */
> > +
> > +/* { dg-final { scan-assembler-not "st\\.h" } } */
> > +/* { dg-final { scan-assembler-not "st\\.w|stptr\\.w" } } */
> > +/* { dg-final { scan-assembler-not "st\\.d|stptr\\.d" } } */
> > +
> > +extern char a[], b[];
> > +void test() { __builtin_memcpy(a, b, 15); }
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Tue, 2023-04-18 at 20:03 +0800, Lulu Cheng wrote:
>
> 在 2023/4/18 下午7:48, Xi Ruoyao 写道:
> > On Tue, 2023-04-18 at 19:21 +0800, Lulu Cheng wrote:
> > > 在 2023/4/18 下午5:27, Xi Ruoyao 写道:
> > > > On Mon, 2023-04-10 at 17:45 +0800, Lulu Cheng wrote:
> > >
the result comes out, this patch will
>
> not be merged into the main branch for the time being.
Ok, I'll wait for the result.
>
> Thanks!
>
> 在 2023/4/18 下午8:17, Xi Ruoyao 写道:
> > According to Xuerui's LLVM changeset [1], doing so can make a
> > significant p
On Tue, 2023-04-18 at 20:51 +0800, WANG Xuerui wrote:
>
> On 2023/4/18 20:45, Xi Ruoyao wrote:
> > On Tue, 2023-04-18 at 20:39 +0800, WANG Xuerui wrote:
> > > Hi,
> > >
> > > Thanks for helping confirming on GCC and porting this! I'd never know
> &
On Tue, 2023-04-18 at 20:39 +0800, WANG Xuerui wrote:
> Hi,
>
> Thanks for helping confirming on GCC and porting this! I'd never know
> even GCC lacked this adaptation without someone actually checking... Too
> many things are taken for granted these days.
>
> On 2023/
According to Xuerui's LLVM changeset [1], doing so can make a
significant performace gain.
Bootstrapped and regtested on loongarch64-linux-gnu. Ok for GCC 14?
[1]:https://reviews.llvm.org/D148622
gcc/ChangeLog:
* config/loongarch/loongarch.cc
On Tue, 2023-04-18 at 19:21 +0800, Lulu Cheng wrote:
>
> 在 2023/4/18 下午5:27, Xi Ruoyao 写道:
> > On Mon, 2023-04-10 at 17:45 +0800, Lulu Cheng wrote:
> > > Sorry, it's my question. I still have some questions that I haven't
> > > understood, so I haven't replied
including extracting aggregates and
floating-point values in the va list) and the result seems correct. And
gcc/testsuite/gcc.c-torture/execute/va-arg-*.c should provide a good
enough test coverage.
Is there still something seemly problematic?
>
> 在 2023/4/10 下午5:04, Xi Ruoyao 写道:
> > Ping
Pushed r14-19.
On Tue, 2023-04-04 at 17:09 +0800, Lulu Cheng wrote:
>
> 在 2023/4/4 下午4:38, Xi Ruoyao 写道:
> > 1. Use addu16i.d for TARGET_64BIT and suitable immediates.
> > 2. Split one addition with immediate into two addu16i.d or
> > addi.{d/w}
> > instruction
On Tue, 2023-04-18 at 09:54 +0800, Lulu Cheng wrote:
> Pushed to r14-15.
>
> Due to my reasons, this modification did not catch up with the creation
> of the releases/gcc-13 branch,
>
> can I still submit this modification to releases/gcc-13?:-(
I guess we need a decision f
401 - 500 of 970 matches
Mail list logo