Hi,
Currently move_max follows the tuning feature first, but ideally it
should sync with prefer-vector-width when it is explicitly set to keep
vector move and operation with same vector size.
Bootstrapped/regtested on x86-64-pc-linux-gnu{-m32,}
OK for trunk?
gcc/ChangeLog:
PR
After recent RVV cost model tweak, I found this PR issue has been fixed.
Add testcase and committed.
PR target/112387
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/pr112387.c: New test.
---
.../vect/costmodel/riscv/rvv/pr112387.c | 19 +++
1
The implementation of this patch has some issues. When I compile 521.wrf
with -Ofast -mlasx -flto -muse-movcf2gr, it results in an ICE:
during RTL pass: reload
module_mp_fast_sbm.fppized.f90: In function 'fast_sbm.constprop':
module_mp_fast_sbm.fppized.f90:1369:25: internal compiler error:
Pushed.
PR tree-optimization/110640
* gcc.dg/torture/pr110640.c: New testcase.
---
gcc/testsuite/gcc.dg/torture/pr110640.c | 22 ++
1 file changed, 22 insertions(+)
create mode 100644 gcc/testsuite/gcc.dg/torture/pr110640.c
diff --git
Hi!
On top of the previously posted patch, this simplifies say (x * 16) / (x * 4)
into 4. Unlike the previous pattern, this is something we didn't fold
previously on GENERIC, so I think it shouldn't be all wrapped with #if
GIMPLE. The question whether there should be fold_overflow_warning for
Hi!
The following testcase is optimized just on GENERIC (using
strict_overflow_p = false;
if (TREE_CODE (arg1) == INTEGER_CST
&& (tem = extract_muldiv (op0, arg1, code, NULL_TREE,
_overflow_p)) != 0)
{
if
On Wed, Nov 29, 2023 at 11:43:05AM +, Julian Brown wrote:
> * c-c++-common/gomp/target-enter-data-1.c: Adjust scan output.
struct bar { int num_vectors; double *vectors; };
is 16 bytes only on 64-bit targets, on 32-bit ones it is just 8 bytes,
so the explicit matching of the * 16
Committed, thanks Kito.
Pan
From: Kito Cheng
Sent: Thursday, December 14, 2023 2:45 PM
To: Juzhe-Zhong
Cc: GCC Patches ; Kito Cheng ;
Jeff Law ; Robin Dapp
Subject: Re: [PATCH] RISC-V: Add RVV builtin vectorization cost model
LGTM
Juzhe-Zhong mailto:juzhe.zh...@rivai.ai>> 於 2023年12月14日
週四
LGTM
Juzhe-Zhong 於 2023年12月14日 週四 11:24 寫道:
> This patch fixes PR11153:
>
> ble a1,zero,.L8
> addiw a5,a1,-1
> li a4,4
> addisp,sp,-16
> mv a2,a0
> sext.w a3,a1
> bleua5,a4,.L9
> srliw a4,a3,2
>
I prefer all vector related function registration should be in the same
function groups.
like aarch64:
/* A list of all SVE ACLE functions. */
static CONSTEXPR const function_group_info function_groups[] = {
#define DEF_SVE_FUNCTION_GS(NAME, SHAPE, TYPES, GROUPS, PREDS) \
{ #NAME, ::NAME,
This patch fixes PR11153:
ble a1,zero,.L8
addiw a5,a1,-1
li a4,4
addisp,sp,-16
mv a2,a0
sext.w a3,a1
bleua5,a4,.L9
srliw a4,a3,2
sllia4,a4,4
mv a5,a0
add a4,a4,a0
Hi all,
According to ISE050 published at the end of September, RAO-INT will not
be in Grand Ridge anymore. This patch aims to remove it.
The documentation comes following:
https://cdrdv2.intel.com/v1/dl/getContent/671368
Regtested on x86_64-pc-linux-gnu. Ok for trunk and backport to GCC13?
On Linux/x86_64,
5fdb150cd4bf8f2da335e3f5c3a17aafcbc66dbe is the first bad commit
commit 5fdb150cd4bf8f2da335e3f5c3a17aafcbc66dbe
Author: Julian Brown
Date: Mon Aug 14 12:41:56 2023 +
OpenMP/OpenACC: Rework clause expansion and nested struct handling
caused
FAIL:
On 2023/12/14 4:52, Thomas Schwinge wrote:
> Hi Lipeng!
>
> On 2023-12-12T02:05:26+, "Zhu, Lipeng" wrote:
> > On 2023/12/12 1:45, H.J. Lu wrote:
> >> On Sat, Dec 9, 2023 at 7:25 PM Zhu, Lipeng
> wrote:
> >> > On 2023/12/9 23:23, Jakub Jelinek wrote:
> >> > > On Sat, Dec 09, 2023 at
On 11/27/23 10:58, Patrick Palka wrote:
gcc/cp/ChangeLog:
* cp-tree.h (type_targs_deducible_from): Adjust return type.
* pt.cc (alias_ctad_tweaks): Handle C++23 inherited CTAD.
(inherited_ctad_tweaks): Define.
(type_targs_deducible_from): Return the deduced
On 12/13/23 19:00, Marek Polacek wrote:
On Wed, Dec 13, 2023 at 11:47:37AM -0500, Jason Merrill wrote:
Tested x86_64-pc-linux-gnu, applying to trunk.
-- 8< --
When building an AGGR_INIT_EXPR from a CALL_EXPR, we shouldn't lose location
information.
I think the following should be an obvious
From: Vladimir Mezentsev
This is fixes for releases/gcc-13 for 31109 gprofng not built and installed in
a combined binutils+gcc build
I only cherry-picked 24552056fd5fc677c0d032f54a5cad1c4303d312 and tested my
build.
ChangeLog:
* Makefile.def: Add gprofng module.
*
On Wed, Dec 13, 2023 at 7:59 PM Jakub Jelinek wrote:
>
> On Fri, Dec 08, 2023 at 03:12:00PM +0800, liuhongt wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ready push to trunk.
> >
> > gcc/ChangeLog:
> >
> > PR target/112904
> > * config/i386/mmx.md
在 2023/12/13 下午9:20, Xi Ruoyao 写道:
On Wed, 2023-12-13 at 20:22 +0800, chenglulu wrote:
在 2023/12/10 上午1:03, Xi Ruoyao 写道:
Replace the instruction costs in loongarch_rtx_cost_data constructor
based on micro-benchmark results on LA464 and LA664.
This allows optimizations like "x * 17" to alsl,
Tested x86_64-linux.
Does this look right? Can we do it faster, or simplify it?
-- >8 --
This reduces the overhead of using std::is_trivially_destructible_v and
as a result fixes some recent regressions seen with a non-default
GLIBCXX_TESTSUITE_STDS env var:
FAIL: 20_util/variant/87619.cc
On 12/13/23 02:03, Christoph Müllner wrote:
On Wed, Dec 13, 2023 at 9:22 AM Liao Shihua wrote:
In Scalar Crypto Built-In functions, some require immediate parameters,
But register_operand are incorrectly used in the pattern.
E.g.:
__builtin_riscv_aes64ks1i(rs1,1)
Before:
li
The alpha port recently failed its weekly test due to a lack of a
prototype for the syscall() routine. Fixed thusly and pushed to the trunk.
Jeff
commit acfd33620af3519b84baecedb0eb6618c2f599a6
Author: Jeff Law
Date: Wed Dec 13 17:24:39 2023 -0700
[committed] Minor testsuite fallout
On Wed, Dec 13, 2023 at 11:47:37AM -0500, Jason Merrill wrote:
> Tested x86_64-pc-linux-gnu, applying to trunk.
>
> -- 8< --
>
> When building an AGGR_INIT_EXPR from a CALL_EXPR, we shouldn't lose location
> information.
I think the following should be an obvious fix, so I'll check it in.
--
Alex Coplan writes:
> This patch uses the new force_reload_address routine added by the
> previous patch to fix PR112906.
>
> Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk?
OK, thanks, and sorry for the breakage.
Richard
>
> Thanks,
> Alex
>
> gcc/ChangeLog:
>
> PR
Alex Coplan writes:
> Hi,
>
> In PR112906 we ICE because we try to use force_reg to reload an
> auto-increment address, but force_reg can't do this.
>
> With the aim of fixing the PR by supporting reloading arbitrary
> addresses in pre-RA splitters, this patch generalizes
>
Thanks for the update. The new comments are really nice, and I think
make the implementation much easier to follow.
I was going to say OK with the changes below, but there's one question/
comment near the end about the double list walk.
Alex Coplan writes:
> +// Convenience wrapper around
On 11/24/23 3:28 AM, Kewen.Lin wrote:
>> + int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode);
>
> Is it intentional to keep GET_MODE_SIZE (V16QImode) instead of 16?
> I think if one day NUM_POLY_INT_COEFFS isn't 1 on rs6000 any more,
> we have to add one explicit .to_constant ()
Clean up scan dump failures on linux rv64 vector targets Juzhe mentioned
could be ignored for now. This will help reduce noise and make it more obvious
if a bug or regression is introduced. The failures that are still reported
are either execution failures or failures that are also present on
This patch uses the new force_reload_address routine added by the
previous patch to fix PR112906.
Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk?
Thanks,
Alex
gcc/ChangeLog:
PR target/112906
* config/aarch64/aarch64-sve.md (@aarch64_vec_duplicate_vq_le):
Use
Hi,
In PR112906 we ICE because we try to use force_reg to reload an
auto-increment address, but force_reg can't do this.
With the aim of fixing the PR by supporting reloading arbitrary
addresses in pre-RA splitters, this patch generalizes
lra-constraints.cc:emit_inc and makes it available to the
Hi Lipeng!
On 2023-12-12T02:05:26+, "Zhu, Lipeng" wrote:
> On 2023/12/12 1:45, H.J. Lu wrote:
>> On Sat, Dec 9, 2023 at 7:25 PM Zhu, Lipeng wrote:
>> > On 2023/12/9 23:23, Jakub Jelinek wrote:
>> > > On Sat, Dec 09, 2023 at 10:39:45AM -0500, Lipeng Zhu wrote:
>> > > > This patch try to
On 12/12/23 17:48, Marek Polacek wrote:
On Fri, Dec 08, 2023 at 11:09:15PM -0500, Jason Merrill wrote:
On 12/8/23 16:15, Marek Polacek wrote:
On Fri, Dec 08, 2023 at 12:09:18PM -0500, Jason Merrill wrote:
On 12/5/23 15:31, Marek Polacek wrote:
Bootstrapped/regtested on x86_64-pc-linux-gnu,
Hi!
On 2023-12-13T20:36:40+0100, I wrote:
> On 2023-12-13T11:15:54-0800, Jerry D via Gcc wrote:
>> I am getting this failure to build from clean trunk.
>
> This is due to commit r14-6499-g348874f0baac0f22c98ab11abbfa65fd172f6bdd
> "libgomp: basic pinned memory on Linux", which supposedly was
On 12/12/23 16:21, Patrick Palka wrote:
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk?
OK.
-- >8 --
When unifying constants we need to generally treat constants of
different types but same value as different, in light of auto template
parameters. This patch
On 12/13/23 03:39, Jakub Jelinek wrote:
Hi!
On the c-c++-common/cpp/pr88974.c testcase I'm seeing
==600549== Conditional jump or move depends on uninitialised value(s)
==600549==at 0x1DD3A05: cpp_get_token_1(cpp_reader*, unsigned int*)
(macro.cc:3050)
==600549==by 0x1DBFC7F:
Tested x86_64-pc-linux-gnu, applying to trunk.
-- 8< --
My r14-6505-g52b4b7d7f5c7c0 change to copy the location in
build_aggr_init_expr reopened PR96997; let's fix it properly this time, by
clearing the location like we do for other trees.
PR c++/96997
gcc/cp/ChangeLog:
*
David: Ping.
I guess if we want to have this merged for this release, it should be
sooner rather than later (if it's still an option).
On Thu, 2023-11-09 at 18:04 -0500, David Malcolm wrote:
> On Thu, 2023-11-09 at 17:27 -0500, Antoni Boucher wrote:
> > Hi.
> > This patch adds support for getting
On Wed, Dec 13, 2023 at 05:01:07PM +0800, Wang wrote:
> On 2023/12/13 16:48, Dan Li wrote:
> > + Likun
> >
> > On Tue, 28 Mar 2023 at 06:18, Sami Tolvanen wrote:
> >> On Mon, Mar 27, 2023 at 2:30 AM Peter Zijlstra
> >> wrote:
> >>> On Sat, Mar 25, 2023 at 01:54:16AM -0700, Dan Li wrote:
> >>>
>
On 12/13/23 11:26, Jakub Jelinek wrote:
On Wed, Dec 13, 2023 at 11:24:42AM -0500, Jason Merrill wrote:
gcc/testsuite/ChangeLog:
* g++.dg/pr112822.C: Require C++17.
---
gcc/testsuite/g++.dg/pr112822.C | 1 +
1 file changed, 1 insertion(+)
diff --git a/gcc/testsuite/g++.dg/pr112822.C
On Wed, 13 Dec 2023, Jason Merrill wrote:
> Tested x86_64-pc-linux-gnu, applying to trunk.
>
> -- 8< --
>
> When building an AGGR_INIT_EXPR from a CALL_EXPR, we shouldn't lose location
> information.
>
> gcc/cp/ChangeLog:
>
> * tree.cc (build_aggr_init_expr): Copy EXPR_LOCATION.
I made
Andrew Carlotti writes:
> On Sat, Dec 09, 2023 at 06:42:17PM +, Richard Sandiford wrote:
>> Andrew Carlotti writes:
>> The .def files are included in TM_H by:
>>
>> TM_H += $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \
>> $(srcdir)/config/aarch64/aarch64-tuning-flags.def \
>>
Andrew Carlotti writes:
> Additionally, replace all checks for the AARCH64_FL_CRYPTO bit with
> checks for (AARCH64_FL_AES | AARCH64_FL_SHA2) instead. The value of the
> AARCH64_FL_CRYPTO bit within isa_flags is now ignored, but it is
> retained because removing it would make processing the data
On 12/13/23 04:49, Jakub Jelinek wrote:
Hi!
With valgrind checking, there are various errors reported on some C++26
libstdc++ tests, like:
==2009913== Conditional jump or move depends on uninitialised value(s)
==2009913==at 0x914C59: gt_ggc_mx_lang_tree_node(void*) (gt-cp-tree.h:107)
On Wed, 13 Dec 2023 at 10:51, haochen.jiang
wrote:
>
> On Linux/x86_64,
>
> a01462ae8bafa86e7df47a252917ba6899d587cf is the first bad commit
> commit a01462ae8bafa86e7df47a252917ba6899d587cf
> Author: Jonathan Wakely
> Date: Mon Dec 11 15:33:59 2023 +
>
> libstdc++: Fix std::format
After r14-2667-gceae1400cf24f329393e96dd9720, we force a constant to a register
if it is shared with one of the other operands. The problem is used the
comparison
mode for the register but that could be different from the operand mode. This
causes some issues on some targets.
To fix it, we either
Tested x86_64-pc-linux-gnu, applying to trunk.
-- 8< --
When testing the proposed patch for PR71093 I noticed that it changed the
diagnostic for consteval-prop6.C. I then noticed that the diagnostic wasn't
very helpful either way; it was complaining about modification of the 'x'
variable, but
Tested x86_64-pc-linux-gnu, applying to trunk.
This is modified from Nathaniel's last version by adjusting for my recent
CLOBBER changes and removing the special handling of __in_chrg which is no
longer needed since my previous commit.
-- 8< --
This patch adds checks for using objects after
Tested x86_64-pc-linux-gnu, applying to trunk.
-- 8< --
I was puzzled by the proposed patch for PR71093 specifically ignoring the
in-charge parameter; the problem turned out to be that when
cxx_eval_call_expression jumps from the clone to the cloned function, it
assumes that the latter has the
Tested x86_64-pc-linux-gnu, applying to trunk.
-- 8< --
When building an AGGR_INIT_EXPR from a CALL_EXPR, we shouldn't lose location
information.
gcc/cp/ChangeLog:
* tree.cc (build_aggr_init_expr): Copy EXPR_LOCATION.
gcc/testsuite/ChangeLog:
*
On Wed, Dec 13, 2023 at 11:24:42AM -0500, Jason Merrill wrote:
> gcc/testsuite/ChangeLog:
>
> * g++.dg/pr112822.C: Require C++17.
> ---
> gcc/testsuite/g++.dg/pr112822.C | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/gcc/testsuite/g++.dg/pr112822.C
On 12/12/23 21:36, Jason Merrill wrote:
On 12/12/23 17:50, Peter Bergner wrote:
On 12/12/23 1:26 PM, Richard Biener wrote:
Am 12.12.2023 um 19:51 schrieb Peter Bergner :
On 12/12/23 12:45 PM, Peter Bergner wrote:
+/* PR target/112822 */
Oops, this should be:
/* PR
> Am 13.12.2023 um 17:12 schrieb Filip Kastl :
>
>
>>
Hi,
this is a patch that I submitted two months ago as an RFC. I added some
polish
since.
It is a new lightweight pass that removes redundant PHI functions and as a
bonus does basic copy
> Am 13.12.2023 um 17:07 schrieb Martin Jambor :
>
> Hi,
>
> sorry for getting to this only so late, my email backlog from my medical
> leave still isn't empty.
>
>> On Mon, Oct 16 2023, Richard Biener wrote:
>> The following addresses build_reconstructed_reference failing to
>> build
> > > Hi,
> > >
> > > this is a patch that I submitted two months ago as an RFC. I added some
> > > polish
> > > since.
> > >
> > > It is a new lightweight pass that removes redundant PHI functions and as a
> > > bonus does basic copy propagation. With Jan Hubi?ka we measured that it
> > > is
Hi,
sorry for getting to this only so late, my email backlog from my medical
leave still isn't empty.
On Mon, Oct 16 2023, Richard Biener wrote:
> The following addresses build_reconstructed_reference failing to
> build references with a different offset than the models and thus
> the caller
> > The diffrerence is that Cores understand the fact that fmadd does not need
> > all three parameters to start computation, while Zen cores doesn't.
> >
> > Since this seems noticeable win on zen and not loss on Core it seems like
> > good
> > default for generic.
> >
> > I plan to commit the
Some AMD GCN devices support an "XNACK" mode in which the device can
handle page-misses (and maybe other traps in memory instructions), but
it's not completely invisible to software.
We need this now to support OpenMP Unified Shared Memory (I plan to post
updated patches for that in January),
Several typos have been found and fixed: missing semicolons, using
variable name instead of type, duplicate functions and wrong types.
gcc/ChangeLog:
* doc/extend.texi(__lsx_vabsd_di): remove extra `i' in name.
(__lsx_vfrintrm_d, __lsx_vfrintrm_s, __lsx_vfrintrne_d,
Attached is an in-between update for the release notes and also for the project
status page.
The latter contains an implementation-status page that is updated based on the
libgomp.texi entries;
I think there are more issues, but I found an incomplete update which is now
fixed. I probably need
On Sat, Dec 09, 2023 at 07:22:49PM +, Richard Sandiford wrote:
> Andrew Carlotti writes:
> > ...
>
> This is the only use of native_detect_p, so it'd be good to remove
> the field itself.
Done
> > ...
> >
> > @@ -447,6 +451,13 @@ host_detect_local_cpu (int argc, const char **argv)
> >
Additionally, replace all checks for the AARCH64_FL_CRYPTO bit with
checks for (AARCH64_FL_AES | AARCH64_FL_SHA2) instead. The value of the
AARCH64_FL_CRYPTO bit within isa_flags is now ignored, but it is
retained because removing it would make processing the data in
option-extensions.def
Add a terminating newline to various tests, and add missing
extensions to some test strings. The current output is broken for
options_set_4.c, so this test is left unchanged, to be fixed in a
subsequent patch.
Committed as obvious, with options_set_4.c removed compared to v1.
Alex Coplan writes:
> On 12/12/2023 15:58, Richard Sandiford wrote:
>> Alex Coplan writes:
>> > Hi,
>> >
>> > This is a v2 version which addresses feedback from Richard's review
>> > here:
>> >
>> > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637648.html
>> >
>> > I'll reply inline
On Sat, Dec 09, 2023 at 06:42:17PM +, Richard Sandiford wrote:
> Andrew Carlotti writes:
> The .def files are included in TM_H by:
>
> TM_H += $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \
> $(srcdir)/config/aarch64/aarch64-tuning-flags.def \
>
On Wed, Dec 13, 2023 at 05:01:07PM +0800, Wang wrote:
> On 2023/12/13 16:48, Dan Li wrote:
> > + Likun
> >
> > On Tue, 28 Mar 2023 at 06:18, Sami Tolvanen wrote:
> >> On Mon, Mar 27, 2023 at 2:30 AM Peter Zijlstra wrote:
> >>> On Sat, Mar 25, 2023 at 01:54:16AM -0700, Dan Li wrote:
> >>>
> In
Richard Ball writes:
> ACLE has added intrinsics to bridge between SVE and Neon.
>
> The NEON_SVE Bridge adds intrinsics that allow conversions between NEON and
> SVE vectors.
>
> This patch adds support to GCC for the following 3 intrinsics:
> svset_neonq, svget_neonq and svdup_neonq
>
>
On 12/12/2023 09:02, Tobias Burnus wrote:
On 11.12.23 18:04, Andrew Stubbs wrote:
Implement the OpenMP pinned memory trait on Linux hosts using the mlock
syscall. Pinned allocations are performed using mmap, not malloc, to
ensure
that they can be unpinned safely when freed.
This
We used a branch to load floating-point comparison results into GPR.
This is very slow when the branch is not predictable.
Use the movcf2gr instruction to implement cstore4 if movcf2gr
is fast enough.
gcc/ChangeLog:
* config/loongarch/genopts/loongarch.opt.in (muse-movcf2gr): New
On 12/13/23 2:05 AM, Jakub Jelinek wrote:
> On Wed, Dec 13, 2023 at 08:51:16AM +0100, Richard Biener wrote:
>> On Tue, 12 Dec 2023, Peter Bergner wrote:
>>
>>> On 12/12/23 8:36 PM, Jason Merrill wrote:
This test is failing for me below C++17, I think you need
// { dg-do compile {
> > > else if (vect_use_mask_type_p (stmt_info))
> > > {
> > > unsigned int precision = stmt_info->mask_precision;
> > > scalar_type = build_nonstandard_integer_type (precision, 1);
> > > vectype = get_mask_type_for_scalar_type (vinfo, scalar_type,
> > > group_size);
> > >
Committed with below comments, thanks Juzhe and Robin.
Pan
-Original Message-
From: Robin Dapp
Sent: Wednesday, December 13, 2023 9:56 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; juzhe.zh...@rivai.ai
Subject: Re: [PATCH v1] RISC-V: Refine test cases for both
Thanks Richard.
LGTM for RISC-V part.
Thanks Robin for fixing it.
juzhe.zh...@rivai.ai
From: Richard Sandiford
Date: 2023-12-13 22:05
To: Robin Dapp
CC: Richard Biener; gcc-patches; juzhe.zhong\@rivai.ai
Subject: Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].
Robin Dapp
Robin Dapp writes:
> @@ -1758,16 +1759,19 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64
> bitsize, poly_uint64 bitnum,
>if (VECTOR_MODE_P (outermode) && !MEM_P (op0))
> {
>scalar_mode innermode = GET_MODE_INNER (outermode);
>enum insn_code icode
> =
Results verified by running
`RUNTESTFLAGS="aarch64-ssve.exp=*" make -k -j 56 check-gcc`
before and after the change. I initally spotted the issue because the tests
were being run a nondeterministic number of time during unrelated regresison
testing.
Committed as obvious.
Thanks, LGTM but please add a comment like:
These test cases used to cause out-of-bounds writes to the stack
and therefore showed unreliable behavior. Depending on the
execution environment they can either pass or fail. As of now,
with the latest QEMU version, they will pass even without the
lgtm from my side. But I'd like to see Robin's commentsThanks Replied Message Frompan2...@intel.comDate12/13/2023 21:49 Togcc-patches@gcc.gnu.org Ccjuzhe.zh...@rivai.ai,pan2...@intel.com,rdapp@gmail.comSubject[PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988
From: Pan Li
Refine the test cases for:
* Name convention.
* Add run case.
PR target/112929
PR target/112988
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/pr112929.c: Moved to...
* gcc.target/riscv/rvv/vsetvl/pr112929-1.c: ...here.
*
On Wed, 2023-12-13 at 20:22 +0800, chenglulu wrote:
在 2023/12/10 上午1:03, Xi Ruoyao 写道:
Replace the instruction costs in loongarch_rtx_cost_data constructor
based on micro-benchmark results on LA464 and LA664.
This allows optimizations like "x * 17" to alsl, and "x * 68" to alsl
and slli.
Due to the crypto vector entension is depend on the Vector extension,
so add the implied ISA info with the corresponding crypto vector extension.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Modify implied ISA info.
* config/riscv/arch-canonicalize: Add crypto vector
2023-12-13 18:18 juzhe.zhong wrote:
>
>
>+ multiple_p (GET_MODE_BITSIZE (e.arg_mode (0)),
>+ GET_MODE_BITSIZE (e.arg_mode (1)), );
>
>Change it into gcc_assert (multiple_p (...))
>
>+/* A list of all Vector Crypto intrinsic functions. */
>+static function_group_info
LGTM!
Thanks!
在 2023/12/10 上午1:03, Xi Ruoyao 写道:
Following the instruction cost fix, we are generating
alsl.w $a0, $a0, $a0, 4
instead of
li.w $t0, 17
mul.w $a0, $t0
for "x * 4", because alsl.w is 4 times faster than mul.w. But we didn't
have a sign-extending pattern for
LGTM!
Thanks.
在 2023/12/10 上午1:03, Xi Ruoyao 写道:
With loongarch-def.cc switched from C to C++, we can include rtl.h for
COSTS_N_INSNS, instead of hard coding our own.
THis is a non-functional change for now, but it will make the code more
future-proof in case COSTS_N_INSNS in rtl.h would be
The following defers, for non-gather/scatter and non-pattern stmts,
setting of STMT_VINFO_VECTYPE until after we computed the desired
vectorization factor. This allows us to use larger vector types
when the vectorization factor and the preferred vector mode allow,
reducing the number of vector
The gather_load optab and friends require the offset vector mode to
have the same number of lanes as the data vector mode. Restrict the
vector type query to that when searching for a proper offset type.
* tree-vect-data-refs.cc (vect_gather_scatter_fn_p):
Use
The following changes the unsigned group_size argument to a poly_uint64
one to avoid too much special-casing in callers for VLA vectors when
passing down the effective maximum desirable vector size to vector
type query routines. The intent is to be able to pass down
the vectorization factor
The following makes sure to keep LOOP_VINFO_VECT_FACTOR at the
indetermined value zero until it is final, making LOOP_VINFO_VECT_FACTOR
an rvalue and changing some direct references to use the macro.
* tree-vectorizer.h (LOOP_VINFO_VECT_FACTOR): Make an rvalue.
* tree-vect-loop.cc
This reduces more calls to get_vectype_for_scalar_type.
* tree-vect-loop.cc (vect_transform_cycle_phi): Specify
the vector type for invariant/external defs.
* tree-vect-stmts.cc (vectorizable_shift): For invariant
or external shifted operands use the result vector
It seems that what I pushed didn't match what I tested, due to testing
on a different machine!
Tested x86_64-linux, on the right machine this time. Pushed to trunk.
-- >8 --
The change in r14-6468-ga01462ae8bafa8 was only supposed to apply to %C
formats, not %Y.
libstdc++-v3/ChangeLog:
The following removes get_vectype_for_scalar_type calls when we
already have the vector type computed. It also avoids some
premature and possibly redundant or unnecessary check during
data-ref analysis for gathers.
* tree-vect-data-refs.cc (vect_analyze_data_refs): Do
not check
I've been asked to look into how to best relax the current restriction
of the vectorizer that it prefers to use a single vector size throughout
loop vectorization. That size is determined by the preferred_simd_mode
and the autovectorize_vector_modes hook for other-than-first iterations.
The
OK. will add it later. Replied Message FromRobin DappDate12/13/2023 20:23 Tojuzhe.zhong Ccrdapp@gmail.com,gcc-patches@gcc.gnu.org,kito.ch...@gmail.com,kito.ch...@sifive.com,jeffreya...@gmail.comSubjectRe: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]> Do you mean
> Do you mean add some comments in tests?
I meant add it as a run test as well and comment that the test
has caused out-of-bounds writes before and passed by the time of
adding it (or so) and is kept regardless.
Regards
Robin
在 2023/12/10 上午1:03, Xi Ruoyao 写道:
Replace the instruction costs in loongarch_rtx_cost_data constructor
based on micro-benchmark results on LA464 and LA664.
This allows optimizations like "x * 17" to alsl, and "x * 68" to alsl
and slli.
gcc/ChangeLog:
PR target/112936
*
Thanks. The attached v2 goes with your suggestion and adds a
vec_extractbi expander. Apart from that it keeps the
MODE_PRECISION changes from before and uses
insn_data[icode].operand[0]'s mode.
Apart from that no changes on the riscv side.
Bootstrapped and regtested on x86 and aarch64. On
Do you mean add some comments in tests? Replied Message FromRobin DappDate12/13/2023 20:16 Tojuzhe.zhong Ccrdapp@gmail.com,gcc-patches@gcc.gnu.org,kito.ch...@gmail.com,kito.ch...@sifive.com,jeffreya...@gmail.comSubjectRe: [PATCH] RISC-V: Postpone full available optimization [VSETVL
> I don”t choose to run since I didn”t have issue run on my local
> simulator no matter qemu or spike.
Yes it was flaky. That's kind of expected with the out-of-bounds
writes we did. They can depend on runtime environment and other
factors. Of course it's a bit counterintuitive to add a
I don”t choose to run since I didn”t have issue run on my local simulator no matter qemu or spike.So it”s better to check vsetvl asm.full available is not consistent between LCM analysis and earliest fusion,so it”s safe to postpone it. Replied Message FromRobin DappDate12/13/2023 20:08
Hi Juzhe,
in general looks OK to me.
Just a question for understanding:
> - if (header_info.valid_p ()
> - && (anticipated_exp_p (header_info) || block_info.full_available))
Why is full_available true if we cannot use it?
> +/* { dg-do compile } */
It would be nice if we could
On Fri, Dec 08, 2023 at 03:12:00PM +0800, liuhongt wrote:
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ready push to trunk.
>
> gcc/ChangeLog:
>
> PR target/112904
> * config/i386/mmx.md (*xop_pcmov_): New define_insn.
>
> gcc/testsuite/ChangeLog:
>
> *
Committed, thanks all.
Pan
From: juzhe.zh...@rivai.ai
Sent: Wednesday, December 13, 2023 7:16 PM
To: demin.han ; gcc-patches
Cc: Li, Pan2
Subject: Re: [PATCH v2] RISC-V: Fix dynamic lmul tests depended on abi
LGTM.
1 - 100 of 146 matches
Mail list logo