Bootstrapped and regtested on s390x. Ok for mainline?
gcc/ChangeLog:
* config/s390/s390.cc (expand_perm_as_a_vlbr_vstbr_candidate):
New function which handles bswap patterns for vec_perm_const.
(vectorize_vec_perm_const_1): Call new function.
* config/s390/vector.
This enables the following tests which rely on instruction vperm which
is available since z13 with the initial vector support.
testsuite/gcc.dg/vect/vect-bswap16.c
42:/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target
{ vect_bswap || sse4_runtime } } } } */
testsuite/gcc
On Wed, Aug 2, 2023 at 11:10 PM Jeff Law via Gcc-patches
wrote:
>
>
>
> On 8/2/23 17:52, Andrew Pinski via Gcc-patches wrote:
> > This moves a few simple patterns that are done in value replacement
> > in phiopt over to match.pd. Just the simple ones which might show up
> > in other code.
> >
> >
Committed, thanks Jeff and Kito.
Pan
-Original Message-
From: Li, Pan2
Sent: Thursday, August 3, 2023 2:17 PM
To: Jeff Law ; Wang, Yanzhang ;
Kito Cheng
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com
Subject: RE: [PATCH v3] RISCV: Add -m(no)-omit-leaf-frame-po
Hi Richard,
Richard Biener writes:
> On Tue, 1 Aug 2023, Jiufu Guo wrote:
>
>>
>> Hi,
>>
>> Richard Biener writes:
>>
>> > On Mon, 24 Jul 2023, Jiufu Guo wrote:
>> >
>> >>
>> >> Hi Martin,
>> >>
>> >> Not sure about your current option about re-using the ipa-sra code
>> >> in the light-e
Thanks Jeff and nice dream, I will commit this patch.
Pan
-Original Message-
From: Jeff Law
Sent: Thursday, August 3, 2023 2:13 PM
To: Wang, Yanzhang ; Kito Cheng
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li,
Pan2
Subject: Re: [PATCH v3] RISCV: Add -m(
From: Pan Li
This patch would like to support the rounding mode API for the
VFWMUL for the below samples.
* __riscv_vfwmul_vv_f64m2_rm
* __riscv_vfwmul_vv_f64m2_rm_m
* __riscv_vfwmul_vf_f64m2_rm
* __riscv_vfwmul_vf_f64m2_rm_m
Signed-off-by: Pan Li
gcc/ChangeLog:
* config/riscv/riscv-
On 8/1/23 19:51, Wang, Yanzhang wrote:
Hi Jeff,
Do you have any further comments about this patch ?
I thought we covered this in the meeting earlier this week. This is
fine for the trunk.
If you or Pan doesn't get around to committing it before I start my day
tomorrow, I'll go ahead and
On 8/2/23 17:52, Andrew Pinski via Gcc-patches wrote:
This moves a few simple patterns that are done in value replacement
in phiopt over to match.pd. Just the simple ones which might show up
in other code.
This allows some optimizations to happen even without depending
on sinking from happeni
On 8/2/23 06:44, Richard Biener via Gcc-patches wrote:
statement_sink_location for loads is currently confused about
stores that are not on the paths we are sinking across. The
following replaces the logic that tries to ensure we are not
sinking across stores by instead of walking all immedia
On 8/2/23 06:44, Richard Biener via Gcc-patches wrote:
The following adds an on-demand global liveness computation class
computing and caching the live-out virtual operand of basic blocks
and answering live-out, live-in and live-on-edge queries. The flow
is optimized for the intended use in c
On Thu, Aug 3, 2023 at 12:18 AM Roger Sayle wrote:
>
>
> This patch is a conservative fix for PR target/110792, a wrong-code
> regression affecting doubleword rotations by BITS_PER_WORD, which
> effectively swaps the highpart and lowpart words, when the source to be
> rotated resides in memory. Th
On 7/29/23 03:14, Xiao Zeng wrote:
1 Thank you for Jeff's code review comments. I have made the modifications
and submitted the V2-patch[3/5].
Yea. I'm adjusting my tree based on those updates. For testing I've
actually got my compiler generating zicond by default and qemu allowing
zicon
YunQiang Su 于2023年8月3日周四 11:18写道:
>
> PR #104914
>
> On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms,
> zero_extract (SI, SI) can be sign-extended. So, if a zero_extract (DI,
> DI) following with an sign_extend(SI, DI) can be merged to a single
> zero_extract (SI, SI).
>
The
I am considering whether it is better to have multiple macro define for FRM ?
like:
DECLARE_FRM_FUNCTION_BASE (NAME)\
extern const function_base *const NAME;
extern const function_base *const NAME##_frm;
DECLARE_FRM_FUNCTION (NAME, )\
DEF_RVV_FUNCTION (NAME##_frm, alu, );
DEF_RVV
From: Pan Li
This patch would like to support the rounding mode API for the
VFDIV and VFRDIV for the below samples.
* __riscv_vfdiv_vv_f32m1_rm
* __riscv_vfdiv_vv_f32m1_rm_m
* __riscv_vfdiv_vf_f32m1_rm
* __riscv_vfdiv_vf_f32m1_rm_m
* __riscv_vfrdiv_vf_f32m1_rm
* __riscv_vfrdiv_vf_f32m1_rm_m
Sig
PR #104914
On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms,
zero_extract (SI, SI) can be sign-extended. So, if a zero_extract (DI,
DI) following with an sign_extend(SI, DI) can be merged to a single
zero_extract (SI, SI).
gcc/ChangeLog:
PR: 104914.
* combine.
Committed, thanks Juzhe.
Pan
From: juzhe.zh...@rivai.ai
Sent: Thursday, August 3, 2023 10:36 AM
To: Li, Pan2 ; gcc-patches
Cc: Kito.cheng ; Li, Pan2 ; Wang,
Yanzhang
Subject: Re: [PATCH v2] RISC-V: Support RVV VFMUL rounding mode intrinsic API
LGTM
juzhe.zh.
On 7/28/23 00:34, Xiao Zeng wrote:
What I like about yours is it keeps all the logic in riscv.cc rather
than scattering it across riscv.cc and riscv.md.
Yes, when I use enough test cases, I cannot find a concise way to optimize
all test cases. When I enumerated all possible cases in the
Hi, Richi.
I have fully tested in RISC-V port with adding gcc_unreachable () in V4 patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626133.html
Bootstrap and regression on X86 passed.
juzhe.zh...@rivai.ai
From: Richard Biener
Date: 2023-08-02 16:33
To: juzhe.zh...@rivai.ai
CC: r
From: Ju-Zhe Zhong
Hi, Richard and Richi.
Base on the suggestions from Richard:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625396.html
This patch choose (1) approach that Richard provided, meaning:
RVV implements cond_* optabs as expanders. RVV therefore supports
both IFN_COND_ADD an
LGTM
juzhe.zh...@rivai.ai
From: pan2.li
Date: 2023-08-03 10:32
To: gcc-patches
CC: juzhe.zhong; kito.cheng; pan2.li; yanzhang.wang
Subject: [PATCH v2] RISC-V: Support RVV VFMUL rounding mode intrinsic API
From: Pan Li
Update in v2:
* Sync with upstream for the vfmul duplicated declaration
So I didn't expect valueization to cause calling gimple_nop_convert
to iterate between 2 different SSA names causing an infinite loop
in gimple_bitwise_inverted_equal_p.
So we should cause a bound on gimple_bitwise_inverted_equal_p calling
gimple_nop_convert and only look through one rather than al
From: Pan Li
Update in v2:
* Sync with upstream for the vfmul duplicated declaration.
Original log:
This patch would like to support the rounding mode API for the VFMUL
for the below samples.
* __riscv_vfmul_vv_f32m1_rm
* __riscv_vfmul_vv_f32m1_rm_m
* __riscv_vfmul_vf_f32m1_rm
* __riscv_vfmul
cc.c-torture/execute/20230802-1.c: New test.
---
gcc/match.pd | 6 +-
.../gcc.c-torture/execute/20230802-1.c| 68 +++
2 files changed, 72 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.c-torture/execute/20230802-1.c
Hi,
I would like to have a ping on this patch.
BR,
Jeff (Jiufu Guo)
Jiufu Guo writes:
> Hi,
>
> As discussed in previous reviews, adding overflow APIs to range-op
> would be useful. Those APIs could help to check if overflow happens
> when operating between two 'range's, like: plus, minus,
Committed, thanks Kito.
Pan
From: Kito Cheng
Sent: Thursday, August 3, 2023 10:12 AM
To: Li, Pan2
Cc: GCC Patches ; 钟居哲 ; Wang,
Yanzhang
Subject: Re: [PATCH v1] RISC-V: Remove redudant extern declaration in function
base
LGTM
mailto:pan2...@intel.com>> 於 2023年8月3日 週四 10:11 寫道:
From: Pan Li
LGTM
於 2023年8月3日 週四 10:11 寫道:
> From: Pan Li
>
> This patch would like to remove the redudant declaration.
>
> Signed-off-by: Pan Li
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.h: Remove
> redudant declaration.
> ---
> gcc/config/riscv/riscv-vector-builti
From: Pan Li
This patch would like to remove the redudant declaration.
Signed-off-by: Pan Li
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.h: Remove
redudant declaration.
---
gcc/config/riscv/riscv-vector-builtins-bases.h | 1 -
1 file changed, 1 deletion(-)
diff
Sure thing, will prepare it on the double.
Pan
From: juzhe.zh...@rivai.ai
Sent: Thursday, August 3, 2023 10:02 AM
To: Li, Pan2 ; gcc-patches
Cc: Wang, Yanzhang ; kito.cheng
Subject: Re: RE: [PATCH v1] RISC-V: Support RVV VFMUL rounding mode intrinsic
API
Could you split it into 2 patches ?
Could you split it into 2 patches ?
one is cleanup patch which is removing the redundant declaration.
The other is support VFMUL API.
juzhe.zh...@rivai.ai
From: Li, Pan2
Date: 2023-08-03 09:44
To: juzhe.zh...@rivai.ai; gcc-patches
CC: Wang, Yanzhang; kito.cheng
Subject: RE: [PATCH v1] RISC-V
Yes, looks there is some I missed after the last cleanup. I will have a double
check after rounding API support.
Pan
From: juzhe.zh...@rivai.ai
Sent: Thursday, August 3, 2023 9:40 AM
To: Li, Pan2 ; gcc-patches
Cc: Li, Pan2 ; Wang, Yanzhang ;
kito.cheng
Subject: Re: [PATCH v1] RISC-V: Support
extern const function_base *const vfmul;
-extern const function_base *const vfmul;
+extern const function_base *const vfmul_frm;
It seems that there is a redundant declaration in the original code?
extern const function_base *const vfmul;
-extern const function_base *const vfmul;
juzhe.zh...@r
From: Pan Li
This patch would like to support the rounding mode API for the VFMUL
for the below samples.
* __riscv_vfmul_vv_f32m1_rm
* __riscv_vfmul_vv_f32m1_rm_m
* __riscv_vfmul_vf_f32m1_rm
* __riscv_vfmul_vf_f32m1_rm_m
Signed-off-by: Pan Li
gcc/ChangeLog:
* config/riscv/riscv-vecto
Committed, thanks Kito.
Pan
From: Kito Cheng
Sent: Wednesday, August 2, 2023 9:48 PM
To: Li, Pan2
Cc: GCC Patches ; 钟居哲 ; Wang,
Yanzhang
Subject: Re: [PATCH v1] RISC-V: Support RVV VFWSUB rounding mode intrinsic API
LGTM, thanks:)
Pan Li via Gcc-patches
mailto:gcc-patches@gcc.gnu.org>> 於 2
This moves a few simple patterns that are done in value replacement
in phiopt over to match.pd. Just the simple ones which might show up
in other code.
This allows some optimizations to happen even without depending
on sinking from happening and in some cases where phiopt is not
invoked (cond-1.c
This patch is a conservative fix for PR target/110792, a wrong-code
regression affecting doubleword rotations by BITS_PER_WORD, which
effectively swaps the highpart and lowpart words, when the source to be
rotated resides in memory. The issue is that if the register used to
hold the lowpart of the
On Wed, Aug 2, 2023 at 1:25 AM Jakub Jelinek via Gcc-patches
wrote:
>
> On Wed, Aug 02, 2023 at 10:04:26AM +0200, Richard Biener via Gcc-patches
> wrote:
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -1157,8 +1157,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >
> > > /* Simplify ~X
Plz put your testcases into:
# widening operation only test on LMUL < 8
set AUTOVEC_TEST_OPTS [list \
{-ftree-vectorize -O3 --param riscv-autovec-lmul=m1} \
{-ftree-vectorize -O3 --param riscv-autovec-lmul=m2} \
{-ftree-vectorize -O3 --param riscv-autovec-lmul=m4} \
{-ftree-vectorize -O2 -
I just checked LLVM:
https://godbolt.org/z/nMa6qnEeT
This patch generally is reasonable so LGTM.
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2023-08-03 02:49
To: 钟居哲; gcc-patches; palmer; kito.cheng; Jeff Law
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: Implement vector "average" autovec patte
On Wed, Aug 2, 2023 at 10:14 AM Andrew Pinski wrote:
>
> On Wed, Aug 2, 2023 at 10:13 AM Prathamesh Kulkarni via Gcc-patches
> wrote:
> >
> > On Mon, 31 Jul 2023 at 22:39, Andrew Pinski via Gcc-patches
> > wrote:
> > >
> > > This is a new version of the patch.
> > > Instead of doing the matching
On Wed, 2023-08-02 at 14:46 -0400, Eric Feng wrote:
> On Wed, Aug 2, 2023 at 1:20 PM Marek Polacek
> wrote:
> >
> > On Wed, Aug 02, 2023 at 12:59:28PM -0400, David Malcolm wrote:
> > > On Wed, 2023-08-02 at 12:20 -0400, Eric Feng wrote:
> > >
[Dropping Joseph and Marek from the CC]
[...snip...
Richard Biener writes:
> [...]
>> >> in vect_determine_precisions_from_range. Maybe we should drop
>> >> the shift handling from there and instead rely on
>> >> vect_determine_precisions_from_users, extending:
>> >>
>> >> if (TREE_CODE (shift) != INTEGER_CST
>> >> || !wi::ltu_p (wi::to_w
Ping…
thanks.
Qing
> On Jul 10, 2023, at 3:11 PM, Qing Zhao wrote:
>
> Hi,
>
> This is the change for the GCC14 releaes Notes on the deprecating of a C
> extension about flexible array members.
>
> Okay for committing?
>
> thanks.
>
> Qing
>
>
>
> *htdocs/gcc-14/changes.html (Ca
Ping.
This is a very simple patch to correct a URL address in GCC13’s changes.html.
Currently, it’s pointing to a wrong address.
Okay for committing?
> On Jul 21, 2023, at 3:02 PM, Qing Zhao wrote:
>
> Hi,
>
> In the current GCC13 release note, the URL to the option -fstrict-flex-array
> is
> 1. How do you model round to +Inf (avg_floor) and round to -Inf (avg_ceil) ?
That's just specified by the +1 or the lack of it in the original pattern.
Actually the IFN is just a detour because we would create perfect code
if not for the fallback. But as there is currently now way to check for
On Wed, Aug 2, 2023 at 1:20 PM Marek Polacek wrote:
>
> On Wed, Aug 02, 2023 at 12:59:28PM -0400, David Malcolm wrote:
> > On Wed, 2023-08-02 at 12:20 -0400, Eric Feng wrote:
> >
> > Hi Eric, thanks for the updated patch.
> >
> > Overall, looks good to me, although I'd drop the "Exited." from the
Revised:
-- Remove superfluous { }
-- Reword diagnostic
---
This patch adds a hook to the end of ana::on_finish_translation_unit
which calls relevant stashing-related callbacks registered during plugin
initialization. This feature is used to stash named types and global
variables for a CPython an
Am Mittwoch, dem 02.08.2023 um 16:45 + schrieb Qing Zhao:
>
> > On Aug 1, 2023, at 10:31 AM, Martin Uecker wrote:
> >
> > Am Dienstag, dem 01.08.2023 um 13:27 + schrieb Qing Zhao:
> > >
> > > > On Aug 1, 2023, at 3:51 AM, Martin Uecker via Gcc-patches
> > > > wrote:
> > > >
> >
> >
On Wed, 2 Aug 2023, Tamar Christina via Gcc-patches wrote:
> Ping.
>
> > -Original Message-
> > From: Tamar Christina
> > Sent: Wednesday, July 26, 2023 8:35 PM
> > To: Tamar Christina ; gcc-patches@gcc.gnu.org
> > Cc: nd ; jos...@codesourcery.com
> > Subject: RE: [PATCH 2/2][frontend]:
On Wed, 2 Aug 2023, Richard Sandiford wrote:
> Richard Biener writes:
> > On Tue, 1 Aug 2023, Richard Sandiford wrote:
> >
> >> Richard Sandiford writes:
> >> > Richard Biener via Gcc-patches writes:
> >> >> The following makes sure to limit the shift operand when vectorizing
> >> >> (short)((i
So we're being a bit too aggressive with the .opt zicond patterns.
(define_insn "*czero.eqz..opt1"
[(set (match_operand:GPR 0 "register_operand" "=r")
(if_then_else:GPR (eq (match_operand:X 1 "register_operand" "r")
(const_int 0))
On Wed, Aug 02, 2023 at 12:59:28PM -0400, David Malcolm wrote:
> On Wed, 2023-08-02 at 12:20 -0400, Eric Feng wrote:
>
> Hi Eric, thanks for the updated patch.
>
> Overall, looks good to me, although I'd drop the "Exited." from the
> "sorry" message (and thus from the dg-message directive), since
On Wed, Aug 2, 2023 at 10:13 AM Prathamesh Kulkarni via Gcc-patches
wrote:
>
> On Mon, 31 Jul 2023 at 22:39, Andrew Pinski via Gcc-patches
> wrote:
> >
> > This is a new version of the patch.
> > Instead of doing the matching of inversion comparison directly inside
> > match, creating a new funct
On Mon, 31 Jul 2023 at 22:39, Andrew Pinski via Gcc-patches
wrote:
>
> This is a new version of the patch.
> Instead of doing the matching of inversion comparison directly inside
> match, creating a new function (bitwise_inverted_equal_p) to do it.
> It is very similar to bitwise_equal_p that was
This implements the OpenMP low-latency memory allocator for AMD GCN using the
small per-team LDS memory (Local Data Store).
Since addresses can now refer to LDS space, the "Global" address space is
no-longer compatible. This patch therefore switches the backend to use
entirely "Flat" addressing
This patch adds support for allocating low-latency ".shared" memory on
NVPTX GPU device, via the omp_low_lat_mem_space and omp_alloc. The memory
can be allocated, reallocated, and freed using a basic but fast algorithm,
is thread safe and the size of the low-latency heap can be configured using
t
The NVPTX low latency memory is not accessible outside the team that allocates
it, and therefore should be unavailable for allocators with the access trait
"all". This change means that the omp_low_lat_mem_alloc predefined
allocator now implicitly implies the "pteam" trait.
libgomp/ChangeLog:
This patch series is an updated and reworked version of some of the patch set
posted about a year ago (the other features will be posted soon), this
time supporting amdgcn, in addition to nvptx:
https://patchwork.sourceware.org/project/gcc/list/?series=10748&state=%2A&archive=both
The series impl
On Wed, 2023-08-02 at 12:20 -0400, Eric Feng wrote:
Hi Eric, thanks for the updated patch.
Overall, looks good to me, although I'd drop the "Exited." from the
"sorry" message (and thus from the dg-message directive), since the
compiler is not exiting, it's just the particular plugin that's giving
On 8/2/23 04:05, Richard Sandiford wrote:
Jeff Law via Gcc-patches writes:
On 8/1/23 05:18, Richard Sandiford wrote:
Where were you seeing the requirement for pointer equality? genrecog.cc
at least uses rtx_equal_p, and I think it has to. E.g. some patterns
use (match_dup ...) to match o
On Thu, Jun 1, 2023 at 2:11 PM Bernhard Reutner-Fischer
wrote:
>
> Hi David, Patrick,
>
> On Thu, 1 Jun 2023 18:33:46 +0200
> Bernhard Reutner-Fischer wrote:
>
> > On Thu, 1 Jun 2023 11:24:06 -0400
> > Patrick Palka wrote:
> >
> > > On Sat, May 13, 2023 at 7:26 PM Bernhard Reutner-Fischer via
>
> On Aug 1, 2023, at 10:31 AM, Martin Uecker wrote:
>
> Am Dienstag, dem 01.08.2023 um 13:27 + schrieb Qing Zhao:
>>
>>> On Aug 1, 2023, at 3:51 AM, Martin Uecker via Gcc-patches
>>> wrote:
>>>
>
>
Hi Martin,
Just wondering if it'd be a good idea perhaps to warn if allo
Hi Dave,
Thank you for the feedback! I've incorporated the changes and sent a
revised version of the patch.
On Tue, Aug 1, 2023 at 1:02 PM David Malcolm wrote:
>
> On Tue, 2023-08-01 at 09:52 -0400, Eric Feng wrote:
> > Hi all,
> >
> > This patch adds a hook to the end of ana::on_finish_translat
Canonicalizes (signed x << c) >> c into the lowest
precision(type) - c bits of x IF those bits have a mode precision or a
precision of 1. Also combines this rule with (unsigned x << c) >> c -> x &
((unsigned)-1 >> c) to prevent duplicate pattern. Tested successfully on
x86_64 and x86 targets.
PR
Revised:
-- Fix indentation problems
-- Add more detail to Changelog
-- Add new test on handling non-CPython code case
-- Turn off debugging inform by default
-- Make on_finish_translation_unit() static
-- Remove superfluous null checks in init_py_structs()
Changes have been bootstrapped and teste
On 2023-08-02 20:41, Richard Biener wrote:
On Tue, 1 Aug 2023, Jiufu Guo wrote:
Hi,
Richard Biener writes:
> On Mon, 24 Jul 2023, Jiufu Guo wrote:
>
>>
>> Hi Martin,
>>
>> Not sure about your current option about re-using the ipa-sra code
>> in the light-expander-sra. And if anything I coul
Hi!
On Fri, Jul 28, 2023 at 06:37:23PM +, Joseph Myers wrote:
> Yes, the type used in _Generic isn't fully specified, just the type after
> integer promotions in contexts where those occur.
Ok. I've removed those static_asserts from the test then, no need to test
what isn't fully specified.
This patch extends option -mbranch-protection=bti with an optional argument
as bti[+all] to force compiler to unconditionally insert bti for all
functions. Because a direct function call at the stage of compiling might be
rewritten to an indirect call with some kind of linker-generated thunk stub
a
> On Aug 1, 2023, at 6:45 PM, Kees Cook wrote:
>
> On Mon, Jul 31, 2023 at 08:14:42PM +, Qing Zhao wrote:
>> /* In general, Due to type casting, the type for the pointee of a pointer
>> does not say anything about the object it points to,
>> So, __builtin_object_size can not directly us
> On Aug 2, 2023, at 2:25 AM, Martin Uecker wrote:
>
> Am Dienstag, dem 01.08.2023 um 15:45 -0700 schrieb Kees Cook:
>> On Mon, Jul 31, 2023 at 08:14:42PM +, Qing Zhao wrote:
>>> /* In general, Due to type casting, the type for the pointee of a pointer
>>> does not say anything about the
Tamar Christina writes:
> Hi All,
>
> Currently we segfault when len == 0 for an attribute list.
>
> essentially [cons: =0, 1, 2, 3; attrs: ] segfaults but should be equivalent to
> [cons: =0, 1, 2, 3] and [cons: =0, 1, 2, 3; attrs:]. This fixes it by just
> returning early and leaving it to the
Tamar Christina writes:
> Hi All,
>
> In GCC 11 we implemented the vectorizer optab for widening left shifts,
> however this optab is only supported for uniform shift constants.
>
> At the moment GCC still has two loop vectorization strategy (classical loop
> and
> SLP based loop vec) and the opt
Hi all,
I'm pinging to discuss again if we want to move this forward for GCC14.
I did some testing again and I haven't been able to find obvious
regressions, including testing the code from PR86270 and PR70359 that
Richard mentioned.
I still believe that zero can be considered a special case even
On 8/2/23 07:49, Stefan Schulze Frielinghaus via Gcc-patches wrote:
In certain cases a constant may not fit into the mode used to perform a
comparison. This may be the case for sign-extended constants which are
used during an unsigned comparison as e.g. in
(set (reg:CC 100 cc)
(compare:
I am concerning:
1. How do you model round to +Inf (avg_floor) and round to -Inf (avg_ceil) ?
2. Is it possible we could use vaadd[u] to model avg ?
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2023-08-01 22:31
To: gcc-patches; palmer; Kito Cheng; juzhe.zh...@rivai.ai; jeffreyalaw
CC: rdapp.gc
Okay. This previous small example was used to show the correct behavior of
__bos
for Fixed arrays when the allocation size and the TYPE_SIZE are mismatched.
Now we agreed on the correct behavior for each of the cases for the fixed array.
Since the new “counted_by” attribute is mainly a comple
In certain cases a constant may not fit into the mode used to perform a
comparison. This may be the case for sign-extended constants which are
used during an unsigned comparison as e.g. in
(set (reg:CC 100 cc)
(compare:CC (mem:SI (reg/v/f:SI 115 [ a ]) [1 *a_4(D)+0 S4 A64])
(const_int
On 8/2/23 06:50, Richard Biener via Gcc-patches wrote:
On Mon, 31 Jul 2023, Richard Biener wrote:
statement_sink_location for loads is currently confused about
stores that are not on the paths we are sinking across. The
following avoids this by explicitely checking whether a block
with a st
LGTM, thanks:)
Pan Li via Gcc-patches 於 2023年8月2日 週三 18:19 寫道:
> From: Pan Li
>
> This patch would like to support the rounding mode API for the VFWSUB
> for the below samples.
>
> * __riscv_vfwsub_vv_f64m2_rm
> * __riscv_vfwsub_vv_f64m2_rm_m
> * __riscv_vfwsub_vf_f64m2_rm
> * _
Richard Biener writes:
> On Tue, 1 Aug 2023, Richard Sandiford wrote:
>
>> Richard Sandiford writes:
>> > Richard Biener via Gcc-patches writes:
>> >> The following makes sure to limit the shift operand when vectorizing
>> >> (short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift
>>
On 8/1/23 01:20, Surya Kumari Jangala wrote:
Ping
Sorry for delay with the answer. I was on vacation.
On 21/07/23 3:43 pm, Surya Kumari Jangala via Gcc-patches wrote:
The improve_allocation() routine does not update the
allocated_hardreg_p[] array after an allocno is assigned a register.
If
On Mon, 31 Jul 2023, Richard Biener wrote:
> statement_sink_location for loads is currently confused about
> stores that are not on the paths we are sinking across. The
> following avoids this by explicitely checking whether a block
> with a store is on any of those paths. To not perform too man
On Tue, Aug 1, 2023 at 12:15 PM Jan Hubicka via Gcc-patches
wrote:
>
> Hi,
> This patch fixes update after constant peeling in profilogue. We now reached
> 0 profile
> update bugs on tramp3d vectorizaiton and also on quite few testcases, so I am
> enabling the
> testuiste checks so we do not re
statement_sink_location for loads is currently confused about
stores that are not on the paths we are sinking across. The
following replaces the logic that tries to ensure we are not
sinking across stores by instead of walking all immediate virtual
uses and then checking whether found stores are o
The following adds an on-demand global liveness computation class
computing and caching the live-out virtual operand of basic blocks
and answering live-out, live-in and live-on-edge queries. The flow
is optimized for the intended use in code sinking which will query
live-in and possibly can be opt
On Tue, 1 Aug 2023, Jiufu Guo wrote:
>
> Hi,
>
> Richard Biener writes:
>
> > On Mon, 24 Jul 2023, Jiufu Guo wrote:
> >
> >>
> >> Hi Martin,
> >>
> >> Not sure about your current option about re-using the ipa-sra code
> >> in the light-expander-sra. And if anything I could input please
> >>
On Tue, 1 Aug 2023, Richard Sandiford wrote:
> Richard Sandiford writes:
> > Richard Biener via Gcc-patches writes:
> >> The following makes sure to limit the shift operand when vectorizing
> >> (short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift
> >> operand otherwise invokes un
Hello,
On Tue, 1 Aug 2023, Joseph Myers wrote:
> > Only because cmpxchg is defined in terms of memcpy/memcmp. If it were
> > defined in terms of the == operator (obviously applied recursively
> > member-wise for structs) and simple-assignment that wouldn't be a problem.
>
> It also wouldn't
ACLE has added intrinsics to bridge between SVE and Neon.
The NEON_SVE Bridge adds intrinsics that allow conversions between NEON and
SVE vectors.
This patch adds support to GCC for the following 3 intrinsics:
svset_neonq, svget_neonq and svdup_neonq
gcc/ChangeLog:
* config.gcc: Adds n
The only exported PHI allocation already adds the PHI node to a block.
Bootstrapped on x86_64-unknown-linux-gnu, pushed.
* tree-phinodes.h (add_phi_node_to_bb): Remove.
* tree-phinodes.cc (add_phi_node_to_bb): Make static.
---
gcc/tree-phinodes.cc | 3 +--
gcc/tree-phinodes.h |
Tamar Christina writes:
>> Tamar Christina writes:
>> > Hi All,
>> >
>> > When determining issue rates we currently discount non-constant MLA
>> > accumulators for Advanced SIMD but don't do it for the latency.
>> >
>> > This means the costs for Advanced SIMD with a constant accumulator are
>> >
Tamar Christina writes:
> Hi All,
>
> boolean comparisons have different cost depending on the mode. e.g.
> a && b when predicated doesn't require an addition instruction, the AND is
> free
Nit (for the commit msg): additional
Maybe:
for SVE, a && b doesn't require an additional instruction
> Tamar Christina writes:
> > Hi All,
> >
> > When determining issue rates we currently discount non-constant MLA
> > accumulators for Advanced SIMD but don't do it for the latency.
> >
> > This means the costs for Advanced SIMD with a constant accumulator are
> > wrong and results in us costing S
On 26/07/2023 16:26, Jason Merrill wrote:
> On 6/28/23 06:35, Alex Coplan wrote:
> > Hi,
> >
> > This patch implements clang's __has_feature and __has_extension in GCC.
> > This is a v2 of the original RFC posted here:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617878.html
> >
>
Tamar Christina writes:
> Hi All,
>
> When determining issue rates we currently discount non-constant MLA
> accumulators
> for Advanced SIMD but don't do it for the latency.
>
> This means the costs for Advanced SIMD with a constant accumulator are wrong
> and
> results in us costing SVE and Adv
Hi All,
Currently we segfault when len == 0 for an attribute list.
essentially [cons: =0, 1, 2, 3; attrs: ] segfaults but should be equivalent to
[cons: =0, 1, 2, 3] and [cons: =0, 1, 2, 3; attrs:]. This fixes it by just
returning early and leaving it to the validators whether this should error
Hi All,
In GCC 11 we implemented the vectorizer optab for widening left shifts,
however this optab is only supported for uniform shift constants.
At the moment GCC still has two loop vectorization strategy (classical loop and
SLP based loop vec) and the optab is implemented as a scalar pattern.
Hi All,
boolean comparisons have different cost depending on the mode. e.g.
a && b when predicated doesn't require an addition instruction, the AND is free
by combining the predicate of the one operation into the second one. At the
moment though we only fuse compares so this update requires one o
Hi All,
When determining issue rates we currently discount non-constant MLA accumulators
for Advanced SIMD but don't do it for the latency.
This means the costs for Advanced SIMD with a constant accumulator are wrong and
results in us costing SVE and Advanced SIMD the same. This can cauze us to
1 - 100 of 126 matches
Mail list logo