Ping*2
On Wed, Jun 14, 2023, 14:11 Aldy Hernandez wrote:
> PING
>
> On Sat, Jun 10, 2023 at 10:30 PM Aldy Hernandez wrote:
> >
> >
> >
> > On 5/29/23 16:51, Martin Jambor wrote:
> > > Hi,
> > >
> > > On Mon, May 22 2023, Aldy Hernandez via Gcc-patches wrote:
> > >> Implement hashing for
Ping*2
On Wed, Jun 14, 2023, 14:09 Aldy Hernandez wrote:
> PING
>
> On Mon, May 22, 2023 at 8:56 PM Aldy Hernandez wrote:
> >
> > This patch converts the ipa_jump_func code to use the type agnostic
> > ipa_vr suitable for GC instead of value_range which is integer specific.
> >
> > I've
Ping*2
On Wed, Jun 14, 2023, 14:10 Aldy Hernandez wrote:
> PING
>
> On Mon, May 22, 2023 at 8:56 PM Aldy Hernandez wrote:
> >
> > Minor cleanups to get rid of value_range in IPA. There's only one left,
> > but it's in the switch code which is integer specific.
> >
> > OK?
> >
> >
On Jun 21, 2023, Qing Zhao wrote:
> I see that you have testing case to check the above built_in_trap call
> is generated by FE.
> Do you have a testing case to check the trap is happening at runtime?
I have written such tests, using type-punning, but I don't think our
testing infrastructure
Hello, Qing,
On Jun 16, 2023, Qing Zhao wrote:
> As I mentioned in the previous round of review, I think that the documentation
> might need to add more details on what’s the LEAFY mode,
> The purpose of it, and how to use it, provide more details to the end-users.
I'm afraid I'm having
Thanks for the test.
Did you mean for me to incorporate it into the patch, or do you mean to
contribute it separately, if the feature happens to be accepted?
On Jun 19, 2023, Bernhard Reutner-Fischer wrote:
> I don't see explicit tests with _Complex nor __complex__. Would we
> want to check
This patch to the Go frontend determines the types of a couple of
expressions types that accidentally failed to recurse into their
subexpressions. The test case for this is https://go.dev/cl/505015.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed
to mainline.
Ian
Hi there,
I try to verify the offloading following below doc.
https://gcc.gnu.org/wiki/Offloading#How_to_build_an_offloading-enabled_GCC
with some steps:
1. Build nvptx-tools.
2. Symbol link nvptx-newlib to gcc source code.
3. Build the Nividia PTX accel compiler.
4. Build the host compiler
Hello,
Jeff Law writes:
> On 6/19/23 22:52, Tamar Christina wrote:
>
>>> It's a bit hackish, but could we reject the stack pointer for operand1 in
>>> the
>>> stack-tie? And if we do so, does it help?
>> Yeah this one I had to defer until later this week to look at closer because
>> what
GCC maintainers:
Ver 2. Switched to using code macros to generate the call to the
builtin and test the results. Added in instruction counts for the key
instruction for the builtin. Moved the tests into an additional
function call to ensure the compile doesn't replace the builtin call
code
On Mon, 2023-06-19 at 15:17 +0800, Kewen.Lin wrote:
> Hi Carl,
>
> on 2023/5/31 04:46, Carl Love wrote:
> > GCC maintainers:
> >
> > The following patch takes the tests in vsx-vector-6-p7.h, vsx-
> > vector-
> > 6-p8.h, vsx-vector-6-p9.h and reorganizes them into a series of
> > smaller
> >
Long time ago, I encounter ICE when trying to set clobber register as Pmode
and I forgot the reason.
So, I clobber SI scratch and PUT_MODE to make it Pmode after reload which
makes patterns look unreasonable.
According to Jeff's comments, I tried it again, it works now when we try to
set clobber
On Sat, Mar 25, 2023 at 01:11:14AM -0700, Dan Li wrote:
> This series of patches is mainly used to support the control flow
> integrity protection of the linux kernel [1], which is similar to
> -fsanitize=kcfi in clang 16.0 [2,3].
>
> Any suggestion please let me know :).
Hi Dan,
It's been a
I'd like to ping this C++ FE patch for review:
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621779.html
Thanks
Dave
On Wed, 2023-06-14 at 20:28 -0400, David Malcolm wrote:
> PR c++/110164 notes that in cases where we have a forward decl
> of a std library type such as:
>
> std::array
On Wed, 21 Jun 2023, Richard Biener via Gcc-patches wrote:
> > > int32_t x = (int32_t)0x1.0p32;
> > > int32_t y = (int32_t)(int64_t)0x1.0p32;
> > >
> > > sets x to 2147483647 and y to 0.
>
> Hmm, good question. GENERIC has a direct truncation to unsigned char
> for example, the C standard
Also change some internal variables to bool and some functions to void.
gcc/ChangeLog:
* function.h (emit_initial_value_sets):
Change return type from int to void.
(aggregate_value_p): Change return type from int to bool.
(prologue_contains): Ditto.
(epilogue_contains):
I merged trunk revision 577223aebc7acdd31e62b33c1682fe54a622ae27 to
the gccgo branch.
Ian
Hi!
First of all, many thanks for the patch!
If i may, i have a question concerning the chosen style in the error
message and one nitpick concerning a return type though, the latter
primarily prompting this reply.
On Tue, 20 Jun 2023 11:54:25 +0100
Paul Richard Thomas via Fortran wrote:
> diff
libcpp/
* charset.cc: Allow `UCS_LIMIT` in UTF-8 strings.
Reported-by: Damien Guibouret
Fixes: c1dbaa6656a (libcpp: reject codepoints above 0x10, 2023-06-06)
Signed-off-by: Ben Boeckel
---
libcpp/charset.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
Hi Paul,
while I only had a minor question regarding gfc_is_ptr_fcn(),
can you still try to enlighten me why that second part
was necessary? (I believed it to be redundant and may have
overlooked the obvious.)
Cheers,
Harald
On 6/21/23 18:12, Paul Richard Thomas via Gcc-patches wrote:
On Wed, 21 Jun 2023, Richard Biener via Gcc-patches wrote:
> > This patch sets the address space of the array type to that of the
> > element type.
> >
> > Regression tests for avr look ok. Ok for trunk?
>
> The patch looks OK to me but please let a C frontend maintainer
> double-check
When stepping through the variable/alias template specialization code
paths, I noticed we perform template argument coercion twice: first from
instantiate_alias_template / finish_template_variable and again from
tsubst_decl (during instantiate_template). It should suffice to perform
coercion
On Wed, Jun 21, 2023 at 05:12:22PM +0100, Paul Richard Thomas wrote:
> Committed as r14-2022-g577223aebc7acdd31e62b33c1682fe54a622ae27
>
> Thanks for the help and the review Harald. Thanks to Steve too for
> picking up Neil Carlson's bugs.
>
It's only natural. You fix bugs in a long desired
First C++26 papers started to trickle in. Update our docs accordingly.
We don't have -std=c++2c/-std=c++26/-std=gnu++2c/-std=gnu++26 yet, but
I should have a patch for it by the end of this week.
W3C validated. Pushed.
commit 9c66e33761140358d350c5fb2d1638f6afdaead4
Author: Marek Polacek
On Jun 20, 2023, at 10:21 AM, David Malcolm wrote:
> Does this testsuite patch look OK?
>
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620275.html
>
> Thanks
> David
>
> On Mon, 2023-06-12 at 19:11 -0400, David Malcolm wrote:
>> Please can someone review this testsuite patch:
>>
On 6/19/23 3:39 AM, Thomas Schwinge wrote:
Hi Paul!
On 2023-06-16T11:00:02-0500, "Paul E. Murphy via Gcc-patches"
wrote:
This was noticed when fixing the gccgo usage of the macro, the
rust usage is very similar.
TARGET_AIX is defined as a non-zero value on linux/powerpc64le
which may
Committed as r14-2022-g577223aebc7acdd31e62b33c1682fe54a622ae27
Thanks for the help and the review Harald. Thanks to Steve too for
picking up Neil Carlson's bugs.
Cheers
Paul
On Tue, 20 Jun 2023 at 22:57, Harald Anlauf wrote:
>
> Hi Paul,
>
> On 6/20/23 12:54, Paul Richard Thomas via
Committed as r14-2021-gcaf0892eea67349d9a1e44590c3440768136fe2b
Thanks for the pointers, Tobias and Mikael, I used them both.
Paul
On Tue, 20 Jun 2023 at 21:47, Mikael Morin wrote:
>
> Le 20/06/2023 à 18:30, Tobias Burnus a écrit :
> > On 20.06.23 18:19, Paul Richard Thomas via Fortran wrote:
Hi, Alexandre,
>
> diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
> index 22e240a3c2a55..f9cc609b54d94 100644
> --- a/gcc/c/c-typeck.cc
> +++ b/gcc/c/c-typeck.cc
> @@ -2226,6 +2226,35 @@ convert_lvalue_to_rvalue (location_t loc, struct
> c_expr exp,
> exp.value = convert
Hi, Jeff.
I tried again:
+(define_expand "fma4"
+ [(parallel
+[(set (match_operand:VF_AUTO 0 "register_operand")
+ (fma:VF_AUTO
+ (match_operand:VF_AUTO 1 "register_operand")
+ (match_operand:VF_AUTO 2 "register_operand")
+ (match_operand:VF_AUTO 3
This patch adds RVV floating-point auto-vectorization.
Also, fix attribute bug of floating-point ternary operations in vector.md.
gcc/ChangeLog:
* config/riscv/autovec.md (fma4): New pattern.
(*fma): Ditto.
(fnma4): Ditto.
(*fnma): Ditto.
(fms4): Ditto.
From: Ju-Zhe Zhong
gcc/ChangeLog:
* internal-fn.cc (expand_partial_store_optab_fn): Adapt for
LEN_MASK_STORE.
(internal_load_fn_p): Add LEN_MASK_LOAD.
(internal_store_fn_p): Add LEN_MASK_STORE.
(internal_fn_mask_index): Add LEN_MASK_{LOAD,STORE}.
On 6/21/23 09:28, 钟居哲 wrote:
I have tried:
(define_expand "fms4"
[(parallel
[(set (match_operand:VF_AUTO 0 "register_operand")
(fma:VF_AUTO
(match_operand:VF_AUTO 1 "register_operand")
(match_operand:VF_AUTO 2 "register_operand")
(neg:VF_AUTO
On 6/21/23 01:49, Richard Biener via Gcc-patches wrote:
The following addresses a miscompilation by RTL scheduling related
to the representation of masked stores. For that we have
(insn 38 35 39 3 (set (mem:V16SI (plus:DI (reg:DI 40 r12 [orig:90 _22 ] [90])
(const:DI
I have tried:
(define_expand "fms4"
[(parallel
[(set (match_operand:VF_AUTO 0 "register_operand")
(fma:VF_AUTO
(match_operand:VF_AUTO 1 "register_operand")
(match_operand:VF_AUTO 2 "register_operand")
(neg:VF_AUTO
(match_operand:VF_AUTO 3 "register_operand"
On 6/21/23 09:20, 钟居哲 wrote:
I failed to make Pmode of the of operand.
I have tried the following
clobber (match_dup_4)
But it causes to many issues. I do many tries turns out only the current
solution can work.
Can you describe more concretely what failed?
Offhand I can't think of a
I failed to make Pmode of the of operand.
I have tried the following
clobber (match_dup_4)
But it causes to many issues. I do many tries turns out only the current
solution can work.
juzhe.zh...@rivai.ai
From: Jeff Law
Date: 2023-06-21 23:15
To: Juzhe-Zhong; gcc-patches
CC: kito.cheng;
On 6/21/23 05:12, Juzhe-Zhong wrote:
This patch adds RVV floating-point auto-vectorization.
Also, fix attribute bug of floating-point ternary operations in vector.md.
gcc/ChangeLog:
* config/riscv/autovec.md (fma4): New pattern.
(*fma): Ditto.
(fnma4): Ditto.
On Wed, Jun 21, 2023 at 10:22 AM Richard Biener
wrote:
> > > + /* For conversions between float and smaller integer types try
> > > whether we
> > > +can use intermediate signed integer types to support the
> > > +conversion. */
> >
> > I'm trying to enhance testcase
This patch adds RVV floating-point auto-vectorization.
Also, fix attribute bug of floating-point ternary operations in vector.md.
gcc/ChangeLog:
* config/riscv/autovec.md (fma4): New pattern.
(*fma): Ditto.
(fnma4): Ditto.
(*fnma): Ditto.
(fms4): Ditto.
LGTM.
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2023-06-21 20:57
To: juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng; jeffreyalaw
CC: rdapp.gcc
Subject: [PATCH v2] RISC-V: Implement autovec copysign.
Hi,
changes from v1:
- Removed UNSPEC_VNCOPYSIGN
- Adjusted xorsign test
On 6/21/23 00:41, Richard Biener wrote:
I thought during the introduction of erroneous path isolation that we
concluded stores, calls and such had observable side effects that must be
preserved, even when we hit a block that leads to __builtin_unreachable.
Indeed, I remember we repeatedly
Hi,
changes from v1:
- Removed UNSPEC_VNCOPYSIGN
- Adjusted xorsign test expectation.
Regards
Robin
This adds vector copysign, ncopysign and xorsign as well as the
accompanying tests.
gcc/ChangeLog:
* config/riscv/autovec.md (copysign3): Add expander.
(xorsign3): Dito.
Hi Juzhe,
LGTM apart from a tiny nit:
> + /* We have a maximum of 11 operands for RVV instruction patterns according
> to
> + * vector.md. */
> + insn_expander<11> e (/*OP_NUM*/ op_num, /*HAS_DEST_P*/ true,
Seems like you copied this from the non-fp ternary part but the
rest of the file
Hi all,
The architecture recommends that load-gather instructions avoid using the same
Z register for the load address and the destination, and the Software
Optimization
Guides for Arm cores recommend that as well.
This means that for code like:
#include
svuint64_t
food (svbool_t p, uint64_t
On 21.06.2023 09:44, Jan Beulich wrote:
> On 21.06.2023 09:37, Hongtao Liu wrote:
>> On Wed, Jun 21, 2023 at 2:06 PM Jan Beulich via Gcc-patches
>> wrote:
>>>
>>> Isn't prefix_extra use bogus here? What extra prefix does vbroadcastss
>> According to comments, yes, no extra prefix is needed.
>>
>>
Committed, thanks Richard.
Pan
-Original Message-
From: Gcc-patches On Behalf
Of Richard Biener via Gcc-patches
Sent: Wednesday, June 21, 2023 7:42 PM
To: Ju-Zhe Zhong
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH]: [NFC] Move can_vec_mask_load_store_p and
get_len_load_store_mode
I've pushed Jason's patch from https://gcc.gnu.org/PR105651#c17 to the
gcc-12 branch, because Jakub's fix on gcc-13 isn't possible to backport.
Tested x86_64-linux, pushed to gcc-12.
-- >8 --
PR tree-optimization/105651
libstdc++-v3/ChangeLog:
* include/bits/basic_string.tcc
On Wed, 21 Jun 2023, Richard Biener wrote:
> PR110243 shows strip_offset has some correctness issues, the following
> avoids using it from loop distribution which can use the more correct
> split_constant_offset from data-ref analysis instead. The patch then
> un-exports the function and
On Wed, 21 Jun 2023, juzhe.zh...@rivai.ai wrote:
> From: Ju-Zhe Zhong
>
> Since we want both can_vec_mask_load_store_p and get_len_load_store_mode
> can see "internal_fn", move these 2 functions into optabs-tree.
OK.
Thanks,
Richard.
> gcc/ChangeLog:
>
> * optabs-query.cc
From: Ju-Zhe Zhong
Since we want both can_vec_mask_load_store_p and get_len_load_store_mode
can see "internal_fn", move these 2 functions into optabs-tree.
gcc/ChangeLog:
* optabs-query.cc (can_vec_mask_load_store_p): Move to optabs-tree.cc.
(get_len_load_store_mode): Ditto.
This avoids one strip_offset use in add_iv_candidate_for_use where
we know it operates on a sizetype quantity.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
* tree-ssa-loop-ivopts.cc (add_iv_candidate_for_use): Use
split_constant_offset for the POINTER_PLUS_EXPR
This avoids a strip_offset use in record_group_use where we know
it operates on addresses.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
* tree-ssa-loop-ivopts.cc (record_group_use): Use
split_constant_offset.
---
gcc/tree-ssa-loop-ivopts.cc | 8
1 file
PR110243 shows strip_offset has some correctness issues, the following
avoids using it from loop distribution which can use the more correct
split_constant_offset from data-ref analysis instead. The patch then
un-exports the function and refactors it to make it obvious the
actual constant offset
Richard Biener writes:
> On Wed, Jun 21, 2023 at 11:32 AM Richard Sandiford
> wrote:
>>
>> Richard Sandiford writes:
>> > Richard Biener via Gcc-patches writes:
>> >> On Fri, Jun 2, 2023 at 3:01 AM liuhongt via Gcc-patches
>> >> wrote:
>> >>>
>> >>> We have already use intermidate type in
Hi all,
This patch converts the SVE load gather patterns to the new compact syntax
that Tamar introduced. This allows for a future patch I want to contribute
to add more alternatives that are better viewed in the more compact form.
The lines in some patterns are >80 long now, but I think that's
Richard Biener writes:
> The issue in the PR the change is fixing is that we end up with
> an expression that overflows but uses signed arithmetic and so
> we miscompile it later. IIRC the fixes to split_constant_offset
> always were that the sum of the base + offset wasn't equal to
> the
This patch adds RVV floating-point auto-vectorization.
Also, fix attribute bug of floating-point ternary operations in vector.md.
gcc/ChangeLog:
* config/riscv/autovec.md (fma4): New pattern.
(*fma): Ditto.
(fnma4): Ditto.
(*fnma): Ditto.
(fms4): Ditto.
On Wed, Jun 21, 2023 at 11:32 AM Richard Sandiford
wrote:
>
> Richard Sandiford writes:
> > Richard Biener via Gcc-patches writes:
> >> On Fri, Jun 2, 2023 at 3:01 AM liuhongt via Gcc-patches
> >> wrote:
> >>>
> >>> We have already use intermidate type in case WIDEN, but not for NONE,
> >>>
Hi, Richi.
I received again:
"host smtp-in1.suse.de[195.135.220.23] said: 452 4.3.1 Insufficient system
storage (in reply to MAIL FROM command)"
that I failed to send you the last email.
Now, I CC you again, this is last email:
Hi, Richi. Thanks so much for the review and comments.
>> Can
On Wed, 21 Jun 2023, Richard Biener wrote:
> On Tue, 20 Jun 2023, Richard Sandiford wrote:
>
> > Richard Biener writes:
> > > On Mon, 19 Jun 2023, Richard Sandiford wrote:
> > >
> > >> Jeff Law writes:
> > >> > On 6/16/23 06:34, Richard Biener via Gcc-patches wrote:
> > >> >> IVOPTs has
>
> If I manually add a __builtin_unreachable () to the above case
> I see the *(int *)0 = 0; store DSEd. Maybe we should avoid
> removing stores that might trap here? POSIX wise such a trap
> could be a way to jump out of the path leading to unreachable ()
> via siglongjmp ...
I am not sure
Hi All,
It seems like @blackslashchar{} is a relatively new addition
to texinfo. Other parts of the docs use @samp{\} so use it
here too so older distros work.
Bootstrapped on aarch64-none-linux-gnu and no issues.
committed under obvious rule.
Thanks,
Tamar
gcc/ChangeLog:
PR
Richard Sandiford writes:
> Richard Biener via Gcc-patches writes:
>> On Fri, Jun 2, 2023 at 3:01 AM liuhongt via Gcc-patches
>> wrote:
>>>
>>> We have already use intermidate type in case WIDEN, but not for NONE,
>>> this patch extended that.
>>>
>>> I didn't do that in pattern recog since we
On Tue, 20 Jun 2023, Richard Sandiford wrote:
> Richard Biener writes:
> > On Mon, 19 Jun 2023, Richard Sandiford wrote:
> >
> >> Jeff Law writes:
> >> > On 6/16/23 06:34, Richard Biener via Gcc-patches wrote:
> >> >> IVOPTs has strip_offset which suffers from the same issues regarding
> >> >>
Hi, Richi. Thanks so much for the review and comments.
>> Can you instead adjust get_len_load_store_mode and
>>can_vec_mask_load_store_p to provide the optab they matched on
>>via the corresponding IFN code as additional output (add a
>>pointer argument, you can default it to nullptr and only
Richard Biener via Gcc-patches writes:
> On Fri, Jun 2, 2023 at 3:01 AM liuhongt via Gcc-patches
> wrote:
>>
>> We have already use intermidate type in case WIDEN, but not for NONE,
>> this patch extended that.
>>
>> I didn't do that in pattern recog since we need to know whether the
>> stmt
On Tue, 20 Jun 2023, juzhe.zh...@rivai.ai wrote:
> From: Ju-Zhe Zhong
>
> gcc/ChangeLog:
>
> * internal-fn.cc (expand_partial_store_optab_fn): Add
> LEN_MASK_{LOAD,STORE} vectorizer support.
> (internal_load_fn_p): Ditto.
> (internal_store_fn_p): Ditto.
>
On Wed, Jun 21, 2023 at 9:50 AM Uros Bizjak via Gcc-patches
wrote:
>
> On Tue, Jun 20, 2023 at 6:11 PM liuhongt via Gcc-patches
> wrote:
> >
> > I notice there's some refactor in vectorizable_conversion
> > for code_helper,so I've adjusted my patch to that.
> > Here's the patch I'm going to
On Wed, Jun 21, 2023 at 7:57 AM wrote:
>
> Hi,
>
> When c-typeck.cc:c_build_qualified_type builds an array type
> from its element type, it does not copy the address space of
> the element type to the array type itself. This is unlike
> tree.cc:build_array_type_1, which explicitly does
>
Hi Jeff, sorry for the late reply.
> The long branch handling is done at the assembler level. So the clobbering
> of $ra isn't visible to the compiler. Thus the compiler has to be
> extremely careful to not hold values in $ra because the assembler may
> clobber $ra.
If assembler will modify
From: Pan Li
We extend the machine mode from 8 to 16 bits already. But there still
one placing missing from the streamer. It has one hard coded array
for the machine code like size 256.
In the lto pass, we memset the array by MAX_MACHINE_MODE count but the
value of the MAX_MACHINE_MODE will
On Tue, Jun 20, 2023 at 6:11 PM liuhongt via Gcc-patches
wrote:
>
> I notice there's some refactor in vectorizable_conversion
> for code_helper,so I've adjusted my patch to that.
> Here's the patch I'm going to commit.
>
> We have already use intermidate type in case WIDEN, but not for NONE,
>
The following addresses a miscompilation by RTL scheduling related
to the representation of masked stores. For that we have
(insn 38 35 39 3 (set (mem:V16SI (plus:DI (reg:DI 40 r12 [orig:90 _22 ] [90])
(const:DI (plus:DI (symbol_ref:DI ("b") [flags 0x2] )
LGTM. Thanks.
juzhe.zh...@rivai.ai
From: shiyulong
Date: 2023-06-21 15:39
To: gcc-patches
CC: palmer; kito.cheng; jeffreyalaw; juzhe.zhong; pan2.li; wuwei2016; jiawei;
shihua; dje.gcc; pinskia; yulong
Subject: [PATCH V1] RISC-V:Add float16 tuple type abi
From: yulong
gcc/ChangeLog:
On 21.06.2023 09:37, Hongtao Liu wrote:
> On Wed, Jun 21, 2023 at 2:06 PM Jan Beulich via Gcc-patches
> wrote:
>>
>> Is there a reason why vec_dupv4sf uses sseshuf1 for its shuffle
>> alternatives, but *vec_dupv4si uses sselog1? I'd be happy to correct
>> this in whichever is the appropriate
From: yulong
gcc/ChangeLog:
* config/riscv/vector.md: Add float16 attr at sew、vlmul and ratio.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/abi-10.c: Add float16 tuple type case.
* gcc.target/riscv/rvv/base/abi-11.c: Ditto.
*
On Wed, Jun 21, 2023 at 2:06 PM Jan Beulich via Gcc-patches
wrote:
>
> ... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are
> never longer (yet sometimes shorter) than the corresponding VSHUFPS /
> VPSHUFD, due to the immediate operand of the shuffle insns balancing the
>
No, I don't think we need another UNSPEC.
You just need to modify predicate of (match_operand: 4 "reg_or_0_operand")
juzhe.zh...@rivai.ai
From: Wang, Yanzhang
Date: 2023-06-21 15:08
To: juzhe.zh...@rivai.ai; Robin Dapp; gcc-patches
CC: Robin Dapp; Kito.cheng; Li, Pan2
Subject: RE: Re: [PATCH]
Thanks Jakub, will fix the format issue and send the V3 patch, as well as try
to validate it for offloading.
Pan
-Original Message-
From: Jakub Jelinek
Sent: Wednesday, June 21, 2023 3:16 PM
To: Li, Pan2
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; rdapp@gmail.com;
On Wed, Jun 21, 2023 at 06:59:08AM +, Li, Pan2 wrote:
> inline machine_mode
> bp_unpack_machine_mode (struct bitpack_d *bp)
> {
> - return (machine_mode)
> -((class lto_input_block *)
> - bp->stream)->mode_table[bp_unpack_enum (bp, machine_mode, 1 << 8)];
> + int last = 1
Of cause, I'd like to make it generic. Thanks Robin’s advice! It's right,
there're many similar situations.
But I'm not sure how to distinguish different operations. Currently, the
VMULH is fixed as below.
+ (unspec:VI_QHS
+ [(vec_duplicate:VI_QHS
+
Thanks Jakub for the useful comments, go thru the mail list and have a
refinement version as below. But I not sure if I understand correct about
adding new field named mode_bits in struct lto_file_decl_data, looks
unnecessary up to a point.
Thanks again for your coaching with patient.
diff
Thanks, you are right. I have not considered the iterator much. I picked it
from one of pred_mulh directly. It should be able to work with VFULL_I.
Yanzhang
From: juzhe.zh...@rivai.ai
Sent: Wednesday, June 21, 2023 2:21 PM
To: Wang, Yanzhang ; gcc-patches
Cc: Kito.cheng ; Li, Pan2 ; Wang,
On Tue, 20 Jun 2023, Jeff Law wrote:
>
>
> On 6/20/23 00:59, Richard Biener via Gcc-patches wrote:
> > DSE isn't good at identifying program points that end lifetime
> > of variables that are not associated with virtual operands. But
> > at least for those that end basic-blocks we can handle
LGTM as long as you remove all stuff related to UNSPEC_VNCOPYSIGN
Thanks.
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2023-06-21 14:36
To: 钟居哲; gcc-patches; palmer; kito.cheng; Jeff Law
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: Implement autovec copysign.
> You should remove all "unspec"
> You should remove all "unspec" related of "n" ncopysign including
> riscv-vector-builtins-bases.cc
> vector.md/ vector-iterators.md
Ah, there was indeed one stray UNSPEC_VNCOPYSIGN in the iterators, thanks. Any
other
comments before I sent V2?
Regards
Robin
Oh. Yes. Thanks for Robin pointing this.
@yanzhang, could you refine this patch more deeply to gain more optimizations ?
Thanks.
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2023-06-21 14:27
To: yanzhang.wang; gcc-patches
CC: rdapp.gcc; juzhe.zhong; kito.cheng; pan2.li
Subject: Re: [PATCH]
Following two-operand bitwise operations, add another splitter to also
deal with not followed by broadcast all on its own, which can be
expressed as simple embedded broadcast instead once a broadcast operand
is actually permitted in the respective insn. While there also permit
a broadcast operand
Hi Yanzhang,
while I appreciate the optimization, I'm a bit wary about just adding a special
case for "0". Is that so common? Wouldn't we also like to have
* pow2_p (val) == << val and others?
* 1 should also be covered.
Regards
Robin
With respective two-operand bitwise operations now expressable by a
single VPTERNLOG, add splitters to also deal with ior and xor
counterparts of the original and-only case. Note that the splitters need
to be separate, as the placement of "not" differs in the final insns
(*iornot3, *xnor3) which
The intended broadcast (with AVX512) can very well be done right from
memory.
gcc/
* config/i386/sse.md: Permit non-immediate operand 1 in AVX2
form of splitter for PR target/100711.
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -17356,7 +17356,7 @@
When it's the memory operand which is to be inverted, using VPANDN*
requires a further load instruction. The same can be achieved by a
single VPTERNLOG*. Add two new alternatives (for plain memory and
embedded broadcast), adjusting the predicate for the first operand
accordingly.
Two pre-existing
All combinations of and, ior, xor, and not involving two operands can be
expressed that way in a single insn.
gcc/
PR target/93768
* config/i386/i386.cc (ix86_rtx_costs): Further special-case
bitwise vector operations.
* config/i386/sse.md (*iornot3): New insn.
While there are some quite sophisticated 4-operand expanders,
2-operand binary logic which can't be expressed by just VPAND,
VPANDN, VPOR, or VPXOR doesn't utilize this insn to carry out
such operations in a single insn. Therefore the first two
patches address one of the sub-aspects of PR
+machine_mode mask_mode = riscv_vector::get_mask_mode (mode)
+ .require ();
+emit_insn (gen_pred_mov (mode, operands[0], CONST1_RTX (mask_mode),
+ RVV_VUNDEF (mode), CONST0_RTX (GET_MODE (operands[0])),
+ operands[5], operands[6], operands[7], operands[8]));
I don't think you
Good catch!
vmulh.vx v24,v24,zero -> vmv.v.i v1,0
can eliminate use of v24 and reduce register pressure.
But I wonder why you pick only VI_QHS?
+ [(set (match_operand:VI_QHS 0 "register_operand")
SEW = 64 should always have such optimization.
Thanks.
juzhe.zh...@rivai.ai
From:
From: Yanzhang Wang
This patch will optimize the below mulh example,
vint32m1_t shortcut_for_riscv_vmulh_case_0(vint32m1_t v1, size_t vl) {
return __riscv_vmulh_vx_i32m1(v1, 0, vl);
}
from mulh pattern
vsetvli zero, a2, e32, m1, ta, ma
vmulh.vx v24, v24, zero
vs1r.vv24, 0(a0)
to
... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are
never longer (yet sometimes shorter) than the corresponding VSHUFPS /
VPSHUFD, due to the immediate operand of the shuffle insns balancing the
possible need for VEX3 in the broadcast ones. When EVEX encoding is
required the
This is to cover testing also being done with -march=cascadelake.
---
Committing as obvious.
--- a/gcc/testsuite/gcc.target/i386/avx512f-dupv2di.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-dupv2di.c
@@ -1,5 +1,5 @@
/* { dg-do compile { target { ! ia32 } } } */
-/* { dg-options "-mavx512f
100 matches
Mail list logo