"juzhe.zhong" writes:
> Thanks Richard.
> I will do that as you suggested. I have a question for the first patch. How
> to
> enable decrement IV? Should I add a target hook or something to let target
> decide whether enable decrement IV?
At the moment, the only other targets that use
"Li, Pan2 via Gcc-patches" writes:
> Thanks Richard for comments. In previous, I am not sure it is reasonable to
> let everywhere consume the same macro in rtl.h (As the includes you
> mentioned). Thus, make a conservative change in PATCH v1.
>
> I will address the comments and try to align the
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong
>
> 1. Fix document description according Jeff && Richard.
> 2. Add LOOP_VINFO_USING_SELECT_VL_P for single rgroup.
> 3. Add LOOP_VINFO_USING_SLP_ADJUSTED_LEN_P for SLP multiple rgroup.
>
> Fix bugs for V5 after testing:
>
pan2...@intel.com writes:
> From: Pan Li
>
> We are running out of the machine_mode(8 bits) in RISC-V backend. Thus
> we would like to extend the machine mode bit size from 8 to 16 bits.
> However, it is sensitive to extend the memory size in common structure
> like tree or rtx. This patch would
Tejas Belagod writes:
> From: Tejas Belagod
>
> This PR optimizes an SVE intrinsics sequence where
> svlasta (svptrue_pat_b8 (SV_VL1), x)
> a scalar is selected based on a constant predicate and a variable vector.
> This sequence is optimized to return the correspoding element of a
Prathamesh Kulkarni writes:
> On Tue, 2 May 2023 at 18:22, Richard Sandiford
> wrote:
>>
>> Prathamesh Kulkarni writes:
>> > On Tue, 2 May 2023 at 17:32, Richard Sandiford
>> > wrote:
>> >>
>> >> Prathamesh Kulkarni writes:
>> >> > On Tue, 2 May 2023 at 14:56, Richard Sandiford
>> >> >
Prathamesh Kulkarni writes:
> diff --git a/gcc/testsuite/gcc.target/aarch64/vec-init-18.c
> b/gcc/testsuite/gcc.target/aarch64/vec-init-18.c
> new file mode 100644
> index 000..598a51f17c6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/vec-init-18.c
> @@ -0,0 +1,20 @@
> +/* {
"Roger Sayle" writes:
> This patch proposes adding run-time library support for bit reversal,
> by adding a __bitrevsi2 function to libgcc. Thoughts/opinions?
>
> I'm also tempted to add __popcount[qh]i2 and __parity[qh]i2 to libgcc,
> to allow the RTL optimizers to perform narrowing operations,
"Roger Sayle" writes:
> An analysis of backend UNSPECs reveals that two of the most common UNSPECs
> across target backends are for copysign and bit reversal. This patch
> adds RTX codes for these expressions to allow their representation to
> be standardized, and them to optimized by the
"juzhe.zh...@rivai.ai" writes:
> Thanks. I have read rgroup descriptions again.
> Still I am not fully understand it clearly, bear with me :)
>
> I don't known how to differentiate Case 2 and Case 3.
>
> Case 2 is multiple rgroup for SLP.
> Case 3 is multiple rgroup for non-SLP (VEC_PACK_TRUNC)
>
The trunk patch for this PR corrected the ABI for enums that have
a defined underlying type. We shouldn't change the ABI on the branches
though, so this patch just removes the assertions that highlighed
the problem.
I think the same approach makes sense longer-term: keep the assertions
at
"juzhe.zh...@rivai.ai" writes:
> Oh, I see. But I saw there is a variable using_partial_vectors_p
> in the loop data structure.
>
> Can I add a variable call using_select_vl_p ?
Yeah. Please also add a wrapper macro like
LOOP_VINFO_USING_PARTIAL_VECTORS_P. (I'm not really a fan of the
"juzhe.zh...@rivai.ai" writes:
> Hi, Richard. Since create_iv has been approved and soon will be commited
> after
> we bootstrap && regression.
>
> Now, I plan to send patch for "decrement IV".
>
> After reading your comments, I have several questions:
>
> 1.
>>if (use_bias_adjusted_len)
Christophe Lyon writes:
> On 5/11/23 10:30, Richard Sandiford wrote:
>> Christophe Lyon writes:
>>> On 5/10/23 16:52, Kyrylo Tkachov wrote:
> -Original Message-
> From: Christophe Lyon
> Sent: Wednesday, May 10, 2023 2:31 PM
> To: gcc-patches@gcc.gnu.org;
Christophe Lyon writes:
> On 5/10/23 16:52, Kyrylo Tkachov wrote:
>>
>>
>>> -Original Message-
>>> From: Christophe Lyon
>>> Sent: Wednesday, May 10, 2023 2:31 PM
>>> To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ;
>>> Richard Earnshaw ; Richard Sandiford
>>>
>>> Cc: Christophe Lyon
juzhe.zh...@rivai.ai writes:
> From: Juzhe-Zhong
>
> This is patch is a seperate patch preparing for supporting decrement IV.
>
> gcc/ChangeLog:
>
> * cfgloopmanip.cc (create_empty_loop_on_edge): Add PLUS_EXPR.
> * gimple-loop-interchange.cc
>
pan2...@intel.com writes:
> From: Pan Li
>
> The decl_or_value is defined as void * before this PATCH. It will take
> care of both the tree_node and rtx_def. Unfortunately, given a void
> pointer cannot tell the input is tree_node or rtx_def.
>
> Then we have some implicit structure layout
钟居哲 writes:
> I am sorry that I am still confused about that.
>
> Is this what you want ?
>
> bool use_minus_p = TREE_CODE (step) == INTEGER_CST && ((TYPE_UNSIGNED
> (TREE_TYPE (step)) && tree_int_cst_lt (step1, step))
> || (!TYPE_UNSIGNED (TREE_TYPE (step)) &&
>
"Li, Pan2" writes:
> Thanks Richard Sandiford. Update PATCH v4 here ->
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618099.html.
>
>> - if (dv_as_opaque (node->dv) != decl || node->offset != offset)
>> + if (node->dv.first_or_null () != decl || node->offset !=
>> + offset)
>
钟居哲 writes:
> Thanks Richard.
> I am planning to seperate a patch with only creat_iv stuff only.
>
> Are you suggesting that I remove "tree_code incr_op = code;"
> Use the argument directly ?
>
> I saw the codes here:
>
> /* For easier readability of the created code, produce MINUS_EXPRs
>
Alexandre Oliva via Gcc-patches writes:
> On vxworks, isunordered is defined as a macro that ultimately calls a
> _Fpcomp function, that GCC doesn't recognize as a builtin, so it
> can't optimize accordingly.
>
> Use __builtin_isunordered instead to get the desired code for the
> test.
>
>
In addition to Jeff's comments:
juzhe.zh...@rivai.ai writes:
> [...]
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index cc4a93a8763..99cf0cdbdca 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -4974,6 +4974,40 @@ for (i = 1; i < operand3; i++)
>operand0[i] = operand0[i - 1]
Thanks, mostly looks good to me. Some minor comments below.
pan2...@intel.com writes:
> From: Pan Li
>
> The decl_or_value is defined as void * before this PATCH. It will take
> care of both the tree_node and rtx_def. Unfortunately, given a void
> pointer cannot tell the input is tree_node or
Richard Biener writes:
> On Wed, May 10, 2023 at 11:49 AM Richard Biener
> wrote:
>>
>> On Wed, May 10, 2023 at 11:01 AM Richard Sandiford
>> wrote:
>> >
>> > Oluwatamilore Adebayo writes:
>> > > From 0b5f469171c340ef61a48a31877d495bb77bd35f Mon Sep 17 00:00:00 2001
>> > > From: oluade01
>> >
Jakub Jelinek writes:
> On Wed, May 10, 2023 at 07:57:05PM +0800, pan2...@intel.com wrote:
>> --- a/gcc/var-tracking.cc
>> +++ b/gcc/var-tracking.cc
>> @@ -116,9 +116,14 @@
>> #include "fibonacci_heap.h"
>> #include "print-rtl.h"
>> #include "function-abi.h"
>> +#include "mux-utils.h"
>>
>>
Oluwatamilore Adebayo writes:
> From 0b5f469171c340ef61a48a31877d495bb77bd35f Mon Sep 17 00:00:00 2001
> From: oluade01
> Date: Fri, 14 Apr 2023 10:24:43 +0100
> Subject: [PATCH 1/4] Missed opportunity to use [SU]ABD
>
> This adds a recognition pattern for the non-widening
> absolute difference
Richard Biener writes:
> On Wed, 10 May 2023, pan2...@intel.com wrote:
>
>> From: Pan Li
>>
>> The decl_or_value is defined as void * before this PATCH. It will take
>> care of both the tree_node and rtx_def. Unfortunately, given a void
>> pointer cannot tell the input is tree_node or rtx_def.
Richard Biener writes:
> On Wed, May 10, 2023 at 12:05 AM Richard Sandiford via Gcc-patches
> wrote:
>>
>> Andrew Pinski writes:
>> > On Tue, May 9, 2023 at 11:02 AM Richard Sandiford via Gcc-patches
>> > wrote:
>> >>
>>
aarch64-sve2-acle-asm.exp tried to prevent --with-cpu/tune
from affecting the results, but it used sve_flags rather than
sve2_flags. This was a silent failure when running the full
testsuite, but was a fatal error when running the harness
individually.
Tested on aarch64-linux-gnu, pushed to
钟居哲 writes:
> Hi, Richards. I would like to give more information about this patch so that
> it will make this patch easier for you to review.
>
> Currently, I saw we have 3 situations that we need to handle in case of loop
> control IV in auto-vectorization:
> 1. Single rgroup loop control
"Li, Pan2" writes:
> After the bits patch like below.
>
> rtx_def code 16 => 8 bits.
> rtx_def mode 8 => 16 bits.
> tree_base code unchanged.
>
> The structure layout of both the rtx_def and tree_base will be something
> similar as below. As I understand, the lower 8-bits of tree_base will be
>
Some tests hard-coded specific allocations for temporary registers,
whereas the RA should be free to pick anything that doesn't force
unnecessary moves or spills.
gcc/testsuite/
* gcc.target/aarch64/asimd-mul-to-shl-sub.c: Allow any register
allocation for temporary results,
There were many tests that used [0-9] to match an FP or vector register,
but that should allow any of 0-31 instead.
asm-x-constraint-1.c required s0-s7, but that's the range for "y"
rather than "x". "x" allows s0-s15.
sve/pcs/return_9.c required z2-z7 (the initial set of available
Some ACLE intrinsics map to instructions that tie the output
operand to an input operand. If all the operands are allocated
to different registers, and if MOVPRFX can't be used, we will need
a move either before the instruction or after it. Many tests only
matched the "before" case; this patch
Most governing predicate operands require p0-p7, but some
instructions also allow p8-p15. Non-gp uses of predicates
often also allow all of p0-p15.
This patch fixes up cases where we required p0-p7 unnecessarily.
In some cases we match the definition (typically a comparison,
PFALSE or PTRUE),
Some of the svdup tests expand to a SEL between two constant vectors.
This patch allows the constants to be formed in either order.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/asm/dup_s16.c: When using SEL to select
between two constant vectors, allow the constant moves to appear
Some of the SVE ACLE asm tests tried to be agnostic about the
instruction order, but only one of the alternatives was exercised
in practice. This patch fixes latent typos in the other versions.
gcc/testsuite/
* gcc.target/aarch64/sve2/acle/asm/aesd_u8.c: Fix expected register
I have a patch that seems to improve register allocation for SIMD
lane operations, and for similar instructions that require a reduced
register range. However, it showed that a lot of asm tests are
sensitive to the current register allocation. This patch series
tries to correct the affected
This patch follows on from g:9f635bd13fe9e85872e441b6f3618947f989909a
("the previous patch"). To start by quoting that:
If an insn requires two operands to be tied, and the input operand dies
in the insn, IRA acts as though there were a copy from the input to the
output with the same execution
Richard Biener writes:
> On Fri, 28 Apr 2023, Andre Vieira (lists) wrote:
>
>> This patch replaces the existing tree_code widen_plus and widen_minus
>> patterns with internal_fn versions.
>>
>> DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it provides
>> convenience wrappers
YunQiang Su writes:
> speculation_barrier for MIPS needs sync+jr.hb (r2+),
> so we implement __speculation_barrier in libgcc, like arm32 does.
Looks reasonable, but do you have a source for the fallback
pre-r2 handling? (Thanks for adding that btw, since I realise
it's not your focus here.)
aarch64_function_arg_alignment has traditionally taken the alignment
of a scalar type T from TYPE_ALIGN (TYPE_MAIN_VARIANT (T)). This is
supposed to discard any user alignment and give the alignment of the
underlying fundamental type.
PR109661 shows that this did the wrong thing for enums with
a
aarch64_function_arg_alignment has two related abi_break
parameters: abi_break for a change in GCC 9, and abi_break_packed
for a related follow-on change in GCC 13. In a sense, abi_break_packed
is a "subfix" of abi_break.
PR109661 now requires a third ABI break that is independent
of the other
Prathamesh Kulkarni writes:
> On Tue, 2 May 2023 at 17:32, Richard Sandiford
> wrote:
>>
>> Prathamesh Kulkarni writes:
>> > On Tue, 2 May 2023 at 14:56, Richard Sandiford
>> > wrote:
>> >> > [aarch64] Improve code-gen for vector initialization with single
>> >> > constant element.
>> >> >
>>
Andrew Pinski via Gcc-patches writes:
> There is no canonical form for this case defined. So the aarch64 backend needs
> a pattern to match both of these forms.
>
> The forms are:
> (set (reg/i:SI 0 x0)
> (if_then_else:SI (eq (reg:CC 66 cc)
> (const_int 0 [0]))
> (reg:SI
Prathamesh Kulkarni writes:
> On Tue, 2 May 2023 at 14:56, Richard Sandiford
> wrote:
>> > [aarch64] Improve code-gen for vector initialization with single constant
>> > element.
>> >
>> > gcc/ChangeLog:
>> > * config/aarch64/aarc64.cc (aarch64_expand_vector_init): Tweak
>> > condition
Michael Collison writes:
> While working on autovectorizing for the RISCV port I encountered an issue
> where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
> evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
> where GET_MODE_NUNITS is equal to one.
Prathamesh Kulkarni writes:
> On Tue, 25 Apr 2023 at 16:29, Richard Sandiford
> wrote:
>>
>> Prathamesh Kulkarni writes:
>> > Hi Richard,
>> > While digging thru aarch64_expand_vector_init, I noticed it gives
>> > priority to loading a constant first:
>> > /* Initialise a vector which is
Jakub Jelinek via Gcc-patches writes:
> Hi!
>
> On Thu, Apr 27, 2023 at 10:34:59AM +, Richard Biener wrote:
>> OK. As said the patch itself looks good to me, let's go ahead. We
>> have plenty of time to backtrack until GCC 14.
>
> Thanks. Unfortunately when I started using it, I've
"juzhe.zh...@rivai.ai" writes:
> Hi, Richard.
> Would you mind take a look at the loop control part again:
>
> static gcond *
> vect_set_loop_condition_partial_vectors (class loop *loop,
> loop_vec_info loop_vinfo, tree niters,
> tree final_iv, bool niters_maybe_zero,
> gimple_stmt_iterator
"juzhe.zh...@rivai.ai" writes:
> Thank you so much for pointing out this issue.
>
> After reading your comments carefully, I need to revise
> "vect_set_loop_controls_by_while_len" in loop control like this:
>
> vect_set_loop_controls_by_while_len
> ...
> tree X = NULL_TREE;
> FOR_EACH_VEC_ELT
"juzhe.zh...@rivai.ai" writes:
> Thanks Richard so much.
>
>>> I don't think that's guaranteed by the proposed definition of WHILE_LEN.
>>> The first int64_t WHILE_LEN could come up short, and return something
>>> less than VF/2.
>
> I am so sorry that the comments of
Richard Biener writes:
> On Tue, 25 Apr 2023, Richard Sandiford wrote:
>> juzhe.zh...@rivai.ai writes:
>> > diff --git a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc
>> > index a52277abdbf..54845a62298 100644
>> > --- a/gcc/tree-ssa-loop-manip.cc
>> > +++ b/gcc/tree-ssa-loop-manip.cc
juzhe.zh...@rivai.ai writes:
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 6e81dc05e0e..5f44def90d3 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -127,6 +127,7 @@ init_internal_fns ()
> #define cond_binary_direct { 1, 1, true }
> #define cond_ternary_direct {
Prathamesh Kulkarni writes:
> Hi Richard,
> While digging thru aarch64_expand_vector_init, I noticed it gives
> priority to loading a constant first:
> /* Initialise a vector which is part-variable. We want to first try
> to build those lanes which are constant in the most efficient way we
Richard Biener writes:
> On Thu, Apr 20, 2023 at 3:24 PM Andre Vieira (lists) via Gcc-patches
> wrote:
>>
>> Rebased all three patches and made some small changes to the second one:
>> - removed sub and abd optabs from commutative_optab_p, I suspect this
>> was a copy paste mistake,
>> - removed
Tamar Christina writes:
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Friday, April 21, 2023 6:19 PM
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>>
>> Subject: Re: [PATCH] RFC: New compact syntax for insn and insn_split in
>> Machine
Prathamesh Kulkarni writes:
> gcc/ChangeLog:
> * tree-ssa-forwprop.cc (is_combined_permutation_identity): Try to
> simplify two successive VEC_PERM_EXPRs with single operand and same
> mask, where mask chooses elements in reverse order.
>
> gcc/testesuite/ChangeLog:
> *
Prathamesh Kulkarni writes:
> [aarch64] Recursively intialize even and odd sub-parts and merge with zip1.
>
> gcc/ChangeLog:
> * config/aarch64/aarch64.cc (aarch64_expand_vector_init_fallback):
> Rename
> aarch64_expand_vector_init to this, and remove interleaving case.
>
Kyrylo Tkachov writes:
> Hi all,
>
> SVE2 supports an unpredicated vector integer MUL form that we can emit from
> our SVE expanders
> without using up a predicate registers. This patch does so.
> As the SVE MUL expansion currently is templated away through a code iterator
> I did not split it
Richard Sandiford writes:
> Tamar Christina writes:
>> Hi All,
>>
>> This patch adds support for a compact syntax for specifying constraints in
>> instruction patterns. Credit for the idea goes to Richard Earnshaw.
>>
>> I am sending up this RFC to get feedback for it's inclusion in GCC 14.
>>
Tamar Christina writes:
> Hi All,
>
> This patch adds support for a compact syntax for specifying constraints in
> instruction patterns. Credit for the idea goes to Richard Earnshaw.
>
> I am sending up this RFC to get feedback for it's inclusion in GCC 14.
> With this new syntax we want a clean
Prathamesh Kulkarni writes:
> On Wed, 19 Apr 2023 at 16:17, Richard Biener
> wrote:
>>
>> On Wed, Apr 19, 2023 at 11:21 AM Prathamesh Kulkarni
>> wrote:
>> >
>> > On Tue, 11 Apr 2023 at 19:36, Prathamesh Kulkarni
>> > wrote:
>> > >
>> > > On Tue, 11 Apr 2023 at 14:17, Richard Biener
>> > >
"Victor L. Do Nascimento" writes:
> The backend pattern for storing a pair of identical values in 32 and
> 64-bit modes with the machine instruction STP was missing, and
> multiple instructions were needed to reproduce this behavior as a
> result of failed RTL pattern match in the combine pass.
>
Prathamesh Kulkarni writes:
> Hi Richard,
> Based on your suggestions in the other thread, the patch uses force_reg
> to avoid creating pseudo if value is already in a register.
> Bootstrap+test passes on aarch64-linux-gnu.
> OK to commit ?
>
> Thanks,
> Prathamesh
>
> [aarch64] Use force_reg
"Andre Vieira (lists)" writes:
> On 20/04/2023 17:13, Richard Sandiford wrote:
>> "Andre Vieira (lists)" writes:
>>> On 20/04/2023 15:51, Richard Sandiford wrote:
"Andre Vieira (lists)" writes:
> Hi all,
>
> This is a series of patches/RFCs to implement support in GCC to be
Palmer Dabbelt writes:
> On Thu, 20 Apr 2023 09:55:23 PDT (-0700), Vineet Gupta wrote:
>> ChangeLog:
>>
>> * MAINTAINERS (Write After Approval): Add myself.
>>
>> (Ref: <680c7bbe-5d6e-07cd-8468-247afc65e...@gmail.com>)
>>
>> Signed-off-by: Vineet Gupta
>> ---
>> MAINTAINERS | 1 +
>> 1
Prathamesh Kulkarni writes:
> Hi,
> I tested the interleave+zip1 for vector init patch and it segfaulted
> during bootstrap while trying to build
> libgfortran/generated/matmul_i2.c.
> Rebuilding with --enable-checking=rtl showed out of bounds access in
> aarch64_unzip_vector_init in following
"Andre Vieira (lists)" writes:
> On 20/04/2023 15:51, Richard Sandiford wrote:
>> "Andre Vieira (lists)" writes:
>>> Hi all,
>>>
>>> This is a series of patches/RFCs to implement support in GCC to be able
>>> to target AArch64's libmvec functions that will be/are being added to glibc.
>>> We
"Andre Vieira (lists)" writes:
> Hi,
>
> This patch replaces the uses of simd_clone_subparts with
> TYPE_VECTOR_SUBPARTS and removes the definition of the first.
>
> gcc/ChangeLog:
>
> * omp-sind-clone.cc (simd_clone_subparts): Remove.
> (simd_clone_init_simd_arrays): Replace
"Andre Vieira (lists)" writes:
> Hi all,
>
> This is a series of patches/RFCs to implement support in GCC to be able
> to target AArch64's libmvec functions that will be/are being added to glibc.
> We have chosen to use the omp pragma '#pragma omp declare variant ...'
> with a simd construct as
writes:
> Yes, like kito said.
> We won't enable VNx1DImode in auto-vectorization so it's meaningless to fix
> it here.
> We dynamic adjust the minimum vector-length for different '-march' according
> to RVV ISA specification.
> So we strongly suggest that we should drop this fix.
I think the
Richard Biener writes:
> On Thu, 20 Apr 2023, Richard Sandiford wrote:
>
>> "juzhe.zh...@rivai.ai" writes:
>> > OK. Thanks Richard.
>> > So let me conclude:
>> > 1. Community agree that I should support variable IV in the middle-end.
>> > 2. We can keep WHILE_LEN pattern when "not only final
"juzhe.zh...@rivai.ai" writes:
> OK. Thanks Richard.
> So let me conclude:
> 1. Community agree that I should support variable IV in the middle-end.
> 2. We can keep WHILE_LEN pattern when "not only final iteration is partial".
> And I should describe it more clearly in the doc.
>
> I should
"juzhe.zh...@rivai.ai" writes:
> Thanks Richard reminding me. I originally think community does not allow me
> support variable amount IV and let me do this in RISC-V backend.
No, I think that part should and needs to be done in the middle-end,
since if the initial IVs are incorrect, it's very
钟居哲 writes:
> Hi, Richards.
> Since GCC 14 is open and this patch has been boostraped && tested on X86.
> Is this patch supporting variable IV OK for the trunk ?
Doesn't the patch need updating based on the previous discussion?
I thought the outcome was that WHILE_LEN isn't a simple MIN
Christophe Lyon via Gcc-patches writes:
> When I committed the patches to enable support for DFP on AArch64, I
> forgot to update the installation documentation.
>
> This patch adds AArch64 as needed (same as i386/x86_64).
>
> OK for trunk and gcc-13?
OK for both, thanks.
Richard
> 2023-04-17
Jeff Law via Gcc-patches writes:
> On 4/18/23 07:02, Richard Sandiford via Gcc-patches wrote:
>> "Victor L. Do Nascimento" writes:
>>> The function `constrain_operands' lacked the logic to consider relaxed
>>> memory constraints when "traditional&
"Victor L. Do Nascimento" writes:
> The function `constrain_operands' lacked the logic to consider relaxed
> memory constraints when "traditional" memory constraints were not
> satisfied, creating potential issues as observed during the reload
> compilation pass.
>
> In addition, it was observed
"Victor L. Do Nascimento" writes:
> The function `constrain_operands' lacked the logic to consider relaxed
> memory constraints when "traditional" memory constraints were not
> satisfied, creating potential issues as observed during the reload
> compilation pass.
>
> In addition, it was observed
"juzhe.zh...@rivai.ai" writes:
>>> But the issue is the same in the reverse with WHILE_LEN, no?
>>>WHILE_LEN just computes a scalar value - you seem to suggest
>>>there's a hidden side-effect of "coalescing" the result with
>>>a hardware vector length register? I don't think that's good design.
The aarch64 PCS rules ignore user alignment for scalars and
vectors and use the "natural" alignment of the type. GCC tried
to calculate that natural alignment using:
TYPE_ALIGN (TYPE_MAIN_VARIANT (type))
But as discussed in the PR, it's possible that the main variant
of a pointer type is an
钟居哲 writes:
> Yeah, like kito said.
> Turns out the tuple type model in ARM SVE is the optimal solution for RVV.
> And we like ARM SVE style implmentation.
>
> And now we see swapping rtx_code and mode in rtx_def can make rtx_def overal
> not exceed 64 bit.
> But it seems that there is still
"juzhe.zh...@rivai.ai" writes:
>>> I think that already works for them (could be misremembering).
>>> However, IIUC, they have no special instruction to calculate the
>>> length (unlike for RVV), and so it's open-coded using vect_get_len.
>
> Yeah, the current flow using min, sub, and then min in
Richard Biener writes:
> On Wed, 12 Apr 2023, juzhe.zh...@rivai.ai wrote:
>
>>
>> >> Thanks for the detailed explanation. Just to clarify - with RVV
>> >> there's only a single mask register, v0.t, or did you want to
>> >> say an instruction can only specify a single mask register?
>>
>> RVV
Prathamesh Kulkarni writes:
> On Thu, 6 Apr 2023 at 16:05, Richard Sandiford
> wrote:
>>
>> Prathamesh Kulkarni writes:
>> > On Tue, 4 Apr 2023 at 23:35, Richard Sandiford
>> > wrote:
>> >> > diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> >> >
"juzhe.zh...@rivai.ai" writes:
> Hi, Richards.
> Kindly Ping this patch.
> This is the most important patch for RVV auto-vectorization support.
> Bootstraped on X86 has passed.
Can it wait for GCC 14? It doesn't seem like stage 4 material.
Also, pinging after 5 days seems a bit soon. It's
Richard Biener writes:
> On Tue, 11 Apr 2023, Richard Sandiford wrote:
>
>> writes:
>> > ARM SVE has?svint8_t, svint8x2_t, svint8x3_t, svint8x4_t
>> > As far as I known, they don't have tuple type for partial vector.
>>
>> Yeah, there are no separate types for partial vectors, but there
>> are
Richard Earnshaw writes:
> On 11/04/2023 10:46, Richard Sandiford via Gcc-patches wrote:
>> writes:
>>> ARM SVE has:svint8_t, svint8x2_t, svint8x3_t, svint8x4_t
>>> As far as I known, they don't have tuple type for partial vector.
>>
>> Yeah, there ar
Richard Biener via Gcc-patches writes:
> On Mon, Mar 27, 2023 at 6:02 PM Kevin Lee wrote:
>>
>> This patch is a proper fix to the previous patch
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614463.html
>> vect_grouped_store_supported checks if the count is a power of 2, but
>> doesn't
writes:
> ARM SVE has:svint8_t, svint8x2_t, svint8x3_t, svint8x4_t
> As far as I known, they don't have tuple type for partial vector.
Yeah, there are no separate types for partial vectors, but there
are separate modes. E.g. VNx2QI is a partial vector of QIs,
with each QI stored in a 64-bit
Prathamesh Kulkarni writes:
> On Tue, 4 Apr 2023 at 23:35, Richard Sandiford
> wrote:
>> > diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> > b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> > index cd9cace3c9b..3de79060619 100644
>> > ---
Jeff Law writes:
> On 4/5/23 10:48, Jakub Jelinek wrote:
>> On Wed, Apr 05, 2023 at 10:17:59AM -0600, Jeff Law wrote:
It is true that an instruction like
(insn 8 7 9 2 (set (reg:HI 141)
(subreg:HI (reg:SI 142) 0)) "aauu.c":6:18 181 {*movhi_internal}
(nil))
Prathamesh Kulkarni writes:
> On Mon, 13 Mar 2023 at 13:03, Richard Biener wrote:
>> On GIMPLE it would be
>>
>> _1 = { a, ... }; // (a)
>> _2 = { _1, ... }; // (b)
>>
>> but I'm not sure if (b), a VL CTOR of fixed len(?) sub-vectors is
>> possible? But at least a CTOR of vectors is what we
Andrew Pinski via Gcc-patches writes:
> The problem here is that aarch64_expand_setmem does not change the alignment
> for strict alignment case.
> This is version 4 of the fix, major changes from the last version is fixing
> the way store pairs are handled which allows handling of storing 2 SI
Thanks for the patch and sorry for the slow reply.
Jakub Jelinek writes:
> Hi!
>
> The testcase in the PR (which unfortunately because of my lack of experience
> with SVE I'm not able to turn into a runtime testcase that verifies it)
> is miscompiled on aarch64-linux in the regname pass, because
Segher Boessenkool writes:
> On Fri, Mar 31, 2023 at 03:06:41PM +0100, Richard Sandiford wrote:
>> This is an alternative presentation of the change that we discussed
>> a few weeks ago, and that you already tested:
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613486.html
>>
>>
Segher Boessenkool writes:
> On Thu, Mar 09, 2023 at 12:10:51PM +, Richard Sandiford wrote:
>> g:c23a9c87cc62bd177fd0d4db6ad34b34e1b9a31f uses nonzero_bits
>> information to convert sign_extends into zero_extends.
>> That change is semantically correct in itself, but for the
>> testcase in
Segher Boessenkool writes:
> Hi!
>
> On Thu, Mar 09, 2023 at 12:09:59PM +, Richard Sandiford wrote:
>> This patch just splits some code out of make_compound_operation_int
>> into a new function called make_compound_operation_and. It is a
>> prerequisite for the fix for PR106594.
>>
>> It
Michael Collison writes:
> While working on autovectorizing for the RISCV port I encountered an issue
> where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
> evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
> where GET_MODE_NUNITS is equal to one.
401 - 500 of 2266 matches
Mail list logo