Richard Earnshaw writes:
> On 13/06/2022 15:33, Richard Sandiford via Gcc-patches wrote:
>> On aarch64, --with-arch, --with-cpu and --with-tune only have an
>> effect on the driver, so “./xgcc -B./ -O3” can give significantly
>> different results from “./cc1 -O3”. --with-ar
"Roger Sayle" writes:
> This patch to the middle-end's RTL expansion reorders the code in
> emit_store_flag_1 so that the backend has more control over how best
> to expand/split double word equality/inequality comparisons against
> zero or minus one. With the current implementation, the
Prathamesh Kulkarni writes:
> Hi Richard,
> Following from off-list discussion, in the attached patch, I wrote pattern
> similar to vec_duplicate_reg, which seems to work for the svld1rq tests.
> Does it look OK ?
>
> Sorry, I didn't fully understand your suggestion on integrating with
>
Richard Biener writes:
>> +/* Create vector init for vectorized iv. */
>> +static tree
>> +vect_create_nonlinear_iv_init (gimple_seq* stmts, tree init_expr,
>> + tree step_expr, poly_uint64 nunits,
>> + tree vectype,
>> +
Takayuki 'January June' Suwa writes:
> Thanks for your response.
>
> On 2022/08/03 16:52, Richard Sandiford wrote:
>> Takayuki 'January June' Suwa via Gcc-patches
>> writes:
>>> Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps
>>> data flow consistent, but it also
Jeff Law via Gcc-patches writes:
> On 8/3/2022 1:52 AM, Richard Sandiford via Gcc-patches wrote:
>> Takayuki 'January June' Suwa via Gcc-patches
>> writes:
>>> Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps
>>>
Richard Biener writes:
> On Tue, 2 Aug 2022, Richard Sandiford wrote:
>
>> Currently SLP tries to force permute operations "down" the graph
>> from loads in the hope of reducing the total number of permutes
>> needed or (in the best case) removing the need for the permutes
>> entirely. This
Martin Jambor writes:
> Hi Richard,
>
> On Fri, Nov 13 2020, Richard Sandiford via Gcc-patches wrote:
>> A later patch wants to be able to pass around subarray views of an
>> existing array. The standard class to do that is std::span, but it's
>> a C++20 thing. This
Takayuki 'January June' Suwa via Gcc-patches writes:
> Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps
> data flow consistent, but it also increases register allocation pressure
> and thus often creates many unwanted register-to-register moves that
> cannot be optimized
Richard Sandiford via Gcc-patches writes:
> "Roger Sayle" writes:
>> Many thanks to Segher and Richard for pointing out that my removal
>> of optimizations of ABS(ABS(x)) and ABS(FFS(x)) in the original version
>> of this patch was incorrect, and my assumption t
"Roger Sayle" writes:
> Many thanks to Segher and Richard for pointing out that my removal
> of optimizations of ABS(ABS(x)) and ABS(FFS(x)) in the original version
> of this patch was incorrect, and my assumption that these would be
> subsumed by val_signbit_known_clear_p was mistaken. That
Philipp Rimmele via Gcc writes:
> Hi,
>
> i'm developing a GCC-Plugin. And i don't understand why there is a
> "try_finally_expr" in a must_not_throw-Area in my AST. It happens in the
> destructors.
> Here is my AST:
> function_decl Exception::__dt_base
> 1:
"Roger Sayle" writes:
> This patch implements some additional zero-extension and sign-extension
> related optimizations in simplify-rtx.cc. The original motivation comes
> from PR rtl-optimization/71775, where in comment #2 Andrew Pinski sees:
>
> Failed to match this instruction:
> (set (reg:DI
Richard Biener writes:
> The following teaches VN to handle reads from .MASK_STORE and
> .LEN_STORE. For this push_partial_def is extended first for
> convenience so we don't have to handle the full def case in the
> caller (possibly other paths can be simplified then). Also
> the partial
Ping^3 for the configure bits.
Richard Sandiford via Gcc-patches writes:
> On aarch64, --with-arch, --with-cpu and --with-tune only have an
> effect on the driver, so “./xgcc -B./ -O3” can give significantly
> different results from “./cc1 -O3”. --with-arch did have a limited
> eff
Currently SLP tries to force permute operations "down" the graph
from loads in the hope of reducing the total number of permutes
needed or (in the best case) removing the need for the permutes
entirely. This patch tries to extend it as follows:
- Allow loads to take a different permutation from
Seems this thread has become a bit heated, so I'll try to proceed
with caution :-)
In the below, I'll use "X-mode const_int" to mean "a const_int that
is known from context to represent an X-mode value". Of course,
the const_int itself always stores VOIDmode.
"Roger Sayle" writes:
> Hi Segher,
Dimitrije Milosevic writes:
>> Do you know someone very familiar with MIPS and GCC and capable as a
>> port maintainer? An active MIPS port maintainer will make the situation
>> better.
> Sadly, no. I agree it would make things easier.
Yeah, I agree that's what we need. I stepped down from
Richard Biener writes:
> On Wed, 27 Jul 2022, juzhe.zh...@rivai.ai wrote:
>
>> From: zhongjuzhe
>>
>> gcc/ChangeLog:
>>
>> * expr.cc (expand_assignment): Change GET_MODE_PRECISION to
>> GET_MODE_BITSIZE
>>
>> ---
>> gcc/expr.cc | 2 +-
>> 1 file changed, 1 insertion(+), 1
graphds_scc says that it uses Tarjan's algorithm, but it looks like
it uses Kosaraju's algorithm instead (dfs one way followed by dfs
the other way).
OK to install?
Richard
gcc/
* graphds.cc (graphds_scc): Fix algorithm attribution.
---
gcc/graphds.cc | 2 +-
1 file changed, 1
Andrew Carlotti writes:
> On Wed, Jul 13, 2022 at 05:36:04PM +0100, Richard Sandiford wrote:
>> I like the part about getting rid of:
>>
>> static tree
>> aarch64_simd_builtin_type (machine_mode mode,
>> bool unsigned_p, bool poly_p)
>>
>> and the flow of the new
Richard Ball writes:
> Replace manual swapping idiom with std::swap in aarch64.cc
>
> gcc/config/aarch64/aarch64.cc has a few manual swapping idioms of the form:
>
> x = in0, in0 = in1, in1 = x;
>
> The preferred way is using the standard:
>
> std::swap (in0, in1);
>
> We should just fix these to
Richard Biener writes:
> On Thu, Jul 14, 2022 at 9:55 AM Prathamesh Kulkarni
> wrote:
>>
>> On Wed, 13 Jul 2022 at 12:22, Richard Biener
>> wrote:
>> >
>> > On Tue, Jul 12, 2022 at 9:12 PM Prathamesh Kulkarni via Gcc-patches
>> > wrote:
>> > >
>> > > Hi Richard,
>> > > For the following test:
Prathamesh Kulkarni writes:
> Hi,
> For following test case:
>
> svint32_t foo()
> {
> int32x4_t v = (int32x4_t) { 1, 2, 3, 4 };
> svint32_t v2 = svld1rq_s32 (svptrue_b8(), [0]);
> return v2;
> }
>
> After applying workaround in forwprop to not simplify VEC_PERM_EXPR in
>
Andrew Carlotti writes:
> This removes a significant number of intrinsic definitions from the arm_neon.h
> header file, and reduces the amount of code duplication. The new macros and
> data structures are intended to also facilitate moving other intrinsic
> definitions out of the header file in
Andrew Carlotti writes:
> There were several similarly-named functions, which each built or looked up a
> type using a different subset of valid modes or qualifiers.
>
> This change combines these all into a single function, which can additionally
> handle const and pointer qualifiers.
I like
Andrew Carlotti writes:
> This has been unused since 2014, so there's no reason to retain it.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-builtins.cc
> (enum aarch64_type_qualifiers): Remove qualifier_internal.
> (aarch64_init_simd_builtin_functions): Remove
Andrew Carlotti writes:
> We already have a V1DF mode, so this makes the vector modes more consistent.
>
> Additionally, this allows us to recognise uint64x1_t and int64x1_t types given
> only the mode and type qualifiers (e.g. in aarch64_lookup_simd_builtin_type).
>
> gcc/ChangeLog:
>
> *
Andrew Carlotti writes:
> This lowers vcombine intrinsics to a GIMPLE vector constructor, which enables
> better optimisation during GIMPLE passes.
>
> gcc/
>
> * config/aarch64/aarch64-builtins.c
> (aarch64_general_gimple_fold_builtin): Add combine.
>
> gcc/testsuite/
>
> *
This patch extends the fix for PR106253 to AArch32. As with AArch64,
we were using ACLE intrinsics to vectorise scalar built-ins, even
though the two sometimes have different ECF_* flags. (That in turn
is because the ACLE intrinsics should follow the instruction semantics
as closely as possible,
Richard Biener via Gcc-patches writes:
> On Tue, Jul 12, 2022 at 4:38 PM Andrew Carlotti
> wrote:
>>
>> aarch64_general_gimple_fold_builtin doesn't check whether the LHS of a
>> function call is null before converting it to an assign statement. To avoid
>> returning an invalid GIMPLE statement
Ping^2 for the configure bits.
Richard Sandiford via Gcc-patches writes:
> On aarch64, --with-arch, --with-cpu and --with-tune only have an
> effect on the driver, so “./xgcc -B./ -O3” can give significantly
> different results from “./cc1 -O3”. --with-arch did have a limited
> eff
The PR is about the aarch64 port using an ACLE built-in function
to vectorise a scalar function call, even though the ECF_* flags for
the ACLE function didn't match the ECF_* flags for the scalar call.
To some extent that kind of difference is inevitable, since the
ACLE intrinsics are supposed to
aarch64_builtin_vectorized_function handles some built-in functions
that already have equivalent internal functions. This seems to be
redundant now, since the target builtins that it chooses are mapped
to the same optab patterns as the internal functions.
Tested on aarch64-linux-gnu & pushed.
In g:76c3041b856cb0 I'd removed a "C ? optab_vector : optab_mixed_sign"
argument from a call to directly_supported_p, thinking that the argument
only existed because of the condition (which I was removing). But the
difference between the scalar and vector forms matters for shifts,
so we do still
I know it'll seem like make-work, but could you put the combine flag
in a separate follow-on patch? Reorganising the existing flags
(very welcome!) and adding new ones seem like different things.
TBH I'm a bit suspicious of the combine flag. What fundamental
property holds true after combine
Tamar Christina writes:
>> > so that the multiple_p test is skipped if the structure is undefined.
>>
>> Actually, we should probably skip the constant_multiple_p test as well.
>> Keeping it would only be meaningful for little-endian.
>>
>> simplify_gen_subreg should alread do the necessary
Richard Biener writes:
> The final loop IV use after the loop has that not in LC SSA
> (and inserts not simplified _2 = _3 - 0 stmts). In particular
> since it splits the exit edge when there's a virtual PHI in the
> destination it breaks virtual LC SSA form (but likely also
> non-virtual).
>
>
Sorry for the slow review.
Andrew Carlotti via Gcc-patches writes:
> Hi,
>
> This removes a significant number of intrinsic definitions from the arm_neon.h
> header file, and reduces the amount of code duplication. The new macros and
> data structures are intended to also facilitate moving other
t;> > > > -Original Message-
>> >> > > > From: Richard Sandiford
>> >> > > > Sent: Thursday, June 16, 2022 7:54 PM
>> >> > > > To: Tamar Christina
>> >> > > > Cc: gc
Xi Ruoyao via Gcc-patches writes:
> On Fri, 2022-07-01 at 12:40 +, Dimitrije Milosevic wrote:
>> Building the ASAN for the n32 MIPS ABI currently fails, due to a few reasons:
>> - defined(__mips64), which is set solely based on the architecture type
>> (32-bit/64-bit),
>> was still used in
Aldy Hernandez via Gcc-patches writes:
> Currently global ranges are stored in SSA_NAME_RANGE_INFO as a pair of
> wide_int-like objects along with the nonzero bits. We frequently lose
> precision when streaming out our higher resolution iranges. The plan
> was always to store the full irange
Richard Biener via Gcc-patches writes:
> This reverts the change as discussed.
Thanks!
> Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
>
> 2022-07-01 Richard Biener
>
> * tree-ssa-sccvn.cc (vn_reference_lookup_3): Revert
> back to using maybe_ne (off, -1).
> ---
>
"Andre Vieira (lists)" writes:
> On 29/06/2022 08:18, Richard Sandiford wrote:
>>> + break;
>>> +case AARCH64_RBIT:
>>> +case AARCH64_RBITL:
>>> +case AARCH64_RBITLL:
>>> + if (mode == SImode)
>>> + icode = CODE_FOR_aarch64_rbitsi;
>>> + else
>>> + icode =
Richard Biener writes:
> On Fri, 1 Jul 2022, Richard Sandiford wrote:
>
>> Richard Biener via Gcc-patches writes:
>> > The following makes sure to not use the original TBAA type for
>> > looking up a value across an aggregate copy when we had to offset
>> > the read.
>> >
>> > Bootstrapped and
Richard Biener via Gcc-patches writes:
> The following makes sure to not use the original TBAA type for
> looking up a value across an aggregate copy when we had to offset
> the read.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk.
>
> 2022-06-30 Richard Biener
>
>
Jakub Jelinek writes:
> Hi!
>
> As the following self-test testcase shows, wi::shifted_mask sometimes
> doesn't create canonicalized wide_ints, which then fail to compare equal
> to canonicalized wide_ints with the same value.
> In particular, wi::mask (128, false, 128) gives { -1 } with len 1
; nd ; Richard Earnshaw
>> > > > ; Marcus Shawcroft
>> > > > ; Kyrylo Tkachov
>> >
>> > > > Subject: Re: [PATCH 1/2]AArch64 Add fallback case using sdot for
>> > > > usdot
>> > > >
>> > > > Richard Sandif
"Andre Vieira (lists)" writes:
> On 17/06/2022 11:54, Richard Sandiford wrote:
>> "Andre Vieira (lists)" writes:
>>> Hi,
>>>
>>> This patch adds support for the ACLE Data Intrinsics to the AArch64 port.
>>>
>>> Bootstrapped and regression tested on aarch64-none-linux.
>>>
>>> OK for trunk?
>>
Richard Biener writes:
> On Thu, 23 Jun 2022, Richard Sandiford wrote:
>> In a reduction pair like:
>>
>> typedef float T;
>>
>> void
>> f1 (T *x)
>> {
>> T res1 = 0;
>> T res2 = 0;
>> for (int i = 0; i < 100; ++i)
>> {
>> res1 += x[i * 2];
>> res2 += x[i * 2
In a reduction pair like:
typedef float T;
void
f1 (T *x)
{
T res1 = 0;
T res2 = 0;
for (int i = 0; i < 100; ++i)
{
res1 += x[i * 2];
res2 += x[i * 2 + 1];
}
x[0] = res1;
x[1] = res2;
}
it isn't easy to predict whether the initial
The fix for PR106019 regressed slsr-39.c for -m32 -march=cascadelake
because we are now able to vectorise the code. (Whether the code model
should be allowing that is a different question -- the vectorised code
looked worse to me.)
The test runs at -O2 and predates vectorisation being enabled at
Alexandre Oliva writes:
> On Jun 21, 2022, Richard Sandiford wrote:
>
>> Could we instead have a new target selector for whether the memory
>> map includes xGB of RAM?
>
> How about this? Testing on aarch64-rtems6.0. Ok to install?
>
>
> aarch64: testsuite: symbol-range fallback to compile
>
>
When dr_may_alias_p is called without a loop context, it tries
to use the tree-affine interface to calculate the difference
between the two addresses and use that difference to check whether
the gap between the accesses is known at compile time. However, as the
example in the PR shows, this
Alexandre Oliva writes:
> On some of our embedded aarch64 targets, RAM size is too small for
> this test to fit. It doesn't look like this test requires linking,
> and if it does, the -tiny version may presumably get most of the
> coverage without going overboard in target system requirements.
Tamar Christina via Gcc-patches writes:
>> -Original Message-
>> From: Richard Biener
>> Sent: Monday, June 20, 2022 12:56 PM
>> To: Tamar Christina
>> Cc: Andrew Pinski via Gcc-patches ; nd
>>
>> Subject: RE: [PATCH]middle-end Add optimized float addsub without
>> needing
Richard Biener via Gcc-patches writes:
> On Thu, Jun 16, 2022 at 1:10 PM Tamar Christina via Gcc-patches
> wrote:
>>
>> Hi All,
>>
>> This adds a match.pd rule that drops the bitwwise nots when both arguments
>> to a
>> subtract is inverted. i.e. for:
>>
>> float g(float a, float b)
>> {
>>
Ping for the configure bits
Richard Sandiford via Gcc-patches writes:
> On aarch64, --with-arch, --with-cpu and --with-tune only have an
> effect on the driver, so “./xgcc -B./ -O3” can give significantly
> different results from “./cc1 -O3”. --with-arch did have a limited
> eff
Richard Sandiford via Gcc-patches writes:
> Tamar Christina writes:
>>> -Original Message-
>>> From: Richard Sandiford
>>> Sent: Monday, June 13, 2022 9:41 AM
>>> To: Tamar Christina
>>> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.d
Tamar Christina writes:
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Monday, June 13, 2022 9:41 AM
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de
>> Subject: Re: [PATCH]middle-end Use subregs to expand COMPLEX_EXPR to
>> set the lowpart.
>>
"Andre Vieira (lists)" writes:
> Hi,
>
> This patch adds support for the ACLE Data Intrinsics to the AArch64 port.
>
> Bootstrapped and regression tested on aarch64-none-linux.
>
> OK for trunk?
Sorry for the slow review.
>
> gcc/ChangeLog:
>
> 2022-06-10 Andre Vieira
>
> *
Richard Sandiford via Gcc-patches writes:
> Tamar Christina writes:
>> Hi All,
>>
>> The usdot operation is common in video encoder and decoders including some of
>> the most widely used ones.
>>
>> This patch adds a +dotprod version of the optab as
Tamar Christina writes:
> Hi All,
>
> The usdot operation is common in video encoder and decoders including some of
> the most widely used ones.
>
> This patch adds a +dotprod version of the optab as a fallback for when you do
> have sdot but not usdot available.
>
> The fallback works by adding
The recent internal-fn “clean-ups” triggered problems on nvptx
because some of the omp_simt_* patterns had modeless operands.
I wondered about adapting expand_fn_using_insn to cope with that,
but then the problem becomes: what should the mode of operand 0
be when there is no lhs? The answer
Jeff Law via Gcc-patches writes:
> On 6/13/2022 5:54 AM, Richard Biener wrote:
>> On Sun, Jun 12, 2022 at 7:27 PM Jeff Law via Gcc-patches
>> wrote:
>> [...]
>>> On a related topic, any thoughts on keeping complex objects as complex
>>> types/modes through gimple and into at least parts of the
In f2ebf2d98efe0ac2314b58cf474f44cb8ebd5244 I'd forced the
chosen unroll factor to be a factor of the VF, in order to
work around an exact_div ICE in PR105254. This was completely
bogus -- clearly I didn't look in enough detail at why we ended
up with an unrolled VF that wasn't a multiple of the
Tamar pointed out that:
(unspec:M ... )
didn't work when a value of attribute FOO was defined by
define_constant, such as in:
(define_int_attribute FOO [(UNSPEC_A "UNSPEC_B") ...])
This is because symbolic constants are substituted during lexing
and only apply to bare symbol names, not
Richard Biener writes:
> On Mon, 13 Jun 2022, Tamar Christina wrote:
>
>> > -Original Message-
>> > From: Richard Biener
>> > Sent: Monday, June 13, 2022 12:48 PM
>> > To: Tamar Christina
>> > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Sandiford
>> >
>> > Subject: RE: [PATCH
On aarch64, --with-arch, --with-cpu and --with-tune only have an
effect on the driver, so “./xgcc -B./ -O3” can give significantly
different results from “./cc1 -O3”. --with-arch did have a limited
effect on ./cc1 in previous releases, although it didn't work
entirely correctly.
Being of a lazy
Tamar Christina writes:
> Hi All,
>
> When lowering COMPLEX_EXPR we currently emit two VEC_EXTRACTs. One for the
> lowpart and one for the highpart.
>
> The problem with this is that in RTL the lvalue of the RTX is the only thing
> tying the two instructions together.
>
> This means that e.g.
Richard Biener writes:
> On Wed, 8 Jun 2022, Richard Sandiford wrote:
>> Tamar Christina writes:
>> >> -Original Message-
>> >> From: Richard Sandiford
>> >> Sent: Wednesday, June 8, 2022 11:31 AM
>> >> To: Tamar Christina
>> >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> >>
Several existing internal functions map directly to an instruction
defined in target-insns.def. This patch makes it easier to define
more such functions in future.
This should help to reduce cut-&-paste, but more importantly, it allows
the difference between optab functions and target-insns.def
internal-fn.c has quite a few functions that simply map the result
of the call to an instruction's output operand (if any) and map
each argument to an instruction's input operand, in order.
This patch adds a single function for doing that. It's really
just a generalisation of
Andrew Carlotti via Gcc-patches writes:
> Hi all,
>
> This lowers vcombine intrinsics to a GIMPLE vector constructor, which enables
> better optimisation during GIMPLE passes.
>
> Bootstrapped and tested on aarch64-none-linux-gnu, and tested for
> aarch64_be-none-linux-gnu via
Jakub Jelinek via Gcc-patches writes:
> Hi!
>
> This patch adds support for dlopening libmemkind.so on Linux and uses it
> for some kinds of allocations (but not yet e.g. pinned memory).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux (with libmemkind
> around) and compile tested with
Tamar Christina writes:
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Wednesday, June 8, 2022 3:36 PM
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; Kyrylo Tkachov
>> ; rguent...@suse.de; ro...@eyesopen.com
>> Subject:
Tamar Christina writes:
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Wednesday, June 8, 2022 11:31 AM
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; Kyrylo Tkachov
>> Subject: Re: [PATCH]AArch64 relax predicate on
Christophe Lyon writes:
> On 6/7/22 19:44, Richard Sandiford wrote:
>> Christophe Lyon via Gcc-patches writes:
>>> While working on enabling DFP for AArch64, I noticed new failures in
>>> gcc.dg/compat/struct-layout-1.exp (t028) which were not actually
>>> caused by DFP types handling. These
Tamar Christina writes:
> Hi All,
>
> At some point in time we started lowering the ld1r instructions in gimple.
>
> That is:
>
> uint8x8_t f1(const uint8_t *in) {
> return vld1_dup_u8([1]);
> }
>
> generates at gimple:
>
> _3 = MEM[(const uint8_t *)in_1(D) + 1B];
> _4 = {_3, _3, _3, _3,
Christophe Lyon via Gcc-patches writes:
> While working on enabling DFP for AArch64, I noticed new failures in
> gcc.dg/compat/struct-layout-1.exp (t028) which were not actually
> caused by DFP types handling. These tests are generated during 'make
> check' and enabling DFP made generation
Prathamesh Kulkarni writes:
> On Mon, 6 Jun 2022 at 16:29, Richard Sandiford
> wrote:
>>
>> Prathamesh Kulkarni writes:
>> >> > {
>> >> >/* The pattern matching functions above are written to look for a
>> >> > small
>> >> > number to begin the sequence (0, 1, N/2). If we begin
Joel Hutton writes:
>> > Patches attached. They already incorporated the .cc rename, now
>> > rebased to be after the change to tree.h
>>
>> @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
>>2, oprnd, half_type, unprom, vectype);
>>
>>tree var =
Prathamesh Kulkarni writes:
>> > {
>> >/* The pattern matching functions above are written to look for a small
>> > number to begin the sequence (0, 1, N/2). If we begin with an index
>> > @@ -24084,6 +24112,12 @@ aarch64_expand_vec_perm_const_1 (struct
>> > expand_vec_perm_d *d)
>>
"Kewen.Lin" writes:
> Hi,
>
> r10-3912 updated the format of VECTOR_MODES_WITH_PREFIX by
> adding one more parameter ORDER, the related document is out
> of date. So update the document for ORDER.
>
> Is it ok for trunk?
>
> BR,
> Kewen
> -
>
> gcc/ChangeLog:
>
> * machmode.def
Richard Biener writes:
> On Sat, May 21, 2022 at 5:31 PM Roger Sayle
> wrote:
>> This patch simplifies vec_unpack_hi_expr/vec_unpack_lo_expr of a uniform
>> constructor or vec_duplicate operand. The motivation is from PR 105621
>> where after optimization, we're left with:
>>
>> vect_cst__21
Prathamesh Kulkarni writes:
> On Thu, 12 May 2022 at 16:15, Richard Sandiford
> wrote:
>>
>> Prathamesh Kulkarni writes:
>> > On Wed, 11 May 2022 at 12:44, Richard Sandiford
>> > wrote:
>> >>
>> >> Prathamesh Kulkarni writes:
>> >> > On Fri, 6 May 2022 at 16:00, Richard Sandiford
>> >> >
"H.J. Lu" writes:
> On Mon, May 30, 2022 at 09:35:43AM +0100, Richard Sandiford wrote:
>> "H.J. Lu" writes:
>> > ---
>> > RTL DSE tracks redundant constant stores within a basic block. When RTL
>> > loop invariant motion hoists a constant initialization out of the loop
>> > into a separate
Vladimir Makarov via Gcc-patches writes:
> On 2022-05-29 23:05, Hongtao Liu wrote:
>> On Fri, May 27, 2022 at 5:12 AM Vladimir Makarov via Gcc-patches
>> wrote:
>>>
>>> On 2022-05-24 23:39, liuhongt wrote:
Rigt now, mem_cost for separate mem alternative is 1 * frequency which
is pretty
"H.J. Lu" writes:
> On Thu, May 26, 2022 at 04:14:17PM +0100, Richard Sandiford wrote:
>> "H.J. Lu" writes:
>> > On Wed, May 25, 2022 at 12:30 AM Richard Sandiford
>> > wrote:
>> >>
>> >> "H.J. Lu via Gcc-patches" writes:
>> >> > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote:
Jeff Law via Gcc-patches writes:
> On 5/29/2022 3:43 PM, H.J. Lu wrote:
>> On Sat, May 28, 2022 at 11:37 AM Jeff Law via Gcc-patches
>> wrote:
>>>
>>>
>>> On 5/26/2022 2:43 PM, H.J. Lu via Gcc-patches wrote:
On Thu, May 26, 2022 at 04:14:17PM +0100, Richard Sandiford wrote:
> "H.J. Lu"
(Sorry for the slow reply, was off on Friday)
Richard Biener writes:
> On Wed, May 25, 2022 at 10:24 PM Prathamesh Kulkarni
> wrote:
>>
>> On Thu, 26 May 2022 at 00:37, Richard Biener
>> wrote:
> [...]
>> > x86 now accepts V4SI V8SI permutes because we don’t ask it correctly and
>> > thus my
Wilco Dijkstra writes:
> Further cleanup option processing. Remove the duplication of global
> variables for CPU and tune settings so that CPU option processing is
> simplified even further. Move global variables that need save and
> restore due to target option processing into aarch64.opt. This
"H.J. Lu" writes:
> On Wed, May 25, 2022 at 12:30 AM Richard Sandiford
> wrote:
>>
>> "H.J. Lu via Gcc-patches" writes:
>> > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote:
>> >> On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches
>> >> wrote:
>> >> >
>> >> > When recording
Wilco Dijkstra writes:
> Hi Richard,
>
> I've added a comment - as usual it's just a number. A quick grep in gcc and
> glibc showed that priorities 98-101 are used, so I just went a bit below so it
> has a higher priority than typical initializations.
Thanks. OK for trunk, and for backports
Richard Biener via Gcc-patches writes:
> On Tue, May 24, 2022 at 10:11 PM H.J. Lu wrote:
>>
>> On Mon, May 23, 2022 at 11:42 PM Richard Biener
>> wrote:
>> >
>> > On Mon, May 23, 2022 at 8:34 PM H.J. Lu wrote:
>> > >
>> > > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote:
>> > >
Szabolcs Nagy writes:
> The 05/13/2022 16:35, Richard Sandiford wrote:
>> Szabolcs Nagy via Gcc-patches writes:
>> > The RA_SIGN_STATE dwarf pseudo-register is normally only set using the
>> > DW_CFA_AARCH64_negate_ra_state (== DW_CFA_window_save) operation which
>> > toggles the return address
"H.J. Lu via Gcc-patches" writes:
> On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote:
>> On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches
>> wrote:
>> >
>> > When recording store for RTL dead store elimination, check if the source
>> > register is set only once to a
Wilco Dijkstra writes:
> Increase the priority of the init_have_lse_atomics constructor so it runs
> before other constructors. This improves chances that rr works when LSE
> atomics are supported.
Can you add a comment above the function explaining why we chose 90
in particular? I see 100 was
Richard Biener writes:
> This is another place where we fail to pass down the mode of a
> CONST_INT.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
>
> Thanks,
> Richard.
>
> 2022-05-24 Richard Biener
>
> PR middle-end/105711
> * expmed.cc
Prathamesh Kulkarni writes:
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index c5006afc00d..0a3c733ada9 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -6088,14 +6088,18 @@ for the given scalar type @var{type}.
> @var{is_packed} is false if the scalar
> access using
901 - 1000 of 2266 matches
Mail list logo