Hi, all
This patch aims to extend __builtin_ia32_cmp[p|s][s|d] from avx to
sse/sse2/avx, where its immediate is in range of [0, 7].
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
BRs,
Lin
gcc/ChangeLog:
* config/i386/avxintrin.h: Move cmp[p|s][s|d] to
Hi,
This patch optimizes vector construction with two vector doubleword loads.
It generates an optimal insn sequence as "xxlor" has lower latency than
"mtvsrdd" on Power10.
Bootstrapped and tested on powerpc64-linux BE and LE with no
regressions. OK for the trunk?
Thanks
Gui Haochen
Looks like this new version works the same to fix the warning without
the issues reported here.
All 23_containers/vector tests run in C++98/14/20 so far.
Ok to commit once I've complete the testsuite (or some bot did it for me
!) ?
I'll look for a PR to associate, if you have one in mind do
mkdir, chdir and chmod functions are defined in librtemscpu, that
doesn't get linked in during libstdc++-v3 configure, but applications
use -qrtems for linking, which brings those symbols in, so it makes
sense to mark them as available so that the C++ filesystem APIs are
enabled.
Regstrapped on
While looking at the index I noticed that some options had
`-` in the front for the index which is wrong. And then
I noticed there was no index for `mcmodel=` for targets or had
used `-mcmodel` incorrectly.
This fixes both of those and regnerates the urls files see that
`-mcmodel=` option now has
On 5/29/24 8:41 PM, Hans-Peter Nilsson wrote:
I do bootstraps and regression testsuite runs on a variety of systems
via qemu (alpha, m68k, aarch64, s390, ppc64, etc). It ain't fast, but
it does work if QEMU is in pretty good shape and you can find a root
filesystem to use.
That might
Hi Kewen,
在 2024/5/29 13:26, Kewen.Lin 写道:
> I can understand re-using "unordered" and "eq" will save some efforts than
> doing with unspecs, but they are actually RTL codes instead of bits on the
> specific hardware CR, a downside is that people who isn't aware of this
> design point can have
Gently ping :)
Hi Richard, Is it OK to adopt the ccmp change? Or did you know who can
help to review this part?
Thanks.
Hongyu Wang 于2024年5月23日周四 16:27写道:
>
> Gently ping for this :)
> Hi Richard, Is it OK to adopt the ccmp change? Or did you know who can
> help to review this part?
> Thanks.
Hi,
The builtin isinf is not folded at front end if the corresponding optab
exists. It causes the range evaluation failed on the targets which has
optab_isinf. For instance, range-sincos.c will fail on the targets which
has optab_isinf as it calls builtin_isinf.
This patch fixed the problem
Hi,
This patch adds the range op for builtin isnormal. It also adds two
help function in frange to detect range of normal floating-point and
range of subnormal or zero.
Compared to previous version, the main change is to set the range to
1 if it's normal number otherwise to 0.
Hi,
This patch adds the range op for builtin isfinite.
Compared to previous version, the main change is to set the range to
1 if it's finite number otherwise to 0.
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652220.html
Bootstrapped and tested on x86 and powerpc64-linux BE and LE
> Date: Wed, 29 May 2024 20:07:22 -0600
> From: Jeff Law
> > There appears to be only a single supported SPARC machine in
> > cfarm: cfarm216, and I currently can't reach it due to what
> > appears to be issues at my end. I guess I'll either fix
> > that or breathe life into sparc-elf+sim.
> Or
On 5/29/24 7:28 PM, Hans-Peter Nilsson wrote:
From: Hans-Peter Nilsson
Date: Mon, 27 May 2024 19:51:47 +0200
2: Does not depend on 1, but corrects an incidentally found wart:
find_basic_block calls fails too often. Replace it with "modern"
insn-to-basic-block cross-referencing.
3: Just
> From: Hans-Peter Nilsson
> Date: Mon, 27 May 2024 19:51:47 +0200
> 2: Does not depend on 1, but corrects an incidentally found wart:
> find_basic_block calls fails too often. Replace it with "modern"
> insn-to-basic-block cross-referencing.
>
> 3: Just an addendum to 2: removes an "if",
This patch improves vectorization of certain floating point widening operations
for the aarch64 target by adding vector floating point extend patterns for
V2SF->V2DF and V4HF->V4SF conversions.
PR target/113880
PR target/113869
gcc/ChangeLog:
*
From: Greg McGary
Add option -m(no-)autovec-segment to enable/disable autovectorizer
from emitting vector segment load/store instructions. This is useful for
performance experiments.
gcc/ChangeLog:
* config/riscv/autovec.md (vec_mask_len_load_lanes,
vec_mask_len_store_lanes):
From: Greg McGary
gcc/ChangeLog:
* gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent
divide-by-zero.
* testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: Remove dg-ice.
---
No changes in v3. Depends on the risc-v backend option added in patch 1 to
trigger the ICE.
---
Sending v3 to fixup testsuite issues and whitespace linter issue.
v2 changelog:
Rebased to squash Edwin's fixup into Greg's patch. Split out the middle-end
change and xfailed the associated testcase so the second patch can land
seperately.
Relying on pre-commit CI for full testing.
v3
On 5/28/24 1:01 AM, Richard Biener wrote:
On Fri, May 24, 2024 at 10:46 AM Mariam Arutunian
wrote:
This patch adds a new compiler pass aimed at identifying naive CRC
implementations,
characterized by the presence of a loop calculating a CRC (polynomial long
division).
Upon detection of a
On Wed, 2024-05-29 at 15:26 -0400, David Edelsohn wrote:
> On Mon, May 20, 2024 at 1:56 PM David Edelsohn
> wrote:
>
> > Hi, David
> >
> > Unfortunately r15-636-g770657d02c986c causes a bootstrap failure on
> > AIX
> > when building f951 in stage2. cc1 and cc1plus link successfully.
> > There
On Wed, 2024-05-29 at 16:35 -0400, Eric Gallager wrote:
> On Tue, May 28, 2024 at 1:21 PM David Malcolm
> wrote:
> >
> > Ping.
> >
> > This patch has actually been *very* helpful to me when debugging
> > selftest failures involving ASSERT_STREQ.
> >
> > Thanks
> > Dave
> >
>
> Currently
Maybe also add a mention of the toolchain's Mastodon account while
you're there? https://fosstodon.org/@gnutools
On Sun, May 26, 2024 at 6:05 PM Gerald Pfeifer wrote:
>
> Keep the reference as text; just not the link.
>
> Gerald
> ---
> htdocs/news.html | 3 +--
> 1 file changed, 1
On Tue, May 28, 2024 at 1:21 PM David Malcolm wrote:
>
> Ping.
>
> This patch has actually been *very* helpful to me when debugging
> selftest failures involving ASSERT_STREQ.
>
> Thanks
> Dave
>
Currently `diff` is only listed under the "Tools/packages necessary
for modifying GCC" section of
On May 23, 2024, at 6:28 AM, Alexandre Oliva wrote;
> I came up with an entirely different approach:
>
>
> g++.dg/vect/pr95401.cc has dg-additional-sources, and that fails when
> check_vect_support_and_set_flags finds vector support lacking for
> execution tests: tests decay to compile tests,
On Mon, May 20, 2024 at 1:56 PM David Edelsohn wrote:
> Hi, David
>
> Unfortunately r15-636-g770657d02c986c causes a bootstrap failure on AIX
> when building f951 in stage2. cc1 and cc1plus link successfully. There
> doesn't seem to be a similar failure for powerpc64-linux BE or LE.
>
> The
Richard and Joseph:
> On May 28, 2024, at 17:09, Qing Zhao wrote:
>
>>>
>>> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
>>> index fa17eff551e8..d75b23668925 100644
>>> --- a/gcc/varasm.cc
>>> +++ b/gcc/varasm.cc
>>> @@ -5082,6 +5082,11 @@ initializer_constant_valid_p_1 (tree value, tree
>>>
After reimplementing late resolution of "declare variant" to use the
same mechanisms as metadirective, the declare_variant_alt and
calls_declare_variant_alt flags on struct cgraph_node are no longer
used by anything. For the purposes of marking functions that need
late resolution, the
The code and test case previously implemented the OpenMP 5.0 spec,
which said in section 2.3.1:
"For functions within a declare target block, the target trait is added
to the beginning of the set..."
In OpenMP 5.1, this was changed to
"For device routines, the target trait is added to the
gcc/testsuite/ChangeLog
* c-c++-common/gomp/metadirective-1.c: New.
* c-c++-common/gomp/metadirective-2.c: New.
* c-c++-common/gomp/metadirective-3.c: New.
* c-c++-common/gomp/metadirective-4.c: New.
* c-c++-common/gomp/metadirective-5.c: New.
*
This patch extends the mechanisms previously added to support dynamic
selectors in metavariant constructs to also apply to "declare
variant". The front-end mechanisms used to handle "declare variant"
via attributes attached to the function decls remain the same, but the
gimplifier now uses the
This patch adds support for metadirectives to the Fortran front end.
gcc/fortran/ChangeLog
* decl.cc (gfc_match_end): Handle metadirectives.
* dump-parse-tree.cc (show_omp_node): Likewise.
(show_code_node): Likewise.
* gfortran.h (enum gfc_statement): Add
libgomp/ChangeLog
* libgomp.texi (OpenMP 5.0): Mark metadirective and declare variant
as implemented.
(OpenMP 5.1): Mark target_device as supported.
Add changed interaction between declare target and OpenMP context
and dynamic selector support.
This patch implements the libgomp runtime support for the dynamic
target_device selector via the GOMP_evaluate_target_device function.
include/ChangeLog
* cuda/cuda.h (CUdevice_attribute): Add definitions for
CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR and
This patch adds middle-end support for OpenMP metadirectives. Some
context selectors can be resolved during gimplification, but others need to
be deferred until the omp_device_lower pass, which requires that cgraph,
LTO streaming, inlining, etc all know about this construct as well.
The OpenMP spec says:
"If trait-property any is specified in the kind trait-selector of the
device selector set or the target_device selector sets, no other
trait-property may be specified in the same selector set."
GCC was not previously enforcing this restriction and several testcases
included
This patch adds C++ support for metadirectives. It uses the
c-family support committed with the corresponding C front end patch
to do early parse-time metadirective resolution when possible.
Additional C/C++ common testcases are provided in a subsequent
patch in the series.
gcc/cp/ChangeLog
This patch adds support to the C front end to parse OpenMP metadirective
constructs. It includes support for early parse-time resolution
of metadirectives (when possible) that will also be used by the C++ front
end.
Additional common C/C++ testcases are in a later patch in the series.
This patch adds the OMP_METADIRECTIVE tree node and shared tree-level
support for manipulating metadirectives. It defines/exposes
interfaces that will be used in subsequent patches that add front-end
and middle-end support, but nothing generates these nodes yet.
This patch also adds compile-time
This is an updated version of the patch series I posted a few weeks
ago:
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650725.html
I won't duplicate the full list of things implemented/fixed here from
the original patch mail. The incremental changes since then include:
* I rebased the
On 5/29/24 00:20, Richard Biener wrote:
On Wed, May 29, 2024 at 1:39 AM Patrick O'Neill wrote:
From: Greg McGary
gcc/ChangeLog:
* gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent
divide-by-zero.
* testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: Remove xfail.
On May 27, 2024, "Kewen.Lin" wrote:
> OK with these nits tweaked and re-tested well, thanks!
Thanks, here's what I've retested on ppc64le-linux-gnu, and will push
onto trunk eventually, after retesting also on ppc- and ppc64-vx7r2:
[testsuite] [powerpc] adjust -m32 counts for
This was patch 13 from the previous series. Note the previous series patch 12
was dropped. This patch is the same as the previous version. The additional
work to remove __builtin_vec_set_v1ti, __builtin_vec_set_v2di,
__builtin_vec_set_v2d per the feedback comments with equivalent gimple
This was patch 11 from the previous series. Patch was updated to address
feedback comments.
Carl
--
rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in
The built-in __builtin_vsx_xvcmpeqsp_p is a duplicate of the
This was patch 8 in the previous series. Updated patch per the feedback
comments.
Carl
rs6000, remove __builtin_vsx_vperm_* built-ins
The undocumented built-ins:
__builtin_vsx_vperm_16qi_uns,
This was patch 10 from the previous series. The patch was updated to address
feedback comments.
Carl
---
rs6000, extend vec_xxpermdi built-in for __int128 args
Add a new signed and unsigned overloaded instances for
This was patch 9 in the previous series. It was previously approved.
Reposting for completeness.
Carl
-
rs6000, remove __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp built-ins
The undocumented
This was patch 7 in the previous series. Patch was updated to address the
feedback comments.
Carl
rs6000, remove the vec_xxsel built-ins, they are duplicates
The following undocumented
This was patch 5 in the previous series. It was previously approved. Not
changes in this version. Being posted for completeness.
Carl
rs6000, remove duplicated built-ins of vecmergl and
vec_mergeh
The
This was patch 6 in the previous series. Updated the documentation file per
the comments. No functional changes to the patch.
Carl
rs6000, add overloaded vec_sel with int128 arguments
Extend the vec_sel
This is a new patch to removed the built-ins that were inadvertently missing in
the previous series.
Carl
--
rs6000, Remove redundant float/double type conversions
The following built-ins are redundant
Updated the patch per the feedback comments from the previous version.
Carl
---
rs6000, extend the current vec_{un,}signed{e,o} built-ins
The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds
This patch was updated per the feedback comment from the previous version in
series 2.
Carl
---
rs6000, fix error in unsigned vector float to unsigned int built-in definitions
The built-in
I responded to comments about the patch from the previous patch series. No
functional changes were made to this patch.
Carl
--
rs6000, Remove __builtin_vsx_xvcvspsxws built-in.
The built-in __builtin_vsx_xvcvspsxws
This patch was approved in the previous series. There are no changes to this
patch. Reposting for completeness.
Carl
---
rs6000, Remove __builtin_vsx_cmple* builtins
The built-ins __builtin_vsx_cmple_u16qi,
GCC maintainers:
The following is an updated patch series to remove duplicate built-ins.
There are patches to extend an existing overloaded built-in to cover additional
input types.
A new patch, 0005-rs6000-Remove-redundant-float-double-type-conversion.patch,
was added to remove
Two-vector TBL instructions are fed by an aarch64_combinev16qi, whose
purpose is to put the two input data vectors into consecutive registers.
This aarch64_combinev16qi was then split after reload into individual
moves (from the first input to the first half of the output, and from
the second
> Am 29.05.2024 um 15:30 schrieb Eric Botcazou :
>
> Hi,
>
> Ada doesn't have an equivalent to transparent union types in GNU C so, when it
> needs to interface a C function that takes a parameter of a transparent union
> type, GNAT uses the type of the first member of the union on the Ada
Now that unified-shared memory works (with some devices), mark it as 'Y'
and link to the device-specific chapter. While there is always room for
improvement (like having opt-in partial support for managed-memory
semi-USM devices), it works sufficienty for a 'Y'.
Additionally, I saw that 5.2
Hi,
this testcase shows another poblem with missing comparators for metadata
in ICF. With value ranges available to loop optimizations during early
opts we can estimate number of iterations based on guarding condition that
can be split away by the fnsplit pass. This patch disables ICF when
number
On 5/29/24 03:19, Richard Biener wrote:
On Tue, May 28, 2024 at 8:57 PM Andrew MacLeod wrote:
The original patch causing the PR made ranger's cache re-entrant to
enable SCEV to use the current range_query when called from within ranger..
SCEV uses the currently active range query (via
Thanks Richard for suggestion and review.
Did some tricky/ugly restrictions v3 for the phi gen as there are
sorts of (cond in match.pd, will have a try with your proposal in v4.
Thanks again for help.
Pan
-Original Message-
From: Richard Biener
Sent: Wednesday, May 29, 2024 8:36 PM
Currently, gcc warns about noreturn marked functions that return both
explicitly and implicitly, with no way to turn this warning off. clang does
have an option for these classes of warnings, -Winvalid-noreturn. However, we
can do better. Instead of just having 1 option that switches the
Revised to drop the cgraph change so I can self-approve the remaining patch.
Tested x86_64-pc-linux-gnu, applying to trunk.
-- 8< --
#pragma target and optimize should also apply to implicitly-generated
functions like static initialization functions and defaulted special member
functions.
Revised to change mkdeps and the docs.
Tested x86_64-pc-linux-gnu, applying to trunk.
-- 8< --
There is a trend in the broader C++ community to use a different extension
for module interface units, even though (in GCC) they are compiled in the
same way as other source files. Let's recognize
On Tue, 21 May 2024, Andi Kleen wrote:
> - Give error messages for all causes of non sibling call generation
> - When giving error messages clear the musttail flag to avoid ICEs
> - Error out when tree-tailcall failed to mark a must-tail call
> sibcall. In this case it doesn't know the true
Hi,
Ada doesn't have an equivalent to transparent union types in GNU C so, when it
needs to interface a C function that takes a parameter of a transparent union
type, GNAT uses the type of the first member of the union on the Ada side
(which is the type used to determine the passing mechanism
When vectorizing an early break loop with LENs (do we miss some
check here to disallow this?) we can end up deciding to insert
stmts after a GIMPLE_COND when doing SLP scheduling and trying
to be conservative with placing of stmts only dependent on
the implicit loop mask/len. The following avoids
The following avoids dumping 'vectorizing stmts using SLP' for
single-lane instances since that causes extra testsuite fallout.
* tree-vect-slp.cc (vect_schedule_slp): Gate dumping
'vectorizing stmts using SLP' on > 1 lanes.
---
gcc/tree-vect-slp.cc | 3 ++-
1 file changed, 2
The following performs single-lane SLP discovery for reductions.
It requires a fixup for outer loop vectorization where a check
for multiple types needs adjustments as otherwise bogus pointer
IV increments happen when there are multiple copies of vector stmts
in the inner loop.
For the reduction
On 5/28/24 23:55, Patrick O'Neill wrote:
> From: Greg McGary
>
> Add option -m(no-)autovec-segment to enable/disable autovectorizer
> from emitting vector segment load/store instructions. This is useful for
> performance experiments.
I think the question was raised before but does a vector tune
> On May 29, 2024, at 02:57, Richard Biener wrote:
>
> On Tue, May 28, 2024 at 11:09 PM Qing Zhao wrote:
>>
>> Thank you for the comments. See my answers below:
>>
>> Joseph, please see the last question, I need your help on it. Thanks a lot
>> for the help.
>>
>> Qing
>>
>>> On May 28,
Pushed, thanks!
On 2/27/24 20:13, Oskari Pirhonen wrote:
Add proper hints for implicit declaration of strerror.
The results could be confusing depending on the other included headers.
These example messages are from compiling a trivial program to print the
string for an errno value. It only
On Wed, May 29, 2024 at 08:20:01AM +0200, Tobias Burnus wrote:
> + if (num_devices > 0
> + && (omp_requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY))
> +for (int dev = 0; dev < num_devices; dev++)
> + {
> + int pi;
> + CUresult r;
> + r = CUDA_CALL_NOCHECK
On Fri, 2024-05-24 at 12:42 +0400, Mariam Arutunian wrote:
> This patch adds a new compiler pass aimed at identifying naive CRC
> implementations,
> characterized by the presence of a loop calculating a CRC (polynomial
> long
> division).
> Upon detection of a potential CRC, the pass prints an
On Mon, May 27, 2024 at 8:29 AM wrote:
>
> From: Pan Li
>
> After we support one gassign form of the unsigned .SAT_ADD, we
> would like to support more forms including both the branch and
> branchless. There are 5 other forms of .SAT_ADD, list as below:
>
> Form 1:
> #define SAT_ADD_U_1(T)
On Wed, May 29, 2024 at 02:15:07PM +0200, Tobias Burnus wrote:
> + bool b;
> + hsa_status_t status;
> + status = hsa_fns.hsa_system_get_info_fn (
> + HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT, );
> + if (status != HSA_STATUS_SUCCESS)
> + GOMP_PLUGIN_error (
This patch depends (on the libgomp/target.c parts) of the patch
"[patch] libgomp: Enable USM for some nvptx devices",
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652987.html
AMD GPUs that are either APU devices or MI200 [or MI300X]
(with HSA_XNACK=1 set) can access host memory; the
On Mon, May 27, 2024 at 2:48 AM Andrew Pinski wrote:
>
> While looking into something else, I noticed that `a ^ CST` needed to be
> special casing to bitwise_inverted_equal_p as it would simplify to `a ^ ~CST`
> for the bitwise not.
>
> Bootstrapped and tested on x86_64-linux-gnu with no
On Mon, May 27, 2024 at 2:47 AM Andrew Pinski wrote:
>
> While working on adding matching of negative expressions of `a - b`,
> I noticed that we started to have "duplicated" patterns due to not having
> a way to match maybe negative expressions. So I went back to what I did for
> bit_not and
From: Pan Li
This patch would like to support the .SAT_SUB for the unsigned
vector int. Given we have below example code:
void
vec_sat_sub_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
for (unsigned i = 0; i < n; i++)
out[i] = (x[i] - y[i]) & (-(uint64_t)(x[i] >= y[i]));
}
On Fri, May 24, 2024 at 9:29 AM liuhongt wrote:
>
> Update in V3:
> > Since this was about vectorization can you instead add a testcase to
> > gcc.dg/vect/ and check for
> > vectorization to happen?
> Move to vect/pr112325.c.
> >
> > I believe the if (unr_insn <= 0) check can go as well.
>
On Tue, May 28, 2024 at 8:20 AM Jeff Law wrote:
>
>
> On 5/24/24 2:42 AM, Mariam Arutunian wrote:
> > This patch adds a new compiler pass aimed at identifying naive CRC
> > implementations,
> > characterized by the presence of a loop calculating a CRC (polynomial
> > long division).
> > Upon
On Wed, 29 May 2024, Richard Biener wrote:
> On Wed, 29 May 2024, Richard Sandiford wrote:
>
> > Richard Biener writes:
> > > Code generation for contiguous load vectorization can already deal
> > > with generalized avoidance of loading from a gap. The following
> > > extends detection of
The following arranges for the pre-SLP vectorization scalar cleanup
to be run when predictive commoning was applied to a loop in the
function. This is similar to the complete unroll situation and
facilitating SLP vectorization. Avoiding the SSA copies in predictive
commoning itself isn't easy
This fixes the link failure of the GNAT tools on 32-bit SPARC/Linux (as well
as on 32-bit PowerPC/Linux probably) coming from an incorrect binding to the
64-bit compare-and-exchange builtin.
Tested by Rainer on 32-bit SPARC/Linux, applied on mainline and 14 branch.
2024-05-29 Eric Botcazou
Much like AT_HWCAP is already provided in case the platform headers
don't have the value (yet).
libgcc/
* config/aarch64/cpuinfo.c: Provide AT_HWCAP2.
---
Observed as build failure with 14.1.0, so may want backporting there.
--- a/libgcc/config/aarch64/cpuinfo.c
+++
On Wed, 29 May 2024, Richard Sandiford wrote:
> Richard Biener writes:
> > Code generation for contiguous load vectorization can already deal
> > with generalized avoidance of loading from a gap. The following
> > extends detection of peeling for gaps requirement with that,
> > gets rid of the
On Thu, 23 May 2024, Hu, Lin1 wrote:
> gcc/ChangeLog:
>
> PR target/107432
> * tree-vect-generic.cc
> (supportable_indirect_narrowing_operation): New function for
> support indirect narrowing convert.
> (supportable_indirect_widening_operation): New function for
>
On Tue, 28 May 2024 at 21:55, François Dumont wrote:
>
> I can indeed restore _M_initialize_dispatch as it was before. It was not
> fixing my initial problem. I simply kept the code simplification.
>
> libstdc++: Use RAII to replace try/catch blocks
>
> Move _Guard into std::vector
Richard Biener writes:
> Code generation for contiguous load vectorization can already deal
> with generalized avoidance of loading from a gap. The following
> extends detection of peeling for gaps requirement with that,
> gets rid of the old special casing of a half load and makes sure
> when
Richard Biener writes:
> On Fri, 24 May 2024, Richard Biener wrote:
>
>> This is the second merge proposed from the SLP vectorizer branch.
>> I have again managed without adding and using --param vect-single-lane-slp
>> but instead this provides always enabled functionality.
>>
>> This makes us
According to hongtao's suggestion, I support some trunc in mmx.md under
x86-64-v3, and optimize ix86_expand_trunc_with_avx2_noavx512f.
BRs,
Lin
gcc/ChangeLog:
PR 107432
* config/i386/i386-expand.cc (ix86_expand_trunc_with_avx2_noavx512f):
New function for generate a
On Wed, May 29, 2024 at 4:56 PM Hu, Lin1 wrote:
>
> Exclude add TARGET_MMX_WITH_SSE, I merge two patterns.
Ok.
>
> BRs,
> Lin
>
> gcc/ChangeLog:
>
> PR target/107432
> * config/i386/mmx.md
> (VI2_32_64): New mode iterator.
> (mmxhalfmode): New mode atter.
> (mmxhalfmodelower):
Exclude add TARGET_MMX_WITH_SSE, I merge two patterns.
BRs,
Lin
gcc/ChangeLog:
PR target/107432
* config/i386/mmx.md
(VI2_32_64): New mode iterator.
(mmxhalfmode): New mode atter.
(mmxhalfmodelower): Ditto.
(truncv2hiv2qi2): Extend mode v4hi and change name from
On Thu, May 16, 2024 at 5:15 PM Hongyu Wang wrote:
>
> Richard Biener 于2024年5月16日周四 15:05写道:
>
> >
> > On Thu, May 16, 2024 at 8:25 AM Hongyu Wang wrote:
> > >
> > > Hi,
> > >
> > > In ix86_override_options_after_change, calls to ix86_default_align
> > > and ix86_recompute_optlev_based_flags
On Wed, May 29, 2024 at 10:39 AM Feng Xue OS
wrote:
>
> Ok. Then I will add a TODO comment on "bbs" field to describe it.
Fine with me.
Thanks,
Richard.
> Thanks,
> Feng
>
>
>
> From: Richard Biener
> Sent: Wednesday, May 29, 2024 3:14 PM
> To: Feng
Ok. Then I will add a TODO comment on "bbs" field to describe it.
Thanks,
Feng
From: Richard Biener
Sent: Wednesday, May 29, 2024 3:14 PM
To: Feng Xue OS
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] vect: Unify bbs in loop_vec_info and bb_vec_info
Hi Harald,
thanks for the review. Very much appreciated.
Commited as 2f97d98d174e3ef9f3a9a83c179d787abde5e066.
I have some patches for memory leaks I will post in the next days. I am
inclined to backport them together to 14-line, if no new bugs arise.
About the SAVE_EXPR, Richard Biener shed
On Wed, May 29, 2024 at 1:39 AM Patrick O'Neill wrote:
>
> From: Greg McGary
>
> gcc/ChangeLog:
> * gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent
> divide-by-zero.
> * testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: Remove xfail.
> ---
>
On Tue, May 28, 2024 at 8:57 PM Andrew MacLeod wrote:
>
> The original patch causing the PR made ranger's cache re-entrant to
> enable SCEV to use the current range_query when called from within ranger..
>
> SCEV uses the currently active range query (via get_range_query()) for
> picking up
1 - 100 of 116 matches
Mail list logo