[Bug tree-optimization/122776] vectorizable_simd_clone_call looks at LOOP_VINFO_FULLY_MASKED_P during analysis
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122776 --- Comment #7 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:941df4bc7ac83574374a3623ac11b8685c2282cb commit r16-5898-g941df4bc7ac83574374a3623ac11b8685c2282cb Author: Richard Biener Date: Fri Dec 5 09:04:02 2025 +0100 Make gcc.dg/vect/vect-simd-clone-24.c more robust When -march=cascadelake is added we get 256bit vectorization by default but there's no OMP SIMD ABI for this case with inbranch. So add -mprefer-vector-width=512 to the testcase. PR tree-optimization/122776 * gcc.dg/vect/vect-simd-clone-24.c: Add -mprefer-vector-width=512.
[Bug tree-optimization/122776] vectorizable_simd_clone_call looks at LOOP_VINFO_FULLY_MASKED_P during analysis
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122776 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Target Milestone|--- |16.0 Resolution|--- |FIXED --- Comment #6 from Richard Biener --- Fixed in GC 16.
[Bug tree-optimization/122776] vectorizable_simd_clone_call looks at LOOP_VINFO_FULLY_MASKED_P during analysis
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122776 --- Comment #5 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:cea34ac07e3bd74086f854dd43afb29b4940d8de commit r16-5888-gcea34ac07e3bd74086f854dd43afb29b4940d8de Author: Richard Biener Date: Sun Nov 23 14:01:03 2025 +0100 Select both inbranch and notinbranch clone during SIMD call analysis The following recors both a possibly notinbranch and an inbranch SIMD clone during analysis so that we can properly handle the late decision on loop masking. Recording of linear-clause data from analysis is extended to cover linear-clause arguments from both clones. This also fixes AVX512 masked loop code generation in line with the previous fixes. PR tree-optimization/122776 * tree-vectorizer.h (vect_simd_clone_data::clone, vect_simd_clone_data::clone_inbranch): New fields for the two selected clones. * tree-vect-stmts.cc (vectorizable_simd_clone_call): Record both a possibly notinbranch and a inbranch clone. Delay the choice between both to code generation based on LOOP_VINFO_FULLY_MASKED_P. * gcc.dg/vect/vect-simd-clone-24.c: New testcase. * gcc.dg/gomp/pr110485.c: Adjust.
[Bug tree-optimization/122776] vectorizable_simd_clone_call looks at LOOP_VINFO_FULLY_MASKED_P during analysis
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122776 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #4 from Richard Biener --- I'll see if this works out next week.
[Bug tree-optimization/122776] vectorizable_simd_clone_call looks at LOOP_VINFO_FULLY_MASKED_P during analysis
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122776 --- Comment #3 from Richard Biener --- (In reply to Jakub Jelinek from comment #2) > (In reply to Richard Biener from comment #1) > > Not only due to this check the following testcase fails to use fully masking > > or a masked epilog. I'm not sure how common it would be to have both > > inbranch > > and notinbranch variants, I did not check if there's a syntax to get both > > created for a function definition. > > > > #pragma omp declare simd simdlen(16) inbranch > > int __attribute__((const)) baz (int x); > > #pragma omp declare simd simdlen(16) notinbranch > > int __attribute__((const)) baz (int x); > > Note, just > #pragma omp declare simd simdlen(16) > int __attribute__((const)) baz (int x); > should create both inbranch and notinbranch clones, or you can use > #pragma omp declare simd simdlen(16) inbranch > #pragma omp declare simd simdlen(16) notinbranch > int __attribute__((const)) baz (int x); > too. > > I believe the #c0 mentioned hunk is from Andre, and if > LOOP_VINFO_FULLY_MASKED_P isn't computed until late, I think we need some > alternative for that. I think we need to compute both a inbranch and notinbranch "optimal" SIMD clone. We only know after vectorizable_simd_clone_call analysis whether we actually will use a masked loop or not. If we don't we can code-generate for the notinbranch variant if that's available.
[Bug tree-optimization/122776] vectorizable_simd_clone_call looks at LOOP_VINFO_FULLY_MASKED_P during analysis
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122776 Jakub Jelinek changed: What|Removed |Added CC||avieira at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- (In reply to Richard Biener from comment #1) > Not only due to this check the following testcase fails to use fully masking > or a masked epilog. I'm not sure how common it would be to have both > inbranch > and notinbranch variants, I did not check if there's a syntax to get both > created for a function definition. > > #pragma omp declare simd simdlen(16) inbranch > int __attribute__((const)) baz (int x); > #pragma omp declare simd simdlen(16) notinbranch > int __attribute__((const)) baz (int x); Note, just #pragma omp declare simd simdlen(16) int __attribute__((const)) baz (int x); should create both inbranch and notinbranch clones, or you can use #pragma omp declare simd simdlen(16) inbranch #pragma omp declare simd simdlen(16) notinbranch int __attribute__((const)) baz (int x); too. I believe the #c0 mentioned hunk is from Andre, and if LOOP_VINFO_FULLY_MASKED_P isn't computed until late, I think we need some alternative for that.
[Bug tree-optimization/122776] vectorizable_simd_clone_call looks at LOOP_VINFO_FULLY_MASKED_P during analysis
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122776
Richard Biener changed:
What|Removed |Added
Last reconfirmed||2025-11-21
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
CC||jakub at gcc dot gnu.org
--- Comment #1 from Richard Biener ---
Not only due to this check the following testcase fails to use fully masking or
a masked epilog. I'm not sure how common it would be to have both inbranch
and notinbranch variants, I did not check if there's a syntax to get both
created for a function definition.
#pragma omp declare simd simdlen(16) inbranch
int __attribute__((const)) baz (int x);
#pragma omp declare simd simdlen(16) notinbranch
int __attribute__((const)) baz (int x);
int a[1024];
void foo (int n, int * __restrict b)
{
for (int i = 0; i < n; ++i)
if (baz (a[i]))
b[i] = baz (b[i]);
}
