[PATCH] adjust vectorization expectations for ppc costmodel 76b
Ping? https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566525.html This test expects vectorization at power8+ because strict alignment is not required for vectors. For power7, vectorization is not to take place because it's not deemed profitable: 12 iterations would be required to make it so. But for power6 and below, the test's 10 iterations are enough to make vectorization profitable, but the test doesn't expect this. Assuming the decision is indeed appropriate, I'm adjusting the expectations. for gcc/testsuite/ChangeLog * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c: Adjust expectations for cpus below power7. --- .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c |9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c index cbbfbb24658f8..0dab2c08acdb4 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c @@ -46,9 +46,10 @@ int main (void) return 0; } -/* Peeling to align the store is used. Overhead of peeling is too high. */ -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { vector_alignment_reachable && {! vect_no_align} } } } } */ -/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" { target { vector_alignment_reachable && {! vect_hw_misalign} } } } } */ +/* Peeling to align the store is used. Overhead of peeling is too high + for power7, but acceptable for earlier architectures. */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { has_arch_pwr7 && { vector_alignment_reachable && {! vect_no_align} } } } } } */ +/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" { target { has_arch_pwr7 && { vector_alignment_reachable && {! vect_hw_misalign} } } } } } */ /* Versioning to align the store is used. Overhead of versioning is not too high. */ -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_no_align || {! vector_alignment_reachable} } } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} || {! has_arch_pwr7 } } } } } } */ -- Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive
Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b
Hi, on 2024/4/22 17:28, Alexandre Oliva wrote: > Ping? > https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566525.html > > > This test expects vectorization at power8+ because strict alignment is > not required for vectors. For power7, vectorization is not to take > place because it's not deemed profitable: 12 iterations would be > required to make it so. > > But for power6 and below, the test's 10 iterations are enough to make > vectorization profitable, but the test doesn't expect this. Assuming > the decision is indeed appropriate, I'm adjusting the expectations. For a record, the cost difference between power6 and power7 is the cost for vec_perm, it's: * p6 * ic[i_23] 2 times vector_stmt costs 2 in prologue ic[i_23] 1 times vector_stmt costs 1 in prologue ic[i_23] 1 times vector_load costs 2 in body ic[i_23] 1 times vec_perm costs 1 in body vs. * p7 * ic[i_23] 2 times vector_stmt costs 2 in prologue ic[i_23] 1 times vector_stmt costs 1 in prologue ic[i_23] 1 times vector_load costs 2 in body ic[i_23] 1 times vec_perm costs 3 in body , it further cause minimum iters for profitability difference. > > > for gcc/testsuite/ChangeLog > > * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c: Adjust > expectations for cpus below power7. > --- > .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c |9 + > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > index cbbfbb24658f8..0dab2c08acdb4 100644 > --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > @@ -46,9 +46,10 @@ int main (void) >return 0; > } > > -/* Peeling to align the store is used. Overhead of peeling is too high. */ > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target > { vector_alignment_reachable && {! vect_no_align} } } } } */ > -/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" > { target { vector_alignment_reachable && {! vect_hw_misalign} } } } } */ > +/* Peeling to align the store is used. Overhead of peeling is too high > + for power7, but acceptable for earlier architectures. */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target > { has_arch_pwr7 && { vector_alignment_reachable && {! vect_no_align} } } } } > } */ > +/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" > { target { has_arch_pwr7 && { vector_alignment_reachable && {! > vect_hw_misalign} } } } } } */ > > /* Versioning to align the store is used. Overhead of versioning is not too > high. */ > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target > { vect_no_align || {! vector_alignment_reachable} } } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target > { vect_no_align || { {! vector_alignment_reachable} || {! has_arch_pwr7 } } } > } } } */ For !has_arch_pwr7 case, it still adopts peeling but as the comment (one line above) shows the original intention of this case is to expect not profitable for peeling so it's not expected to be handled here, can we just tweak the loop bound instead, such as: -#define N 14 +#define N 13 #define OFF 4 ?, it can make this loop not profitable to be vectorized for !vect_no_align with peeling (both pwr7 and pwr6) and keep consistent. BR, Kewen > >
Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b
On Apr 24, 2024, "Kewen.Lin" wrote: > For !has_arch_pwr7 case, it still adopts peeling but as the comment (one line > above) > shows the original intention of this case is to expect not profitable for > peeling > so it's not expected to be handled here, can we just tweak the loop bound > instead, > such as: > -#define N 14 > +#define N 13 > #define OFF 4 > ?, it can make this loop not profitable to be vectorized for !vect_no_align > with > peeling (both pwr7 and pwr6) and keep consistent. Like this? I didn't feel I could claim authorship of this one-liner just because I turned it into a patch and tested it, so I took the liberty of turning your own words above into the commit message. So far, tested on ppc64le-linux-gnu (ppc9). Testing with vxworks targets now. Would you like to tweak the commit message to your liking? Otherwise, is this ok to install? Thanks, adjust iteration count for ppc costmodel 76b From: Kewen Lin The original intention of this case is to expect not profitable for peeling. Tweak the loop bound to make this loop not profitable to be vectorized for !vect_no_align with peeling (both pwr7 and pwr6) and keep consistent. for gcc/testsuite/ChangeLog * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c (N): Tweak. --- .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c index cbbfbb24658f8..e48b0ab759e75 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c @@ -6,7 +6,7 @@ /* On Power7 without misalign vector support, this case is to check it's not profitable to perform vectorization by peeling to align the store. */ -#define N 14 +#define N 13 #define OFF 4 /* Check handling of accesses for which the "initial condition" - -- Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive
Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b
Hi, on 2024/4/28 16:14, Alexandre Oliva wrote: > On Apr 24, 2024, "Kewen.Lin" wrote: > >> For !has_arch_pwr7 case, it still adopts peeling but as the comment (one >> line above) >> shows the original intention of this case is to expect not profitable for >> peeling >> so it's not expected to be handled here, can we just tweak the loop bound >> instead, >> such as: > >> -#define N 14 >> +#define N 13 >> #define OFF 4 > >> ?, it can make this loop not profitable to be vectorized for !vect_no_align >> with >> peeling (both pwr7 and pwr6) and keep consistent. > > Like this? I didn't feel I could claim authorship of this one-liner > just because I turned it into a patch and tested it, so I took the > liberty of turning your own words above into the commit message. So Feel free to do so! > far, tested on ppc64le-linux-gnu (ppc9). Testing with vxworks targets > now. Would you like to tweak the commit message to your liking? OK, tweaked as below. > Otherwise, is this ok to install? > > Thanks, > > > adjust iteration count for ppc costmodel 76b Nit: Maybe add a prefix "testsuite: ". > > From: Kewen Lin Thanks, you can just drop this. :) > > The original intention of this case is to expect not profitable for > peeling. Tweak the loop bound to make this loop not profitable to be > vectorized for !vect_no_align with peeling (both pwr7 and pwr6) and > keep consistent. For some hardware which doesn't support unaligned vector memory access, test case costmodel-vect-76b.c expects to see cost modeling would make the decision that it's not profitable for peeling, according to the commit history, test case comments and the way to check. For now, the existing loop bound 14 works well for Power7, but it does not for some targets on which the cost of operation vec_perm can be different from Power7, such as: Power6, it's 3 vs. 1. This difference further causes the difference (10 vs. 12) on the minimum iteration for profitability and cause the failure. To keep the original test point, this patch is to tweak the loop bound to ensure it's not profitable to be vectorized for !vect_no_align with peeling. OK for trunk (assuming the testings run well on p6/p7 too), thanks! BR, Kewen > > > for gcc/testsuite/ChangeLog > > * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c (N): Tweak. > --- > .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > index cbbfbb24658f8..e48b0ab759e75 100644 > --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > @@ -6,7 +6,7 @@ > > /* On Power7 without misalign vector support, this case is to check it's not > profitable to perform vectorization by peeling to align the store. */ > -#define N 14 > +#define N 13 > #define OFF 4 > > /* Check handling of accesses for which the "initial condition" - > >
Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b
On Apr 28, 2024, "Kewen.Lin" wrote: > Nit: Maybe add a prefix "testsuite: ". ACK >> >> From: Kewen Lin > Thanks, you can just drop this. :) I've turned it into Co-Authored-By, since you insist. But unfortunately with the patch it still fails when testing for -mcpu=power7 on ppc64le-linux-gnu: it does vectorize the loop with 13 iterations. We need 16 iterations, as in an earlier version of this test, for it to pass for -mcpu=power7, but then it doesn't pass for -mcpu=power6. It looks like we're going to have to adjust the expectations. -- Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive
Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b
on 2024/4/29 14:28, Alexandre Oliva wrote: > On Apr 28, 2024, "Kewen.Lin" wrote: > >> Nit: Maybe add a prefix "testsuite: ". > > ACK > >>> >>> From: Kewen Lin > >> Thanks, you can just drop this. :) > > I've turned it into Co-Authored-By, since you insist. > > But unfortunately with the patch it still fails when testing for > -mcpu=power7 on ppc64le-linux-gnu: it does vectorize the loop with 13 > iterations. We need 16 iterations, as in an earlier version of this > test, for it to pass for -mcpu=power7, but then it doesn't pass for > -mcpu=power6. > > It looks like we're going to have to adjust the expectations. > I had a look at the failure, it's due to that "vect_no_align" is evaluated as true unexpectedly. "selector_expression: ` vect_no_align || {! vector_alignment_reachable} ' 1" Currently powerpc* checks check_p8vector_hw_available, ppc64le-linux-gnu has at least Power8 support (that is testing machine supports p8vector run), so it concludes vect_no_align is true. proc check_effective_target_vect_no_align { } { return [check_cached_effective_target_indexed vect_no_align { expr { [istarget mipsisa64*-*-*] || [istarget mips-sde-elf] || [istarget sparc*-*-*] || [istarget ia64-*-*] || [check_effective_target_arm_vect_no_misalign] || ([istarget powerpc*-*-*] && [check_p8vector_hw_available]) I'll fix this in PR113535 which was filed previously for visiting powerpc specific check in these vect* effective targets. If the testing just goes with native cpu type, this issue will become invisible. I think you can still push the patch as the testing just exposes another issue. BR, Kewen
Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b
On Apr 29, 2024, "Kewen.Lin" wrote: > I think you can still push the patch as the testing just exposes > another issue. ACK, thanks, I've just confirmed that the problem I reported on ppc64el-linux-gnu didn't come up when testing on ppc64-vx7r2 with a non-power8 emulated cpu, so I'm going to install it. -- Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive
Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b
On May 23, 2024, Alexandre Oliva wrote: > On Apr 29, 2024, "Kewen.Lin" wrote: >> I think you can still push the patch as the testing just exposes >> another issue. > ACK, thanks, I've just confirmed that the problem I reported on > ppc64el-linux-gnu didn't come up when testing on ppc64-vx7r2 with a > non-power8 emulated cpu, so I'm going to install it. I see I hadn't posted the latest version of the patch, with the updated attribution and commit message. Here it is. I'm checking it in. testsuite: adjust iteration count for ppc costmodel 76b From: Alexandre Oliva For some hardware which doesn't support unaligned vector memory access, test case costmodel-vect-76b.c expects to see cost modeling would make the decision that it's not profitable for peeling, according to the commit history, test case comments and the way to check. For now, the existing loop bound 14 works well for Power7, but it does not for some targets on which the cost of operation vec_perm can be different from Power7, such as: Power6, it's 3 vs. 1. This difference further causes the difference (10 vs. 12) on the minimum iteration for profitability and cause the failure. To keep the original test point, this patch is to tweak the loop bound to ensure it's not profitable to be vectorized for !vect_no_align with peeling. Co-Authored-By: Kewen Lin for gcc/testsuite/ChangeLog * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c (N): Tweak. --- .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c index cbbfbb24658f8..e48b0ab759e75 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c @@ -6,7 +6,7 @@ /* On Power7 without misalign vector support, this case is to check it's not profitable to perform vectorization by peeling to align the store. */ -#define N 14 +#define N 13 #define OFF 4 /* Check handling of accesses for which the "initial condition" - -- Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive