> -----Original Message----- > From: rguent...@c653.arch.suse.de <rguent...@c653.arch.suse.de> On > Behalf Of Richard Biener > Sent: Thursday, November 5, 2020 10:17 AM > To: Tamar Christina <tamar.christ...@arm.com> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late > > On Wed, 4 Nov 2020, Tamar Christina wrote: > > > Hi Richi, > > > > > -----Original Message----- > > > From: rguent...@c653.arch.suse.de <rguent...@c653.arch.suse.de> On > > > Behalf Of Richard Biener > > > Sent: Wednesday, November 4, 2020 8:07 AM > > > To: Tamar Christina <tamar.christ...@arm.com> > > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late > > > > > > On Tue, 3 Nov 2020, Tamar Christina wrote: > > > > > > > Hi Richi, > > > > > > > > We decided to take the regression in any code-gen this could give > > > > and fix it properly next stage-1. As such here's a new patch > > > > based on your previous feedback. > > > > > > > > Ok for master? > > > > > > Looks good sofar but be aware that you elide the > > > > > > - && vect_store_lanes_supported > > > - (STMT_VINFO_VECTYPE (scalar_stmts[0]), group_size, > > > false)) > > > > > > part of the check - that is, you don't verify the store part of the > > > instance can use store-lanes. Btw, this means the original code > > > cancelled an instance only when the SLP graph entry is a store-lane > > > capable store but your variant would also cancel in case there's a load- > lane capable reduction. > > > > > > > I do still have it, > > > > if (loads_permuted > > && vect_store_lanes_supported (vectype, group_size, false)) > > > > I just grab the type from the SLP_TREE_VECTYPE (slp_root); which > > should be the store if one exists. > > > > > I think that you eventually want to re-instantiate the store-lane > > > check but treat it the same as any of the load checks (thus not > > > require all instances to be stores for the cancellation). > > > But at least when a store cannot use store-lanes we probably > > > shouldn't cancel the SLP. > > > > I did however elide the kind check, that was added as part of the > > rebase, it looked like kind wasn't Being stored inside the SLP instance and > I'd have to redo the analysis to find it. > > > > Does it does reasonable to include kind as a field in the SLP instance? > > > > > > > > Anyway, the patch is OK for master. The store-lane check part can > > > be re- added as followup. > > > > > > > Thanks! Will do. > > Btw, the patch regressed > > FAIL: gcc.dg/vect/slp-11b.c -flto -ffat-lto-objects scan-tree-dump-times vect > "vectorizing stmts using SLP" 1 > FAIL: gcc.dg/vect/slp-11b.c scan-tree-dump-times vect "vectorizing stmts > using SLP" 1 > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects scan-tree-dump vect > "Built SLP cancelled: can use load/store-lanes" > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects scan-tree-dump vect > "LOAD_LANES" > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects scan-tree-dump vect > "STORE_LANES" > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "Built SLP cancelled: > can use load/store-lanes" > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "LOAD_LANES" > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "STORE_LANES" > > on x86_64. The slp-11b.c testcase is interesting since there extract_muldiv > folding makes the group of four stores not matching so we split into a size of > 3 and one remaining store. > This causes us to arrive at > > note: node 0x441a940 (max_nunits=4, refcnt=2) > note: stmt 0 _2 = in[_1]; > note: stmt 1 _6 = in[_5]; > note: stmt 2 _10 = in[_8]; > note: load permutation { 0 2 1 } > > which on x86_64 we in the end cannot handle (without SSE4 I think) so it fails > to SLP there. Guess arm can do the permute but not the load-lane here. > > For gcc.dg/vect/slp-perm-6.c the XFAILs shouldn't be done > for !vect_load_lanes targets. Not sure if that's possible easily, like with a > { target vect_load_lanes } { xfail vect_load_lanes } combo ...? I suggest to > make it xfail everywhere instead and add a comment as to we're expecting > those only for vect_load_lanes targets.
Yes just fixed these, the change in gcc.dg/vect/slp-11b.c shouldn't be there and I updated the target selector properly :) Cheers, Tamar > > Richard. > > > > Thanks, > > > Richard. > > > > > > > Thanks, > > > > Tamar > > > > > > > > gcc/ChangeLog: > > > > > > > > * tree-vect-slp.c (vect_analyze_slp_instance): Moved load/store > > > lanes > > > > check to ... > > > > * tree-vect-loop.c (vect_analyze_loop_2): ..Here > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > * gcc.dg/vect/slp-11b.c: Update output scan. > > > > * gcc.dg/vect/slp-perm-6.c: Likewise. > > > > > > > > > -----Original Message----- > > > > > From: rguent...@c653.arch.suse.de <rguent...@c653.arch.suse.de> > > > > > On Behalf Of Richard Biener > > > > > Sent: Thursday, October 22, 2020 9:44 AM > > > > > To: Tamar Christina <tamar.christ...@arm.com> > > > > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz > > > > > Subject: Re: [PATCH] SLP: Move load/store-lanes check till late > > > > > > > > > > On Wed, 21 Oct 2020, Tamar Christina wrote: > > > > > > > > > > > Hi All, > > > > > > > > > > > > This moves the code that checks for load/store lanes further > > > > > > in the pipeline and places it after slp_optimize. This would > > > > > > allow us to perform optimizations on the SLP tree and only > > > > > > bail out if we really have a > > > > > permute. > > > > > > > > > > > > With this change it allows us to handle permutes such as > > > > > > {1,1,1,1} which should be handled by a load and replicate. > > > > > > > > > > > > This change however makes it all or nothing. Either all > > > > > > instances can be handled or none at all. This is why some of > > > > > > the test cases have been > > > > > adjusted. > > > > > > > > > > So this possibly leaves a loop unvectorized in case there's a > > > > > ldN/stN opportunity but another SLP instance with a permutation > > > > > not handled by interleaving is present. What I was originally > > > > > suggesting is to only cancel the SLP build if _all_ instances > > > > > can be handled > > > with ldN/stN. > > > > > > > > > > Of course I'm also happy with completely removing this heuristics. > > > > > > > > > > Note some of the comments look off now, also the assignment to > > > > > ok before the goto is pointless and you should probably turn > > > > > this into a dump print instead. > > > > > > > > > > Thanks, > > > > > Richard. > > > > > > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu, > > > > > > -x86_64-pc-linux-gnu and no issues. > > > > > > > > > > > > Ok for master? > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > Tamar > > > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > > > * tree-vect-slp.c (vect_analyze_slp_instance): Moved > > > > > > load/store > > > > > lanes > > > > > > check to ... > > > > > > * tree-vect-loop.c (vect_analyze_loop_2): ..Here > > > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > > > * gcc.dg/vect/slp-11b.c: Update output scan. > > > > > > * gcc.dg/vect/slp-perm-6.c: Likewise. > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Richard Biener <rguent...@suse.de> SUSE Software Solutions > > > > > Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: > > > > > Felix Imend > > > > > > > > > > -- > > > Richard Biener <rguent...@suse.de> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 > > > Nuernberg, Germany; GF: Felix Imend > > > > -- > Richard Biener <rguent...@suse.de> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 > Nuernberg, Germany; GF: Felix Imend