Hi Richi, Thanks for your review comments on this and some others!
on 2023/6/30 19:37, Richard Biener wrote: > On Tue, Jun 13, 2023 at 4:07 AM Kewen Lin <li...@linux.ibm.com> wrote: >> >> This patch series follows Richi's suggestion at the link [1], >> which suggest structuring vectorizable_load to make costing >> next to the transform, in order to make it easier to keep >> costing and the transform in sync. For now, it's a known >> issue that what we cost can be inconsistent with what we >> transform, as the case in PR82255 and some other associated >> test cases in the patches of this series show. >> >> Basically this patch series makes costing not call function >> vect_model_load_cost any more. To make the review and >> bisection easy, I organized the changes according to the >> memory access types of vector load. For each memory access >> type, firstly it follows the handlings in the function >> vect_model_load_costto avoid any missing, then refines >> further by referring to the transform code, I also checked >> them with some typical test cases to verify. Hope the >> subjects of patches are clear enough. >> >> The whole series can be bootstrapped and regtested >> incrementally on: >> - x86_64-redhat-linux >> - aarch64-linux-gnu >> - powerpc64-linux-gnu P7, P8 and P9 >> - powerpc64le-linux-gnu P8, P9 and P10 >> >> By considering the current vector test buckets are mainly >> tested without cost model, I also verified the whole patch >> series was neutral for SPEC2017 int/fp on Power9 at O2, >> O3 and Ofast separately. > > I went through the series now and I like it overall (well, I suggested > the change). > Looking at the changes I think we want some followup to reduce the > mess in the final loop nest. We already have some VMAT_* cases handled > separately, maybe we can split out some more cases. Maybe we should At first glance, the simple parts look to be the handlings for VMAT_LOAD_STORE_LANES, and VMAT_GATHER_SCATTER (with ifn and emulated). It seems a bit straightforward if it's fine to duplicate the nested loop, but may need to care about removing some useless code. > bite the bullet and duplicate that loop nest for the different VMAT_* cases. > Maybe we can merge some of the if (!costing_p) checks by clever > re-ordering. I've tried a bit to merge them if possible, like the place to check VMAT_CONTIGUOUS, VMAT_CONTIGUOUS_REVERSE and VMAT_CONTIGUOUS_PERMUTE. But will keep in mind for the following updates. > So what > this series doesn't improve is overall readability of the code (indent and our > 80 char line limit). Sorry about that. > > The change also makes it more difficult(?) to separate analysis and transform > though in the end I hope that analysis will actually "code generate" to a > (SLP) > data structure so the target will have a chance to see the actual flow of > insns. > > That said, I'd like to hear from Richard whether he thinks this is a step > in the right direction. > > Are you willing to followup with doing the same re-structuring to > vectorizable_store? Yes, vectorizable_store was also pointed out in your original suggestion [1], I planned to update this once this series meets your expectations and gets landed. > > OK from my side with the few comments addressed. The patch likely needs > refresh > after the RVV changes in this area? Thanks! Yes, I've updated 2/9 and 3/9 according to your comments, and updated 5/9 and 9/9 as they had some conflicts when rebasing. Re-testing is ongoing, do the updated versions look good to you? Is this series ok for trunk if all the test runs go well again as before? BR, Kewen