On Mon, Jul 3, 2023 at 5:39 AM Kewen.Lin <li...@linux.ibm.com> wrote:
>
> Hi Richi,
>
> Thanks for your review comments on this and some others!
>
> on 2023/6/30 19:37, Richard Biener wrote:
> > On Tue, Jun 13, 2023 at 4:07 AM Kewen Lin <li...@linux.ibm.com> wrote:
> >>
> >> This patch series follows Richi's suggestion at the link [1],
> >> which suggest structuring vectorizable_load to make costing
> >> next to the transform, in order to make it easier to keep
> >> costing and the transform in sync.  For now, it's a known
> >> issue that what we cost can be inconsistent with what we
> >> transform, as the case in PR82255 and some other associated
> >> test cases in the patches of this series show.
> >>
> >> Basically this patch series makes costing not call function
> >> vect_model_load_cost any more.  To make the review and
> >> bisection easy, I organized the changes according to the
> >> memory access types of vector load.  For each memory access
> >> type, firstly it follows the handlings in the function
> >> vect_model_load_costto avoid any missing, then refines
> >> further by referring to the transform code, I also checked
> >> them with some typical test cases to verify.  Hope the
> >> subjects of patches are clear enough.
> >>
> >> The whole series can be bootstrapped and regtested
> >> incrementally on:
> >>   - x86_64-redhat-linux
> >>   - aarch64-linux-gnu
> >>   - powerpc64-linux-gnu P7, P8 and P9
> >>   - powerpc64le-linux-gnu P8, P9 and P10
> >>
> >> By considering the current vector test buckets are mainly
> >> tested without cost model, I also verified the whole patch
> >> series was neutral for SPEC2017 int/fp on Power9 at O2,
> >> O3 and Ofast separately.
> >
> > I went through the series now and I like it overall (well, I suggested
> > the change).
> > Looking at the changes I think we want some followup to reduce the
> > mess in the final loop nest.  We already have some VMAT_* cases handled
> > separately, maybe we can split out some more cases.  Maybe we should
>
> At first glance, the simple parts look to be the handlings for
> VMAT_LOAD_STORE_LANES, and VMAT_GATHER_SCATTER (with ifn and emulated).
> It seems a bit straightforward if it's fine to duplicate the nested loop,
> but may need to care about removing some useless code.
>
> > bite the bullet and duplicate that loop nest for the different VMAT_* cases.
> > Maybe we can merge some of the if (!costing_p) checks by clever
> > re-ordering.
>
> I've tried a bit to merge them if possible, like the place to check
> VMAT_CONTIGUOUS, VMAT_CONTIGUOUS_REVERSE and VMAT_CONTIGUOUS_PERMUTE.
> But will keep in mind for the following updates.
>
> > So what
> > this series doesn't improve is overall readability of the code (indent and 
> > our
> > 80 char line limit).
>
> Sorry about that.
>
> >
> > The change also makes it more difficult(?) to separate analysis and 
> > transform
> > though in the end I hope that analysis will actually "code generate" to a 
> > (SLP)
> > data structure so the target will have a chance to see the actual flow of 
> > insns.
> >
> > That said, I'd like to hear from Richard whether he thinks this is a step
> > in the right direction.
> >
> > Are you willing to followup with doing the same re-structuring to
> > vectorizable_store?
>
> Yes, vectorizable_store was also pointed out in your original suggestion [1],
> I planned to update this once this series meets your expectations and gets 
> landed.
>
> >
> > OK from my side with the few comments addressed.  The patch likely needs 
> > refresh
> > after the RVV changes in this area?
>
> Thanks!  Yes, I've updated 2/9 and 3/9 according to your comments, and updated
> 5/9 and 9/9 as they had some conflicts when rebasing.  Re-testing is ongoing,
> do the updated versions look good to you?  Is this series ok for trunk if all 
> the
> test runs go well again as before?

Yes.

Thanks,
Richard.

> BR,
> Kewen

Reply via email to