On Tue, 31 May 2016, Yuri Rumyantsev wrote:

> Richard,
> 
> I built compiler with your patch and did not find out any issues with
> vectorization of loops marked with pragma simd. I also noticed that
> the size of the vectorized loop looks smaller (I can't tell you exact
> numbers since the fresh compiler performs fool unroll even if
> "-funroll-loops" option was not passed).

Thanks for checking - and yes, I expect us to generate better code
as we can't always recover from the un-CSE ifcvt did.

I have installed the patch now, watching for fallout on other archs
(I have installed it w/o re-instantiating bool patterns on x86).

Richard.

> 2016-05-30 15:55 GMT+03:00 Ilya Enkovich <enkovich....@gmail.com>:
> > 2016-05-30 14:04 GMT+03:00 Richard Biener <rguent...@suse.de>:
> >>
> >> The following patch removes the restriction on seeing a tree of stmts
> >> in vectorizer bool pattern detection (aka single-use).  With this
> >> it is no longer necessary to unshare DEFs in ifcvt_repair_bool_pattern
> >> and that compile-time hog can go (it's now enabled unconditionally for GCC 
> >> 7).
> >>
> >> Instead the pattern detection code will now "unshare" the condition tree
> >> for each bool pattern root my means of adding all pattern stmts of the
> >> condition tree to its pattern def sequence (so we still get some
> >> unnecessary copying, worst-case quadratic rather than exponential).
> >>
> >> Ilja - I had to disable the
> >>
> >>           tree mask_type = get_mask_type_for_scalar_type (TREE_TYPE
> >> (rhs1));
> >>           if (mask_type
> >>               && expand_vec_cmp_expr_p (comp_vectype, mask_type))
> >>             return false;
> >>
> >> check you added to check_bool_pattern to get any coverage for bool
> >> patterns on x86_64.  Doing that regresses
> >>
> >> FAIL: gcc.target/i386/mask-pack.c scan-tree-dump-times vect "vectorized 1
> >> loops" 10
> >> FAIL: gcc.target/i386/mask-unpack.c scan-tree-dump-times vect "vectorized
> >> 1 loops" 10
> >> FAIL: gcc.target/i386/pr70021.c scan-tree-dump-times vect "vectorized 1
> >> loops" 2
> >>
> >> so somehow bool patterns mess up things here (I didn't investigate).
> >> The final patch will enable the above path again, avoiding the regression.
> >
> > Mask conversion patterns handle some cases bool patterns don't.  So it's
> > expected we can't vectorize some loops if bool patterns are enforced.
> >
> > Thanks,
> > Ilya
> >
> >>
> >> Yuri - I suppose you have a larger set of testcases using OMP simd
> >> or other forced vectorization you added ifcvt_repair_bool_pattern for.
> >> I'd appreciate testing (and testcases if anything fails unexpectedly).
> >>
> >> Testing on other targets is of course appreciated as well.
> >>
> >> Bootstrapped and tested on x86_64-unknown-linux-gnu (with the #if 0).
> >>
> >> Comments?
> >>
> >> I agree with Ilya elsewhere to remove bool patterns completely
> >> (as a first step making them apply to individual stmts).  We do
> >> currently not support mixed cond/non-cond uses anyway, like
> >>
> >> _Bool a[64];
> >> unsigned char b[64];
> >>
> >> void foo (void)
> >> {
> >>   for (int i = 0; i < 64; ++i)
> >>     {
> >>       _Bool x = a[i] && b[i] < 10;
> >>       a[i] = x;
> >>     }
> >> }
> >>
> >> and stmt-local "patterns" can be added when vectorizing the stmt
> >> (like in the above case a tem = a[i] != 0 ? -1 : 0).  Doing
> >> bool "promotion" optimally requires a better vectorizer IL
> >> (similar to placing of shuffles).
> >>
> >> Thanks,
> >> Richard.
> >>
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Reply via email to