Re: Fix optimize_mask_stores profile update

2023-07-24 Thread Richard Biener via Gcc-patches
On Fri, Jul 21, 2023 at 7:34 PM Jan Hubicka  wrote:
>
> > On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches
> >  wrote:
> > >
> > > Hi,
> > > While looking into sphinx3 regression I noticed that vectorizer produces
> > > BBs with overall probability count 120%.  This patch fixes it.
> > > Richi, I don't know how to create a testcase, but having one would
> > > be nice.
> > >
> > > Bootstrapped/regtested x86_64-linux, commited last night (sorry for
> > > late email)
> >
> > This should trigger with sth like
> >
> >   for (i)
> > if (cond[i])
> >   out[i] = 1.;
> >
> > so a masked store and then using AVX2+.  ISTR we disable AVX masked
> > stores on zen (but not AVX512).
>
> Richard,
> if we know probability of if (cond[i]) to be p,
> then we know that the combined conditional is somewhere between
>   low = p  (the strategy packing true and falses into VF sized
> blocks)
> and
>   high = min (p*vf,1)
>(the stragegy doing only one true per block if possible)
> Likely value is
>
>   likely = 1-pow(1-p, vf)
>
> I wonder if we can work out p at least in common cases.
> Making store unlikely as we do right now will place it offline with
> extra jump.  Making it likely is better unless p is very small.
>
> I think if p is close to 0 or 1 which may be common case the analysis
> above may be useful. If range [low...high] is small, we can use likely
> and keep it as reliable.
> If it is high, we can probably just end up with guessed value close but
> above 50% so the store stays inline.

I'd say we want to keep the store inline in all cases (we likely lost
any explicit profile info during if-conversion), not sure what we gain
with providing a better "guess" here.  So I think we should simply
go with 'likely' derived from statistical independent events.

If in future we can tie if-conversion and vectorization even closer
we might be able to preserve profile data here.

Richard.

>
> Honza


Re: Fix optimize_mask_stores profile update

2023-07-21 Thread Jan Hubicka via Gcc-patches
> On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches
>  wrote:
> >
> > Hi,
> > While looking into sphinx3 regression I noticed that vectorizer produces
> > BBs with overall probability count 120%.  This patch fixes it.
> > Richi, I don't know how to create a testcase, but having one would
> > be nice.
> >
> > Bootstrapped/regtested x86_64-linux, commited last night (sorry for
> > late email)
> 
> This should trigger with sth like
> 
>   for (i)
> if (cond[i])
>   out[i] = 1.;
> 
> so a masked store and then using AVX2+.  ISTR we disable AVX masked
> stores on zen (but not AVX512).

Richard,
if we know probability of if (cond[i]) to be p,
then we know that the combined conditional is somewhere between
  low = p  (the strategy packing true and falses into VF sized
blocks)
and
  high = min (p*vf,1)
   (the stragegy doing only one true per block if possible)
Likely value is

  likely = 1-pow(1-p, vf)

I wonder if we can work out p at least in common cases. 
Making store unlikely as we do right now will place it offline with
extra jump.  Making it likely is better unless p is very small.

I think if p is close to 0 or 1 which may be common case the analysis
above may be useful. If range [low...high] is small, we can use likely
and keep it as reliable.
If it is high, we can probably just end up with guessed value close but
above 50% so the store stays inline.

Honza


Re: Fix optimize_mask_stores profile update

2023-07-17 Thread Richard Biener via Gcc-patches



> Am 17.07.2023 um 14:38 schrieb Jan Hubicka :
> 
> 
>> 
>>> On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches
>>>  wrote:
>>> 
>>> Hi,
>>> While looking into sphinx3 regression I noticed that vectorizer produces
>>> BBs with overall probability count 120%.  This patch fixes it.
>>> Richi, I don't know how to create a testcase, but having one would
>>> be nice.
>>> 
>>> Bootstrapped/regtested x86_64-linux, commited last night (sorry for
>>> late email)
>> 
>> This should trigger with sth like
>> 
>>  for (i)
>>if (cond[i])
>>  out[i] = 1.;
>> 
>> so a masked store and then using AVX2+.  ISTR we disable AVX masked
>> stores on zen (but not AVX512).
> 
> OK, let me see if I can get a testcase out of that.
>>>   efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE);
>>>   /* Put STORE_BB to likely part.  */
>>>   efalse->probability = profile_probability::unlikely ();
>>> +  e->probability = efalse->probability.invert ();
>>>   store_bb->count = efalse->count ();
>> 
>> isn't the count also wrong?  Or rather efalse should be likely().   We're
>> testing doing
>> 
>>  if (!mask all zeros)
>>masked-store
>> 
>> because a masked store with all zero mask can end up invoking COW page fault
>> handling multiple times (because it doesn't actually write).
> 
> Hmm, I only fixed the profile, efalse was already set to unlikely, but
> indeed I think it should be likely. Maybe we can compute some bound on
> actual probability by knowing if(cond[i]) probability.
> If the loop always does factor many ones or zeros, the probability would
> remain the same.
> If that is p and they are all independent, the outcome would be
> (1-p)^factor
> 
> sp we know the conditoinal shoul dbe in ragne (1-p)^factor(1-p),
> right?

Yes.  I think the heuristic was added for
The case of bigger ranges with all 0/1 for
Purely random one wouldn’t expect all zeros ever in practice.  Maybe the 
probability was also set with that special case in mind (which is of course 
broken)

Richard 

> Honza
> 
>> 
>> Note -Ofast allows store data races and thus does RMW instead of a masked 
>> store.
>> 
>>>   make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU);
>>>   if (dom_info_available_p (CDI_DOMINATORS))


Re: Fix optimize_mask_stores profile update

2023-07-17 Thread Jan Hubicka via Gcc-patches
> On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches
>  wrote:
> >
> > Hi,
> > While looking into sphinx3 regression I noticed that vectorizer produces
> > BBs with overall probability count 120%.  This patch fixes it.
> > Richi, I don't know how to create a testcase, but having one would
> > be nice.
> >
> > Bootstrapped/regtested x86_64-linux, commited last night (sorry for
> > late email)
> 
> This should trigger with sth like
> 
>   for (i)
> if (cond[i])
>   out[i] = 1.;
> 
> so a masked store and then using AVX2+.  ISTR we disable AVX masked
> stores on zen (but not AVX512).

OK, let me see if I can get a testcase out of that.
> >efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE);
> >/* Put STORE_BB to likely part.  */
> >efalse->probability = profile_probability::unlikely ();
> > +  e->probability = efalse->probability.invert ();
> >store_bb->count = efalse->count ();
> 
> isn't the count also wrong?  Or rather efalse should be likely().   We're
> testing doing
> 
>   if (!mask all zeros)
> masked-store
> 
> because a masked store with all zero mask can end up invoking COW page fault
> handling multiple times (because it doesn't actually write).

Hmm, I only fixed the profile, efalse was already set to unlikely, but
indeed I think it should be likely. Maybe we can compute some bound on
actual probability by knowing if(cond[i]) probability.
If the loop always does factor many ones or zeros, the probability would
remain the same.
If that is p and they are all independent, the outcome would be
(1-p)^factor

sp we know the conditoinal shoul dbe in ragne (1-p)^factor(1-p),
right?

Honza

> 
> Note -Ofast allows store data races and thus does RMW instead of a masked 
> store.
> 
> >make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU);
> >if (dom_info_available_p (CDI_DOMINATORS))


Re: Fix optimize_mask_stores profile update

2023-07-17 Thread Richard Biener via Gcc-patches
On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches
 wrote:
>
> Hi,
> While looking into sphinx3 regression I noticed that vectorizer produces
> BBs with overall probability count 120%.  This patch fixes it.
> Richi, I don't know how to create a testcase, but having one would
> be nice.
>
> Bootstrapped/regtested x86_64-linux, commited last night (sorry for
> late email)

This should trigger with sth like

  for (i)
if (cond[i])
  out[i] = 1.;

so a masked store and then using AVX2+.  ISTR we disable AVX masked
stores on zen (but not AVX512).

Richard.

> gcc/ChangeLog:
>
> PR tree-optimization/110649
> * tree-vect-loop.cc (optimize_mask_stores): Set correctly
> probability of the if-then-else construct.
>
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 7d917bfd72c..b44fb9c7712 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -11680,6 +11679,7 @@ optimize_mask_stores (class loop *loop)
>efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE);
>/* Put STORE_BB to likely part.  */
>efalse->probability = profile_probability::unlikely ();
> +  e->probability = efalse->probability.invert ();
>store_bb->count = efalse->count ();

isn't the count also wrong?  Or rather efalse should be likely().   We're
testing doing

  if (!mask all zeros)
masked-store

because a masked store with all zero mask can end up invoking COW page fault
handling multiple times (because it doesn't actually write).

Note -Ofast allows store data races and thus does RMW instead of a masked store.

>make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU);
>if (dom_info_available_p (CDI_DOMINATORS))


Fix optimize_mask_stores profile update

2023-07-17 Thread Jan Hubicka via Gcc-patches
Hi,
While looking into sphinx3 regression I noticed that vectorizer produces
BBs with overall probability count 120%.  This patch fixes it.
Richi, I don't know how to create a testcase, but having one would
be nice.

Bootstrapped/regtested x86_64-linux, commited last night (sorry for
late email)

gcc/ChangeLog:

PR tree-optimization/110649
* tree-vect-loop.cc (optimize_mask_stores): Set correctly
probability of the if-then-else construct.

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 7d917bfd72c..b44fb9c7712 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -11680,6 +11679,7 @@ optimize_mask_stores (class loop *loop)
   efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE);
   /* Put STORE_BB to likely part.  */
   efalse->probability = profile_probability::unlikely ();
+  e->probability = efalse->probability.invert ();
   store_bb->count = efalse->count ();
   make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU);
   if (dom_info_available_p (CDI_DOMINATORS))