Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

Sergey Ostanevich Fri, 15 Nov 2013 06:07:17 -0800

Richard,

here's an example that causes trigger for the cost model. As soon as
elemental functions will appear and we update the vectorizer so it can accept
an elemental function inside the loop - we will have the same
situation as we have
it now: cost model will bail out with profitability estimation.
Still we have no chance to get info on how efficient the bar() function when it
is in vector form.


I believe I should repeat: #pragma omp simd is intended for introduction of an
instruction-level parallel region on developer's request, hence should
be treated
in same manner as #pragma omp parallel. Vectorizer cost model is an obstacle
here, not a help.

Regards,
Sergos


On Fri, Nov 15, 2013 at 1:08 AM, Richard Biener <rguent...@suse.de> wrote:
> Sergey Ostanevich <sergos....@gmail.com> wrote:
>>this is only for the whole file? I mean to have a particular loop
>>vectorized in a
>>file while all others - up to compiler's cost model. is there such a
>>machinery?
>
> No, there is not.
>
> Richard.
>
>>Sergos
>>
>>On Thu, Nov 14, 2013 at 12:39 PM, Richard Biener <rguent...@suse.de>
>>wrote:
>>> On Wed, 13 Nov 2013, Sergey Ostanevich wrote:
>>>
>>>> I will get some tests.
>>>> As for cost analysis - simply consider the pragma as a request to
>>>> vectorize. How can I - as a developer - enforce it beyond the
>>pragma?
>>>
>>> You can disable the cost model via -fvect-cost-model=unlimited
>>>
>>> Richard.
>>>
>>>> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener <rguent...@suse.de>
>>wrote:
>>>> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
>>>> >
>>>> >> The reason patch was in its original state is because we want
>>>> >> to notify user that his assumption of profitability may be wrong.
>>>> >> This is not a part of any spec and as far as I know ICC does not
>>>> >> notify user about the case. Still it can be a good hint for those
>>>> >> users who tries to get as much as possible performance.
>>>> >>
>>>> >> Richard's comment on the vectorization problems is about the same
>>-
>>>> >> to inform user that his attempt to force vectorization is failed.
>>>> >>
>>>> >> As for profitable or not - sometimes I believe it's impossible to
>>be
>>>> >> precise. For OMP we have case of a vector version of a function
>>>> >> and we have no chance to figure out whether it is profitable to
>>use
>>>> >> it or to loose it. If we can't map the loop for any vector length
>>>> >> other than 1 - I believe in this case we have to bail out and
>>report.
>>>> >> Is it about 'never profitable'?
>>>> >
>>>> > For example.  I think we should report non-vectorized loops
>>>> > that are marked with force_vect anyway, with
>>-Wdisabled-optimization.
>>>> > Another case is that a loop may be profitable to vectorize if
>>>> > the ISA supports a gather instruction but otherwise not.  Or if
>>the
>>>> > ISA supports efficient vector construction from N not loop
>>>> > invariant scalars (for vectorization of strided loads).
>>>> >
>>>> > Simply disregarding all of the cost analysis sounds completely
>>>> > bogus to me.
>>>> >
>>>> > I'd simply go for the diagnostic for now, not changing anything
>>else.
>>>> > We want to have a good understanding about why the cost model is
>>>> > so bad that we have to force to ignore it for #pragma simd - thus
>>we
>>>> > want testcases.
>>>> >
>>>> > Richard.
>>>> >
>>>> >>
>>>> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener
>><rguent...@suse.de> wrote:
>>>> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
>>>> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich
>>wrote:
>>>> >> >>> ivdep just substitutes all cross-iteration data analysis,
>>>> >> >>> nothing related to cost model. ICC does not cancel its
>>>> >> >>> cost model in case of #pragma ivdep
>>>> >> >>>
>>>> >> >>> as for the safelen - OMP standart treats it as a limitation
>>>> >> >>> for the vector length. this means if no safelen is present
>>>> >> >>> an arbitrary vector length can be used.
>>>> >> >>
>>>> >> >> I was talking about GCC loop->safelen, which is INT_MAX for
>>#pragma omp simd
>>>> >> >> without safelen clause or #pragma simd without vectorlength
>>clause.
>>>> >> >>
>>>> >> >>> so I believe loop->force_vect is the only trigger to
>>disregard
>>>> >> >>> the cost model
>>>> >> >>
>>>> >> >> Anyway, in that case I think the originally posted patch is
>>wrong,
>>>> >> >> if we want to treat force_vect as disregard all the cost model
>>and
>>>> >> >> force vectorization (well, the name of the field already kind
>>of suggest
>>>> >> >> that), then IMHO we should treat it the same as
>>-fvect-cost-model=unlimited
>>>> >> >> for those loops.
>>>> >> >
>>>> >> > Err - the user may have a specific sub-architecture in mind
>>when using
>>>> >> > #pragma simd, if you say we should completely ignore the cost
>>model
>>>> >> > then should we also sorry () if we cannot vectorize the loop
>>(either
>>>> >> > because of GCC deficiencies or lack of sub-target support)?
>>>> >> >
>>>> >> > That said, at least in the cases that the cost model says the
>>loop
>>>> >> > is never profitable to vectorize we should follow its advice.
>>>> >> >
>>>> >> > Richard.
>>>> >> >
>>>> >> >> Thus (untested):
>>>> >> >>
>>>> >> >> 2013-11-12  Jakub Jelinek  <ja...@redhat.com>
>>>> >> >>
>>>> >> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters):
>>Use
>>>> >> >>       unlimited cost model also for force_vect loops.
>>>> >> >>
>>>> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000
>>+0100
>>>> >> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330
>>+0100
>>>> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
>>>> >> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA
>>(loop_vinfo);
>>>> >> >>
>>>> >> >>    /* Cost model disabled.  */
>>>> >> >> -  if (unlimited_cost_model ())
>>>> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP
>>(loop_vinfo)->force_vect)
>>>> >> >>      {
>>>> >> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model
>>disabled.\n");
>>>> >> >>        *ret_min_profitable_niters = 0;
>>>> >> >>
>>>> >> >>       Jakub
>>>> >> >>
>>>> >> >
>>>> >>
>>>> >>
>>>> >
>>>> > --
>>>> > Richard Biener <rguent...@suse.de>
>>>> > SUSE / SUSE Labs
>>>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>>>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend
>>>>
>>>>
>>>
>>> --
>>> Richard Biener <rguent...@suse.de>
>>> SUSE / SUSE Labs
>>> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>>> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
>
>

typedef float K[5];


struct Str1
{
  unsigned short u1, u2, u3; 
  int i1;             
  float f1, f2;      
  float f3;            
  K k1; 
};

struct Str2
{
  unsigned short u1, u2, u3; 
  int i1;             
  float f1, f2;      
  float f3;            
  float f4;
  float f5;
};


struct Str3
{
  float f1;
  unsigned char u1;
  union
  {
   K k1;
   struct Str1 *str1;
   struct Str2 *str2;
  } Un1;
};


struct str4
{
  int i1;
  short s1;
  char c1, u1;
  struct Str3 *str1;
};

#pragma omp declare simd 
extern float bar (float value);

float foo (struct str4 *Map)
{
  int i;
  float Value;
  float Total = 0.0;
#pragma omp simd
   for (i = 0; i < Map->s1; i++)
   {
     Value = Map->str1[i].f1;
//     Value = bar (Value);
     Total += Value;
   }
  return Total;
}

Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

Reply via email to