Richard,

here's an example that causes trigger for the cost model. As soon as
elemental functions will appear and we update the vectorizer so it can accept
an elemental function inside the loop - we will have the same
situation as we have
it now: cost model will bail out with profitability estimation.
Still we have no chance to get info on how efficient the bar() function when it
is in vector form.

I believe I should repeat: #pragma omp simd is intended for introduction of an
instruction-level parallel region on developer's request, hence should
be treated
in same manner as #pragma omp parallel. Vectorizer cost model is an obstacle
here, not a help.

Regards,
Sergos


On Fri, Nov 15, 2013 at 1:08 AM, Richard Biener <rguent...@suse.de> wrote:
> Sergey Ostanevich <sergos....@gmail.com> wrote:
>>this is only for the whole file? I mean to have a particular loop
>>vectorized in a
>>file while all others - up to compiler's cost model. is there such a
>>machinery?
>
> No, there is not.
>
> Richard.
>
>>Sergos
>>
>>On Thu, Nov 14, 2013 at 12:39 PM, Richard Biener <rguent...@suse.de>
>>wrote:
>>> On Wed, 13 Nov 2013, Sergey Ostanevich wrote:
>>>
>>>> I will get some tests.
>>>> As for cost analysis - simply consider the pragma as a request to
>>>> vectorize. How can I - as a developer - enforce it beyond the
>>pragma?
>>>
>>> You can disable the cost model via -fvect-cost-model=unlimited
>>>
>>> Richard.
>>>
>>>> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener <rguent...@suse.de>
>>wrote:
>>>> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
>>>> >
>>>> >> The reason patch was in its original state is because we want
>>>> >> to notify user that his assumption of profitability may be wrong.
>>>> >> This is not a part of any spec and as far as I know ICC does not
>>>> >> notify user about the case. Still it can be a good hint for those
>>>> >> users who tries to get as much as possible performance.
>>>> >>
>>>> >> Richard's comment on the vectorization problems is about the same
>>-
>>>> >> to inform user that his attempt to force vectorization is failed.
>>>> >>
>>>> >> As for profitable or not - sometimes I believe it's impossible to
>>be
>>>> >> precise. For OMP we have case of a vector version of a function
>>>> >> and we have no chance to figure out whether it is profitable to
>>use
>>>> >> it or to loose it. If we can't map the loop for any vector length
>>>> >> other than 1 - I believe in this case we have to bail out and
>>report.
>>>> >> Is it about 'never profitable'?
>>>> >
>>>> > For example.  I think we should report non-vectorized loops
>>>> > that are marked with force_vect anyway, with
>>-Wdisabled-optimization.
>>>> > Another case is that a loop may be profitable to vectorize if
>>>> > the ISA supports a gather instruction but otherwise not.  Or if
>>the
>>>> > ISA supports efficient vector construction from N not loop
>>>> > invariant scalars (for vectorization of strided loads).
>>>> >
>>>> > Simply disregarding all of the cost analysis sounds completely
>>>> > bogus to me.
>>>> >
>>>> > I'd simply go for the diagnostic for now, not changing anything
>>else.
>>>> > We want to have a good understanding about why the cost model is
>>>> > so bad that we have to force to ignore it for #pragma simd - thus
>>we
>>>> > want testcases.
>>>> >
>>>> > Richard.
>>>> >
>>>> >>
>>>> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener
>><rguent...@suse.de> wrote:
>>>> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
>>>> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich
>>wrote:
>>>> >> >>> ivdep just substitutes all cross-iteration data analysis,
>>>> >> >>> nothing related to cost model. ICC does not cancel its
>>>> >> >>> cost model in case of #pragma ivdep
>>>> >> >>>
>>>> >> >>> as for the safelen - OMP standart treats it as a limitation
>>>> >> >>> for the vector length. this means if no safelen is present
>>>> >> >>> an arbitrary vector length can be used.
>>>> >> >>
>>>> >> >> I was talking about GCC loop->safelen, which is INT_MAX for
>>#pragma omp simd
>>>> >> >> without safelen clause or #pragma simd without vectorlength
>>clause.
>>>> >> >>
>>>> >> >>> so I believe loop->force_vect is the only trigger to
>>disregard
>>>> >> >>> the cost model
>>>> >> >>
>>>> >> >> Anyway, in that case I think the originally posted patch is
>>wrong,
>>>> >> >> if we want to treat force_vect as disregard all the cost model
>>and
>>>> >> >> force vectorization (well, the name of the field already kind
>>of suggest
>>>> >> >> that), then IMHO we should treat it the same as
>>-fvect-cost-model=unlimited
>>>> >> >> for those loops.
>>>> >> >
>>>> >> > Err - the user may have a specific sub-architecture in mind
>>when using
>>>> >> > #pragma simd, if you say we should completely ignore the cost
>>model
>>>> >> > then should we also sorry () if we cannot vectorize the loop
>>(either
>>>> >> > because of GCC deficiencies or lack of sub-target support)?
>>>> >> >
>>>> >> > That said, at least in the cases that the cost model says the
>>loop
>>>> >> > is never profitable to vectorize we should follow its advice.
>>>> >> >
>>>> >> > Richard.
>>>> >> >
>>>> >> >> Thus (untested):
>>>> >> >>
>>>> >> >> 2013-11-12  Jakub Jelinek  <ja...@redhat.com>
>>>> >> >>
>>>> >> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters):
>>Use
>>>> >> >>       unlimited cost model also for force_vect loops.
>>>> >> >>
>>>> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000
>>+0100
>>>> >> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330
>>+0100
>>>> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
>>>> >> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA
>>(loop_vinfo);
>>>> >> >>
>>>> >> >>    /* Cost model disabled.  */
>>>> >> >> -  if (unlimited_cost_model ())
>>>> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP
>>(loop_vinfo)->force_vect)
>>>> >> >>      {
>>>> >> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model
>>disabled.\n");
>>>> >> >>        *ret_min_profitable_niters = 0;
>>>> >> >>
>>>> >> >>       Jakub
>>>> >> >>
>>>> >> >
>>>> >>
>>>> >>
>>>> >
>>>> > --
>>>> > Richard Biener <rguent...@suse.de>
>>>> > SUSE / SUSE Labs
>>>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>>>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend
>>>>
>>>>
>>>
>>> --
>>> Richard Biener <rguent...@suse.de>
>>> SUSE / SUSE Labs
>>> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>>> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
>
>
typedef float K[5];


struct Str1
{
  unsigned short u1, u2, u3; 
  int i1;             
  float f1, f2;      
  float f3;            
  K k1; 
};

struct Str2
{
  unsigned short u1, u2, u3; 
  int i1;             
  float f1, f2;      
  float f3;            
  float f4;
  float f5;
};


struct Str3
{
  float f1;
  unsigned char u1;
  union
  {
   K k1;
   struct Str1 *str1;
   struct Str2 *str2;
  } Un1;
};


struct str4
{
  int i1;
  short s1;
  char c1, u1;
  struct Str3 *str1;
};

#pragma omp declare simd 
extern float bar (float value);

float foo (struct str4 *Map)
{
  int i;
  float Value;
  float Total = 0.0;
#pragma omp simd
   for (i = 0; i < Map->s1; i++)
   {
     Value = Map->str1[i].f1;
//     Value = bar (Value);
     Total += Value;
   }
  return Total;
}

Reply via email to