[Bug tree-optimization/105175] [12 Regression] Pointless warning about missed vector optimization

2022-04-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105175

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Richard Biener  ---
Should be fixed now - I guess it's latent on branches when you enable
vectorization.

[Bug tree-optimization/105175] [12 Regression] Pointless warning about missed vector optimization

2022-04-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105175

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:b789c44c6463452900f7b1e6d2a0af6567534bfc

commit r12-8054-gb789c44c6463452900f7b1e6d2a0af6567534bfc
Author: Richard Biener 
Date:   Wed Apr 6 11:18:12 2022 +0200

tree-optimization/105175 - avoid -Wvector-operation-performance

This avoids -Wvector-operation-performance diagnostics for vectorizer
produced code.  It's unfortunate the warning_at code in
tree-vect-generic.cc needs adjustments but the diagnostic suppression
code doesn't magically suppress those otherwise.

2022-04-06  Richard Biener  

PR tree-optimization/105175
* tree-vect-stmts.cc (vectorizable_operation): Suppress
-Wvector-operation-performance if using emulated vectors.
* tree-vect-generic.cc (expand_vector_piecewise): Do not diagnose
-Wvector-operation-performance when suppressed.
(expand_vector_parallel): Likewise.
(expand_vector_comparison): Likewise.
(expand_vector_condition): Likewise.
(lower_vec_perm): Likewise.
(expand_vector_conversion): Likewise.

* gcc.dg/pr105175.c: New testcase.

[Bug tree-optimization/105175] [12 Regression] Pointless warning about missed vector optimization

2022-04-06 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105175

Richard Biener  changed:

   What|Removed |Added

 CC||dmalcolm at gcc dot gnu.org,
   ||msebor at gcc dot gnu.org

--- Comment #4 from Richard Biener  ---
@item -Wvector-operation-performance
@opindex Wvector-operation-performance
@opindex Wno-vector-operation-performance
Warn if vector operation is not implemented via SIMD capabilities of the
architecture.  Mainly useful for the performance tuning.
Vector operation can be implemented @code{piecewise}, which means that the
scalar operation is performed on every vector element;
@code{in parallel}, which means that the vector operation is implemented
using scalars of wider type, which normally is more performance efficient;
and @code{as a single scalar}, which means that vector fits into a
scalar type.

--

So the point is the vector lowering pass cannot distinguish people writing

typedef int v2si __attribute__((vector_size(8)));

v2si a, b;
void foo()
{
   a &= b;
}

and the vectorizer producing such code.  So technically the diagnostic is
correct but it was the vectorizer producing the operation.

So a proper way would be to suppress OPT_Wvector_operation_performance for
the vectorizer generated stmt.  Unfortunately

  if (using_emulated_vectors_p)
suppress_warning (new_stmt, OPT_Wvector_operation_performance);

will not magically make

  warning_at (loc, OPT_Wvector_operation_performance,
  "vector operation will be expanded with a "
  "single scalar operation");

not warn.  suppress_warning_at returns true, and supp is true as well
(that parameter is not documented as far as I can see).  So we need to
guard all the warning_at with stmt-based warning_suppressed_p and
there's no warning_at overload with a gimple * as location that would
automagically do that it seems?  There is one with rich_location * but
AFAIK that doesn't cover gimple * or tree.

I'm testing a patch that is IMHO too verbose (adjusting all warning_at
in tree-vect-generic.cc).

[Bug tree-optimization/105175] [12 Regression] Pointless warning about missed vector optimization

2022-04-06 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105175

--- Comment #3 from Richard Biener  ---
So vectorizable_operation correctly says target_support_p == false and then
goes
on with

  target_support_p = (optab_handler (optab, vec_mode)
  != CODE_FOR_nothing);
}

  bool using_emulated_vectors_p = vect_emulated_vector_p (vectype);
  if (!target_support_p)
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "op not supported by target.\n");
  /* Check only during analysis.  */
  if (maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD)
  || (!vec_stmt && !vect_can_vectorize_without_simd_p (code)))
return false;
  if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
 "proceeding using word mode.\n");
  using_emulated_vectors_p = true;

using emulated (word-mode) vectors.  In the end it will still emit code
using v2si types which will have V2SImode.  For those it will rely on
vector lowering to perform this very lowering.

So the vectorization works as desired and the lowering as well.  But I agree
the diagnostic in this case is questionable.

[Bug tree-optimization/105175] [12 Regression] Pointless warning about missed vector optimization

2022-04-06 Thread krebbel at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105175

--- Comment #2 from Andreas Krebbel  ---
I would expect the vectorizer to only generate vector modes which would fit
into word mode if no hardware vector support is available. E.g. for:

struct {
  unsigned a, b, c, d;
} s;
foo() {
  s.a &= 42;
  s.b &= 42;
  s.c &= 42;
  s.d &= 42;
}

I see two "vector 2 unsigned" operations being generated when compiling with
-mno-sse but with sse I get a 4 element vector as expected.

[Bug tree-optimization/105175] [12 Regression] Pointless warning about missed vector optimization

2022-04-06 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105175

Richard Biener  changed:

   What|Removed |Added

  Component|rtl-optimization|tree-optimization
   Last reconfirmed||2022-04-06
   Target Milestone|--- |12.0
 Target||x86_64-*-*
 Status|UNCONFIRMED |ASSIGNED
   Keywords||diagnostic,
   ||missed-optimization
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
I think the issue is that the vectorizer generates

[local count: 1073741824]:
+  vect__1.5_8 = MEM  [(unsigned int
*)&qemuMigrationCookieGetPersistent_mig];
+  vect__2.6_9 = vect__1.5_8 & { 1, 1 };
   _1 = qemuMigrationCookieGetPersistent_mig.flags;
   _2 = _1 & 1;
-  qemuMigrationCookieGetPersistent_mig.flags = _2;
   _3 = qemuMigrationCookieGetPersistent_mig.flagsMandatory;
   _4 = _3 & 1;
-  qemuMigrationCookieGetPersistent_mig.flagsMandatory = _4;
+  MEM  [(unsigned int
*)&qemuMigrationCookieGetPersistent_mig] = vect__2.6_9;

but appearantly it does not check availability of the bitwise AND V2SImode
operation.  vector lowering recognizes this missing operation and lowers
it to

+  long unsigned int _6;
+  long unsigned int _7;
+  vector(2) unsigned int _11;

[local count: 1073741824]:
   vect__1.5_8 = MEM  [(unsigned int
*)&qemuMigrationCookieGetPersistent_mig];
-  vect__2.6_9 = vect__1.5_8 & { 1, 1 };
+  _7 = VIEW_CONVERT_EXPR(vect__1.5_8);
+  _6 = _7 & 4294967297;
+  _11 = VIEW_CONVERT_EXPR(_6);
+  vect__2.6_9 = _11;

_maybe_ V2SImode is also a real thing now even with -mno-sse, but the
AND is cut out (or not implemented).

That is, this is probably a vectorizer missed target check.