[Bug tree-optimization/105175] [12 Regression] Pointless warning about missed vector optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105175 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #6 from Richard Biener --- Should be fixed now - I guess it's latent on branches when you enable vectorization.
[Bug tree-optimization/105175] [12 Regression] Pointless warning about missed vector optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105175 --- Comment #5 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:b789c44c6463452900f7b1e6d2a0af6567534bfc commit r12-8054-gb789c44c6463452900f7b1e6d2a0af6567534bfc Author: Richard Biener Date: Wed Apr 6 11:18:12 2022 +0200 tree-optimization/105175 - avoid -Wvector-operation-performance This avoids -Wvector-operation-performance diagnostics for vectorizer produced code. It's unfortunate the warning_at code in tree-vect-generic.cc needs adjustments but the diagnostic suppression code doesn't magically suppress those otherwise. 2022-04-06 Richard Biener PR tree-optimization/105175 * tree-vect-stmts.cc (vectorizable_operation): Suppress -Wvector-operation-performance if using emulated vectors. * tree-vect-generic.cc (expand_vector_piecewise): Do not diagnose -Wvector-operation-performance when suppressed. (expand_vector_parallel): Likewise. (expand_vector_comparison): Likewise. (expand_vector_condition): Likewise. (lower_vec_perm): Likewise. (expand_vector_conversion): Likewise. * gcc.dg/pr105175.c: New testcase.
[Bug tree-optimization/105175] [12 Regression] Pointless warning about missed vector optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105175 Richard Biener changed: What|Removed |Added CC||dmalcolm at gcc dot gnu.org, ||msebor at gcc dot gnu.org --- Comment #4 from Richard Biener --- @item -Wvector-operation-performance @opindex Wvector-operation-performance @opindex Wno-vector-operation-performance Warn if vector operation is not implemented via SIMD capabilities of the architecture. Mainly useful for the performance tuning. Vector operation can be implemented @code{piecewise}, which means that the scalar operation is performed on every vector element; @code{in parallel}, which means that the vector operation is implemented using scalars of wider type, which normally is more performance efficient; and @code{as a single scalar}, which means that vector fits into a scalar type. -- So the point is the vector lowering pass cannot distinguish people writing typedef int v2si __attribute__((vector_size(8))); v2si a, b; void foo() { a &= b; } and the vectorizer producing such code. So technically the diagnostic is correct but it was the vectorizer producing the operation. So a proper way would be to suppress OPT_Wvector_operation_performance for the vectorizer generated stmt. Unfortunately if (using_emulated_vectors_p) suppress_warning (new_stmt, OPT_Wvector_operation_performance); will not magically make warning_at (loc, OPT_Wvector_operation_performance, "vector operation will be expanded with a " "single scalar operation"); not warn. suppress_warning_at returns true, and supp is true as well (that parameter is not documented as far as I can see). So we need to guard all the warning_at with stmt-based warning_suppressed_p and there's no warning_at overload with a gimple * as location that would automagically do that it seems? There is one with rich_location * but AFAIK that doesn't cover gimple * or tree. I'm testing a patch that is IMHO too verbose (adjusting all warning_at in tree-vect-generic.cc).
[Bug tree-optimization/105175] [12 Regression] Pointless warning about missed vector optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105175 --- Comment #3 from Richard Biener --- So vectorizable_operation correctly says target_support_p == false and then goes on with target_support_p = (optab_handler (optab, vec_mode) != CODE_FOR_nothing); } bool using_emulated_vectors_p = vect_emulated_vector_p (vectype); if (!target_support_p) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "op not supported by target.\n"); /* Check only during analysis. */ if (maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD) || (!vec_stmt && !vect_can_vectorize_without_simd_p (code))) return false; if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "proceeding using word mode.\n"); using_emulated_vectors_p = true; using emulated (word-mode) vectors. In the end it will still emit code using v2si types which will have V2SImode. For those it will rely on vector lowering to perform this very lowering. So the vectorization works as desired and the lowering as well. But I agree the diagnostic in this case is questionable.
[Bug tree-optimization/105175] [12 Regression] Pointless warning about missed vector optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105175 --- Comment #2 from Andreas Krebbel --- I would expect the vectorizer to only generate vector modes which would fit into word mode if no hardware vector support is available. E.g. for: struct { unsigned a, b, c, d; } s; foo() { s.a &= 42; s.b &= 42; s.c &= 42; s.d &= 42; } I see two "vector 2 unsigned" operations being generated when compiling with -mno-sse but with sse I get a 4 element vector as expected.
[Bug tree-optimization/105175] [12 Regression] Pointless warning about missed vector optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105175 Richard Biener changed: What|Removed |Added Component|rtl-optimization|tree-optimization Last reconfirmed||2022-04-06 Target Milestone|--- |12.0 Target||x86_64-*-* Status|UNCONFIRMED |ASSIGNED Keywords||diagnostic, ||missed-optimization Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- I think the issue is that the vectorizer generates [local count: 1073741824]: + vect__1.5_8 = MEM [(unsigned int *)&qemuMigrationCookieGetPersistent_mig]; + vect__2.6_9 = vect__1.5_8 & { 1, 1 }; _1 = qemuMigrationCookieGetPersistent_mig.flags; _2 = _1 & 1; - qemuMigrationCookieGetPersistent_mig.flags = _2; _3 = qemuMigrationCookieGetPersistent_mig.flagsMandatory; _4 = _3 & 1; - qemuMigrationCookieGetPersistent_mig.flagsMandatory = _4; + MEM [(unsigned int *)&qemuMigrationCookieGetPersistent_mig] = vect__2.6_9; but appearantly it does not check availability of the bitwise AND V2SImode operation. vector lowering recognizes this missing operation and lowers it to + long unsigned int _6; + long unsigned int _7; + vector(2) unsigned int _11; [local count: 1073741824]: vect__1.5_8 = MEM [(unsigned int *)&qemuMigrationCookieGetPersistent_mig]; - vect__2.6_9 = vect__1.5_8 & { 1, 1 }; + _7 = VIEW_CONVERT_EXPR(vect__1.5_8); + _6 = _7 & 4294967297; + _11 = VIEW_CONVERT_EXPR(_6); + vect__2.6_9 = _11; _maybe_ V2SImode is also a real thing now even with -mno-sse, but the AND is cut out (or not implemented). That is, this is probably a vectorizer missed target check.