[Bug tree-optimization/77399] Poor code generation for vector casts and loads

rguenth at gcc dot gnu.org Tue, 30 Aug 2016 03:13:35 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77399


--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Alexander Monakov from comment #6)
> Thanks. Any comment on having gimple lowering emit cleaner code in the first
> place?

well, I'm not sure if it is worth the trouble.  FEs emit

  return <<< Unknown tree: compound_literal_expr
    v4sf D.1795 = {(float) VIEW_CONVERT_EXPR<int[4]>(f)[0], (float)
VIEW_CONVERT_EXPR<int[4]>(f)[1], (float) VIEW_CONVERT_EXPR<int[4]>(f)[2],
(float) VIEW_CONVERT_EXPR<int[4]>(f)[3]}; >>>;

and gimplification then forces the scalar computations to temporaries:

  _1 = BIT_FIELD_REF <f, 32, 0>;
  _2 = (float) _1;
  _3 = BIT_FIELD_REF <f, 32, 32>;
  _4 = (float) _3;
  _5 = BIT_FIELD_REF <f, 32, 64>;
  _6 = (float) _5;
  _7 = BIT_FIELD_REF <f, 32, 96>;
  _8 = (float) _7;
  D.1797 = {_2, _4, _6, _8};

this is theoretically a point where stmt folding could replace it by

  D.1797 = (float) f;

the code in forwprop would need to be moved to gimple-fold.c and eventually
the gimplifier needs to be changed to fold more stmts (it really should
fold all of them, with SSA edge following enabled -- the point we now
have SSA names as early as gimplification).

I think the real issue for writing vector code is that our GCC generic
vector extension has no casting support (or pack-/unpack-support).  The
extension is closely modeled after openCL and IIRC openCL uses
"intrinsics" for these kind of operations?

So I still believe that the forwprop code should be extended, and eventually
the forwprop code should be moved to gimple-fold.c (invoked via fold_stmt),
aka "manually written" match.pd patterns.

[Bug tree-optimization/77399] Poor code generation for vector casts and loads

Reply via email to