On Fri, May 1, 2015 at 6:41 AM, Kugan <kugan.vivekanandara...@linaro.org> wrote: > >>> Thanks for the comments. Here is a prototype patch that implements a >>> type promotion pass. This pass records SSA variables that will have >>> values in higher bits (than the original type precision) if promoted and >>> uses this information in inserting appropriate truncations and >>> extensions. This pass also classifies some of the stmts that sets ssa's >>> to be unsafe to promote. Here is a gimple difference for the type >>> promotion as compared to previous dump for a testcase. >> >> Note that while GIMPLE has a way to zero-extend (using BIT_AND_EXPR) >> it has no convenient way to sign-extend other than truncating to a signed >> (non-promoted) type and then extending to the promoted type. Thus >> I think such pass should be accompanied with a new tree code, >> SEXT_EXPR. Otherwise we end up with "spurious" un-promoted >> signed types which later optimizations may be confused about. >> >> Not sure if that is the actual issue though. >> >> Instead op "prmt" and "prmtn" I'd spell out promote and tree-type-prmtn >> should be gimple-ssa-type-promote.c. In the end all targets with >> non-trivial PROMOTE_MODE should run the pass as a lowering step >> so it should be enabled even at -O0 (and not disablable). >> >> I'd definitely run the pass _after_ pass_lower_vector_ssa (and in the >> end I'd like to run it before IVOPTs ... which means moving IVOPTs >> later, after VRP which should be the pass optimizing away some of >> the extensions). >> >> In get_promoted_type I don't understand why you preserve qualifiers. >> Also even for targets without PROMOTE_MODE it may be >> beneficial to expose truncations required by expanding bit-precision >> arithmetic earlier (that is, if !PROMOTE_MODE at least promote >> to GET_MODE_PRECISION (TYPE_MODE (type))). A testcase >> for that is for example >> >> struct { long i : 33; long j : 33; } a; >> return a.i + a.j; >> >> where bitfields of type > int do not promote so you get a >> 33 bit add which we expand to a 64bit add plus a sign-extension >> (and nothing optimizes that later usually). >> >> insert_next_bb sounds like you want to use insert_on_edge >> somewhere. >> >> in assign_rhs_promotable_p you handle comparisons special >> but the ternary COND_EXPR and VEC_COND_EXPR can have >> comparisons embedded in their first operand. The comment >> confuses me though - with proper sign- or zero-extensions inserted >> you should be able to promote them anyway? >> >> You seem to miss that a GIMPLE_ASSIGN can have 3 operands >> in promote_cst_in_stmt as well. >> >> In promote_assign_stmt_use I consider a default: case that ends >> up doing nothing dangerous ;) Please either use gcc_unreachable () >> or do the safe thing (fix = true;?). You seem to be working with >> a lattice of some kind - fixing up stmt uses the way you do - walking >> over immediate uses - is not very cache friendly. Why not use >> a lattice for this - record promoted vars to be used for old SSA names >> and walk over all stmts instead, replacing SSA uses on them? >> Btw, you don't need to call update_stmt if you SET_USE and not >> replace an SSA name with a constant. >> >> You seem to "fix" with a single stmt but I don't see where you insert >> zero- or sign-extensions for ssa_overflows_p cases? >> >> Note that at least for SSA names with !SSA_NAME_VAR (thus >> anonymous vars) you want to do a cheaper promotion by not >> allocating a new SSA name but simply "fixing" its type by >> assigning to its TREE_TYPE. For SSA names with SSA_NAME_VAR >> there is of course debug-info to consider and thus doing what you >> do is better (but probably still will wreck debuginfo?). >> >> GIMPLE_NOPs are not only used for parameters but also uninitialized >> uses - for non-parameters you should simply adjust their type. No >> need to fixup their value. >> >> The pass needs more comments. >> >> It looks like you are not promoting all variables but only those >> where compensation code (zero-/sign-extensions) is not necessary? >> > > Thanks for the comments. Please find an updated version of this which > addresses your review comments above. I am still to do full benchmarking > on this, but tried with few small benchmarks. I will do proper > benchmarking after getting feedback on the implementation. I have > however bootstrapped on x86-64-none-linux and regression tested on > x86-64, ARM and AArch64. > > I am also not clear with how I should handle the gimple debug statements > when the intermediate temporary variable that maps to the original > variable is promoted.
A few notes. +/* Sign-extend operation. */ +DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2) this needs an extended comment documenting the operands. + case SEXT_EXPR: + { + rtx op0 = expand_normal (treeop0); + rtx temp; + if (!target) + target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0))); + + machine_mode inner_mode = smallest_mode_for_size (tree_to_shwi (treeop1), + MODE_INT); + temp = convert_modes (inner_mode, + TYPE_MODE (TREE_TYPE (treeop0)), op0, 0); + convert_move (target, temp, 0); + return target; + } I think that if you allow arbitrary treeop1 you have to properly implement fallbacks for the case where direct expansion to (sign_extend:<target-mode> (subreg:<inner-mode> reg)) does not work which is the intended operation modeled by SEXT_EXPR. Direct expansion to that RTL would also be best I suppose. +/* Return the promoted type for TYPE. */ +static tree +get_promoted_type (tree type) +{ + tree promoted_type; + enum machine_mode mode; + int uns; + if (POINTER_TYPE_P (type) + || TYPE_PRECISION (type) == 1 + || !INTEGRAL_TYPE_P (type)) + return type; you should check for INTEGRAL_TYPE_P before looking at TYPE_PRECISION. +#ifdef PROMOTE_MODE + mode = TYPE_MODE (type); + uns = TYPE_SIGN (type); + PROMOTE_MODE (mode, uns, type); +#else + mode = smallest_mode_for_size (GET_MODE_PRECISION (TYPE_MODE (type)), + MODE_INT); +#endif That smallest_mode_for_size should be a no-op. Just hoist out mode = TYPE_MODE (type). Now before I get into too much details at this point. You compute which promotions are unsafe, like sources/sinks of memory (I think you miss call arguments/return values and also asm operands here). But instead of simply marking those SSA names as not to be promoted I'd instead split their life-ranges, thus replace _1 = mem; with _2 = mem; _1 = [zs]ext (_2, ...); and promote _1 anyway. So in the first phase I'd do that (and obviously note that _2 isn't to be promoted in the specific example). For promotions that apply I wouldn't bother allocating new SSA names but just "fix" their types (assign to their TREE_TYPE). This also means they have to become anonymous and if they didn't have a !DECL_IGNORED_P decl before then a debug stmt should be inserted at the point of the promotions. So bar_3 = _1 + _2; when promoted would become _4 = _1 + _2; _3 = sext <_4, ...>; # DEBUG bar = (orig-type) _4; // or _3? so you'd basically always promote defs (you have a lot of stmt/operand walking code I didn't look too closely at - but it looks like too much) and the uses get promoted automagically (because you promote the original SSA name). Promotion of constants has to remain, of course. I wouldn't promote pointers at all (are targets doing that?) There are existing various helpers for stuff you re-invent. I've just spotted get_single_successor_bb for which there is single_succ_p () plus single_succ (). Generally most of your stmt walking code could either use walk_stmt () or the low-level gimple_op (...) interface and a loop over all gimple_num_ops () operands. You seem to mix in optimization and lowering - you are extending at uses, not at defs for example. IMHO that complicates the code or do you think that a later optimization pass cannot recover from some obviously bad decisions here? If so then I suggest to implement sth less ad-hoc by using a SSA lattice to track this and propagate the info properly, still emitting the truncations at the defs where necessary. Thanks, Richard. > Thanks, > Kugan > > > gcc/ChangeLog: > > 2015-05-01 Kugan Vivekanandarajah <kug...@linaro.org> > > * Makefile.in: Add gimple-ssa-type-promote.o. > * cfgexpand.c (expand_debug_expr): Handle SEXT_EXPR. > * common.opt: New option -ftree-type-promote. > * expr.c (expand_expr_real_2): Handle SEXT_EXPR. > * fold-const.c (int_const_binop_1): > * gimple-ssa-type-promote.c: New file. > * passes.def: Define new pass_type_promote. > * timevar.def: Define new TV_TREE_TYPE_PROMOTE. > * tree-cfg.c (verify_gimple_assign_binary): Handle SEXT_EXPR. > * tree-inline.c (estimate_operator_cost): > * tree-pass.h (make_pass_type_promote): New. > * tree-pretty-print.c (dump_generic_node): Handle SEXT_EXPR. > (op_symbol_code): Likewise. > * tree-vrp.c (extract_range_from_binary_expr_1): Likewise. > * tree.def: Define new SEXT_EXPR.