Re: [wide-int] tree-ssa-ccp fix
On Thu, Apr 24, 2014 at 11:54 PM, Richard Sandiford rdsandif...@googlemail.com wrote: The asm comparison showed a problem with my r204593 change, which dropped a val.mask in the second hunk below. Seeing that the problem was in ccp made me look at the whole file again. I noticed that we'd changed the VARYING mask value from -1 to 1, which didn't look intentional. Tested on x86_64-linux-gnu. OK to install? Ok. Thanks, Richard. Thanks, Richard Index: gcc/tree-ssa-ccp.c === --- gcc/tree-ssa-ccp.c 2014-04-23 19:13:20.488547331 +0100 +++ gcc/tree-ssa-ccp.c 2014-04-23 19:30:07.025416946 +0100 @@ -607,7 +607,7 @@ get_value_for_expr (tree expr, bool for_ else { val.lattice_val = VARYING; - val.mask = 1; + val.mask = -1; val.value = NULL_TREE; } return val; @@ -1848,7 +1848,7 @@ evaluate_stmt (gimple stmt) if (nonzero_bits == 0) val.mask = 0; else - val.mask = extend_mask (nonzero_bits); + val.mask = val.mask extend_mask (nonzero_bits); } } }
Re: wide-int, tree-ssa
On 11/26/2013 07:34 AM, Richard Biener wrote: --- a/gcc/tree-ssa-ccp.c +++ b/gcc/tree-ssa-ccp.c @@ -98,6 +98,15 @@ along with GCC; see the file COPYING3. If not see array CONST_VAL[i].VALUE. That is fed into substitute_and_fold for final substitution and folding. + This algorithm uses wide-ints at the max precision of the target. + This means that, with one uninteresting exception, variables with + UNSIGNED types never go to VARYING because the bits above the + precision of the type of the variable are always zero. The + uninteresting case is a variable of UNSIGNED type that has the + maximum precision of the target. Such variables can go to VARYING, + but this causes no loss of infomation since these variables will + never be extended. + I don't understand this. In CCP a SSA name drops to varying when it's not constant. How is that related to signedness or precision?! Richi, This is a bogosity that is inherited from the double-int code. Consider an unsigned int in double-int (or for that matter in widest-int). It has a rep of a bunch of leading zeros followed by 32 bits of real stuff.Even if you know nothing about the value, it still has the upper bits of zero that you know are constant zeros. Of course this is not true for signed numbers because the upper bits are smeared with the sign bits. This was what i thought you were talking about when you argued many months ago when you said that infinite precision was more natural and that i would have trouble with a fixed precision based on the size of the type or mode and is the reason that tree-ssa-ccp uses widest int rather than wide-int. I do actually believe that wide-int is more natural here. However this bogosity does buy you a few constants. My first cut at this pass used wide-int rather than widest int, but there are in fact constants that the pass finds that depend on this being done this way so to satisfy the bit for bit same code rule for values smaller than timode, so i left this the way that it was but decided to document it. The proper way to do this is in fact to use wide-int, not widest-int but that would require major surgery to the code that converts values from one type to another.I plan to do this during the next stage 1. At that time, i will also enhance the code to expand the sign bit if it is known for signed types. Kenny
Re: wide-int, tree-ssa
On Fri, Nov 29, 2013 at 3:59 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Only a partial reply. I'll leave Kenny and Mike to answer the VARYING question. Richard Biener richard.guent...@gmail.com writes: On Sat, Nov 23, 2013 at 8:23 PM, Mike Stump mikest...@comcast.net wrote: Richi has asked the we break the wide-int patch so that the individual port and front end maintainers can review their parts without have to go through the entire patch. This patch covers the tree-saa code. Ok? Same comments as to tree-affine.c apply to tree-ssa-address.c. @@ -887,8 +886,8 @@ copy_ref_info (tree new_ref, tree old_ref) (TREE_INT_CST_LOW (TMR_STEP (new_ref)) align) { - unsigned int inc = (mem_ref_offset (old_ref) - - mem_ref_offset (new_ref)).low; + unsigned int inc = (mem_ref_offset (old_ref).to_uhwi () + - mem_ref_offset (new_ref).to_uhwi ()); adjust_ptr_info_misalignment (new_pi, inc); } else other patches used .to_short_addr (), also your conversion isn't correct - previously the subtraction was in infinite precision and only the result (possibly) truncated. Now the subtraction operands are truncated - that looks wrong. Sorry, forgot about .to_short_addr () when doing the merge. The conversion should be OK with that fixed though, since truncation distributes through subtraction (and truncating first is cheaper). +/* Return a widest_int that can be used for bitwise simplifications from VAL. */ -static double_int -value_to_double_int (prop_value_t val) +static widest_int +value_to_wide_int (prop_value_t val) { if (val.value TREE_CODE (val.value) == INTEGER_CST) -return tree_to_double_int (val.value); - else -return double_int_zero; +return wi::to_widest (val.value); + + return 0; } you also get to hope that we optimize all the widest_int return-by-value code to elide the implied copying ... (not sure if you can do that by adding a not implemented copy / assignment constructor - C++ folks?) Are you worried about a copy from one widest_int to another? Won't the return-value optimisation stop that? If it works and applies - as far as I know this is not guaranteed. It would be interesting to know whether any non-NRV cases are left. Or are you worried about the copy from the tree to the widest_int? In this particular case I suppose we could use: wi::to_widest (integer_zero_node) instead of 0 and return a: generic_wide_int extended_tree MAX_BITSIZE_MODE_ANY_INT (typedefed :-)). Is the function really hot enough to justify the uglification though? We're thinking about these kinds of tweaks now because it's flavour of the month, but I bet post-merge patches will use the more obvious implementation instead. I wasn't worried about this specifically. OTOH maybe this is a natural candidate for C++11 auto... case LT_EXPR: case LE_EXPR: { + widest_int o1val, o2val, o1mask, o2mask; int minmax, maxmin; + + if ((code == GE_EXPR) || (code == GT_EXPR)) + { + o1val = r2val; + o1mask = r2mask; + o2val = r1val; + o2mask = r1mask; + code = swap_tree_comparison (code); + } + else + { + o1val = r1val; + o1mask = r1mask; + o2val = r2val; + o2mask = r2mask; + } that are pretty expensive copies, no? Consider using const widest_int o1val = swap ? r2val : r1val; and so on. (C++ folks? Any idea how to avoid the repeated conditional init in favor of sth that more resembles the original?) Not a C++ person, but I can't think of one either. It probably wouldn't be as readable as the four separate statements though. diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 6ea634c..c975a97 100644 --- a/gcc/tree-ssa-loop-im.c +++ b/gcc/tree-ssa-loop-im.c @@ -1643,7 +1643,7 @@ mem_refs_may_alias_p (mem_ref_p mem1, mem_ref_p mem2, /* Perform BASE + OFFSET analysis -- if MEM1 and MEM2 are based on the same object and their offset differ in such a way that the locations cannot overlap, then they cannot alias. */ - double_int size1, size2; + widest_int size1, size2; aff_tree off1, off2; from the names you can know this is offset_int. I agree that's true in the overlap and get_inner_reference_aff cases, but most of tree-affine seems to be more general than that. Is it really worth bunging in conversions between offset_int and widest_int to save a few HWIs of stack space? I think so. @@ -529,15 +526,15 @@ end: difference of two values in TYPE. */ static void -bounds_add (bounds *bnds, double_int delta, tree type) +bounds_add (bounds *bnds, widest_int delta, tree type) {
Re: wide-int, tree-ssa
Only a partial reply. I'll leave Kenny and Mike to answer the VARYING question. Richard Biener richard.guent...@gmail.com writes: On Sat, Nov 23, 2013 at 8:23 PM, Mike Stump mikest...@comcast.net wrote: Richi has asked the we break the wide-int patch so that the individual port and front end maintainers can review their parts without have to go through the entire patch. This patch covers the tree-saa code. Ok? Same comments as to tree-affine.c apply to tree-ssa-address.c. @@ -887,8 +886,8 @@ copy_ref_info (tree new_ref, tree old_ref) (TREE_INT_CST_LOW (TMR_STEP (new_ref)) align) { - unsigned int inc = (mem_ref_offset (old_ref) - - mem_ref_offset (new_ref)).low; + unsigned int inc = (mem_ref_offset (old_ref).to_uhwi () + - mem_ref_offset (new_ref).to_uhwi ()); adjust_ptr_info_misalignment (new_pi, inc); } else other patches used .to_short_addr (), also your conversion isn't correct - previously the subtraction was in infinite precision and only the result (possibly) truncated. Now the subtraction operands are truncated - that looks wrong. Sorry, forgot about .to_short_addr () when doing the merge. The conversion should be OK with that fixed though, since truncation distributes through subtraction (and truncating first is cheaper). +/* Return a widest_int that can be used for bitwise simplifications from VAL. */ -static double_int -value_to_double_int (prop_value_t val) +static widest_int +value_to_wide_int (prop_value_t val) { if (val.value TREE_CODE (val.value) == INTEGER_CST) -return tree_to_double_int (val.value); - else -return double_int_zero; +return wi::to_widest (val.value); + + return 0; } you also get to hope that we optimize all the widest_int return-by-value code to elide the implied copying ... (not sure if you can do that by adding a not implemented copy / assignment constructor - C++ folks?) Are you worried about a copy from one widest_int to another? Won't the return-value optimisation stop that? Or are you worried about the copy from the tree to the widest_int? In this particular case I suppose we could use: wi::to_widest (integer_zero_node) instead of 0 and return a: generic_wide_int extended_tree MAX_BITSIZE_MODE_ANY_INT (typedefed :-)). Is the function really hot enough to justify the uglification though? We're thinking about these kinds of tweaks now because it's flavour of the month, but I bet post-merge patches will use the more obvious implementation instead. OTOH maybe this is a natural candidate for C++11 auto... case LT_EXPR: case LE_EXPR: { + widest_int o1val, o2val, o1mask, o2mask; int minmax, maxmin; + + if ((code == GE_EXPR) || (code == GT_EXPR)) + { + o1val = r2val; + o1mask = r2mask; + o2val = r1val; + o2mask = r1mask; + code = swap_tree_comparison (code); + } + else + { + o1val = r1val; + o1mask = r1mask; + o2val = r2val; + o2mask = r2mask; + } that are pretty expensive copies, no? Consider using const widest_int o1val = swap ? r2val : r1val; and so on. (C++ folks? Any idea how to avoid the repeated conditional init in favor of sth that more resembles the original?) Not a C++ person, but I can't think of one either. It probably wouldn't be as readable as the four separate statements though. diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 6ea634c..c975a97 100644 --- a/gcc/tree-ssa-loop-im.c +++ b/gcc/tree-ssa-loop-im.c @@ -1643,7 +1643,7 @@ mem_refs_may_alias_p (mem_ref_p mem1, mem_ref_p mem2, /* Perform BASE + OFFSET analysis -- if MEM1 and MEM2 are based on the same object and their offset differ in such a way that the locations cannot overlap, then they cannot alias. */ - double_int size1, size2; + widest_int size1, size2; aff_tree off1, off2; from the names you can know this is offset_int. I agree that's true in the overlap and get_inner_reference_aff cases, but most of tree-affine seems to be more general than that. Is it really worth bunging in conversions between offset_int and widest_int to save a few HWIs of stack space? @@ -529,15 +526,15 @@ end: difference of two values in TYPE. */ static void -bounds_add (bounds *bnds, double_int delta, tree type) +bounds_add (bounds *bnds, widest_int delta, tree type) { mpz_t mdelta, max; const widest_int delta please (please double-check the patches for widest-int-by-value passing). @@ -2592,7 +2587,7 @@ derive_constant_upper_bound_ops (tree type, tree op0, static void do_warn_aggressive_loop_optimizations (struct loop *loop, -
Re: wide-int, tree-ssa
On Sat, Nov 23, 2013 at 8:23 PM, Mike Stump mikest...@comcast.net wrote: Richi has asked the we break the wide-int patch so that the individual port and front end maintainers can review their parts without have to go through the entire patch.This patch covers the tree-saa code. Ok? Same comments as to tree-affine.c apply to tree-ssa-address.c. @@ -887,8 +886,8 @@ copy_ref_info (tree new_ref, tree old_ref) (TREE_INT_CST_LOW (TMR_STEP (new_ref)) align) { - unsigned int inc = (mem_ref_offset (old_ref) - - mem_ref_offset (new_ref)).low; + unsigned int inc = (mem_ref_offset (old_ref).to_uhwi () + - mem_ref_offset (new_ref).to_uhwi ()); adjust_ptr_info_misalignment (new_pi, inc); } else other patches used .to_short_addr (), also your conversion isn't correct - previously the subtraction was in infinite precision and only the result (possibly) truncated. Now the subtraction operands are truncated - that looks wrong. --- a/gcc/tree-ssa-ccp.c +++ b/gcc/tree-ssa-ccp.c @@ -98,6 +98,15 @@ along with GCC; see the file COPYING3. If not see array CONST_VAL[i].VALUE. That is fed into substitute_and_fold for final substitution and folding. + This algorithm uses wide-ints at the max precision of the target. + This means that, with one uninteresting exception, variables with + UNSIGNED types never go to VARYING because the bits above the + precision of the type of the variable are always zero. The + uninteresting case is a variable of UNSIGNED type that has the + maximum precision of the target. Such variables can go to VARYING, + but this causes no loss of infomation since these variables will + never be extended. + I don't understand this. In CCP a SSA name drops to varying when it's not constant. How is that related to signedness or precision?! @@ -511,21 +523,21 @@ set_lattice_value (tree var, prop_value_t new_val) static prop_value_t get_value_for_expr (tree, bool); static prop_value_t bit_value_binop (enum tree_code, tree, tree, tree); -static void bit_value_binop_1 (enum tree_code, tree, double_int *, double_int *, - tree, double_int, double_int, - tree, double_int, double_int); +static void bit_value_binop_1 (enum tree_code, tree, widest_int *, widest_int *, + tree, widest_int, widest_int, + tree, widest_int, widest_int); please don't pass widest_int by value but instead pass it by const reference. compared to double_int it is really large (most targets passed double_int in two registers). +/* Return a widest_int that can be used for bitwise simplifications from VAL. */ -static double_int -value_to_double_int (prop_value_t val) +static widest_int +value_to_wide_int (prop_value_t val) { if (val.value TREE_CODE (val.value) == INTEGER_CST) -return tree_to_double_int (val.value); - else -return double_int_zero; +return wi::to_widest (val.value); + + return 0; } you also get to hope that we optimize all the widest_int return-by-value code to elide the implied copying ... (not sure if you can do that by adding a not implemented copy / assignment constructor - C++ folks?) + val.lattice_val = val.mask == -1 ? VARYING : CONSTANT; if (val.lattice_val == CONSTANT) you mean this check wrt the toplevel VARYING comment? It would be bad if that no longer ends up VARYING. OTOH I don't know whether the current check is correct either - we extend the mask according to the sign of the lattice element. Thus the wide-int variant looks equivalent. Note that for unsigned values we have knowledge about the bits outside of the precision - they are zero, so techincally unsigneds never go VARYING. case LT_EXPR: case LE_EXPR: { + widest_int o1val, o2val, o1mask, o2mask; int minmax, maxmin; + + if ((code == GE_EXPR) || (code == GT_EXPR)) + { + o1val = r2val; + o1mask = r2mask; + o2val = r1val; + o2mask = r1mask; + code = swap_tree_comparison (code); + } + else + { + o1val = r1val; + o1mask = r1mask; + o2val = r2val; + o2mask = r2mask; + } that are pretty expensive copies, no? Consider using const widest_int o1val = swap ? r2val : r1val; and so on. (C++ folks? Any idea how to avoid the repeated conditional init in favor of sth that more resembles the original?) diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 6ea634c..c975a97 100644 --- a/gcc/tree-ssa-loop-im.c +++ b/gcc/tree-ssa-loop-im.c @@ -1643,7 +1643,7 @@ mem_refs_may_alias_p (mem_ref_p mem1, mem_ref_p mem2, /* Perform BASE + OFFSET analysis -- if MEM1 and MEM2 are based on the same