On Fri, Nov 11, 2016 at 4:56 PM, Martin Sebor <[email protected]> wrote:
> Thanks for the review and comments!
First of all sorry for the late response.
>>
>> @@ -158,14 +170,149 @@ compute_object_offset (const_tree expr, const_tree
>> var)
>> return size_binop (code, base, off);
>> }
>>
>> +static bool
>> +operand_unsigned_p (tree op)
>> +{
>> + if (TREE_CODE (op) == SSA_NAME)
>>
>> new functions need a comment. But maybe you want to use
>> tree_expr_nonnegative_p
>> to also allow signed but known positive ones?
>
>
> Let me add a comment.
>
> operand_unsigned_p returns true if the type of the original offset
> before conversion to sizetype is unsigned. tree_expr_nonnegative_p
> returns true if the argument's type is unsigned, which is always
> the case here.
Sure - but then you maybe instead want to check for op being in
range [0, max-of-signed-type-of-op] instead? So similar to
expr_not_equal_to add a expr_in_range helper?
Your function returns true for sizetype vars even if it might be
effectively signed, like for
sizetype i_1 = -4;
i_2 = i_1 + 1;
operand_unsigned_p (i) returns true. I suppose you may have
meant
+static bool
+operand_unsigned_p (tree op)
+{
+ if (TREE_CODE (op) == SSA_NAME)
+ {
+ gimple *def = SSA_NAME_DEF_STMT (op);
+ if (is_gimple_assign (def))
+ {
+ tree_code code = gimple_assign_rhs_code (def);
+ if (code == NOP_EXPR
&& TYPE_PRECISION (TREE_TYPE (op)) > TYPE_PRECISION
(TREE_TYPE (gimple_assign_rhs1 (def))))
return tree_expr_nonnegative_p (gimple_assign_rhs1 (def)));
+ }
+ }
+
+ return false;
+}
? because only if you do see a cast and that cast is widening from an
nonnegative number
the final one will be unsigned (if interpreted as signed number).
>>
>> +/* Fill the 2-element OFFRANGE array with the range of values OFF
>> + is known to be in. Postcodition: OFFRANGE[0] <= OFFRANGE[1]. */
>> +
>> +static bool
>> +get_offset_range (tree off, HOST_WIDE_INT offrange[2])
>> +{
>> + STRIP_NOPS (off);
>>
>> why strip nops (even sign changes!) here?
>
>
> That might be a leftover from an earlier/failed attempt to simplify
> things that I forgot to remove. Let me do that in a followup patch.
> Unless I misunderstand your comment there are no sign changes (AFAIK)
> because the offset is always represented as sizetype.
>
>> Why below convert things
>> via to_uhwi when offrange is of type signed HOST_WIDE_INT[2]?
>
>
> The offset is always represented as sizetype but the code treats
> it as signed because in reality it can be negative. That said,
> I don't find dealing with ranges very intuitive so I could very
> well be missing something and there may be a better way to code
> this. I welcome suggestions.
You might want to use offset_ints here (see mem_ref_offset for example)
>>
>> + gimple *def = SSA_NAME_DEF_STMT (off);
>> + if (is_gimple_assign (def))
>> + {
>> + tree_code code = gimple_assign_rhs_code (def);
>> + if (code == PLUS_EXPR)
>> + {
>> + /* Handle offset in the form VAR + CST where VAR's type
>> + is unsigned so the offset must be the greater of
>> + OFFRANGE[0] and CST. This assumes the PLUS_EXPR
>> + is in a canonical form with CST second. */
>> + tree rhs2 = gimple_assign_rhs2 (def);
>>
>> err, what? What about overflow? Aren't you just trying to decompose
>> 'off' into a variable and a constant part here and somehow extracting a
>> range for the variable part? So why not just do that?
>
>
> Sorry, what about overflow?
>
> The purpose of this code is to handle cases of the form
>
> & PTR [range (MIN, MAX)] + CST
what if MAX + CST overflows?
Note pointer plus int is a POINTER_PLUS_EXPR, not a PLUS_EXPR.
Only for a POINTER_PLUS_EXPR you might argue that overflow is
undefined.
> where CST is unsigned implying that the lower bound of the offset
> is the greater of CST and MIN. For instance, in the following it
> determines that bos(p, 0) is 4 (and if the 3 were greater than 7
> and overflowed the addition the result would be zero). I'm not
> sure I understand what you suggest I do differently to make this
> work.
>
> char d[7];
>
> #define bos(p, t) __builtin_object_size (p, t)
>
> long f (unsigned i)
> {
> if (2 < i) i = 2;
>
> char *p = &d[i] + 3;
>
> return bos (p, 0);
> }
I'm sure that doesn't work as you match for PLUS_EXPR.
>>
>>
>> + else if (range_type == VR_ANTI_RANGE)
>> + {
>> + offrange[0] = max.to_uhwi () + 1;
>> + offrange[1] = min.to_uhwi () - 1;
>> + return true;
>> + }
>>
>> first of all, how do you know it fits uhwi? Second, from ~[5, 9] you get
>> [10, 4] !? That looks bogus (and contrary to the function comment
>> postcondition)
>
>
> I admit I have some trouble working with anti-ranges. It's also
> difficult to fully exercise them in this pass because it only runs
> after EVRP but not after VRP1 (except with -g), so only limited
> range information is available. (I'm hoping to eventually change
> it but moving the passes broke a test in a way that seemed too
> complex to fix for this project).
Maybe simply ignore VR_ANTI_RANGEs for now then?
> The code above is based on the observation that an anti-range is
> often used to represent the full subrange of a narrower signed type
> like signed char (as ~[128, -129]). I haven't been able to create
> an anti-range like ~[5, 9]. When/how would a range like that come
> about (so I can test it and implement the above correctly)?
if (a < 4)
if (a > 8)
b = a;
then b should have ~[5, 9]
Note a trick VRP uses internally is to treat an anti-range as
the union of two ranges. ~[X, Y] == [MIN, X-1] u [Y+1, MAX].
That essentially removes anti-range handling from VRPs
propagation - not sure if it would help your case.
>>
>> + else if (range_type == VR_VARYING)
>> + {
>> + gimple *def = SSA_NAME_DEF_STMT (off);
>> + if (is_gimple_assign (def))
>> + {
>> + tree_code code = gimple_assign_rhs_code (def);
>> + if (code == NOP_EXPR)
>> + {
>>
>> please trust range-info instead of doing your own little VRP here.
>> VR_VARYING -> return false.
>
>
> I would prefer to rely on the range information and not have to work
> around it like I do here but, unfortunately, it doesn't always appear
> to be available. For example, in the following test case:
>
> char d[130];
>
> #define bos(p, t) __builtin_object_size (p, t)
>
> void f (void*);
>
> void g (signed char i)
> {
> char *p = &d [i] + 128;
> f(__builtin___memset_chk (p, '*', 2, bos (p, 0))); // okay
>
> f(__builtin___memset_chk (p, '*', 3, bos (p, 0))); // overflow
> }
>
> the range type is VR_VARYING but the code makes it possible to diagnose
> the overflow in the second call to memset.
>
> Maybe the poor range info i a consequence of the pass only benefiting
> from EVRP and not VRP?
The range of 'p' is indeed not known (we only represent integer bound ranges).
You seem to want the range of p - &d[0] here, something that is not present
in the IL.
>>
>> stopping review here noting that other parts of the compiler
>> split constant parts from variable parts and it looks like this is
>> what you want to do here? That is, enhance
>>
>> static vec<unsigned HOST_WIDE_INT> object_sizes[4];
>>
>> and cache a SSA-NAME, constant offset pair in addition? Or just a range
>> (range of that SSA name + offset), if that's good enough -- wait, that's
>> what you get from get_range_info!
>
>
> I agree that storing the offsets could be an enhancement to consider.
> The patch mentions it in the FIXME part of the following comment:
>
> /* Bitmaps of variables whose object sizes have been computed without
> regard to the (non-constant) offset into them. A bit in each bitmap
> is valid only if the corresponding bit in the COMPUTED bitmap is
> non-zero (otherwise it's zero).
> FIXME: Change this to a vector of offset ranges to make it possible
> to handle cases like &A[I] + J where both I and J are non-const and
> bounded by some range. */
> static bitmap nonconst_offsets[4];
>
> It's just something I haven't had time to work on yet and with the
> close of stage 1 approaching I wanted to put out this version for
> review. Do you view this enhancement as prerequisite for approving
> the patch or is it something that you'd be fine with adding later?
I find the patch adds quite some ad-hoc ugliness to a pass that is
already complex and nearly impossible to understand.
>>
>> So I'm not sure where the whole complication in the patch comes from...
>>
>> Possibly from the fact that VRP on pointers is limited and thus &a[i] + 5
>> doesn't get you a range for i + 5?
>
>
> get_range_info doesn't accept pointers at all (perhaps that's what
> you meant by "VRP on pointers is limited"). In Gimple, &a[i] + 5
> is represented as just that (i.e., _1 = &a[i_3]; p_6 = _1 + 5;) and
> so to get the right answer for bos(&a[i] + 5) the patch jumps though
> some hoops. But again, I could very well be missing something that's
> obvious to you so if you think that's the case I'd be happy to simplify
> the code if you point me in the right direction.
Yes, get_range_info is limited. IMHO of you want to enhance object-size
to cover ranges by implicitely working on a different IL then it probably
should be re-written with that in mind. The EVRP pass provides a
good template for how to re-use the VRP machinery and decomposing
the lattice a bit further into BASE + range shouldn't be difficult.
I'm leaving it to others if we have to have this improvement for GCC 7
(bos has its own share of know issues besides missing features).
Richard.
>
> Martin