On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor <mse...@gmail.com> wrote:
> On 10/18/2017 04:48 AM, Richard Biener wrote:
>>
>> On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor <mse...@gmail.com> wrote:
>>>
>>> While testing my latest -Wrestrict changes I noticed a number of
>>> opportunities to improve the -Warray-bounds warning.  Attached
>>> is a patch that implements a solution for the following subset
>>> of these:
>>>
>>> PR tree-optimization/82596 - missing -Warray-bounds on an out-of
>>>   bounds index into string literal
>>> PR tree-optimization/82588 - missing -Warray-bounds on an excessively
>>>   large index
>>> PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds
>>>   inner indices
>>>
>>> The patch also adds more detail to the -Warray-bounds diagnostics
>>> to make it easier to see the cause of the problem.
>>>
>>> Richard, since the changes touch tree-vrp.c, I look in particular
>>> for your comments.
>>
>>
>> +      /* Accesses to trailing arrays via pointers may access storage
>> +        beyond the types array bounds.  For such arrays, or for flexible
>> +        array members as well as for other arrays of an unknown size,
>> +        replace the upper bound with a more permissive one that assumes
>> +        the size of the largest object is SSIZE_MAX.  */
>>
>> I believe handling those cases are somewhat academic, but ...
>
>
> Thanks for the quick review!
>
> I agree the SSIZE_MAX tests handle corner cases and there are
> arguably more important gaps here to plug (e.g., VLAs).  Then
> again, most bugs tend to lurk in corner cases of one kind or
> another and these seemed like a good way for me to come up to
> speed on the implementation before tackling those.  If you
> have suggestions for which to dive into next I'm happy to take
> them.
>
>> +      tree eltype = TREE_TYPE (ref);
>> +      tree eltsize = TYPE_SIZE_UNIT (eltype);
>>
>> needs to use array_ref_element_size.  Note that this size can be
>> non-constant in which case you now ICE so you have to check
>> this is an INTEGER_CST.
>
>
> Thanks.  I did reproduce a few ICEs due to VLAs.  I've fixed
> the problems and added tests for them.  One-dimensional VLAs
> are now handled the same way arrays of unknown bound are.
> Handling them more intelligently (i.e., taking into account
> the ranges their bounds are in) and especially handling
> multidimensional VLAs will take some effort.
>
>>
>> +      tree maxbound = TYPE_MAX_VALUE (ssizetype);
>>
>> please don't use ssizetype - sizetype can be of different precision
>> than pointers (and thus objects).  size_type_node would come
>> close (maps to size_t), eventually ptrdiff_type_node is also
>> defined by all frontends.
>
>
> Okay, I've changed it to sizetype (it would be nice if there
> were a cleaner way of doing it rather than by bit-twiddling.)
>
> ptrdiff_type would have been my first choice but it's only defined
> by the C/C++ front ends and not available in the middle-end, so
> the other warnings that consider object sizes deal with ssizetype
> (e.g., -Walloc-size-larger-than, -Wstringop- overflow, and
> -Wformat-overflow).
>
> That said, I'm not sure I understand under what conditions
> ssizetype is not the signed equivalent of sizetype.  Can you
> clarify?

I meant to use size_type_node (size_t), not sizetype.  But
I just checked that ptrdiff_type_node is initialized in
build_common_tree_nodes and thus always available.

> As an aside, at some point I would like to get away from a type
> based limit in all these warnings and instead use one that can
> be controlled by an option so that a user can impose a lower limit
> on the maximum size of an object and have all size-related warnings
> (and perhaps even optimizations) enforce it and benefit from it.

You could add a --param that is initialized from ptrdiff_type_node.

>> +      up_bound_p1 = fold_build2 (TRUNC_DIV_EXPR, ssizetype, maxbound,
>> eltsize);
>> +
>>
>> int_const_binop if you insist on using trees...
>
>
> Sure.  (I think offset_int would be more convenient but the rest
> of the function uses trees so I just stuck to those to avoid
> converting things back and forth or disrupting more of the code
> than I had to.)

Understood.

>>
>> +      tree arg = TREE_OPERAND (ref, 0);
>> +      tree_code code = TREE_CODE (arg);
>> +      if (code == COMPONENT_REF)
>> +       {
>> +         HOST_WIDE_INT off;
>> +         if (tree base = get_addr_base_and_unit_offset (ref, &off))
>> +           up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype, up_bound_p1,
>> +                                      TYPE_SIZE_UNIT (TREE_TYPE (base)));
>> +         else
>> +           return;
>>
>> so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will be
>> false).
>> simply not subtracting anyhing instead of returning would be
>> conservatively
>> correct, no?  Likewise subtracting the offset of the array for all
>> "previous"
>> variably indexed components with assuming the lowest value for the index.
>> But as above I think compensating for the offset of the array within the
>> object
>> is academic ... ;)
>
>
> I was going to say yes (it gives up) but on second thought I don't
> think it does.  Only the major index can be unbounded and the code
> does consider the size of the sub-array when checking the major
> index.  So, IIUC, I think this works correctly as is (*).  What
> doesn't work is VLAs but those are a separate problem.  Let me
> know if I misunderstood your question.

get_addr_base_and_unit_offset will return NULL if there's any variable
component in 'ref'.  So as written it seems to be dead code (you
want to pass 'arg'?)

I was asking you to remove the 'else return' because w/o subtracting
the upper bound is just more conservative.

>
>> +      else if (code == STRING_CST)
>> +       up_bound_p1 = build_int_cst (ssizetype, TREE_STRING_LENGTH (arg));
>>
>> that one is more interesting -- why's the TYPE_DOMAIN of the STRING_CST
>> lacking
>> a max value?  Iff we use build_string_literal it should have the proper
>> type.
>
>
> Good question!  STRING_CST does have a domain.  The problem is
> that array_at_struct_end_p() returns true for STRING_CST.  I've
> added the handling to the function and removed the block above
> from the latest patch.

Can you split out the STRING_CST handling and commit that separately
(split the testcase)?  That part looks ok.

Thanks,
Richard.

> Martin
>
> [*] This is diagnosed:
>
>   struct C { char d[4]; };
>   struct B { struct C c; };
>   struct A { struct B b[7]; };
>
>   int f (struct A a, unsigned i, unsigned k)
>   {
>     if (i < 7) i = 7;
>     if (k < 4) k = 4;
>
>     return a.b[i].c.d[k];
>   }
>
>   warning: array subscript 4 is above array bounds of ‘char[4]’
> [-Warray-bounds]
>      return a.b[i].c.d[k];
>             ~~~~~~~~~~^~~
>   warning: array subscript 7 is above array bounds of ‘struct B[7]’
> [-Warray-bounds]
>      return a.b[i].c.d[k];
>             ~~~^~~
>

Reply via email to