On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor <mse...@gmail.com> wrote: > On 10/18/2017 04:48 AM, Richard Biener wrote: >> >> On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor <mse...@gmail.com> wrote: >>> >>> While testing my latest -Wrestrict changes I noticed a number of >>> opportunities to improve the -Warray-bounds warning. Attached >>> is a patch that implements a solution for the following subset >>> of these: >>> >>> PR tree-optimization/82596 - missing -Warray-bounds on an out-of >>> bounds index into string literal >>> PR tree-optimization/82588 - missing -Warray-bounds on an excessively >>> large index >>> PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds >>> inner indices >>> >>> The patch also adds more detail to the -Warray-bounds diagnostics >>> to make it easier to see the cause of the problem. >>> >>> Richard, since the changes touch tree-vrp.c, I look in particular >>> for your comments. >> >> >> + /* Accesses to trailing arrays via pointers may access storage >> + beyond the types array bounds. For such arrays, or for flexible >> + array members as well as for other arrays of an unknown size, >> + replace the upper bound with a more permissive one that assumes >> + the size of the largest object is SSIZE_MAX. */ >> >> I believe handling those cases are somewhat academic, but ... > > > Thanks for the quick review! > > I agree the SSIZE_MAX tests handle corner cases and there are > arguably more important gaps here to plug (e.g., VLAs). Then > again, most bugs tend to lurk in corner cases of one kind or > another and these seemed like a good way for me to come up to > speed on the implementation before tackling those. If you > have suggestions for which to dive into next I'm happy to take > them. > >> + tree eltype = TREE_TYPE (ref); >> + tree eltsize = TYPE_SIZE_UNIT (eltype); >> >> needs to use array_ref_element_size. Note that this size can be >> non-constant in which case you now ICE so you have to check >> this is an INTEGER_CST. > > > Thanks. I did reproduce a few ICEs due to VLAs. I've fixed > the problems and added tests for them. One-dimensional VLAs > are now handled the same way arrays of unknown bound are. > Handling them more intelligently (i.e., taking into account > the ranges their bounds are in) and especially handling > multidimensional VLAs will take some effort. > >> >> + tree maxbound = TYPE_MAX_VALUE (ssizetype); >> >> please don't use ssizetype - sizetype can be of different precision >> than pointers (and thus objects). size_type_node would come >> close (maps to size_t), eventually ptrdiff_type_node is also >> defined by all frontends. > > > Okay, I've changed it to sizetype (it would be nice if there > were a cleaner way of doing it rather than by bit-twiddling.) > > ptrdiff_type would have been my first choice but it's only defined > by the C/C++ front ends and not available in the middle-end, so > the other warnings that consider object sizes deal with ssizetype > (e.g., -Walloc-size-larger-than, -Wstringop- overflow, and > -Wformat-overflow). > > That said, I'm not sure I understand under what conditions > ssizetype is not the signed equivalent of sizetype. Can you > clarify?
I meant to use size_type_node (size_t), not sizetype. But I just checked that ptrdiff_type_node is initialized in build_common_tree_nodes and thus always available. > As an aside, at some point I would like to get away from a type > based limit in all these warnings and instead use one that can > be controlled by an option so that a user can impose a lower limit > on the maximum size of an object and have all size-related warnings > (and perhaps even optimizations) enforce it and benefit from it. You could add a --param that is initialized from ptrdiff_type_node. >> + up_bound_p1 = fold_build2 (TRUNC_DIV_EXPR, ssizetype, maxbound, >> eltsize); >> + >> >> int_const_binop if you insist on using trees... > > > Sure. (I think offset_int would be more convenient but the rest > of the function uses trees so I just stuck to those to avoid > converting things back and forth or disrupting more of the code > than I had to.) Understood. >> >> + tree arg = TREE_OPERAND (ref, 0); >> + tree_code code = TREE_CODE (arg); >> + if (code == COMPONENT_REF) >> + { >> + HOST_WIDE_INT off; >> + if (tree base = get_addr_base_and_unit_offset (ref, &off)) >> + up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype, up_bound_p1, >> + TYPE_SIZE_UNIT (TREE_TYPE (base))); >> + else >> + return; >> >> so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will be >> false). >> simply not subtracting anyhing instead of returning would be >> conservatively >> correct, no? Likewise subtracting the offset of the array for all >> "previous" >> variably indexed components with assuming the lowest value for the index. >> But as above I think compensating for the offset of the array within the >> object >> is academic ... ;) > > > I was going to say yes (it gives up) but on second thought I don't > think it does. Only the major index can be unbounded and the code > does consider the size of the sub-array when checking the major > index. So, IIUC, I think this works correctly as is (*). What > doesn't work is VLAs but those are a separate problem. Let me > know if I misunderstood your question. get_addr_base_and_unit_offset will return NULL if there's any variable component in 'ref'. So as written it seems to be dead code (you want to pass 'arg'?) I was asking you to remove the 'else return' because w/o subtracting the upper bound is just more conservative. > >> + else if (code == STRING_CST) >> + up_bound_p1 = build_int_cst (ssizetype, TREE_STRING_LENGTH (arg)); >> >> that one is more interesting -- why's the TYPE_DOMAIN of the STRING_CST >> lacking >> a max value? Iff we use build_string_literal it should have the proper >> type. > > > Good question! STRING_CST does have a domain. The problem is > that array_at_struct_end_p() returns true for STRING_CST. I've > added the handling to the function and removed the block above > from the latest patch. Can you split out the STRING_CST handling and commit that separately (split the testcase)? That part looks ok. Thanks, Richard. > Martin > > [*] This is diagnosed: > > struct C { char d[4]; }; > struct B { struct C c; }; > struct A { struct B b[7]; }; > > int f (struct A a, unsigned i, unsigned k) > { > if (i < 7) i = 7; > if (k < 4) k = 4; > > return a.b[i].c.d[k]; > } > > warning: array subscript 4 is above array bounds of ‘char[4]’ > [-Warray-bounds] > return a.b[i].c.d[k]; > ~~~~~~~~~~^~~ > warning: array subscript 7 is above array bounds of ‘struct B[7]’ > [-Warray-bounds] > return a.b[i].c.d[k]; > ~~~^~~ >