On Tue, Nov 7, 2017 at 4:23 AM, Martin Sebor <mse...@gmail.com> wrote: > On 11/06/2017 11:41 AM, Jeff Law wrote: >> >> On 10/29/2017 10:15 AM, Martin Sebor wrote: >>> >>> Ping -- please see my reply below. >>> >>> On 10/20/2017 09:57 AM, Richard Biener wrote: >>>>>>>> >>>>>>>> get_addr_base_and_unit_offset will return NULL if there's any >>>>> >>>>> variable >>>>>>>> >>>>>>>> component in 'ref'. So as written it seems to be dead code (you >>>>>>>> want to pass 'arg'?) >>>>>>> >>>>>>> >>>>>>> >>>>>>> Sorry, I'm not sure I understand what you mean. What do you think >>>>>>> is dead code? The call to get_addr_base_and_unit_offset() is also >>>>>>> made for an array of unspecified bound (up_bound is null) and for >>>>>>> an array at the end of a struct. For those the function returns >>>>>>> non-null, and for the others (arrays of runtime bound) it returns >>>>>>> null. (I passed arg instead of ref but I see no difference in >>>>>>> my tests.) >>>>>> >>>>>> >>>>>> If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg') >>>>> >>>>> it will >>>>>> >>>>>> return the offset of 'c'. If you pass a.b[j].c it will still return >>>>> >>>>> NULL. >>>>>> >>>>>> You could use get_ref_base_and_extent which will return the offset >>>>>> of a.b[0].c in this case and sets max_size != size - but you are only >>>>>> interested in offset. The disadvantage of get_ref_base_and_extent >>>>>> is it returns offset in bits thus if the offset is too large for a >>>>> >>>>> HWI >>>>>> >>>>>> you'll instead get offset == 0 and max_size == -1. >>>>>> >>>>>> Thus I'm saying this is dead code for variable array accesses >>>>>> (even for the array you are warning about). Yes, for constant index >>>>>> and at-struct-end you'll get sth, but the warning is in VRP because >>>>>> of variable indexes. >>>>>> >>>>>> So I suggest to pass 'arg' and use get_ref_base_and_extent >>>>>> for some extra precision (and possible lossage for very very large >>>>>> structures). >>>>> >>>>> >>>>> Computing bit offsets defeats the out-of-bounds flexible array >>>>> index detection because the computation overflows (the function >>>>> sets the offset to zero). >>>> >>>> >>>> It shouldn't if you pass arg rather than ref. >>> >>> >>> I suspect you missed my reply in IRC where I confirmed that this >>> approach doesn't work for the reason you yourself mentioned above: >>> >>>>>> The disadvantage of get_ref_base_and_extent >>>>>> is it returns offset in bits thus if the offset is too large >>>>>> for a HWI you'll instead get offset == 0 and max_size == -1. >>> >>> >>> This means that using the function you suggest would either prevent >>> detecting the out-of-bounds indices that overflow due to the bit >>> offset, thus largely defeating the purpose of the patch (to detect >>> excessively large indices), or not printing the value of the out-of >>> bounds index in the warning in this case, which would in turn mean >>> further changes to the rest of the logic. >> >> I think Richi's point is that it's more important (in his opinion) to >> handle the varying objects within an access like a.b.c[i] than to handle >> cases that would overflow when converted to bits. Thus he'd prefer to >> use get_ref_base_and_extent over get_addr_base_and_unit_offset. >> >> I wonder if a hybrid approach here would work. >> >> ie, use get_ref_base_and_extent per Richi's request. When that returns >> an max_size of -1, then call get_addr_base_and_unit_offset and if it >> returns that the offset is huge, then warn without any deeper analysis. >> >> Yea, this could trip a false positive if someone makes an array that is >> half the address space (give or take), but that's probably not at all >> common. Whereas the additional cases handled by get_ref_base_and_extent >> are perhaps more useful. >> >> Thoughts? > > > I'm always open to suggestions for improvements. I just haven't > been able to construct a test case to demonstrate the improvement > gained by using get_ref_base_and_extent. The only test cases I've > come up with show false positives. It could very well be because > of my lack of experience with the APIs so if there is a trade-off > here that could be compensated for by the approach you suggest I'm > happy to incorporate it into the patch. I just need a test case > to verify that the solution works as intended.
The difficulty with a testcase like struct { struct A { int b[1]; } a[5]; } x; x.a[i].b[j] is that b is not considered an array at struct end since one of my recent changes to array_at_struct_end (basically it disallows having a flex array as the last member of an array). It would still stand for non-array components with variable offset but you can't create C testcases for that. So yes, for the specific case within the array_at_struct_end_p condition get_addr_base_and_unit_offset is enough. IIRC the conditon was a bit more than just get_addr_base_and_unit_offset. up_bound != INTEGER_CST for example. So make the above void foo (int n, int i) { struct { struct A { int b[n]; } a[5]; } x; return x.a[i].b[PTRDIFF_MAX/2]; } with appropriately adjusted constant. Does that give you the testcase you want? As of "it works, catches corner-cases, ..." - yes, it does, but it adds code that needs to be maintained, may contain bugs, is executed even for valid code. Richard. > Martin