Re: [PATCH] Make strlen range computations more conservative

Richard Biener Mon, 20 Aug 2018 03:07:02 -0700

On Fri, Aug 3, 2018 at 9:19 AM Jeff Law <l...@redhat.com> wrote:
>
> On 07/25/2018 01:23 AM, Richard Biener wrote:
> > On Tue, 24 Jul 2018, Bernd Edlinger wrote:
> >
> >> On 07/24/18 23:46, Jeff Law wrote:
> >>> On 07/24/2018 01:59 AM, Bernd Edlinger wrote:
> >>>> Hi!
> >>>>
> >>>> This patch makes strlen range computations more conservative.
> >>>>
> >>>> Firstly if there is a visible type cast from type A to B before passing
> >>>> then value to strlen, don't expect the type layout of B to restrict the
> >>>> possible return value range of strlen.
> >>> Why do you think this is the right thing to do?  ie, is there language
> >>> in the standards that makes you think the code as it stands today is
> >>> incorrect from a conformance standpoint?  Is there a significant body of
> >>> code that is affected in an adverse way by the current code?  If so,
> >>> what code?
> >>>
> >>>
> >>
> >> I think if you have an object, of an effective type A say char[100], then
> >> you can cast the address of A to B, say typedef char (*B)[2] for instance
> >> and then to const char *, say for use in strlen.  I may be wrong, but I 
> >> think
> >> that we should at least try not to pick up char[2] from B, but instead
> >> use A for strlen ranges, or leave this range open.  Currently the range
> >> info for strlen is [0..1] in this case, even if we see the type cast
> >> in the generic tree.
> >
> > You raise a valid point - namely that the middle-end allows
> > any object (including storage with a declared type) to change
> > its dynamic type (even of a piece of it).  So unless you can
> > prove that the dynamic type of the thing you are looking at
> > matches your idea of that type you may not derive any string
> > lengths (or ranges) from it.
> >
> > BUT - for the string_constant and c_strlen functions we are,
> > in all cases we return something interesting, able to look
> > at an initializer which then determines that type.  Hopefully.
> > I think the strlen() folding code when it sets SSA ranges
> > now looks at types ...?
> I'm leaning towards a similar conclusion, namely that we can only rely
> on type information for the pointer that actually gets passed to strlen,
> which 99.9% of the time is (char *), potentially with const qualifiers.


It's 100% (char *) because the C standard says arguments are converted
to the argument type.

> It's tempting to look back through the cast to find a cast from a char
> array but I'm more and more concerned that it's not safe unless we can
> walk back to an initializer.
>
> What this might argue is that we need to distinguish between a known
> range and a likely range.  I really dislike doing that again.  We may
> have to see more real world cases where the likely range allows us to
> improve the precision of the sprintf warnings (since that's really the
> goal of improved string length ranges).
>
>
>
> >
> > Consider
> >
> > struct X { int i; char c[4]; int j;};
> > struct Y { char c[16]; };
> >
> > void foo (struct X *p, struct Y *q)
> > {
> >   memcpy (p, q, sizeof (struct Y));
> >   if (strlen ((char *)(struct Y *)p + 4) < 7)
> >     abort ();
> > }
> >
> > here the GIMPLE IL looks like
> >
> >   const char * _1;
> >
> >   <bb 2> [local count: 1073741825]:
> >   _5 = MEM[(char * {ref-all})q_4(D)];
> >   MEM[(char * {ref-all})p_6(D)] = _5;
> >   _1 = p_6(D) + 4;
> >   _2 = __builtin_strlen (_1);
> >
> > and I guess Martin would argue that since p is of type struct X
> > + 4 gets you to c[4] and thus strlen of that cannot be larger
> > than 3.
> But _1 is of type const char * and that's what's passed to strlen.  The
> type of P and Q are irrelevant ISTM.
>
> Jeff
>
>

Re: [PATCH] Make strlen range computations more conservative

Reply via email to