On 11/2/21 4:29 PM, David Malcolm wrote:
On Mon, 2021-11-01 at 16:17 -0600, Martin Sebor via Gcc-patches wrote:
Patch 1 in the series detects a small subset of uses of pointers
made indeterminate by calls to deallocation functions like free
or C++ operator delete.  To control the conditions the warnings
are issued under the new -Wuse-after-free= option provides three
levels.  At the lowest level the warning triggers only for
unconditional uses of freed pointers and doesn't warn for uses
in equality expressions.  Level 2 warns also for come conditional
uses, and level 3 also for uses in equality expressions.

I debated whether to make level 2 or 3 the default included in
-Wall.  I decided on 3 for two reasons: 1) to raise awareness
of both the problem and GCC's new ability to detect it: using
a pointer after it's been freed, even only in principle, by
a successful call to realloc, is undefined, and 2) because
it's trivial to lower the level either globally, or locally
by suppressing the warning around such misuses.

I've tested the patch on x86_64-linux and by building Glibc
and Binutils/GDB.  It triggers a number of times in each, all
due to comparing invalidated pointers for equality (i.e., level
3).  I have suppressed these in GCC (libiberty) by a #pragma,
and will see how the Glibc folks want to deal with theirs (I
track them in BZ #28521).

For reference, this is:
   https://sourceware.org/bugzilla/show_bug.cgi?id=28521


The tests contain a number of xfails due to limitations I'm
aware of.  I marked them pr?????? until the patch is approved.
I will open bugs for them before committing if I don't resolve
them in a followup.


[...snip...]

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c5730228821..eb4ecb56dcc 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -4341,6 +4341,60 @@ annotations.
  Warn about overriding virtual functions that are not marked with the
  @code{override} keyword.
+@item -Wuse-after-free
+@itemx -Wuse-after-free=@var{n}
+@opindex Wuse-after-free
+@opindex Wno-use-after-free
+Warn about uses of pointers to dynamically allocated objects that
have
+been rendered indeterminate by a call to a deallocation function.
+
+@table @gcctabopt
+@item -Wuse-after-free=1
+At level 1 the warning attempts to diagnose only unconditional uses
of
+pointers made indeterminate by a deallocation call.  This includes
+double-@code{free} calls.  Although undefined, uses of indeterminate
+pointers in equality (or inequality) expressions are not diagnosed
at
+this level.
+@item -Wuse-after-free=2
+At level 2, in addition to unconditional uses the warning also
diagnoses
+conditional uses of pointers made indeterminate by a deallocation
call.
+As at level 1, uses in (or inequality) equality expressions are not
+diagnosed.  For example, the second call to @code{free} in the
following
+function is diagnosed at this level:
+@smallexample
+struct A @{ int refcount; void *data; @};
+
+void release (struct A *p)
+@{
+  int refcount = --p->refcount;
+  free (p);
+  if (refcount == 0)
+    free (p->data);   // warning: p may be used after free
+@}
+@end smallexample
+@item -Wuse-after-free=3
+At level 3, the warning also diagnoses uses of indeterminate
pointers in
+equality expressions.  All uses of indeterminate pointers are
undefined
+but equality tests sometimes appear after calls to @code{realloc} as
+an attempt to determine whether the call resulted in relocating the
object
+to a different address.  They are diagnosed at a separate level to
aid
+legacy code gradually transition to safe alternatives.  For example,
+the equality test in the function below is diagnosed at this level:
+@smallexample
+void adjust_pointers (int**, int);
+
+void grow (int **p, int n)
+@{
+  int **q = (int**)realloc (p, n *= 2);
+  if (q == p)
+    return;
+  adjust_pointers ((int**)q, n);
+@}
+@end smallexample
+@end table
+
+@option{-Wuse-after-free=3} is included in @option{-Wall}.

Recapping our chat earlier today, I confess to not being familiar with
this aspect of the C standard, but IIRC you were saying that the
pointer passed in to "realloc" is always "indeterminate" after a
successful call, and indeed I see on:
   https://en.cppreference.com/w/c/memory/realloc
"The original pointer ptr is invalidated and any access to it is
undefined behavior (even if reallocation was in-place)."

Does this really mean that it's undefined to use the original "p" after
an in-place reallocation (as opposed to "*p"?).  I find this
surprising, and I think many of our users would too.

Yes, it's undefined to use the pointer after a successful
reallocation (it's okay to use it after a failed one, although
as discussed in WG14 DR400, even that can be problematic due
to a lack of clarity in the standards).

One rationale for it is to make it possible to implement realloc
with no in-place extension, as a sequence of three calls: malloc,
memcpy, and free.  Another is that using invalidated pointers is
inherently unsafe and the C committee wanted to leave room for
debugging hardware that traps on the use of such pointers
(US7966480 describes one such solution).  Such solutions are
only feasible when all invalid pointers are treated the same.

If that's the case, is there a standards-conforming way for code to
distinguish between a successful in-place vs a successful moving
realloc?  How would a user "transition [their code] to safe
alternatives"?  (the docs should probably spell that out).  i.e. what's
an idiomatic way for the user to write this logic correctly?

There's no portable mechanism to detect whether a realloc call
moved an object.  Strictly conforming programs must avoid it.
The idiomatic way of avoiding it is to store offsets rather
than pointers to reallocated memory.

How is this behavior unsafe, or how could it become unsafe?  If it's
merely a theoretical concern, I think level 2 would be better for -Wall
(or the default).

It's difficult to quantify how unsafe this or that undefined
construct is.  I think it's probably comparable to passing null
pointers and zero size to memcpy (diagnosed by -Wnonnull, also
in -Wall).  Or to passing T* to %p in a call to printf where T
isn't void (diagnosed by -Wformat, in -Wall).  Or to [ab]using
uninitialized variables as a source of entropy.

It's a judgment call what checkers to include at what level of
what option.  I made this choice not just because it's undefined
but also because it could help find mistakes in nearby code (as
in the Binutils case below or ideally more impactful).  Because
it's little known I'm hoping even the benign instances will help
raise awareness of the issue.  But I made it a separate level
to make it easy for users to suppres, and for us to adjust if
necessary.


Have you seen level 3 catch anything that *isn't* a usage of realloc
checking success for in-place resizing vs moving?

The Glibc use in ldconfig.c is a use after free.  But it's just
sloppy coding, not a "real" bug.  It also found one instance in
Binutils (libctf/ctf-open-bfd.c).  That one is also not really
a bug, more like a thinko:

      isymbuf = bfd_elf_get_elf_syms (abfd, symhdr, symcount, 0,
                                      NULL, symtab, NULL);
      free (isymbuf);
      if (isymbuf == NULL)
        {
          bfderrstr = N_("cannot read symbol table");
          goto err_free_sym;
        }

The call to free should follow the if statement.

I expect it to find plenty more benign misues like this in other
code, but what I'm really hoping of course is that it will help
find actual bugs, even if indirectly.


[...snip...]

Hope this is constructive; sorry about my ignorance here.
Dave

Sure.

Martin

Reply via email to