Re: Using source-level annotations to help GCC detect buffer overflows

David Brown Thu, 01 Jul 2021 01:19:09 -0700


Thanks for the reply here.  I've snipped a bit to save space.

On 30/06/2021 19:12, Martin Sebor wrote:

On 6/29/21 12:31 PM, David Brown wrote:

On 29/06/2021 17:50, Martin Sebor wrote:

On 6/29/21 6:27 AM, David Brown wrote:

On 28/06/2021 21:06, Martin Sebor via Gcc wrote:

I wrote an article for the Red Hat Developer blog about how
to annotate code to get the most out of GCC's access checking
warnings like -Warray-bounds, -Wformat-overflow, and
-Wstringop-overflow.  The article published last week:
https://developers.redhat.com/articles/2021/06/25/use-source-level-annotations-help-gcc-detect-buffer-overflows


Could these attributes not be attached to the arguments when the
function is called, or the parameters when the function is expanded?
After all, in cases such as the "access" attribute it is not the
function as such that has the access hints, it is the parameters of the
function.

(I'm talking here based on absolutely no knowledge of how this is
implemented, but it's always possible that a different view, unbiased by
knowing the facts, can inspire new ideas.)


Attaching these attributes to function parameters is an interesting
idea that might be worth exploring.  We've talked about letting
attribute access apply to variables for other reasons (detecting
attempts to modify immutable objects, as I mention in the article).
so your suggestion would be in line with that.  Associating two
variables that aren't parameters might be tricky.

It has always seemed to me that some of these attributes are about theparameters, rather than the functions. It would make more sense whenusing them if the attributes worked as qualifiers for the parameter(vaguely like "restrict") and were written before the parameter itself,rather than as a function attribute with a parameter index. Of course,that gets messy when you have an attribute that ties two parameterstogether (readonly with a size, for example).


Certainly since first reading about the "access" attributes, I have been
considering adding them to my current project.  I have also been mulling
around in my head possibilities of making variadic templates in C++ that
add access attributes in the right places for some kinds of pointers -
but now that I know the attributes will get dropped for inline
functions, and such templates would involve inline functions, there is
little point.  (Maybe I will still figure a neat way to do this for
external functions - it just won't be useful in as many places.)


Unfortunately, with extensive inlining and templates, C++ support
for these attributes is less robust than it ideally would be.
Improving it is on my to do list.

I think with modern coding styles and compiler usage, you have to assumethat /everything/ is likely to get inlined somewhere. Lots of C++coding is done with header-only libraries, full of templates and inlinefunctions. Even variables can be templates or inline variables withcurrent standards. And no matter how the code is written, once you useLTO then pretty much anything can be inlined back and forth, orpartially inlined, or partially outlined (is that the right term?), orcloned. The concept of "function" in C or C++ that corresponds to anexternally visible assembly label, a sequence of assembly instructionsfollowing a strict ABI and a "return", is fast disappearing.


I don't foresee you or the other gcc developers getting bored anytime soon!

Whether an attribute has an effect depends on the compilation stage
where it's handled.  warn_unused_result is handled very early (well
before inlining) so it always has the expected effect.  Attribute
nonnull is handled both early (to catch the simple cases) and also
later, after inlining, to benefit from some flow analysis, so its
effect is lost if the function it attaches to is inlined.  Attribute
access is handled very late and so it suffers from this problem
even more.


I suppose some attributes are not needed for inline functions, since the
compiler has the full function definition and can figure some things out
itself.  That would apply to "pure" and "const" functions, I expect.


I was going to agree, but then I tested it and found out that const
(and most likely pure) do actually make a difference on inline
functions.  For example in the test case below the inequality is
folded to false.

int f (int);

__attribute__ ((const))
int g (int i) { return f (i); }

void h (int i)
{
   if (g (i) != g (i))
     __builtin_abort ();
}

On the other hand, the equality below is not folded unless f() is
also declared with attribute malloc:

void* f (void);

static int a[1];

__attribute__ ((malloc))
void* g (void) { return f (); }

void h (void)
{
   if (g () == a)
     __builtin_abort ();
}

With heavy inlining (e.g., with LTO) whether a function attribute
will have an effect or not in a given caller is anyone's guess :(

And if you want a parameter to be non-null, it's possible to do a check
inside the function:

extern void __attribute__((error("Nonnull check failed")))
        nonnull_check_failed(void);

#define nonnull(x) \
     do { \
         if (__builtin_constant_p(!(x))) { \
             if (!(x)) nonnull_check_failed(); \
         } \
         if (!(x)) __builtin_unreachable(); \
     } while (0)


inline int foo(int *p) {
    nonnull(p);
    (*p)++;
    return *p;
}


(The "__builtin_unreachable()" line could also be a call to an error
handler, or missing entirely, according to need.)


If you try to call "foo(0)" and the compiler can see at compile time
that the parameter is null, you'll get a compile-time error.  I've used
that kind of check in my code, but it's a little uglier than
__attribute__((nonnull)) !


This is a possible workaround.  GCC should be able to do the same
thing for you (that's the __builtin_warning idea).

__builtin_warning is, AFAIK, not documented in the gcc user manual. Butif there were builtins to give a warning or an error that could be usedinstead of my usual method of calling a non-existent function with"error" attribute, that would be neater.

(I sometimes use macros like this to give the compiler extra informationthat it can use for optimisation or other checks. I have exactly thesame macro, with the name "assume", that is useful in some code as acheck, an optimisation hint, and documentation to the programmer.)

The new attribute malloc (to associate allocators with deallocators)
is also handled very late but it deals with the same problem by
disabling inlining.  This was done to avoid false positives, not
to prevent false negatives, but it works for both.  Disabling
inlining is obviously suboptimal and wouldn't be appropriate for
simple accessor functions but for memory allocation it seems like
an acceptable tradeoff.


I've sometimes had allocator functions that are so simple that inlining
is appropriate (on dedicated embedded systems it is not unusual to need
to allocate some memory early on in the code, but never need to free it,
leading to minimal allocation functions).  But the cost of a function
call would not be noticeable.


The inlining problem is not unique to attributes that affect
warnings.  It impacts all function (and function type) attributes,
including those that affect optimization.  Those that specifically
change optimization options disable inlining to avoid meaningless
mismatches between options controlling the codegen of the caller
and those intended to control the codegen for the callee.


It's obvious that some attributes don't play well with inlining, such as
"section" (unless inlined in a function with the same section
attribute), but it looks like there is a lot of detail that is missing
from the manual pages here.  And some of these effects are
counter-intuitive and unhelpful.


Agreed.  I hope my articles will be helpful here (it would be nice

if GCC had a blog where they could be posted too).

Yes, that would be good. I can't speak for anyone else, but I findarticles like yours helpful.

The manual
should be updated to mention these basic limitations; the challenge
is the inconsistency between how they're applied.

I would say it is better to note in the manual that particularattributes may be applied inconsistently, that let users assume thatthey are always applied. People can then be explicit in their code ifthey need consistency, such using the "noinline" attribute to ensurethat a function is not inlined and its other attributes are applied.


For example, it is very occasionally useful to arithmetic operations on
signed types with wrapping semantics rather than the usual "overflow
doesn't happen" semantics that lets gcc produce more efficient code.  A
neat and convenient way to write that in C++ would be to make an enum
class for wrapping ints:

enum class WInt : int {};

__attribute__((optimize("-fwrapv")))
WInt operator + (WInt x, WInt y) {
     return (WInt) ((int) x + (int) y);
}

__attribute__((optimize("-fwrapv")))
WInt operator - (WInt x, WInt y) {
     return (WInt) ((int) x - (int) y);
}

// etc.


Simple, clear, safe (you can't mix "int" and "WInt" by mistake) and
efficient - one might think.  But it turns out this is not the case -
using these operators from a function that does not also have "-fwrapv"
in effect will lead to function calls.

I'm glad I found out now, and not in a situation where inlining was
important.  But I think it would be a good thing to mention this in the
documentation.  (It would be even better to remove the restriction on
inlining, but I expect that will take more time!)


I agree (on both counts).  Raising bugs (ideally with test cases
showing counterintuitive effects) would help.

Martin

I've filed this as a bug at<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101279>.


David

Re: Using source-level annotations to help GCC detect buffer overflows

Reply via email to