https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59856

--- Comment #2 from Josh Triplett <josh at joshtriplett dot org> ---
(In reply to Tom Tromey from comment #1)
> Created attachment 36815 [details]
> simple version in python
> 
> This implements the basics of the checker using the gcc python plugin.

Awesome!

> A few notes on this:
> 
> There didn't seem to be a great way to insert __context__ into the IR.
> I ended up making a const function with a special attribute.  I used a
> function to
> get it into the IR, const so that it would be optimized away later, and
> an attribute so we could recognize it.

That sounds completely sensible.

> The attribute syntax in the test case doesn't work with gcc.  The attributes
> on the function definitions can't appear at the end.

Placing attributes at the end of the function definition works with Sparse, and
as far as I can tell with GCC as well.  The Linux kernel uses trailing
attributes on functions (both prototypes and definitions) fairly frequently,
including with the context attribute.

> I didn't handle the version of __context__ or the attribute that takes
> an explicit lock.  For __context__ this can be done (varargs function maybe);
> but for the attribute I am not so sure, because I don't think the
> function parameters are in scope when the attribute is parsed.  (Not sure,
> maybe there's some way.)

As mentioned, Sparse doesn't currently pay attention to that argument either,
for much the same reason.  As a result, users of that argument currently don't
always pass something sensible.  I'd love to see a solution that makes it
possible to detect things like "foo_lock(&foo); foo_unlock(&foo2);", but the
first pass can just ignore those arguments (though it needs to parse and ignore
them for compatibility).

In general, it seems useful to have the ability to reference argument names in
attributes on a function.  For instance, imagine using argument names rather
than indexes for the printf attribute or another attribute like it.

The context attribute, though, wants something more complex than that: the
ability to provide expressions based on those arguments, such as arg->lock. 
That does seem like a tall order.

> The upstream sparse validation/context.c has a bug where _ca is declared
> returning void (though this is fixed in the version attached here).
> 
> One further issue is that I chose to implement this relatively early in
> the pipeline.  I did it after gimplification and after the cfg is built
> (these both make it simpler to implement); but before most optimizations
> are done.  This is the uninitialized warning situation again, I think,
> where exactly when the pass is invoked will affect the results.  It also
> brings up the question of what exactly the semantics are; for example
> whether code like this is valid:
> 
> if (cond)
>   lock();
> blah;
> if (cond)
>   unlock();
> 
> This plugin will warn about this code; but maybe it should not.

One of the reasons I wanted to see this in GCC, in addition to making it more
widely available, was to take advantage of GCC's better (or rather *existent*)
optimization and simplification passes.  While I don't necessarily think
conditional locks like the above should work (it seems easy enough to rework
that code into something more obvious that doesn't require the compiler to know
if blah can affect cond), I *do* think a GCC-based plugin has the potential to
infer more information without as many annotations.  For instance, consider a
function which calls a function annotated as __must_hold, but only if passed a
given flag.  GCC could then know that the function only needs the lock *if*
passed that flag (or if it can't tell until runtime).  (Going too far in that
direction leads to algebraic evaluators, but still...)

So I suspect it may make sense to run this after inlining, specialization,
dead-code elimination, and similar.

> Python might be too slow for your taste, but it was easy to write :) and
> I think the IR and syntax issues above are problems for any approach.

I don't mind the use of Python, though I wonder how easily such a script could
work by default.  Given this script, how do you invoke GCC and run it?  How
much stability does this interface provide?  Could the Linux kernel ship such a
script and invoke it as part of the compile, given some dependency checks (and
likely a Kconfig option if it increases build time significantly)?

Reply via email to