On Tue, Apr 3, 2012 at 6:02 PM, David Malcolm <dmalc...@redhat.com> wrote:
> On Tue, 2012-04-03 at 15:23 +0200, Richard Guenther wrote:
>> On Tue, Apr 3, 2012 at 12:03 PM, Richard Guenther
>> <richard.guent...@gmail.com> wrote:
>> > On Mon, Apr 2, 2012 at 7:21 PM, David Malcolm <dmalc...@redhat.com> wrote:
>> >> I wrote a script and ported my proposed API for GCC plugins from my
>> >> CamelCase naming convention to an underscore_based_convention (and
>> >> manually fixed up things in a few places also).
>> >>
>> >> The result compiles and passes the full test suite for the Python
>> >> plugin; that said, I'm still breaking the encapsulation in quite a few
>> >> places (hey, this is an experimental prototype).
>> >>
>> >> You can see the latest version of it within the "proposed-plugin-api"
>> >> branch of the Python plugin here:
>> >> http://git.fedorahosted.org/git/?p=gcc-python-plugin.git;a=shortlog;h=refs/heads/proposed-plugin-api
>> >>
>> >> within the "proposed-plugin-api" subdirectory.
>> >
>> > Hmm, how do I get it?  I did
>> >
>> > git clone http://git.fedorahosted.org/git/proposed-plugin-api
>> >
>> > but there is nothing in gcc-python-plugin/.  And
>> >
>> > git checkout proposed-plugin-api
>> >
>> > says I'm already there ...?
>>
>> Meanwhile the directory magically appeared (heh ...).
>
> [The ways of git are something of a mystery to me: 95% of the time it's
> the best revision control system I've ever used, but 5% of the time the
> most obtuse]
>
>> Overall it looks good
> Thanks for taking a look.
>
>>  - but it seems to leak GCC headers into the
>> plugin API (via gcc-plugin.h and input.h inclusion).  Thus, it
>> lacks separating the plugin API headers from the plugin API implementation
>> headers?
> That's true.  The big information "leak" happens inside
> gcc-semiprivate-types.h, which defines the various structs that act like
> pointers, each with a decl like this:
>
> struct gcc_something {
>   some_private_gcc_pointer_type inner;
> };
>
> It would be possible to make this more opaque like this:
>
> struct gcc_something {
>   struct some_private_gcc_struct *inner;
> };
>
> given that you then don't need a full definition of that inner struct
> visible to users.  Though location_t is leaked, and in this approach,
> there isn't a way around that, I think.

See the reply by Roman.

>
>> Or maybe I'm not looking at the right place.
>> I also dislike the use of GCC_PUBLIC_API, etc. - everything in
>> the plugin API headers should be obviously public - and the implementation
>> detail should be an implementation detail that should not need to care.
>
> I added that macro based on the example of other plugin/embedding
> systems I've seen (e.g. Python), where it's handy to make that macro be
> non-trivial on some platforms.  See e.g. CPython's pyport.h:
>  http://hg.python.org/cpython/file/9599f091faa6/Include/pyport.h
> where the macros PyAPI_FUNC and PyAPI_DATA have a 44-line definition,
> handling various different compatibility cases.
>
> For example, GCC_PRIVATE_API could have some magic linker flag that lets
> us automatically strip out those symbols so that they're not visible
> externally after the compiler has been linked.

That's for the implementation side - the public API side by definition has
only declarations considered public.

>> >> If this landed e.g. in GCC 4.8, would this be usable by other plugins?
>>
>> For sure.  I'd say get the copyright stuff sorted out and start pushing 
>> things
>> into the main GCC repository - where the obvious starting point is to
>> get the makefile, configure and install parts correct.
>>
>> I'd concentrate on providing enough API for introspection, like simple
>> things, counting basic-blocks, counting calls, etc.  I'm not sure we
>> want to expose gcc_gimple_walk_tree or gcc_gimple_print (or
>> the gcc_printers for a start) - the latter would something that the
>> python plugin
>> would provide, resorting to introspecting the stmt itself.
> FWIW the Python plugin already heavily uses gcc's pretty-printer code,
> so that *is* something I'd want to wrap (but it's fairly easy to do so).

Ok, fair enough.

>> I also wonder about gcc_gimple_phi_upcast - why would you specialize
>> PHI nodes but not any others?  I'd have exposed gcc_gimple_code.
>> In general the plugin API should provide exactly one (and the most flexible)
>> way to do a thing (thus, not have gcc_gimple_assign_single_p) and
>> the high-level consumers like the python plugin should provide
>> nice-to-use interfaces.
>
> This touches on the biggest thing that's missing in the API: the
> multitude of tree types, gimple statement types, and rtl types.  All I
> added were the "base classes", and gimple-phi is the only instance so
> far in the API of a subclass.

Well, one way is surely to view it as sort-of "classes", the other way is
to view it as tagged structures which you can simply expose.  Of course
if there will be a day where somebody converts all of gimple and tree
to a C++ class hierarchy the class way would be a better representation.

> I would like the API to expose these (my plugin uses them to good
> effect), but doing it in C is likely to be messy: lots of upcasting and
> downcasting functions.

I'd not do up/down-casting.  I'd at most expose a RTTI-like interface,
gcc_gimple_is_phi, etc..  Using an opaque "base" pointer in all
functions should be ok (ok, you will miss some type safety - but so
does GCC internally).

> An alternative approach might be to bite the bullet and go to C++; maybe
> something like this (caveat: I last did any C++ about a decade ago, no
> idea if the following compiles):
>
> // Everything exposed in the public headers are pure virtual functions
> // leading to abstract classes: no public data
> namespace gcc {
>
>   // Abstract base class for objects managed by GCC's garbage-collector
>   class gcmanaged {
>   public:
>      virtual void mark_in_use() const = 0;
>   };
>
>   class location : public gcmanaged {
>       // it isn't gcmanaged yet, but might become so
>   public:
>      virtual std::string get_filename() const = 0;
>      virtual int get_line() const = 0;
>      virtual int get_column() const = 0;
>   };
>
>   namespace gimple {
>       // enum to assist in RTTI by non-C++ code:
>       enum code {
>           INVALID_GIMPLE_CODE = -1,
>           GIMPLE_NOP = 0,
>           GIMPLE_ASSIGNMENT,
>           GIMPLE_SWITCH,
>           //etc; don't have to expose all of the regular gimple codes.
>           NUM_CODES,
>       };
>
>       class statement : public gcmanaged {
>       public:
>           virtual location *get_location() const = 0;
>           virtual enum code get_code() const = 0;
>       };
>
>       class nop : public statement {
>       };
>
>       class phi : public statement {
>       public:
>           virtual tree::ssaname* get_lhs() const = 0;
>           virtual std::list<tree::expr*, cfg::edge*> get_rhs() const = 0;
>           // or whatever
>       };
>   };
>
>   namespace tree {
>       class declaration : public gcmanaged {
>       public:
>           virtual std::string get_name() const = 0;
>           virtual location*   get_location() const = 0;
>       };
>
>       class funcdecl :: public declaration {
>       public:
>           virtual function* get_function() const = 0;
>           virtual std::list<parmdecl*> get_arguments() const = 0;
>           virtual resultdecl* get_result() const = 0;
>           // etc: accessor functions
>       };
>
>       class type : public gcmanaged {
>       // etc...
>       };
>
>       class integertype :: public type {
>       // etc...
>       };
>
>       // does there actually need to be a "tree" base class?  I think
>       // the tcc_codes express the base classes, at a first approximation
>       // could even lose the "tree" concept altogether
>   };
> };

But that of course requires real objects to mediate between the plugin API
and GCC internals - something I'd try to avoid hard.  Objects should be
identified by pointers to the original internal structures - that also makes
it simple to eventually add a garbage collector friendly reference taking
mechanism (by simply having a new internal GC root being a pointer-map,
ptr:refcount and a plugin API that does gcc_push_ref () / gcc_pop_ref ()).

> [though I'd greatly prefer to capitalize the classnames (sigh) :) ]
>
> The implementation would have entirely separate (private) headers, with
> something like:
>
> class location_impl : public gcc::location {
> private:
>   location_t loc;
>
> public:
>    // ctor:
>    location_impl(location_t loc) : loc(loc) {}
>
>    // implement the public API:
>    void mark_in_use() const {
>         // empty: location_t currently isn't gc managed
>    }
>
>    std::string get_filename() const;
>    int get_line() const;
>    int get_column() const;
> };
>
> class phi_impl : public gcc::gimple::phi {
> private:
>    gimple stmt; // of correct subtype
>
> public:
>    // gcmanaged:
>    void mark_in_use() const;
>
>    // ctor:
>    phi(gimple stmt);
>
>    // accessors:
>    location *get_location() const;
>    enum code get_code() const;
>
>    tree::ssaname *get_lhs() const;
>    std::list<tree::expr*, cfg::edge*> get_rhs() const;
> };
>
> That way the implementation is *entirely* hidden, with entirely separate
> headers (though during prototyping I'd need an escape hatch).  But it
> does mean that the API is handing off pointers-to-pointers, rather than
> just pointers (allocating lots of tiny wrapper objects, like the ones
> above, so that the caller doesn't know the size, and so there can be a
> vtable).
>
> Does this sound sane?

No, requiring extra storage allocation will greatly complicate API use.

Richard.

> Long-term, perhaps these impl classes could perhaps *become* the
> implementations - everything's hidden.
>
> Thanks again for looking at this
> Dave
>

Reply via email to