On Tue, Apr 3, 2012 at 6:02 PM, David Malcolm <dmalc...@redhat.com> wrote: > On Tue, 2012-04-03 at 15:23 +0200, Richard Guenther wrote: >> On Tue, Apr 3, 2012 at 12:03 PM, Richard Guenther >> <richard.guent...@gmail.com> wrote: >> > On Mon, Apr 2, 2012 at 7:21 PM, David Malcolm <dmalc...@redhat.com> wrote: >> >> I wrote a script and ported my proposed API for GCC plugins from my >> >> CamelCase naming convention to an underscore_based_convention (and >> >> manually fixed up things in a few places also). >> >> >> >> The result compiles and passes the full test suite for the Python >> >> plugin; that said, I'm still breaking the encapsulation in quite a few >> >> places (hey, this is an experimental prototype). >> >> >> >> You can see the latest version of it within the "proposed-plugin-api" >> >> branch of the Python plugin here: >> >> http://git.fedorahosted.org/git/?p=gcc-python-plugin.git;a=shortlog;h=refs/heads/proposed-plugin-api >> >> >> >> within the "proposed-plugin-api" subdirectory. >> > >> > Hmm, how do I get it? I did >> > >> > git clone http://git.fedorahosted.org/git/proposed-plugin-api >> > >> > but there is nothing in gcc-python-plugin/. And >> > >> > git checkout proposed-plugin-api >> > >> > says I'm already there ...? >> >> Meanwhile the directory magically appeared (heh ...). > > [The ways of git are something of a mystery to me: 95% of the time it's > the best revision control system I've ever used, but 5% of the time the > most obtuse] > >> Overall it looks good > Thanks for taking a look. > >> - but it seems to leak GCC headers into the >> plugin API (via gcc-plugin.h and input.h inclusion). Thus, it >> lacks separating the plugin API headers from the plugin API implementation >> headers? > That's true. The big information "leak" happens inside > gcc-semiprivate-types.h, which defines the various structs that act like > pointers, each with a decl like this: > > struct gcc_something { > some_private_gcc_pointer_type inner; > }; > > It would be possible to make this more opaque like this: > > struct gcc_something { > struct some_private_gcc_struct *inner; > }; > > given that you then don't need a full definition of that inner struct > visible to users. Though location_t is leaked, and in this approach, > there isn't a way around that, I think.
See the reply by Roman. > >> Or maybe I'm not looking at the right place. >> I also dislike the use of GCC_PUBLIC_API, etc. - everything in >> the plugin API headers should be obviously public - and the implementation >> detail should be an implementation detail that should not need to care. > > I added that macro based on the example of other plugin/embedding > systems I've seen (e.g. Python), where it's handy to make that macro be > non-trivial on some platforms. See e.g. CPython's pyport.h: > http://hg.python.org/cpython/file/9599f091faa6/Include/pyport.h > where the macros PyAPI_FUNC and PyAPI_DATA have a 44-line definition, > handling various different compatibility cases. > > For example, GCC_PRIVATE_API could have some magic linker flag that lets > us automatically strip out those symbols so that they're not visible > externally after the compiler has been linked. That's for the implementation side - the public API side by definition has only declarations considered public. >> >> If this landed e.g. in GCC 4.8, would this be usable by other plugins? >> >> For sure. I'd say get the copyright stuff sorted out and start pushing >> things >> into the main GCC repository - where the obvious starting point is to >> get the makefile, configure and install parts correct. >> >> I'd concentrate on providing enough API for introspection, like simple >> things, counting basic-blocks, counting calls, etc. I'm not sure we >> want to expose gcc_gimple_walk_tree or gcc_gimple_print (or >> the gcc_printers for a start) - the latter would something that the >> python plugin >> would provide, resorting to introspecting the stmt itself. > FWIW the Python plugin already heavily uses gcc's pretty-printer code, > so that *is* something I'd want to wrap (but it's fairly easy to do so). Ok, fair enough. >> I also wonder about gcc_gimple_phi_upcast - why would you specialize >> PHI nodes but not any others? I'd have exposed gcc_gimple_code. >> In general the plugin API should provide exactly one (and the most flexible) >> way to do a thing (thus, not have gcc_gimple_assign_single_p) and >> the high-level consumers like the python plugin should provide >> nice-to-use interfaces. > > This touches on the biggest thing that's missing in the API: the > multitude of tree types, gimple statement types, and rtl types. All I > added were the "base classes", and gimple-phi is the only instance so > far in the API of a subclass. Well, one way is surely to view it as sort-of "classes", the other way is to view it as tagged structures which you can simply expose. Of course if there will be a day where somebody converts all of gimple and tree to a C++ class hierarchy the class way would be a better representation. > I would like the API to expose these (my plugin uses them to good > effect), but doing it in C is likely to be messy: lots of upcasting and > downcasting functions. I'd not do up/down-casting. I'd at most expose a RTTI-like interface, gcc_gimple_is_phi, etc.. Using an opaque "base" pointer in all functions should be ok (ok, you will miss some type safety - but so does GCC internally). > An alternative approach might be to bite the bullet and go to C++; maybe > something like this (caveat: I last did any C++ about a decade ago, no > idea if the following compiles): > > // Everything exposed in the public headers are pure virtual functions > // leading to abstract classes: no public data > namespace gcc { > > // Abstract base class for objects managed by GCC's garbage-collector > class gcmanaged { > public: > virtual void mark_in_use() const = 0; > }; > > class location : public gcmanaged { > // it isn't gcmanaged yet, but might become so > public: > virtual std::string get_filename() const = 0; > virtual int get_line() const = 0; > virtual int get_column() const = 0; > }; > > namespace gimple { > // enum to assist in RTTI by non-C++ code: > enum code { > INVALID_GIMPLE_CODE = -1, > GIMPLE_NOP = 0, > GIMPLE_ASSIGNMENT, > GIMPLE_SWITCH, > //etc; don't have to expose all of the regular gimple codes. > NUM_CODES, > }; > > class statement : public gcmanaged { > public: > virtual location *get_location() const = 0; > virtual enum code get_code() const = 0; > }; > > class nop : public statement { > }; > > class phi : public statement { > public: > virtual tree::ssaname* get_lhs() const = 0; > virtual std::list<tree::expr*, cfg::edge*> get_rhs() const = 0; > // or whatever > }; > }; > > namespace tree { > class declaration : public gcmanaged { > public: > virtual std::string get_name() const = 0; > virtual location* get_location() const = 0; > }; > > class funcdecl :: public declaration { > public: > virtual function* get_function() const = 0; > virtual std::list<parmdecl*> get_arguments() const = 0; > virtual resultdecl* get_result() const = 0; > // etc: accessor functions > }; > > class type : public gcmanaged { > // etc... > }; > > class integertype :: public type { > // etc... > }; > > // does there actually need to be a "tree" base class? I think > // the tcc_codes express the base classes, at a first approximation > // could even lose the "tree" concept altogether > }; > }; But that of course requires real objects to mediate between the plugin API and GCC internals - something I'd try to avoid hard. Objects should be identified by pointers to the original internal structures - that also makes it simple to eventually add a garbage collector friendly reference taking mechanism (by simply having a new internal GC root being a pointer-map, ptr:refcount and a plugin API that does gcc_push_ref () / gcc_pop_ref ()). > [though I'd greatly prefer to capitalize the classnames (sigh) :) ] > > The implementation would have entirely separate (private) headers, with > something like: > > class location_impl : public gcc::location { > private: > location_t loc; > > public: > // ctor: > location_impl(location_t loc) : loc(loc) {} > > // implement the public API: > void mark_in_use() const { > // empty: location_t currently isn't gc managed > } > > std::string get_filename() const; > int get_line() const; > int get_column() const; > }; > > class phi_impl : public gcc::gimple::phi { > private: > gimple stmt; // of correct subtype > > public: > // gcmanaged: > void mark_in_use() const; > > // ctor: > phi(gimple stmt); > > // accessors: > location *get_location() const; > enum code get_code() const; > > tree::ssaname *get_lhs() const; > std::list<tree::expr*, cfg::edge*> get_rhs() const; > }; > > That way the implementation is *entirely* hidden, with entirely separate > headers (though during prototyping I'd need an escape hatch). But it > does mean that the API is handing off pointers-to-pointers, rather than > just pointers (allocating lots of tiny wrapper objects, like the ones > above, so that the caller doesn't know the size, and so there can be a > vtable). > > Does this sound sane? No, requiring extra storage allocation will greatly complicate API use. Richard. > Long-term, perhaps these impl classes could perhaps *become* the > implementations - everything's hidden. > > Thanks again for looking at this > Dave >