Re: request for enhancement: compile, package and artifacts support for C++

Ittay Dror Mon, 28 Jul 2008 23:24:51 -0700

I merged the other email (ordering) and comments. My comments inline


Assaf Arkin wrote:

On Mon, Jul 28, 2008 at 2:42 AM, Ittay Dror <[EMAIL PROTECTED]> wrote:

Hi,

I'm working on adding C++ support to buildr. I already have a prototype that
builds libraries and executables in Linux. I'd like to share some of the
difficulties I had and request changes to buildr to accommodate C++ more
easily. (Right now, I've created parallel route to that of building
Java-like code)

compile
========
overview
--------------------
the compile method in project returns a CompileTask that is generic and uses
a Compiler instance to do the actual compilation. In C++, compilation is
also dependency based (.o => .cpp, sometimes precompiling headers). Also,
the same code can produce several results (static and shared libraries, oj
files with debug, profiling, preprocessor defines turned on and off). [1]

there is the 'build' task, which is used as a stub to attach dependencies
to.

suggestion
---------------------
* there should be an array of compile tasks (as in packages)
* #compile should delegate the call to a factory method which returns a task
(again, as in packages)


Yes.  And I know a few people just waiting for the change to compile
multiple things in the same project, so here's another reason for
adding this feature.

But I have to warn you, it's not as simple as it looks, I took a stab
at it before and deciding to downscale support to one compiler per
project.  It's worth doing because a lot of languages would benefit
from it, but that's also what makes it tricky.  I think it would be
easier to get C support working without it first, and separately work
on this feature and then improve C support using it.

How about this: classify compile commands with symbolic names. likecompile('java') or compile('c++:shared') ? on bootstrap, the differentextensions can create compile tasks based on directory structure (so theJava extension can see that the directory [:source, :main, :java] existsand create compile('java') with some default values.


All compile tasks are prerequisits of 'build'

Then 'package :jar' can create a package that depends oncompile('java'), compile('groovy') or whatever makes sense to put in ajar, as long as the compile task exist of course (not to create them ifthey don't) (BTW, I have some issues with the lack of command-queryseparation, normally when using a query method, I wouldn't want a taskto be created if it doesn't exist)

* generic pre-requisites (like 'resources') should either be tacked on
'build' (relying on order of prerequisites), or the compile task can be
defined to be a composite (that is, from the outside it is a single task,
but it can use other tasks to accomplish its job).


compile already is: resources is a prerequisite for compile, some
other tasks (e.g. byte code enhancing) are tacked on to compile by
enhancing it.

yes, but the compilation of the java family of languages is one task(calling javac), while compiling c++ is several tasks: task per obj fileand task per link. so there's a chain of tasks already. having a genericmethod receive a task from the factory method and make it depend on'resources' won't do, since the lower level tasks should be the onesthat depend.

of course the factory method can create just one task that does all therest in its action (compile obj files and link), but i do want to usetasks for the following reasons:

1. it makes the logic more like make, which will assist acceptance

2. it can use mechanisms in unix compilers to help make. specifically,most (if not all) unix compilers have an option to spit out dependenciesof the source files on headers.3. it reuses timestamp checking code in rake (and if ever rakeimplements checksum based recompilation)4. if rake will implement a job execution engine (like -j in make), thenstructuring compilation by tasks will allow it to parallelize the execution.

but, i think the solution is easy: similar to the 'build' "pseudo task",i can create a 'compile:prepare' pseudo task that depends on 'resources'etc. then, the factory method needs only to depend on 'compile:prepare'(the logic is that another extension can then add other things to dobefore compile without needing to change the compile extensions)

package & artifacts
=========
overview
---------------
buildr has a cool concept that all dependencies (in 'compile.with') are
converted to tasks that are then simple rake dependencies. However, the
conversion is not generic enough. to compile C++ code against a dependency
one needs 2 paths: a folder containing headers and another containing
libraries. To put this in a repository, these need to be packaged into one
file. To use after pulling from the repository, one needs to unpack. So a
task representing a repository artifact is in fact an unzip task, that
depends on the 'Artifact' task to pull the package from a remote repository.


Let's take Java for example, let's say we have a task that depends on
the contents of another WAR.  Specifically the classes (in
WEB-INF/classes) and libraries (WEB-INF/lib).  A generic unzipping
artifact won't help much, you'll get the root path which is useless.
You need the classes path for one, and each file in the lib (pointing
to the directory itself does nothing interesting).  It won't work with
EAR either, when you unzip those, you end up with a WAR which you need
to unzip again.

But this hypothetical task that uses WAR could be smarter.  It
understands the semantics of the packages it uses, and all these
packages follow a common convention, so it only needs to unpack the
portions of the WAR it cares about, it knows how to construct the
relevant paths, one to class and one to every JAR inside the lib
directory.

I think the same analogy applies to C packages.  If by convention you
always use include and lib, you can unpack only the portion of the
package you need, find the relevant paths and use them appropriately.

(note: not sure i'm following you here. )

my current implementation creates classes that have methods to retrievethe include paths, the library paths and the library names. I don't usethe task name, since it is useless (as you mentioned). so I have anExtractedRepoArtifact FileTask class that implements these methods byrelying on the structure of the package ('include' and 'lib'directories), it depends on the Artifact class and its action is toextract the artifact.

When given a project dependency, i return the build task whichimplements the artifact methods mentioned above by returning the[:source,:main,:include] and [:target, Platform.id, :lib] paths. It alsoallows the user to add include paths (e.g., for generated files) whichare then both used for compilation and returned by the artifact methods.

furthermore, when building against another project, there is no need to pack
and unpack in the repository. one can simply use the artifacts produced in
the 'build' phase of the other project.


Yes.  Right now it points to the package, which gets invoked and so
packs everything, whether you need the packing or not.  You don't,
however, have to unpack it, if you know the packaging type you can be
smarter and go directly to the source.

but i don't want to pack if there's no use for it. speed is critical inthis project, since there's no eclipse to constantly compile code foryou, so developers need to run the build after each change. having itpack unnecessarily wasts time.

finally, in C++ in many cases you rely on a system library.

in all cases the resulting dependency is two-fold: on a include dir paths
and on a library paths. note that these do not necessarily reside under a
shared folder. for example, a dependency on another project may depend on
two include folders: one just a folder in the source tree, the other of
generated files in the target directory

suggestion
-------------------
While usage of Buildr.artifacts is only as a utility method, so one can
easily write his own implementation and use that, I think it will be nice to
be able to get some reuse.

* when given a project, use it as is (not 'spec.packages'), or allow it to
return its artifacts ('spec.artifacts').


Yes.  Except we're missing that whole dependency later (that's
something 1.4 will add).  Ideally the project would have dependency
lists it can populates (at least compile and runtime), and other
projects can get these dependency lists and pick what they want.  So
the compile dependency list would be the place to put headers and
libraries, without having to package them.  We don't have that right
now.

this is the purpose for the 'spec.artifacts' suggestion (that is, an'artifacts' method in Project). maybe need to classify them similarly tomy suggestion for 'compile', so the Buildr.artifacts method receives a'classifier' argument, whose value can be, for example, 'java' andcalls 'spec.artifacts(classifier)'. are we on the same page here?

* if a symbol, recursively call on the spec from the namespace
* if a struct, recursively call
* otherwise, classify the artifact and call a factory method to create it.
classification can be by packaging (e.g. jar). but actually, i don't have a
very good idea here. note that for c++, there need to be a way of defining
an artifact to look in the system for include files and libraries  (maybe
something like 'openssl:system'? - version and group ids are meaningless).
 * the factory method can create different artifacts. for c++ there would be
RepositoryArtifact (downloads and unpacks), ProjectArtifact (short circuit
to the project's target and source directories) and SystemArtifact.

I think that the use of artifact namespaces can help here as it allows to
create a more verbose syntax for declaring artifacts, while still allowing
the user to create shorter names for them. (as an example in C++ it will
allow me to add to the artifact the list of flags to use when
compiling/linking with it, assuming they're not inherent to the artifact,
e.g. turn debug on). The factory method receives the artifact definition
(which can actually be defined by each plugin) and decides what to do with
it.


1.4 will have a better dependency mechanism, and one thing I looked at
is associating meta-data with each dependency.  So perhaps that would
address things like compiling/linking flags.

> ordering
> =========
> overview
> -------------------
> to support jni, one needs to first compile java classes, then run javah to
> generate headers and then compile c code that implements these headers. so
> the javah task should be able to specify it depends on the java compile
> task. this can't be by depending on all compile tasks of course or on
> 'build'.


Alternatively:

compile do |task|
  javah task.target
end

This will run javah each time the compiler runs.

but running each time is what i want to avoid. not only do i want toavoid the invocation of 'javah', but when invoked it will change thetimestamp of the generated headers and so many source files will getrecompiled.

note that compiling a C/C++ source file is a much slower process thancompiling java.

suggestion
-------------------
when creating a compile task (whose name can be, as in the case of c++, the
result library name - to allow for dependency checking), also create a "for
ordering only" task with a symbolic name (e.g., 'java:compile') which
depends on the actual task. then other tasks can depend on that task


And yes, you'll still need that if you want to run the C compiler
after the Java compiler, so I think the right thing to do would have
separate compile tasks.

I hope all this makes sense, and I'm looking forward to comments. I intend
to share the code once I'm finished.


Unfortunately, the last time I wrote C code was over tens years ago,
so my rustiness is showing.  I'm sure I missed some points because of
that.

I hope I cleared things. I think it is worth investing in C/C++ as it isa space where there's still no solutions (that i know of) that handlemodule dependency.

To make sure it is clear, I'm not asking for the buildr team toimplement C/C++ building, I intend to do that, and have already made ademo of it working, but I do want to ask for the infrastructure inbuildr to make it easier, since currently it looks like a "stepson".


Ittay

Assaf

Thank you,
Ittay


Notes:
[1] I don't consider linking a library as packaging. First, the obj files
are not used by themselves as in other languages. Second, packaging is
required to manage dependencies, because in order for project P to be built
against dependency D, D needs to contain both headers and libraries - this
is the package.

--
--
Ittay Dror <[EMAIL PROTECTED]>


--
--
Ittay Dror <[EMAIL PROTECTED]>

Re: request for enhancement: compile, package and artifacts support for C++

Reply via email to