Re: [DISCUSS] Aggregator Plugins

Benjamin Bentmann Thu, 13 Aug 2009 13:51:19 -0700

[0] http://docs.codehaus.org/display/MAVEN/Aggregator+Plugins


1. Background

In Maven 2.x we have a boolean mojo annotation @aggregator with thefollowing effects on the mojo execution:


Execution

For mojos executed directly from the CLI, the mojo will only beexecuted once and not per each project in the reactor. For mojos boundto a lifecycle phase, the mojo will be executed for each project wherethe lifecycle binding is present.


Dependency Resolution

If an aggregating mojo is annotated with@requiresDependencyResolution, the core will resolve the dependenciesfor all projects in the reactor and not just for the current project.


Forking

The annotation @execute on an aggregating mojo will fork therequested goal/phase on all projects in the reactor.

Besides, aggregating mojos often use the parameter expressionreactorProjects or MavenSession.getSortedProjects() to get hold off allthe projects in the reactor for processing.

The current design has some problems, especially if aggregators arebound into a lifecycle phase, so let's step back and look what we wantto support and how this might work.



2. Use Cases

While we currently have only one annotation to request aggregation, wehave at least two different use cases for it. The differences in theseuse cases as outlined next contribute to the problems we currentlyencounter with aggregation and its use.


Pre- and Post-build Hooks

Given a (multi-module) build, users might want to perform tasks beforethe lifecycle of the first project starts or after the lifecycle of thelast project has completed. For instance, imagine a build like this:


1. Pre-build hook
2. Project 1
   a) validate
   b) (...)
   c) deploy
3. Project 2
   a) validate
   b) (...)
   c) deploy
4. Post-build hook

It's assumed that build hooks are implemented as regular mojos (withspecial annotations) and are introduced to a build via plugin executionsdefined in the POM. However, the <phase> element of such a pluginexecution would have a slightly different meaning. Instead of saying"bind this mojo into lifecycle phase xyz" it should be intepreted as "ifthe build executes to phase xyz or beyond, register this mojo as apre-/post-build hook".

A further mojo annotation could be introduced to enable the pluginauthor to control whether a post-build hook should be called regardlesswhether the build failed before, i.e. to provide some finally-like cleanup.

An example use for a pre-build hook could be an Enforcer rule thatchecks the build environment before any of the projects start to build.


Sub-module Summaries

A probably more common use case is to post-process output from childmodules in order to produce some aggregated/summarized output. In termsof build steps, this would look like:


1. Child 1
   a) validate
   b) (...)
   c) deploy
2. Child 2
   a) validate
   b) (...)
   c) deploy
3. Aggregator
   a) validate
   b) (...)
   c) (aggregating mojo bound to e.g. package phase)
   d) (...)
   e) deploy

The important difference of such a summary mojo compared to a post-buildhook is the interaction with the regular lifecycle. A summary mojo boundto say the "package" phase would be executed during this phase such thatlater phases like "install" or "deploy" of the current project's buildhave access to the output of the summary mojo.

Finally note that for the summary mojo to be able to aggregate theoutput from the child modules, the aggregator project needs to run afterthe child modules.

A concrete example for this type of aggregation is the production ofaggregated API docs or other assembly-like output that should beattached to a project and installed/deployed alongside the mainartifact. An aggregated site with summary reports is another example.


Scope for Aggregation

Orthogonal to the scenarios outlined above, we have to distinguish whatpart of a reactor build should be subject to aggregation. Consider thefollowing multi-module hierarchy where the projects marked with (X)associate an aggregating mojo with their lifecycle:


Top-Level Aggregator POM T (X)
  Second-Level Aggregator POM S1 (X)
    Child A
    Child B
  Second-Level Aggregator POM S2 (X)
    Child C
    Child D

Running "mvn deploy" on the top-level aggregator POM could have thefollowing effects:

1. Invoke the aggregating mojo in each project it is declared in. Indetail the following three mojo executions would result for our example:a) Invoke aggregator on second-level POM S1, aggregating output fromchild A and Bb) Invoke aggregator on second-level POM S2, aggregating output fromchild C and Dc) Invoke aggregator on top-level POM, aggregating output from A, B, S1,C, D and S2

2. Invoke the aggregating mojo only in the top-most project it isdeclared in. For the example given, this would mean only one mojoexecution by suppressing any other executions of the mojo in sub modules(of any depth):a) Invoke aggregator on top-level POM, aggregating output from A, B, S1,C, D and S2

Both styles have their supporting use cases. For a summary mojo thatproduces an aggregated assembly, the user might not want to skip thisassembly step just because he invoked the build from a higher level ofthe project hierarchy where an even bigger assembly is produced. For apre-build hook like a validation step on the other hand, it might bepreferable to run only on the top-most project (e.g. for performancereasons).

To address this distinction in aggregation scope, we might start offwith new mojo annotations like "@aggregator top-level|project" thatplugin authors can use to indicate the desired operational mode. But itseems this ultimatively demands a new POM element to enable the user tochoose the mode that fits his intentions.

Compared to Maven 2.x, the first style of aggregation resembles somehowthe current behavior, i.e. the aggregating mojo being executed in eachproject it is encountered. The major difference however is the order inwhich the individual projects are executed. For the common setup wherethe aggregator POM is also used as parent POM, it would be build aheadof the child modules in 2.x, making aggregation of child outputimpossible right now.

Also note that the second style of aggregation does not necessarily meanthe aggregating mojo is only executed once per reactor build. Considerthis variation of the above example where the aggregating mojo is onlydeclared in S1:


Top-Level Aggregator POM T
  Second-Level Aggregator POM S1 (X)
    Child A
    Child B
  Second-Level Aggregator POM S2
    Child C
    Child D

When running Maven on the top-level project, it seems unintuitive toinvoke the aggregating mojo on the entire reactor just because the userran the build from a higher level of the project hierarchy where howeverthe aggregating mojo is not declared. This would extend the effect ofthe aggregator to modules that are no sub modules of its declaringproject S1. This is exactly one of the problems we have in Maven 2.xwhere an aggregating mojo bound to a lifecycle phase causes dependencyresolution for the entire reactor although some modules haven't beenbuilt yet.


3. Realization

All the different use cases outlined above are the things that we mightwant to support in future Maven versions. Yet we historically have onlythis single boolean "@aggregator" annotation that does not tell whichuse case a mojo is intended to serve. It appears though that themajority of aggregating mojos out there is meant to provide summarymojos. Hence I propose the following behavior of Maven core:


Project Ordering

A project with packaging "pom" can serve both as a parent POM and as anaggregator POM. Inheritance belongs to the construction of the effectivemodel and happens long before we reach the lifecycle executor and assuch does not care about project order. Aggregation in the sense of asummary mojo however imposes a constraint on the order namely that theproject with the aggregating mojo needs to be built after its childmodules. For this reason, the project sorter needs to be changed to markan aggregator POM as a dependant of all its modules. This is contrary tothe related article [1] and the current behavior of Maven 2.x. Thehopefully few cases where users setup an aggregator POM to produce someartifact for comsumption by sub modules would demand to restructure thebuild and move the production of the artifact to a sub module of theaggregator.


Dependency Resolution

A mojo flagged as "@aggregator" should no longer trigger dependencyresolution for the entire reactor but only for the sub tree of theproject hierarchy where the aggregating mojo is rooted. For a mojoinvoked directly from the CLI, this effectively makes no differencescompared to Maven 2.x. For mojos bound to the lifecycle, this preventsdependency resolution errors on modules that due to the project ordercan never be build in time for the aggregating mojo.


Forking

Just as with dependency resolution, an aggregating mojo should no longerfork the entire reactor but only the sub tree of the project hierarchyit is relevant for.


Project Retrieval

What remains unclear to myself is how to handle the "reactorProjects"parameter expression in aggregating mojos. I am tempted to believe thatthose mojos don't really want all reactor projects but again only thesub tree of the project hierarchy they operate in. If this assumptionproves sensible, it would fit the bill to change the semantics of the"reactorProjects" expression to only deliver the projects from the subtree of the project hierarchy, thereby being in sync with the changesfor dependency resolution and forking.

The obvious alternative is to leave "reactorProjects" as is andintroduce a new expression "subProjects" or similar that only deliversthe current project and all its (transitive) sub modules.


Project Hierarchy Tree

Internally, the core will need to keep the tree of projects that formsthe project hierarchy as determined by aggregation, i.e. via the<modules> section in the POM.


Pre-/Post Build Hooks

The details of this are left open for future design. Right now, I simplyassume we will introduce new mojo annotations to mark those goals anddistinguish them from the summary mojos that continue to use theexisting @aggregator" annoation.


4. Related Articles

[0] http://docs.codehaus.org/display/MAVEN/Atypical+Plugin+Use+Cases
[1] http://docs.codehaus.org/display/MAVEN/Deterministic+Lifecycle+Planning


Benjamin

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [DISCUSS] Aggregator Plugins

Reply via email to