I seem to understand that there's room for several different types of solution here;
Starting with the single-machine solution; I now understand that you could start forking downstream builds straight after compile in a reactor build, maybe after install in other cases. In this scenario I think each module is dependant on all upstream modules successfully achieving "install" before proceeding to "deploy". I really think it's important to avoid leaking artifacts that do not have its own (and all upstream) lifecycle requirements fulfilled. When it comes to clustering there may be several approaches: If you decide to publish artifacts through "deploy" to any kind of repo I believe these require to have all lifecycle requirements met, which at my current understanding seems orthogonal to local out-of-order execution. Wouldn't it be feasible to distribute the "local" and perhaps "transacted local" repo inside the cluster using network file sharing ? One would still have to solve serialization issues and using installed artifacts in a reactor build..? The clustering case seems like a much harder task than achieving full local concurrency. I did some fairly extensive measurements with my current build when I set up concurrent spring/junit testing: Missing concurrency in classloading is the most important reason why unit tests run slowly (classloading is strictly a synchronized business until jdk7). By running tests out-order on my local unit test-build I am fairly certain I could reduce run-time for "mvn clean install" to something much closer to "mvn -Dmaven.test.skip=true clean install" (80->25 seconds in my case). This is even before I start parallelizing the individual modules. I must confess that I've yet to see a build that really needs clustering for any other reason than running tests or other individual tasks (javadoc, site etc). I think I'd be inclined to just distributing those specific tasks in a cluster. If you actually had a decent model of inter-lifecycle phase dependencies (requiredForStarting between phases), you could probably achieve good results by keeping lifecycle execution centralized but ditributing plugin execution ? I suppose I may be narrow-minded on this last one... I will be starting to look at the DefaultLifeCycleExecutor with thoughts of out-of-order execution, maybe dabble around a little. Kristian fr., 20.11.2009 kl. 06.29 -0800, skrev Dan Fabulich: > I've been meaning to reply to your earlier emails (it's been a busy week); > to this I'll just say that moving the "test" phase after the "install" > phase is a fascinating idea, which I personally like, but it seems like a > big violation of the contract for the lifecycle, and I suspect it won't be > popular. :-( > > I've long felt that there should be a phase for testing after "install" > for similar reasons. This might be SLIGHTLY more popular since users > would need to explicitly cause their tests to run during this phase. > > What about users doing multi-machine builds? Earlier this week I wrote > that users desiring to do multi-machine parallelism should deploy their > builds to a remote repository shared between the machines. Should their > tests run post-deploy? > > -Dan > > > Kristian Rosenvold wrote: > > > I've been thinking further about parallelity within maven. The proposed > > solution to MNG-3004 > > achieves parallelity by analyzing inter-module dependencies and scheduling > > parallel dependencies in parallel. > > > > A simple further evolution of this would be to collect and download all > > external dependencies > > for all modules immediately. > > > > But this idea has been rummaging in my head while jogging for a week or so: > > > > Would it be possible to achieve super-parallelity by describing > > relationships between phases of the build, and even reordering some of the > > phases ? I'll try to explain: > > > > Assume that you can add transactional ACID (or maybe just AID) abilities > > towards the local > > repo for a full build. Simply put: All writes to a local repo is done in a > > per-process-specific instance of the repo, that can be rolled back if the > > build fails (or pushed to the local repo if > > the build is ok) > > > > If you do that you can re-order the life-cycle for most builds to be > > something like this: > > > > validate > > compile > > package > > install > > test > > integration-test > > deploy > > > > Notice that I just moved all the "test" phases after the "install" phase. > > Theoretically you could start any subsequent modules immediately after > > "install" is done. Running of tests is really the big killer in most > > multi-module projects I see. > > > > Since your commit "push" towards the local repo only happens at the very end > > of the build, you > > will not publish artifacts when tests are failing (at leas not project > > output artifacts) > > > > You could actually make this a generic model that describes deifferent kinds > > of > > dependencies between lifecycle phases of different modules. The dependency I > > immediately > > see is "requiredForStarting" - which could be interpreted as meaning that > > any upstream > > dependencies must have reached at least that phase before the phase can be > > started > > for this project. I'm not sure if there's any value in a generic model, but > > my perspective > > may be limited to what I see on a daily basis. > > > > Would this be feasible ? > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
