I've looked over the code and thought a bit further about the constraints involved, and given that:
- Multi module reactor builds are the only interesting targets of multithreading. - Reactor builds do not use the "install" output of their upstream dependencies (I was not aware of that ;) You do not have to re-order anything at all. An implementation could just: A) Immediately fork 1 thread per module for all modules. B) For the phases compile, install and deploy, a given module can only proceeed when all its upstream dependencies have completed the same state There's still a chance of leaking artifacts to local repository if upstream deploy fails after install, and the general idea of a transacted repo would still be nice to stay consistent. I'm still a bit sure about B) above, it may be a bit limiting in terms of other usage scenarios. I'm also a bit sure how that'd fit in with all the other activities in the lifecycle. An alternative would be to make a declarative-representation of phase-interdependencies that could express multiple types of concurrency-interdependencies. (But I consistenly only see one dependency type - upstreamMustFinishBeforeThisCanStart...?) Would it float ? Kristian lø., 21.11.2009 kl. 11.40 +0000, skrev Stephen Connolly: > In m3 (which is what we are talking about) AFAIK we can have a > listener that waits for the end of the start of the deploy phase > and/or the end of execution. > > With a customized install plugin, we could just install to the > "transaction" repository. The listener can then block until the > criteria have been met (allowing other modules to progress) That would > achieve what you're after... namely, produce the artifacts for > consumption by the other modules before running test and > integration-test. Once the criteria have been met, we either fail the > module or we move the artifacts from the "transactional" local repo to > the real local repo and allow the lifecycle to continue > > -Stephen > > 2009/11/21 Kristian Rosenvold <[email protected]>: > > I seem to understand that there's room for several different > > types of solution here; > > > > Starting with the single-machine solution; I now understand that > > you could start forking downstream builds straight after > > compile in a reactor build, maybe after install in other cases. > > > > In this scenario I think each module is dependant on all upstream > > modules successfully achieving "install" before proceeding to "deploy". > > I really think it's important to avoid leaking artifacts that do not > > have its own (and all upstream) lifecycle requirements fulfilled. > > > > When it comes to clustering there may be several approaches: > > If you decide to publish artifacts through "deploy" to any kind > > of repo I believe these require to have all lifecycle requirements met, > > which at my current understanding seems orthogonal to local out-of-order > > execution. > > > > Wouldn't it be feasible to distribute the "local" and perhaps > > "transacted local" repo inside the cluster using network > > file sharing ? One would still have to solve serialization issues > > and using installed artifacts in a reactor build..? > > > > The clustering case seems like a much harder task than achieving > > full local concurrency. I did some fairly extensive measurements > > with my current build when I set up concurrent spring/junit testing: > > > > Missing concurrency in classloading is the most important reason > > why unit tests run slowly (classloading is strictly a synchronized > > business until jdk7). By running tests out-order on my local > > unit test-build I am fairly certain I could reduce run-time > > for "mvn clean install" to something much closer to "mvn > > -Dmaven.test.skip=true clean install" (80->25 seconds in my case). > > This is even before I start parallelizing the individual modules. > > > > I must confess that I've yet to see a build that really needs > > clustering for any other reason than running tests or other individual > > tasks (javadoc, site etc). I think I'd be inclined to just distributing > > those specific tasks in a cluster. If you actually had a decent model of > > inter-lifecycle phase dependencies (requiredForStarting between phases), > > you could probably achieve good results by keeping lifecycle execution > > centralized but ditributing plugin execution ? > > > > I suppose I may be narrow-minded on this last one... > > > > I will be starting to look at the DefaultLifeCycleExecutor with thoughts > > of out-of-order execution, maybe dabble around a little. > > > > Kristian > > > > fr., 20.11.2009 kl. 06.29 -0800, skrev Dan Fabulich: > >> I've been meaning to reply to your earlier emails (it's been a busy week); > >> to this I'll just say that moving the "test" phase after the "install" > >> phase is a fascinating idea, which I personally like, but it seems like a > >> big violation of the contract for the lifecycle, and I suspect it won't be > >> popular. :-( > >> > >> I've long felt that there should be a phase for testing after "install" > >> for similar reasons. This might be SLIGHTLY more popular since users > >> would need to explicitly cause their tests to run during this phase. > >> > >> What about users doing multi-machine builds? Earlier this week I wrote > >> that users desiring to do multi-machine parallelism should deploy their > >> builds to a remote repository shared between the machines. Should their > >> tests run post-deploy? > >> > >> -Dan > >> > >> > >> Kristian Rosenvold wrote: > >> > >> > I've been thinking further about parallelity within maven. The proposed > >> > solution to MNG-3004 > >> > achieves parallelity by analyzing inter-module dependencies and > >> > scheduling > >> > parallel dependencies in parallel. > >> > > >> > A simple further evolution of this would be to collect and download all > >> > external dependencies > >> > for all modules immediately. > >> > > >> > But this idea has been rummaging in my head while jogging for a week or > >> > so: > >> > > >> > Would it be possible to achieve super-parallelity by describing > >> > relationships between phases of the build, and even reordering some of > >> > the > >> > phases ? I'll try to explain: > >> > > >> > Assume that you can add transactional ACID (or maybe just AID) abilities > >> > towards the local > >> > repo for a full build. Simply put: All writes to a local repo is done in > >> > a > >> > per-process-specific instance of the repo, that can be rolled back if the > >> > build fails (or pushed to the local repo if > >> > the build is ok) > >> > > >> > If you do that you can re-order the life-cycle for most builds to be > >> > something like this: > >> > > >> > validate > >> > compile > >> > package > >> > install > >> > test > >> > integration-test > >> > deploy > >> > > >> > Notice that I just moved all the "test" phases after the "install" phase. > >> > Theoretically you could start any subsequent modules immediately after > >> > "install" is done. Running of tests is really the big killer in most > >> > multi-module projects I see. > >> > > >> > Since your commit "push" towards the local repo only happens at the very > >> > end > >> > of the build, you > >> > will not publish artifacts when tests are failing (at leas not project > >> > output artifacts) > >> > > >> > You could actually make this a generic model that describes deifferent > >> > kinds > >> > of > >> > dependencies between lifecycle phases of different modules. The > >> > dependency I > >> > immediately > >> > see is "requiredForStarting" - which could be interpreted as meaning that > >> > any upstream > >> > dependencies must have reached at least that phase before the phase can > >> > be > >> > started > >> > for this project. I'm not sure if there's any value in a generic model, > >> > but > >> > my perspective > >> > may be limited to what I see on a daily basis. > >> > > >> > Would this be feasible ? > >> > > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [email protected] > >> For additional commands, e-mail: [email protected] > >> > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
