On Wed, 2010-01-06 at 18:36 -0800, Dan Fabulich wrote: > Kristian Rosenvold wrote: > > > In this process I removed your original implementation, simply because > > it allowed me to work freely in simplifying my own implementation (and I > > truly believe I managed to make some good simplifications). I also > > considered that I'd re-add your implementation as a third strategy > > when/if needed - it'll only take me an hour or two. > > If it'd only take you an hour or two, I suggest we try it. I can take a > crack at it if you're busy, but I'm not quite seeing where to plug in at > the moment. > > > as far as I can see it will work for every single project right out of > > the box > > Yes, that would be nice :-)
I will re-add your stuff, and I will also set it up to use my output demultiplexer that causes output to appear in "normal" order. We'll just try to close this (hugely interesting IMO!) discussion first. "An hour or two" usually maps to 4-5 hours real time, so I want to try aiming for the best solution. > > > As for "weave" mode: > > > Each ExecutionPlanItem now potentially has a "schedule" attached to it, > > that describes external requirements that must be met before the item > > can be executed. This schedule is determined by a declarative > > representation attached to class "DefaultLifecycles" (method > > DefaultLifecycles.createExecutionPlanItem). > > That sounds more aggressive than what I was imagining. I was only > imagining splitting the work into lifecycle phases; if I understand you > correctly, you're splitting to individual mojo executions, which is much > more fine-grained. Actually I'm saying that the schedule controls this splitting. We could support multiple (even user defined) schedules. See earlier mail I just sent today. > > So I was imagining running the entire compile phase of project X > sequentially in a thread, concurrent with the compile phase of project Y; > dependent project Z wouldn't begin compiling until X and Y had finished > compiling, but X and Y could begin testing while Z was compiling. Yes, this what I initially did. And it worked - it still works. This is what the current solution still does, but I got a little greedy - maybe too much, but that's easily fixable ;) There is one very important restriction I ended up with; in all normal scheduling cases each single module builds on 1 thread only. The only thing that can happen to this thread is that it may wait for some upstream ExecutionPlanItems to complete. Which (if any) ExecutionPlanItem you're waiting for is controlled by the ProjectDependencyGraph *and* the schedule in use. > It seems like you're splitting the phase up into pieces, too; so > individual mojos of project X could run concurrently with each other, > which I was *not* imagining. No, not for the normal case. Actually you *can* make this happen if you enable "force lifecycle violations", which will permit you to fork a single "something" (mojo or all the mojos in a phase) as a thread of its own. I am vastly unsure if this mode should be included at all, especially since I have been unable to get it to give me significantly "more" concurrency. (I think this is because at the place I tried this, we are already saturating all the cpu there is - I was trying surefire in unit test phase) > The problem with splitting down to the execution item is that then you > need to know the dependencies between execution items (if any); the > dependencies need to be expressed in mojo metadata, etc. > But if we only split down to lifecycle phase, well, we know the ordering > of lifecycle phases, and as long as we run them in that order, we're > guaranteed correctness, and we don't have to add mojo metadata, right? Yes. I think that any succesful "weave" implementation needs to be solved mostly without additional mojo metadata. Even though I can fantasize about proxying parts of the maven model, that's mostly "maven 5.0" territory ;) > > (Well, except maybe declaring that certain mojos need to be > synchronized...?) As I already said, the current schedule describes the "synchronized" metadata-aspect of selected mojos; i needed this because they had non-thread safe interactions with parts of the maven model. So already there I lost it ;) I have tried to make an implementation that reduces this problem to describing the semantics of the "waiting" dependencies in a proper way. When, and what, should control if a phase or mojo is allowed to start execution? > To put names on this, I think there are three "granularities" under > consideration: > > * Project granularity (my first attempt in the MNG-3004 branch) > * Phase granularity (my intended description of "weave" mode) > * Mojo granularity (your highly concurrent implementation) I think we should stick to the first two; because they describe significantly different directions of execution. As I hope I made clear, I am dubious to the effect/value of letting a single module be built by more than one thread. But I just had to try ;) To be any better than "Project Granularity", weave mode needs to be running in different phases for different modules concurrently. Otherwise it degrades to become almost the same as "Project Granularity". From your example above, X and Y are testing while the downstream Z is still in compile. > (In practice, I think you would have something very close to phase > granularity just by assuming that every mojo was "output dependent." Or > am I misunderstanding?) You're understanding it ;) The only adjustments that seem to be needed is to tweak the grammar for expressing the dependencies a bit broader; Currently you can only say "compile" is outputDependenant upon itself, meaning it'll wait for "compile" in all upstream projects to finish before proceeding. We also need to be able to specify the explicit target of the dependency, so you could say "test" is outputDependant on "compile" in all upstream modules. If you additionally add an outputDependency on itself to ALL subsequent phases, I think that should solve most of the problems with this model. Which brings me to another high-level concern here; All I *really* want/need is to run test in the exact manner you are describing. In reality I'm not sure if the later-phase concurrencies (war, install etc) provide any real value, and some may even have negative contributions in some case (I/O trashing due to concurrency comes to mind). I run my builds on ramdisk in linux, so I don't suffer, but those poor windows users with their crappy IO and virus controls are not as lucky. I am tempted to reduce/trim the schedule descriptions to fit just this ONE usecase perfectly. I'd still be running massively concurrent /until/ the compile phase, but after that just concurrently schedule the tests (as Dan describes above), and immediately after that just proceed with regular sequential reactor building. I haven't used the shade plugin, but would that be needed in the test phase of subsequent modules ? > My long-term goal is that Maven should run by default in "concurrent" mode > where threads = 1; optionally, users can crank up the number of threads > *without* changing the execution strategy. Good long-term goal. Maybe even default to threads = numCores; Kristian --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org For additional commands, e-mail: dev-h...@maven.apache.org