Re: MNG-5896: Speed up builds by parallel POM downloads
Thank you all for your feedback so far. So it seems we'll have to wait and see what happens with Aether first... That's an interesting topic of its own. I'll start a new thread about that. Regards, Harald 2015-12-27 22:56 GMT+01:00 Milos Kleint : > Nice work! Looking forward to have this integrated in maven codebase. > > We've solved the same problem internally by developing a service + maven > extension that downloads the maven pom files in bulk (one or more zips). We > are using bamboo aws elastic agents a lot and on a clean local repository, > this can save a minute or two for large builds (when your nexus proxy is > also placed in aws as well, when connecting to central even more). > > Milos > > On Sat, Dec 26, 2015 at 9:39 PM, Harald Wellmann > wrote: > > > When building a project with dependencies not yet available in the local > > repository, I noticed that Maven first downloads the dependency POMs > > sequentially and then proceeds downloading the dependency JARs with up > to 5 > > threads in parallel, which is not optimal when the POMs are served by a > > high-latency repository manager. > > > > There wasn't a lot of feedback on my enhancement request [1] or my > > original StackOverflow question [2], so I started digging into the source > > code and ended up with a patch [3]. > > > > The patch only affects Aether, not Maven Core, but since Aether is not > the > > top-level project from the end user perspective and doesn't appear to be > > very active, I thought I'd better contact this mailing list first. > > > > The basic idea of the patch is a clean separation of POM downloading from > > POM processing in DefaultDependencyCollector. Once these steps are > > separated, it is possible to download dependency POMs asynchronously and > in > > parallel, while still processing the POM models sequentially to build the > > dependency graph in the correct order. > > > > Since DefaultDependencyCollector holds a lot of global state and has > > rather long methods, I started by refactoring this class into smaller > > chunks to make the underlying logic more transparent. That's why the > patch > > looks a bit large, but essentially it only affects a single original > class. > > > > I did a local build of Maven 3.4.0-SNAPSHOT using Aether 1.1.0-SNAPSHOT > > with my patch, with all tests passing. > > > > I also ran maven-integration-testing on this patched Maven > 3.4.0-SNAPSHOT, > > with no new tests failing. (There is just one test which is broken since > a > > recent change on trunk, see [4].) > > > > Thanks for reading this far - it would be great if someone would take the > > time to look into the issue and the patch, and advise how to go on. > > > > > > [1] https://issues.apache.org/jira/browse/MNG-5896 > > [2] > > > http://stackoverflow.com/questions/32299902/parallel-downloads-of-maven-artifacts > > [3] > > > https://github.com/hwellmann/aether-core/commit/cdab4c40094ccf621370647f83ecda54684066ce > > [4] > > > https://builds.apache.org/job/maven-3.3-release-status-test-linux/lastCompletedBuild/testReport/ > > > > Regards, > > Harald > > > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org > > For additional commands, e-mail: dev-h...@maven.apache.org > > > > >
Re: MNG-5896: Speed up builds by parallel POM downloads
Nice work! Looking forward to have this integrated in maven codebase. We've solved the same problem internally by developing a service + maven extension that downloads the maven pom files in bulk (one or more zips). We are using bamboo aws elastic agents a lot and on a clean local repository, this can save a minute or two for large builds (when your nexus proxy is also placed in aws as well, when connecting to central even more). Milos On Sat, Dec 26, 2015 at 9:39 PM, Harald Wellmann wrote: > When building a project with dependencies not yet available in the local > repository, I noticed that Maven first downloads the dependency POMs > sequentially and then proceeds downloading the dependency JARs with up to 5 > threads in parallel, which is not optimal when the POMs are served by a > high-latency repository manager. > > There wasn't a lot of feedback on my enhancement request [1] or my > original StackOverflow question [2], so I started digging into the source > code and ended up with a patch [3]. > > The patch only affects Aether, not Maven Core, but since Aether is not the > top-level project from the end user perspective and doesn't appear to be > very active, I thought I'd better contact this mailing list first. > > The basic idea of the patch is a clean separation of POM downloading from > POM processing in DefaultDependencyCollector. Once these steps are > separated, it is possible to download dependency POMs asynchronously and in > parallel, while still processing the POM models sequentially to build the > dependency graph in the correct order. > > Since DefaultDependencyCollector holds a lot of global state and has > rather long methods, I started by refactoring this class into smaller > chunks to make the underlying logic more transparent. That's why the patch > looks a bit large, but essentially it only affects a single original class. > > I did a local build of Maven 3.4.0-SNAPSHOT using Aether 1.1.0-SNAPSHOT > with my patch, with all tests passing. > > I also ran maven-integration-testing on this patched Maven 3.4.0-SNAPSHOT, > with no new tests failing. (There is just one test which is broken since a > recent change on trunk, see [4].) > > Thanks for reading this far - it would be great if someone would take the > time to look into the issue and the patch, and advise how to go on. > > > [1] https://issues.apache.org/jira/browse/MNG-5896 > [2] > http://stackoverflow.com/questions/32299902/parallel-downloads-of-maven-artifacts > [3] > https://github.com/hwellmann/aether-core/commit/cdab4c40094ccf621370647f83ecda54684066ce > [4] > https://builds.apache.org/job/maven-3.3-release-status-test-linux/lastCompletedBuild/testReport/ > > Regards, > Harald > > > - > To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org > For additional commands, e-mail: dev-h...@maven.apache.org > >
Re: MNG-5896: Speed up builds by parallel POM downloads
This is interesting. I think the plan I to move aether under the maven project (at least I was asked to grant my contributions to aether to allow dual licensing under ASL on the basis of moving the code here... Still not sure what the status is) so I suspect until that gets resolved aether will be a fixed point. Once it gets resolved then this sounds like a good contribution, but I am on my phone so cannot say for sure. Keep an eye out for news of aether's code landing here/elsewhere and re-ping at that point if we have forgotten ;-) On Saturday 26 December 2015, Harald Wellmann wrote: > When building a project with dependencies not yet available in the local > repository, I noticed that Maven first downloads the dependency POMs > sequentially and then proceeds downloading the dependency JARs with up to 5 > threads in parallel, which is not optimal when the POMs are served by a > high-latency repository manager. > > There wasn't a lot of feedback on my enhancement request [1] or my > original StackOverflow question [2], so I started digging into the source > code and ended up with a patch [3]. > > The patch only affects Aether, not Maven Core, but since Aether is not the > top-level project from the end user perspective and doesn't appear to be > very active, I thought I'd better contact this mailing list first. > > The basic idea of the patch is a clean separation of POM downloading from > POM processing in DefaultDependencyCollector. Once these steps are > separated, it is possible to download dependency POMs asynchronously and in > parallel, while still processing the POM models sequentially to build the > dependency graph in the correct order. > > Since DefaultDependencyCollector holds a lot of global state and has > rather long methods, I started by refactoring this class into smaller > chunks to make the underlying logic more transparent. That's why the patch > looks a bit large, but essentially it only affects a single original class. > > I did a local build of Maven 3.4.0-SNAPSHOT using Aether 1.1.0-SNAPSHOT > with my patch, with all tests passing. > > I also ran maven-integration-testing on this patched Maven 3.4.0-SNAPSHOT, > with no new tests failing. (There is just one test which is broken since a > recent change on trunk, see [4].) > > Thanks for reading this far - it would be great if someone would take the > time to look into the issue and the patch, and advise how to go on. > > > [1] https://issues.apache.org/jira/browse/MNG-5896 > [2] > http://stackoverflow.com/questions/32299902/parallel-downloads-of-maven-artifacts > [3] > https://github.com/hwellmann/aether-core/commit/cdab4c40094ccf621370647f83ecda54684066ce > [4] > https://builds.apache.org/job/maven-3.3-release-status-test-linux/lastCompletedBuild/testReport/ > > Regards, > Harald > > > - > To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org > For additional commands, e-mail: dev-h...@maven.apache.org > > -- Sent from my phone
MNG-5896: Speed up builds by parallel POM downloads
When building a project with dependencies not yet available in the local repository, I noticed that Maven first downloads the dependency POMs sequentially and then proceeds downloading the dependency JARs with up to 5 threads in parallel, which is not optimal when the POMs are served by a high-latency repository manager. There wasn't a lot of feedback on my enhancement request [1] or my original StackOverflow question [2], so I started digging into the source code and ended up with a patch [3]. The patch only affects Aether, not Maven Core, but since Aether is not the top-level project from the end user perspective and doesn't appear to be very active, I thought I'd better contact this mailing list first. The basic idea of the patch is a clean separation of POM downloading from POM processing in DefaultDependencyCollector. Once these steps are separated, it is possible to download dependency POMs asynchronously and in parallel, while still processing the POM models sequentially to build the dependency graph in the correct order. Since DefaultDependencyCollector holds a lot of global state and has rather long methods, I started by refactoring this class into smaller chunks to make the underlying logic more transparent. That's why the patch looks a bit large, but essentially it only affects a single original class. I did a local build of Maven 3.4.0-SNAPSHOT using Aether 1.1.0-SNAPSHOT with my patch, with all tests passing. I also ran maven-integration-testing on this patched Maven 3.4.0-SNAPSHOT, with no new tests failing. (There is just one test which is broken since a recent change on trunk, see [4].) Thanks for reading this far - it would be great if someone would take the time to look into the issue and the patch, and advise how to go on. [1] https://issues.apache.org/jira/browse/MNG-5896 [2] http://stackoverflow.com/questions/32299902/parallel-downloads-of-maven-artifacts [3] https://github.com/hwellmann/aether-core/commit/cdab4c40094ccf621370647f83ecda54684066ce [4] https://builds.apache.org/job/maven-3.3-release-status-test-linux/lastCompletedBuild/testReport/ Regards, Harald - To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org For additional commands, e-mail: dev-h...@maven.apache.org