Re: versioning by hashes to speedup multi-module build (a'la nix package manager)
After learning from gitflow-incremental-builder how to remove modules from mavenSession if we want to skip them I implemented a 'version as a hash of sources and dependency tree" solution: https://github.com/avodonosov/hashver-maven-plugin It relies on using property expressions as versions. A build extension loads values for those properties from a file. This can be a file with "normal" maven versions maintained manually in source control or a file generated by the "hashver" mojo when user wants to use the hash versions and avoid build of unchanged modules. The extension allows to skip module build if the artifact of the same version exists. Using property expressions for versions has two drawbacks: 1. Calculation of hash versions should be a separate maven invocation, it can't be done in the same maven session as the main goal, because calculating the hash versions requires building the dependency tree, and after dependency tree is built it's impossible (I believe) to inject newly calculated versions into the maven structures so that they are in effect when the main goal is executed. 2. This approach is a bit intrusive - user needs to adjust his pom.xml files to use property expressions in place of versions. The advantage of this approach is that maven downloads artifacts of the skipped modules automatically if this module is a dependency of a changed module. If it was possible to annotate artifacts with the "src hash" using some attribute other the version, and hook into the artifact download maven logic to find artifacts by this attribute then generation of hash versions could be done in the same session as the main goal, and probably the solution could be applied without any modifications of user's pom. I noticed, when uploading snapshot artifacts maven adds a timestamp (and a sequential "build number"?) to the artifact name. And when resolving artifacts, special versionResolver object is invoked, which for SNAPSHOT sends a metadata request to the repository to retrieve a real, timestamped name of the artifact corresponding tot to the snapshot. This logic could be adjusted to use the "src hash" instead of the timestamp, and lookup in the remote repo an artifact matching the current module's "src hash". Any advice on how to do that? In the long run, I believe it is desirable for maven to natively support the notion of "build inputs hash" for modules and artifacts. It allows significant build time savings and still be very deterministic and stable. 05.02.2020, 01:26, "Jason Young" : > Good questions. First of all, this plugin is CI-agnostic, but it does > require the project to exist in a `git` repository, whether that is in CI > or on your machine. Check the github page I linked to for more instructions > on how it determines what projects in a reactor are considered "changed" > and need to be built versus which are not changed and will be omitted from > the reactor. > > In every Maven build, every dependency is checked this way: > > 1. If it is a project in the reactor, use the artifact of that project. > 2. Otherwise, if the artifact is in the local repo, is that artifact. > 3. Last resort: Download from the remote repository. > > There are some other rules omitted above, e.g. when to download a fresh > SNAPSHOT artifact based on your chosen snapshot policy, etc., but that's > the gist of it: Maven will obtain the artifact if it is not present, no > further configuration needed. > > E.g. let's say you have one project that names 2 other projects A and B as > submodules, and A depends on B. If you run `mvn install -pl A` (NOT SURE > about that syntax), then Maven will look for B.jar from your local repo, > and resort to checking your remote repo (e.g. Maven Central) if it's not > there. But if you omit the `-pl A` part, Maven will build B, then build A > using B.jar. > > Essentially, the plugin I linked to determines the project list based on > what has changed and what has not. Maven then decides whether to use > B/target/B.jar, ~/.m2/repository/.../B.jar, or to look to Maven Central. > > HTH. > > On Tue, Feb 4, 2020 at 4:08 PM Anton Vodonosov wrote: > >> 04.02.2020, 23:32, "Jason Young" : >> > >> > Not what you're looking for, but maybe useful: We use one plugin that >> will >> > skip whole projects that have not changed WRT a given Git branch: >> > https://github.com/vackosar/gitflow-incremental-builder. With careful >> > configuration, this is an effective shortcut without sacrificing >> > repeatability. >> >> How do you use it? I mean if you need to start full system, >> (locally or deploying it to a qa server), >> but the plugin has only built changed modules, >> how do you download the rest? >> >> Do you use this plugin in CI? >> >> - >> To unsubscribe, e-mail: users-unsubscr...@maven.apache.org >> For additional commands, e-mail: users-h...@maven.apache.org > >
Re: versioning by hashes to speedup multi-module build (a'la nix package manager)
Good questions. First of all, this plugin is CI-agnostic, but it does require the project to exist in a `git` repository, whether that is in CI or on your machine. Check the github page I linked to for more instructions on how it determines what projects in a reactor are considered "changed" and need to be built versus which are not changed and will be omitted from the reactor. In every Maven build, every dependency is checked this way: 1. If it is a project in the reactor, use the artifact of that project. 2. Otherwise, if the artifact is in the local repo, is that artifact. 3. Last resort: Download from the remote repository. There are some other rules omitted above, e.g. when to download a fresh SNAPSHOT artifact based on your chosen snapshot policy, etc., but that's the gist of it: Maven will obtain the artifact if it is not present, no further configuration needed. E.g. let's say you have one project that names 2 other projects A and B as submodules, and A depends on B. If you run `mvn install -pl A` (NOT SURE about that syntax), then Maven will look for B.jar from your local repo, and resort to checking your remote repo (e.g. Maven Central) if it's not there. But if you omit the `-pl A` part, Maven will build B, then build A using B.jar. Essentially, the plugin I linked to determines the project list based on what has changed and what has not. Maven then decides whether to use B/target/B.jar, ~/.m2/repository/.../B.jar, or to look to Maven Central. HTH. On Tue, Feb 4, 2020 at 4:08 PM Anton Vodonosov wrote: > > > 04.02.2020, 23:32, "Jason Young" : > > > > Not what you're looking for, but maybe useful: We use one plugin that > will > > skip whole projects that have not changed WRT a given Git branch: > > https://github.com/vackosar/gitflow-incremental-builder. With careful > > configuration, this is an effective shortcut without sacrificing > > repeatability. > > How do you use it? I mean if you need to start full system, > (locally or deploying it to a qa server), > but the plugin has only built changed modules, > how do you download the rest? > > Do you use this plugin in CI? > > - > To unsubscribe, e-mail: users-unsubscr...@maven.apache.org > For additional commands, e-mail: users-h...@maven.apache.org > > --
Re: versioning by hashes to speedup multi-module build (a'la nix package manager)
04.02.2020, 23:32, "Jason Young" : > > Not what you're looking for, but maybe useful: We use one plugin that will > skip whole projects that have not changed WRT a given Git branch: > https://github.com/vackosar/gitflow-incremental-builder. With careful > configuration, this is an effective shortcut without sacrificing > repeatability. How do you use it? I mean if you need to start full system, (locally or deploying it to a qa server), but the plugin has only built changed modules, how do you download the rest? Do you use this plugin in CI? - To unsubscribe, e-mail: users-unsubscr...@maven.apache.org For additional commands, e-mail: users-h...@maven.apache.org
Re: versioning by hashes to speedup multi-module build (a'la nix package manager)
It seems Maven itself never omits re-doing anything except for downloading artifacts from a remote repository. The command you give to Maven and the configuration of your projects dictates what it will do, no matter what happened in the previous build. You _can_ omit projects in a multi-module project if by manually specifying what projects to run: https://blog.sonatype.com/2009/10/maven-tips-and-tricks-advanced-reactor-options/ Not what you're looking for, but maybe useful: We use one plugin that will skip whole projects that have not changed WRT a given Git branch: https://github.com/vackosar/gitflow-incremental-builder. With careful configuration, this is an effective shortcut without sacrificing repeatability. Gradle advertises as a feature that it will not rebuild if rebuilding is not required, or something to that effect. I assume some configuration required sometimes. On Tue, Feb 4, 2020 at 1:57 PM Anton Vodonosov wrote: > Ha, only after completing the script (even though a slow one) > I discovered that maven rebuilds modules even if > an artifact of the same version already exists in artifact > repository. > > I hoped maven, in case a non -SNAPSHOT artifact > found in an artifact repository will just use it > and won't build the same version of a module again. > > Is there a way to tell maven to do so? > > Best regards, > - Anton > > - > To unsubscribe, e-mail: users-unsubscr...@maven.apache.org > For additional commands, e-mail: users-h...@maven.apache.org > > --
Re: versioning by hashes to speedup multi-module build (a'la nix package manager)
Ha, only after completing the script (even though a slow one) I discovered that maven rebuilds modules even if an artifact of the same version already exists in artifact repository. I hoped maven, in case a non -SNAPSHOT artifact found in an artifact repository will just use it and won't build the same version of a module again. Is there a way to tell maven to do so? Best regards, - Anton - To unsubscribe, e-mail: users-unsubscr...@maven.apache.org For additional commands, e-mail: users-h...@maven.apache.org
Re: versioning by hashes to speedup multi-module build (a'la nix package manager)
Thomas Broyer, Enrico Olivelli, I consider the whole directory where the module's pom.xml resides, excluding the target/ dir, as the input, and the final module artifacts as the output. Even if some plugins allow sources outside the pom.xml's directory (out of curiosity, is it possible?), it is an acceptable restriction on project structure, IMO. The version hash approach I described may cause some redundant work. For example, if only dependencies changed, most often only re-testing is needed, re-compilation and re-packaging are not necessary (unless your dependency generates or instruments code at compilation time, or you package an uberjar or war, which includes dependency artifacts). But for stability better to compromise, accepting some redundant work, than go into such complexities as intermediate results of individual plugins, distinguishing test and prod sources, types of dependencies. Even the simplest hash versioning can potentially give significant speedups for large multi-module projects. I've spent this weekend trying to create a script for such version hashes. But haven't completed it (yet), due to various obstacles in maven behaviour (impossible to use property expression in the project/version element; the dependency:tree goal requires artifact of the current version to be present in the ~/.m2 folder, although it doesn't look into the artifact content, so it can even be a fake artifact). Maybe someday I'll have more progress. Best regards, - Anton - To unsubscribe, e-mail: users-unsubscr...@maven.apache.org For additional commands, e-mail: users-h...@maven.apache.org
Re: versioning by hashes to speedup multi-module build (a'la nix package manager)
03.02.2020, 00:15, "Enrico Olivelli" : > (Apologises for top posting ) > > This thread is about a bunch of requested features (cache and parallel > executions of mojos) that we have been discussing on dev@ mailing list. > As said in this thread the first show stopper for Maven is that we do not > have a clear definition of input and outputs for each plugin. > There are proposals but actually no one is spending actively engineering > time on these topics. > > Any help is very appreciated, please join us on dev@ > > Best regards > Enrico What threads in dev list do you recommend to follow in regard to this subject? - To unsubscribe, e-mail: users-unsubscr...@maven.apache.org For additional commands, e-mail: users-h...@maven.apache.org
Re: versioning by hashes to speedup multi-module build (a'la nix package manager)
(Apologises for top posting ) This thread is about a bunch of requested features (cache and parallel executions of mojos) that we have been discussing on dev@ mailing list. As said in this thread the first show stopper for Maven is that we do not have a clear definition of input and outputs for each plugin. There are proposals but actually no one is spending actively engineering time on these topics. Any help is very appreciated, please join us on dev@ Best regards Enrico Il Dom 2 Feb 2020, 18:57 Thomas Broyer ha scritto: > Le dim. 2 févr. 2020 à 17:48, Anton Vodonosov a > écrit : > > > Hello. > > > > In order to speed up the build of a multi-module project, I'd like to > > reuse artifacts of modules that haven't changed. > > Manual versioning is tedious and error-prone. > > > > Is it possible to automatically assign versions to modules computed as a > > hash-of( hash-of(module sources) + hashes of all dependencies)? > > > > Please define modules sources? > Hint: you can't, at least not without knowing how all plugins work. Gradle > Enterprise tries to have such knowledge fwiw to solve this exact issue. > > Also, you'll probably want to include system properties (or at least Maven > properties) and some environment information (e.g. which JDK) in the hash. > > In this approach, every change in code will modify such hash-based version > > of all dependent modules automatically. > > > > This would be similar to Nix package manager. > > > > How to do that in maven? > > > > As said above, you could try Gradle Enterprise. Takari had something in the > works too a few years ago. > …or if that's really problematic for you, then migrate to another build > tool, such as Gradle or Bazel. > > > > Best regards, > > - Anton > > > > > > - > > To unsubscribe, e-mail: users-unsubscr...@maven.apache.org > > For additional commands, e-mail: users-h...@maven.apache.org > > > > >
Re: versioning by hashes to speedup multi-module build (a'la nix package manager)
Le dim. 2 févr. 2020 à 17:48, Anton Vodonosov a écrit : > Hello. > > In order to speed up the build of a multi-module project, I'd like to > reuse artifacts of modules that haven't changed. > Manual versioning is tedious and error-prone. > > Is it possible to automatically assign versions to modules computed as a > hash-of( hash-of(module sources) + hashes of all dependencies)? > Please define modules sources? Hint: you can't, at least not without knowing how all plugins work. Gradle Enterprise tries to have such knowledge fwiw to solve this exact issue. Also, you'll probably want to include system properties (or at least Maven properties) and some environment information (e.g. which JDK) in the hash. In this approach, every change in code will modify such hash-based version > of all dependent modules automatically. > > This would be similar to Nix package manager. > > How to do that in maven? > As said above, you could try Gradle Enterprise. Takari had something in the works too a few years ago. …or if that's really problematic for you, then migrate to another build tool, such as Gradle or Bazel. > Best regards, > - Anton > > > - > To unsubscribe, e-mail: users-unsubscr...@maven.apache.org > For additional commands, e-mail: users-h...@maven.apache.org > >
Re: versioning by hashes to speedup multi-module build (a'la nix package manager)
Hello Anton, If I understand you correctly, you do not want to build all modules of the maven build, only those that has changed. As long as you are building in development shell or IDE the result of last build is still there and maven will only recompile sources that are newer than their classes in target. So doing an "mvn install" again will be much faster since it will only compile what has changes since last build. When running on a CI however each run has to be treated as a clean build. As the "maven-ci-friendly.html" points out "mvn clean install ..." is always done. This is the only safe thing to do. A CI can have many executor hosts that it delegates jobs to. 2 CI runs of the same thing is not necessarily run on the same machine! Even if run on same machine different jobs gets different / personal and temporary areas on disk for that job. In other words, you cannot rely on a previous build being available. Every build is from scratch. But assuming that your build do push built artifacts to a repository somewhere (not ~/.m2/repository, but a Nexus / Artifactory / Bintray / ...), and that the build do look for artifacts in that repository, then you can make a separate CI configuration that only builds the module or modules you know have been modified, depending on the flexibility of the CI used. That is, instead of building from root module, trigger several builds, one for each submodule you want to rebuild. Depending on CI used it might require one config per sub module. The configuration of such a build will have to be changed for each time you want to rebuild just some sub modules. It cannot be an automatic job triggered by new commit! You will have to configure it and run manually every time. Best Regards, Tommy Från: Anton Vodonosov Svara: Maven Users List Datum: 2 februari 2020 at 16:10:44 Till: Maven Users List , i...@soebes.de Cc: Konrad Windszus Ämne: Re: versioning by hashes to speedup multi-module build (a'la nix package manager) I want, for unchanged parts of the project, to reuse artifacts produced by previous builds, and only rebuild the changed parts. Imagine a project with hundreds of modules stored in a single git repository, whose full build with tests takes 3 hours. A developer creates a ticket branch, changes couple lines and pushes the branch to the repository. CI build starts. I wish at this point only parts affected by the change to be rebuild. And artifacts for unaffected modules simply be fetched from artifacts repository (because previous builds placed them there). Speaking of http://maven.apache.org/maven-ci-friendly.html, if that means incorporating git commit into the version of all modules, then all modules will be rebuilt in the above scenario, even unaffected ones, because the new branch has a new git commit. Is it correct? Best regards, - Anton - To unsubscribe, e-mail: users-unsubscr...@maven.apache.org For additional commands, e-mail: users-h...@maven.apache.org
Re: versioning by hashes to speedup multi-module build (a'la nix package manager)
I want, for unchanged parts of the project, to reuse artifacts produced by previous builds, and only rebuild the changed parts. Imagine a project with hundreds of modules stored in a single git repository, whose full build with tests takes 3 hours. A developer creates a ticket branch, changes couple lines and pushes the branch to the repository. CI build starts. I wish at this point only parts affected by the change to be rebuild. And artifacts for unaffected modules simply be fetched from artifacts repository (because previous builds placed them there). Speaking of http://maven.apache.org/maven-ci-friendly.html, if that means incorporating git commit into the version of all modules, then all modules will be rebuilt in the above scenario, even unaffected ones, because the new branch has a new git commit. Is it correct? Best regards, - Anton - To unsubscribe, e-mail: users-unsubscr...@maven.apache.org For additional commands, e-mail: users-h...@maven.apache.org
Re: versioning by hashes to speedup multi-module build (a'la nix package manager)
Hi, On 01.02.20 16:08, Anton Vodonosov wrote: Hello. In order to speed up the build of a big multi-module project, I'd like to reuse the artifacts of modules that haven't changed. Manual versioning is tedious and error-prone. Can you explain more in detail what you exactly mean and what kind of problem you have best would be having an example project which shows the issues... Kind regards Karl Heinz Marbaise Is it possible to automatically assign versions so that versions only change if module sources or dependencies change? In result, only those modules will be recompiled and retested, and all other modules will be reused from artifact repository. For example, this could be done if versions are computed as hash-of( hash-of(module sources) + hashes of all dependencies). In this approach, every change in code will modify such hash-based version of all dependent modules automatically. This would be similar to Nix package manager. How such things (has based or to do that in maven? Best regards, - Anton - To unsubscribe, e-mail: users-unsubscr...@maven.apache.org For additional commands, e-mail: users-h...@maven.apache.org
Re: versioning by hashes to speedup multi-module build (a'la nix package manager)
Hi, just look at http://maven.apache.org/maven-ci-friendly.html. Konrad > Am 01.02.2020 um 16:08 schrieb Anton Vodonosov : > > Hello. > > In order to speed up the build of a big multi-module project, > I'd like to reuse the artifacts of modules that haven't changed. > Manual versioning is tedious and error-prone. > > Is it possible to automatically assign versions so that > versions only change if module sources or dependencies change? > In result, only those modules will be recompiled and retested, > and all other modules will be reused from artifact repository. > > For example, this could be done if versions are computed as > hash-of( hash-of(module sources) + hashes of all dependencies). > > In this approach, every change in code will modify such > hash-based version of all dependent modules automatically. > > This would be similar to Nix package manager. > > How such things (has based or to do that in maven? > > Best regards, > - Anton > > > - > To unsubscribe, e-mail: users-unsubscr...@maven.apache.org > For additional commands, e-mail: users-h...@maven.apache.org >