Re: versioning by hashes to speedup multi-module build (a'la nix package manager)

2020-07-21 Thread Anton Vodonosov
After learning from gitflow-incremental-builder how to remove
modules from mavenSession if we want to skip them
I implemented a 'version as a hash of sources and dependency
tree" solution:
https://github.com/avodonosov/hashver-maven-plugin

It relies on using property expressions as versions.
A build extension loads values for those properties from a file.
This can be a file with "normal" maven versions maintained
manually in source control or a file generated by the "hashver"
mojo when user wants to use the hash versions and avoid
build of unchanged modules. The extension allows to skip
module build if the artifact of the same version exists.

Using property expressions for versions has two drawbacks:
1. Calculation of hash versions should be a separate maven
invocation, it can't be done in the same maven session
as the main goal, because calculating the hash versions
requires building the dependency tree, and after dependency
tree is built it's impossible (I believe) to inject newly calculated
versions into the maven structures so that they are in effect
when the main goal is executed.
2. This approach is a bit intrusive - user needs to adjust
his pom.xml files to use property expressions in place
of versions.

The advantage of this approach is that maven downloads
artifacts of the skipped modules automatically if this module
is a dependency of a changed module.

If it was possible to annotate artifacts with the "src hash"
using some attribute other the version, and hook into
the artifact download maven logic to find artifacts by this
attribute then generation of hash versions could be done
in the same session as the main goal, and probably
the solution could be applied without any modifications
of user's pom.

I noticed, when uploading snapshot artifacts maven
adds a timestamp (and a sequential "build number"?)
to the artifact name. And when resolving artifacts,
special versionResolver object is invoked, which
for SNAPSHOT sends a metadata request
to the repository to retrieve a real, timestamped
name of the artifact corresponding tot to the snapshot.
This logic could be adjusted to use the "src hash"
instead of the timestamp, and lookup in the remote
repo an artifact matching the current module's
"src hash". Any advice on how to do that?

In the long run, I believe it is desirable for maven
to natively support the notion of "build inputs hash"
for modules and artifacts. It allows significant build
time savings and still be very deterministic and stable.


05.02.2020, 01:26, "Jason Young" :
> Good questions. First of all, this plugin is CI-agnostic, but it does
> require the project to exist in a `git` repository, whether that is in CI
> or on your machine. Check the github page I linked to for more instructions
> on how it determines what projects in a reactor are considered "changed"
> and need to be built versus which are not changed and will be omitted from
> the reactor.
>
> In every Maven build, every dependency is checked this way:
>
>    1. If it is a project in the reactor, use the artifact of that project.
>    2. Otherwise, if the artifact is in the local repo, is that artifact.
>    3. Last resort: Download from the remote repository.
>
> There are some other rules omitted above, e.g. when to download a fresh
> SNAPSHOT artifact based on your chosen snapshot policy, etc., but that's
> the gist of it: Maven will obtain the artifact if it is not present, no
> further configuration needed.
>
> E.g. let's say you have one project that names 2 other projects A and B as
> submodules, and A depends on B. If you run `mvn install -pl A` (NOT SURE
> about that syntax), then Maven will look for B.jar from your local repo,
> and resort to checking your remote repo (e.g. Maven Central) if it's not
> there. But if you omit the `-pl A` part, Maven will build B, then build A
> using B.jar.
>
> Essentially, the plugin I linked to determines the project list based on
> what has changed and what has not. Maven then decides whether to use
> B/target/B.jar, ~/.m2/repository/.../B.jar, or to look to Maven Central.
>
> HTH.
>
> On Tue, Feb 4, 2020 at 4:08 PM Anton Vodonosov  wrote:
>
>>  04.02.2020, 23:32, "Jason Young" :
>>  >
>>  > Not what you're looking for, but maybe useful: We use one plugin that
>>  will
>>  > skip whole projects that have not changed WRT a given Git branch:
>>  > https://github.com/vackosar/gitflow-incremental-builder. With careful
>>  > configuration, this is an effective shortcut without sacrificing
>>  > repeatability.
>>
>>  How do you use it? I mean if you need to start full system,
>>  (locally or deploying it to a qa server),
>>  but the plugin has only built changed modules,
>>  how do you download the rest?
>>
>>  Do you use this plugin in CI?
>>
>>  -
>>  To unsubscribe, e-mail: users-unsubscr...@maven.apache.org
>>  For additional commands, e-mail: users-h...@maven.apache.org
>
> 

Re: versioning by hashes to speedup multi-module build (a'la nix package manager)

2020-02-04 Thread Jason Young
Good questions. First of all, this plugin is CI-agnostic, but it does
require the project to exist in a `git` repository, whether that is in CI
or on your machine. Check the github page I linked to for more instructions
on how it determines what projects in a reactor are considered "changed"
and need to be built versus which are not changed and will be omitted from
the reactor.

In every Maven build, every dependency is checked this way:

   1. If it is a project in the reactor, use the artifact of that project.
   2. Otherwise, if the artifact is in the local repo, is that artifact.
   3. Last resort: Download from the remote repository.

There are some other rules omitted above, e.g. when to download a fresh
SNAPSHOT artifact based on your chosen snapshot policy, etc., but that's
the gist of it: Maven will obtain the artifact if it is not present, no
further configuration needed.

E.g. let's say you have one project that names 2 other projects A and B as
submodules, and A depends on B. If you run `mvn install -pl A` (NOT SURE
about that syntax), then Maven will look for B.jar from your local repo,
and resort to checking your remote repo (e.g. Maven Central) if it's not
there. But if you omit the `-pl A` part, Maven will build B, then build A
using B.jar.

Essentially, the plugin I linked to determines the project list based on
what has changed and what has not. Maven then decides whether to use
B/target/B.jar, ~/.m2/repository/.../B.jar, or to look to Maven Central.

HTH.

On Tue, Feb 4, 2020 at 4:08 PM Anton Vodonosov  wrote:

>
>
> 04.02.2020, 23:32, "Jason Young" :
> >
> > Not what you're looking for, but maybe useful: We use one plugin that
> will
> > skip whole projects that have not changed WRT a given Git branch:
> > https://github.com/vackosar/gitflow-incremental-builder. With careful
> > configuration, this is an effective shortcut without sacrificing
> > repeatability.
>
> How do you use it? I mean if you need to start full system,
> (locally or deploying it to a qa server),
> but the plugin has only built changed modules,
> how do you download the rest?
>
> Do you use this plugin in CI?
>
> -
> To unsubscribe, e-mail: users-unsubscr...@maven.apache.org
> For additional commands, e-mail: users-h...@maven.apache.org
>
>

--


Re: versioning by hashes to speedup multi-module build (a'la nix package manager)

2020-02-04 Thread Anton Vodonosov



04.02.2020, 23:32, "Jason Young" :
>
> Not what you're looking for, but maybe useful: We use one plugin that will
> skip whole projects that have not changed WRT a given Git branch:
> https://github.com/vackosar/gitflow-incremental-builder. With careful
> configuration, this is an effective shortcut without sacrificing
> repeatability.

How do you use it? I mean if you need to start full system,
(locally or deploying it to a qa server),
but the plugin has only built changed modules, 
how do you download the rest?

Do you use this plugin in CI?

-
To unsubscribe, e-mail: users-unsubscr...@maven.apache.org
For additional commands, e-mail: users-h...@maven.apache.org



Re: versioning by hashes to speedup multi-module build (a'la nix package manager)

2020-02-04 Thread Jason Young
It seems Maven itself never omits re-doing anything except for downloading
artifacts from a remote repository. The command you give to Maven and the
configuration of your projects dictates what it will do, no matter what
happened in the previous build. You _can_ omit projects in a multi-module
project if by manually specifying what projects to run:
https://blog.sonatype.com/2009/10/maven-tips-and-tricks-advanced-reactor-options/

Not what you're looking for, but maybe useful: We use one plugin that will
skip whole projects that have not changed WRT a given Git branch:
https://github.com/vackosar/gitflow-incremental-builder. With careful
configuration, this is an effective shortcut without sacrificing
repeatability.

Gradle advertises as a feature that it will not rebuild if rebuilding is
not required, or something to that effect. I assume some configuration
required sometimes.

On Tue, Feb 4, 2020 at 1:57 PM Anton Vodonosov  wrote:

> Ha, only after completing the script (even though a slow one)
> I discovered that maven rebuilds modules even if
> an artifact of the same version already exists in artifact
> repository.
>
> I hoped maven, in case a non -SNAPSHOT artifact
> found in an artifact repository will just use it
> and won't build the same version of a module again.
>
> Is there a way to tell maven to do so?
>
> Best regards,
> - Anton
>
> -
> To unsubscribe, e-mail: users-unsubscr...@maven.apache.org
> For additional commands, e-mail: users-h...@maven.apache.org
>
>

--


Re: versioning by hashes to speedup multi-module build (a'la nix package manager)

2020-02-04 Thread Anton Vodonosov
Ha, only after completing the script (even though a slow one)
I discovered that maven rebuilds modules even if
an artifact of the same version already exists in artifact
repository.

I hoped maven, in case a non -SNAPSHOT artifact
found in an artifact repository will just use it
and won't build the same version of a module again.

Is there a way to tell maven to do so?

Best regards,
- Anton

-
To unsubscribe, e-mail: users-unsubscr...@maven.apache.org
For additional commands, e-mail: users-h...@maven.apache.org



Re: versioning by hashes to speedup multi-module build (a'la nix package manager)

2020-02-03 Thread Anton Vodonosov
Thomas Broyer, Enrico Olivelli,

I consider the whole directory where the module's
pom.xml resides, excluding the target/ dir,
as the input, and the final module artifacts as
the output.

Even if some plugins allow sources outside the
pom.xml's directory (out of curiosity, is it possible?),
it is an acceptable restriction on project structure,
IMO.

The version hash approach I described may cause
some redundant work. For example, if only dependencies
changed, most often only re-testing is needed,
re-compilation and re-packaging are not necessary
(unless your dependency generates or instruments
code at compilation time, or you package an uberjar
or war, which includes dependency artifacts).
But for stability better to compromise,
accepting some redundant work, than go into such
complexities as intermediate results of individual
plugins, distinguishing test and prod sources,
types of dependencies. Even the simplest hash
versioning can potentially give significant
speedups for large multi-module projects.

I've spent this weekend trying to create a script
for such version hashes. But haven't completed
it (yet), due to various obstacles in maven
behaviour (impossible to use property expression
in the project/version element; the dependency:tree
goal requires artifact of the current version
to be present in the ~/.m2 folder, although
it doesn't look into the artifact content,
so it can even be a fake artifact).

Maybe someday I'll have more progress.


Best regards,
- Anton


-
To unsubscribe, e-mail: users-unsubscr...@maven.apache.org
For additional commands, e-mail: users-h...@maven.apache.org



Re: versioning by hashes to speedup multi-module build (a'la nix package manager)

2020-02-02 Thread Anton Vodonosov



03.02.2020, 00:15, "Enrico Olivelli" :
> (Apologises for top posting )
>
> This thread is about a bunch of requested features (cache and parallel
> executions of mojos) that we have been discussing on dev@ mailing list.
> As said in this thread the first show stopper for Maven is that we do not
> have a clear definition of input and outputs for each plugin.
> There are proposals but actually no one is spending actively engineering
> time on these topics.
>
> Any help is very appreciated, please join us on dev@
>
> Best regards
> Enrico

What threads in dev list do you recommend to follow in regard
to this subject?


-
To unsubscribe, e-mail: users-unsubscr...@maven.apache.org
For additional commands, e-mail: users-h...@maven.apache.org



Re: versioning by hashes to speedup multi-module build (a'la nix package manager)

2020-02-02 Thread Enrico Olivelli
(Apologises for top posting )

This thread is about a bunch of requested features (cache and parallel
executions of mojos) that we have been discussing on dev@ mailing list.
As said in this thread the first show stopper for Maven is that we do not
have a clear definition of input and outputs for each plugin.
There are proposals but actually no one is spending actively engineering
time on these topics.

Any help is very appreciated, please join us on dev@

Best regards
Enrico



Il Dom 2 Feb 2020, 18:57 Thomas Broyer  ha scritto:

> Le dim. 2 févr. 2020 à 17:48, Anton Vodonosov  a
> écrit :
>
> > Hello.
> >
> > In order to speed up the build of a multi-module project, I'd like to
> > reuse artifacts of modules that haven't changed.
> > Manual versioning is tedious and error-prone.
> >
> > Is it possible to automatically assign versions to modules computed as a
> > hash-of( hash-of(module sources) + hashes of all dependencies)?
> >
>
> Please define modules sources?
> Hint: you can't, at least not without knowing how all plugins work. Gradle
> Enterprise tries to have such knowledge fwiw to solve this exact issue.
>
> Also, you'll probably want to include system properties (or at least Maven
> properties) and some environment information (e.g. which JDK) in the hash.
>
> In this approach, every change in code will modify such hash-based version
> > of all dependent modules automatically.
> >
> > This would be similar to Nix package manager.
> >
> > How to do that in maven?
> >
>
> As said above, you could try Gradle Enterprise. Takari had something in the
> works too a few years ago.
> …or if that's really problematic for you, then migrate to another build
> tool, such as Gradle or Bazel.
>
>
> > Best regards,
> > - Anton
> >
> >
> > -
> > To unsubscribe, e-mail: users-unsubscr...@maven.apache.org
> > For additional commands, e-mail: users-h...@maven.apache.org
> >
> >
>


Re: versioning by hashes to speedup multi-module build (a'la nix package manager)

2020-02-02 Thread Thomas Broyer
Le dim. 2 févr. 2020 à 17:48, Anton Vodonosov  a
écrit :

> Hello.
>
> In order to speed up the build of a multi-module project, I'd like to
> reuse artifacts of modules that haven't changed.
> Manual versioning is tedious and error-prone.
>
> Is it possible to automatically assign versions to modules computed as a
> hash-of( hash-of(module sources) + hashes of all dependencies)?
>

Please define modules sources?
Hint: you can't, at least not without knowing how all plugins work. Gradle
Enterprise tries to have such knowledge fwiw to solve this exact issue.

Also, you'll probably want to include system properties (or at least Maven
properties) and some environment information (e.g. which JDK) in the hash.

In this approach, every change in code will modify such hash-based version
> of all dependent modules automatically.
>
> This would be similar to Nix package manager.
>
> How to do that in maven?
>

As said above, you could try Gradle Enterprise. Takari had something in the
works too a few years ago.
…or if that's really problematic for you, then migrate to another build
tool, such as Gradle or Bazel.


> Best regards,
> - Anton
>
>
> -
> To unsubscribe, e-mail: users-unsubscr...@maven.apache.org
> For additional commands, e-mail: users-h...@maven.apache.org
>
>


Re: versioning by hashes to speedup multi-module build (a'la nix package manager)

2020-02-02 Thread Tommy Svensson
Hello Anton,

If I understand you correctly, you do not want to build all modules of the 
maven build, only those that has changed. 

As long as you are building in development shell or IDE the result of last 
build is still there and maven will only recompile sources that are newer than 
their classes in target. So doing an "mvn install" again will be much faster 
since it will only compile what has changes since last build. 

When running on a CI however each run has to be treated as a clean build. As 
the "maven-ci-friendly.html" points out "mvn clean install ..." is always done. 
This is the only safe thing to do. A CI can have many executor hosts that it 
delegates jobs to. 2 CI runs of the same thing is not necessarily  run on the 
same machine! Even if run on same machine different jobs gets different / 
personal and temporary areas on disk for that job. In other words, you cannot 
rely on a previous build being available. Every build is from scratch.

But assuming that your build do push built artifacts to a repository somewhere 
(not ~/.m2/repository, but a Nexus / Artifactory /  Bintray / ...), and that 
the build do look for artifacts in that repository, then you can make a 
separate CI configuration that only builds the module or modules you know have 
been modified, depending on the flexibility of the CI used. That is, instead of 
building from root module, trigger several builds, one for each submodule you 
want to rebuild. Depending on CI used it might require one config per sub 
module.  The configuration of such a build will have to be changed for each 
time you want to rebuild just some sub modules. It cannot be an automatic job 
triggered by new commit! You will have to configure it and run manually every 
time.

Best Regards,
Tommy


Från: Anton Vodonosov 
Svara: Maven Users List 
Datum: 2 februari 2020 at 16:10:44
Till: Maven Users List , i...@soebes.de 
Cc: Konrad Windszus 
Ämne:  Re: versioning by hashes to speedup multi-module build (a'la nix package 
manager)  

I want, for unchanged parts of the project, to reuse artifacts  
produced by previous builds, and only rebuild the changed parts.  

Imagine a project with hundreds of modules stored in a single  
git repository, whose full build with tests takes 3 hours.  

A developer creates a ticket branch, changes couple lines and  
pushes the branch to the repository.  

CI build starts.  

I wish at this point only parts affected by the change to be rebuild.  
And artifacts for unaffected modules simply be fetched from  
artifacts repository (because previous builds placed them there).  

Speaking of http://maven.apache.org/maven-ci-friendly.html,  
if that means incorporating git commit into the version of all modules,  
then all modules will be rebuilt in the above scenario,  
even unaffected ones, because the new branch has a new git commit.  

Is it correct?  

Best regards,  
- Anton  

-  
To unsubscribe, e-mail: users-unsubscr...@maven.apache.org  
For additional commands, e-mail: users-h...@maven.apache.org  



Re: versioning by hashes to speedup multi-module build (a'la nix package manager)

2020-02-02 Thread Anton Vodonosov
I want, for unchanged parts of the project, to reuse artifacts
produced by previous builds, and only rebuild the changed parts.

Imagine a project with hundreds of modules stored in a single
git repository, whose full build with tests takes 3 hours.

A developer creates a ticket branch, changes couple lines and
pushes the branch to the repository.

CI build starts.

I wish at this point only parts affected by the change to be rebuild.
And artifacts for unaffected modules simply be fetched from
artifacts repository (because previous builds placed them there).

Speaking of http://maven.apache.org/maven-ci-friendly.html,
if that means incorporating git commit into the version of all modules,
then all modules will be rebuilt in the above scenario, 
even unaffected ones, because the new branch has a new git commit.

Is it correct?

Best regards,
- Anton

-
To unsubscribe, e-mail: users-unsubscr...@maven.apache.org
For additional commands, e-mail: users-h...@maven.apache.org



Re: versioning by hashes to speedup multi-module build (a'la nix package manager)

2020-02-02 Thread Karl Heinz Marbaise

Hi,

On 01.02.20 16:08, Anton Vodonosov wrote:

Hello.

In order to speed up the build of a big multi-module project,
I'd like to reuse the artifacts of modules that haven't changed.
Manual versioning is tedious and error-prone.


Can you explain more in detail what you exactly mean and what kind of
problem you have best would be having an example project which shows the
issues...

Kind regards
Karl Heinz Marbaise


Is it possible to automatically assign versions so that
versions only change if module sources or dependencies change?
In result, only those modules will be recompiled and retested,
and all other modules will be reused from artifact repository.

For example, this could be done if versions are computed as
hash-of( hash-of(module sources) + hashes of all dependencies).

In this approach, every change in code will modify such
hash-based version of all dependent modules automatically.

This would be similar to Nix package manager.

How such things (has based or to do that in maven?

Best regards,
- Anton




-
To unsubscribe, e-mail: users-unsubscr...@maven.apache.org
For additional commands, e-mail: users-h...@maven.apache.org



Re: versioning by hashes to speedup multi-module build (a'la nix package manager)

2020-02-01 Thread Konrad Windszus
Hi,
just look at http://maven.apache.org/maven-ci-friendly.html.
Konrad


> Am 01.02.2020 um 16:08 schrieb Anton Vodonosov :
> 
> Hello.
> 
> In order to speed up the build of a big multi-module project,
> I'd like to reuse the artifacts of modules that haven't changed.
> Manual versioning is tedious and error-prone.
> 
> Is it possible to automatically assign versions so that
> versions only change if module sources or dependencies change?
> In result, only those modules will be recompiled and retested,
> and all other modules will be reused from artifact repository.
> 
> For example, this could be done if versions are computed as
> hash-of( hash-of(module sources) + hashes of all dependencies).
> 
> In this approach, every change in code will modify such
> hash-based version of all dependent modules automatically.
> 
> This would be similar to Nix package manager.
> 
> How such things (has based or to do that in maven?
> 
> Best regards,
> - Anton  
> 
> 
> -
> To unsubscribe, e-mail: users-unsubscr...@maven.apache.org
> For additional commands, e-mail: users-h...@maven.apache.org
>