One of the things I feel is necessary to grow Maven in the modelVersion
5.0.0 world is to start taking account of architecture specific
artifacts.
Currently, the Maven repository layout does not handle architecture
specific dependencies well.
So, for example:
Say I have a foo.jar that depends on a native library... bar.dll /
libbar.so / etc
Ideally we'd like to say that foo just depends on bar...
A consumer of foo that is running on, say my local machine, could then
see
that I am running on os-x- x86_64 and because I am wanting to run
tests...
it would look for bar with the architecture of `os-x-x86_64` to get the
native library for me
When I am building the installer for windows on my os-x machine (using
say
.NET and the WiX toolchain) the corresponding (future does not exist
yet)
maven plugin could request the win-x86 architecture of the dependency
and
the rpm plugin could request the linux-ppc, linux-arm64, linux-x86 and
linux-x86_64 artifacts in order to produce the corresponding rpm
architecture artifacts
So when I think about this concept... I feel it is important that we
find
a way to introduce the architectureId into the GACVT of the repository.
When we do this, to my mind, we need to be mindful that modelVersion
4.0.0
consumers would like to be able to consume these architecture specific
dependencies also... and the 4.0.0 GAV constraints will constrain the
possible solutions that we can pick if we value letting 4.0.0 consumers
access these architecture specific artifacts via the `default` layout we
currently employ for the maven repository.
So the first things first... our current `default` layout transforms the
GroupId:ArtifactId:Version:Classifier:Type into a repository URL of
`${groupId.replaceAll('.','/')}/${artifactId}/${version}/${
artifactId}-${version}${classifier==null?'':'-'+classifier}.${type}`
If we want to add architectureId into that URL Path and still have that
resolvable by GAVCT at a modelVersion 4.0.0 coordinate, we are basically
left with stuffing the architectureId into one of the existing
components...
Now when we think about an architecture specific artifact, the first
thing
that comes to mind is that each architecture specific artifact most
likely
has different dependencies... hopefully the .pdt file (that would be
deployed at the GAV without an architecture... modulo multi-machine
builds)
would provide the architecture specific dependency trees so that
modelVersion 5.0.0 aware consumers would - just naturally - be aware of
those differences in dependencies
But - if we want to give the modelVersion 4.0.0 consumers our best
effort
- we probably need to give each architectureId it's own modelVersion
4.0.0
pom.
In other words, I do not think we should try to munge the architectureId
into either classifier or type as both of those would force the
dependencies to be viewed as having the same dependencies in the
modelVersion 4.0.0 world
So that leaves us with groupId, artifactId and version...
I personally think version is a non-runner. In modelVersion 4.0.0 you
can
only depend on one version of a dependency at a time... version ranges
would become completely and utterly unusable (never mind that they are
unusable now)... plus my gut tells me that it would be a total mess!
So that leaves groupId and artifactId... our choices basically boil
down to
legacyGroupId == '${groupId}'; legacyArtifactId == '${architectureId}.${
artifactId}'
legacyGroupId == '${groupId}'; legacyArtifactId == '${architectureId}-${
artifactId}'
legacyGroupId == '${groupId}'; legacyArtifactId == '${artifactId}.${
architectureId}'
legacyGroupId == '${groupId}'; legacyArtifactId == '${artifactId}-${
architectureId}'
legacyGroupId == '${groupId}.${architectureId}'; legacyArtifactId ==
'${artifactId}'
legacyGroupId == '${groupId}.${artifactId}'; legacyArtifactId ==
'${architectureId}'
I personally think that the ones that place `architectureId` lexically
before `artifactId` are not "right"... the most important coordinate is
the
groupId, the next most is the artifactId, then the architecture, then
the
version, etc
So to my mind that leaves us with:
legacyGroupId == '${groupId}'; legacyArtifactId == '${artifactId}.${
architectureId}'
legacyGroupId == '${groupId}'; legacyArtifactId == '${artifactId}-${
architectureId}'
legacyGroupId == '${groupId}.${artifactId}'; legacyArtifactId ==
'${architectureId}'
Now when we look at how, say, a modelVersion 4.0.0 consumer would use
these dependencies... the variant where we shift the artifactId into the
groupId would mean that you would end up with loads of `linux-arm`
"legacyArtifactId" dependencies in your modelVersion 4.0.0 consumer...
which would presumably be ugly (just like now if you have two matching
`artifactId` dependencies in your .war which forces us to disambiguate
by
prefixing the groupId when copying into WEB-INF/lib)... so I am going to
reject that one also.
The convention seems to be that the artifactId does not contain a `.`
with
most artifacts that I am aware of using `-` as the separator... this
could
be used to argue either way... my preference is to run with `-` as the
separator... though I am open to using `.` to provide a convention that
architecture is distinguished using a `.`
So how would this work...
Ok, I have my foobar project that builds a .jar and the native libraries
that are required by that .jar
So from the reactor for that project we want to deploy
com.example:foobar:::1.0:pom (the legacy pom for the .jar to allow
modelVersion 4.0.0 consumption of the jar)
com.example:foobar:::1.0:pdt (the modern project dependency trees for
all
attached artifacts)
com.example:foobar:::1.0:jar (the jar)
com.example:foobar::javadoc:1.0:jar (the javadoc jar)
com.example:foobar::sources:1.0:jar (the source jar)
com.example:foobar:win_x86::1.0:pom (the legacy pom for the 32-bit DLL)
com.example:foobar:win_x86::1.0:dll (the 32-bit DLL... alternatively the
type might be `native-library` or `lib` but let's assume DLL)
com.example:foobar:win_x86_64::1.0:pom (the legacy pom for the 64-bit
DLL)
com.example:foobar:win_x86_64::1.0:dll (the 64-bit DLL)
com.example:foobar:osx_x86_64::1.0:pom (the legacy pom for the 64-bit
OS-X .dylib)
com.example:foobar:osx_x86_64::1.0:dylib (the 64-bit .dylib...
alternatively the type might be `native-library` or `lib` but let's
assume
dylib)
com.example:foobar:elf_arm::1.0:pom (the legacy pom for the linux ARM
.so)
com.example:foobar:elf_arm::1.0:so (the ARM .so ... alternatively the
type might be `native-library` or `lib` but let's assume so)
com.example:foobar:elf_x86::1.0:pom (the legacy pom for the linux x86
32-bit .so)
com.example:foobar:elf_x86::1.0:so (the x86-32-bit .so)
com.example:foobar:elf_x86_64::1.0:pom (the legacy pom for the linux x86
64-bit .so)
com.example:foobar:elf_x86_64::1.0:so (the x86 64-bit .so)
My main build machine cannot cross-compile for PPC or ARM64... so we
have
two other build machines that will want to produce the extra
architecture
specific artifacts...
com.example:foobar:elf_ppc::1.0:pom (the legacy pom for the linux PPC
.so)
com.example:foobar:elf_ppc::1.0:so (the PPC .so)
and
com.example:foobar:elf_arm_64::1.0:pom (the legacy pom for the linux ARM
64-bit .so)
com.example:foobar:elf_arm_64::1.0:so (the ARM 64-bit .so)
In order to accommodate delayed deployment, I am going to suggest that
the
PPC and ARM64 deployments should publish their *supplemental* pdts at
their
coordinates, e.g.
com.example:foobar:elf_ppc::1.0:pdt (the suplemental project dependency
trees for the PPC reactor artifacts)
and
com.example:foobar:elf_arm_64::1.0:pdt (the suplemental project
dependency trees for the ARM64 reactor artifacts)
So ultimately we would end up with the following files being deployed
(in
three "atomic" deployments):
com/example/foobar/1.0/foobar-1.0.pom
com/example/foobar/1.0/foobar-1.0.pdt
com/example/foobar/1.0/foobar-1.0.jar
com/example/foobar/1.0/foobar-1.0-javadoc.jar
com/example/foobar/1.0/foobar-1.0-sources.jar
com/example/foobar-win_x86/1.0/foobar-win_x86-1.0.pom
com/example/foobar-win_x86/1.0/foobar-win_x86-1.0.dll
com/example/foobar-win_x86_64/1.0/foobar-win_x86_64-1.0.pom
com/example/foobar-win_x86_64/1.0/foobar-win_x86_64-1.0.dll
com/example/foobar-osx_x86_64/1.0/foobar-win_x86_64-1.0.pom
com/example/foobar-osx_x86_64/1.0/foobar-win_x86_64-1.0.dylib
com/example/foobar-elf_arm/1.0/foobar-elf_arm-1.0.pom
com/example/foobar-elf_arm/1.0/foobar-elf_arm-1.0.so
com/example/foobar-elf_x86/1.0/foobar-elf_x86-1.0.pom
com/example/foobar-elf_x86/1.0/foobar-elf_x86-1.0.so
com/example/foobar-elf_x86_64/1.0/foobar-elf_x86_64-1.0.pom
com/example/foobar-elf_x86_64/1.0/foobar-elf_x86_64-1.0.so
com/example/foobar-elf_ppc/1.0/foobar-elf_ppc-1.0.pom
com/example/foobar-elf_ppc/1.0/foobar-elf_ppc-1.0.pdt
com/example/foobar-elf_ppc/1.0/foobar-elf_ppc-1.0.so
com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pom
com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pdt
com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.so
When a modelVersion 5.0.0 consumer does something like:
compile: {
dependencies: ["com.example:foobar:1.0:jar"]
}
test: {
dependencies: ["org.junit:junit:5.0:jar"]
}
and wants to run its tests on linux ARM64 it will start by resolving
`com/example/foobar/1.0/foobar-1.0.pdt` this will give it the dependency
tree of the `.jar` which will declare an architecture dependent native
library dependency (somehow or other... this is why we may use
`native-library` as the "type")... because it knows that it is running
on
ARM64 architecture it will then know that it needs
`com.example:foobar:elf_arm_64::1.0:so` since this is not available in
the `com/example/foobar/1.0/foobar-1.0.pdt` trees it will then attempt
to
download `com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pdt`
if
that exists, it will use that tree... if it doesn't exist... we fail the
build (technically we could fall back to checking for
`com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pom` and
`com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.so` before
failing the build... but as we know the artifacts were produced by a
5.0.0
aware producer - as we have `com/example/foobar/1.0/foobar-1.0.pdt`
resolved)
A modelVersion 4.0.0 consumer is not really going to be able to have as
flexible a build... but at least they can - through declarations such as
<dependency>
<groupId>com.example</groupId>
<artifactId>foobar-elf_arm64</artifactId>
<version>1.0</version>
<type>so</type>
</dependency>
grab the .so to bundle into a .zip or installer and if they want to
write
a pom with architecture based profile activation injecting test scoped
dependencies they can do that also
WDYT?
If anyone has any experience from the NMaven experiments, or learnings
from .deb or .rpm attempts to solve architecture dependent artifacts
mixed
with noarch artifacts... please step forward and join the discussion.
-Stephen
Notes:
1. I am not saying what conventions will be used to define the
`architectureId` values here
2. I am not discussing the schema for the .pdt files here... other than
the general priciple that they will contain multiple dependency trees
for
each artifact produced by the project
3. I am not discussing how a modelVersion 5.0.0 build would be invoked
or
detect that it should just do the PPC deployment
4. This proposal does not include the new metadata schema that we would
likely require to assist with such a deployment format
5. I am not discussing or proposing a modelVersion 5.0.0 schema here...
I
use a non-XML format to help people mentally disassociate thinking about
the architectureId specific things from the current 4.0.0 way of doing
things