Re: [DISCUSS] Incorporating an ArchitectureId into the GAVCT of the repository

Tibor Digana Fri, 02 Sep 2016 15:14:23 -0700

@Stephen
I know that you don't want to have too big change and maven1->maven2 but
one way or another XML is strictly defined by XSD.
We must accept that fact to accept a format with more freedom and therefore
I would prefer code/interface instead of XML like Groovy script :
*dependencies: ["com.example:foobar:1.0:jar"]*
It's nice that we are in between Nexus-Maven-User. It has advantages that
we define best practices for Users group but the same has penalty because
Maven group takes this on their shoulder and the same group of Maven devs
has to spend new effort. Maybe it is better to provide some more freedom to
Users and Nexus and have Maven like deployer tool able to check the POM
semantics and syntax  and credentials, etc.


>>So that will not scale and prevents mirroring
Why not?
All I wanted is to tell you that POM can be polymorphic. The POM can be
almost the same or extended without extending schema but meaning would be
different which depends on architectureId. The classifier already does not
appear in URL path but it alters the binary of the same
groupId:artifactId:version. Classifier or Architecture is the same problem
for me.
It should scale because the point is to keep POM structure old and the REST
server would store POM per architectureId. The same could be done already
with classifier. Another advantage is storage which currently is a pure
webcontent but nowadays storages are key-to-value or NoSql databases. We
should give the Remote Repositories more freedom how the REST API is
implemented, unlike webcontent however I believe nowadays Nexus team
already understood this and maybe internally it is not pure webcontent.
In few years later we or Nexus wants to add "cloudId" similar to what you
introduced with "architectureId". Again the pair in database would be
key=value is architectureId&cloudId=POM; just a pure example.




On Fri, Sep 2, 2016 at 8:06 PM, Stephen Connolly <
[email protected]> wrote:

> On Friday 2 September 2016, Tibor Digana <[email protected]>
> wrote:
>
> > @Stephen
> > IIUC the 3rd part artifact is a platform (architecture) specific
> dependency
> > for another project. Thus the 3rd party artifact can be a tree of
> > dependencies for my project.
> > In my imagination some mapping between _architecture_ and
> > _dependency_tree_.
> >
> > >>we need to move Maven forward
> > RESTful Maven is my answer.
> > I may not to read documentation of 3rd party project. Instead let the
> Maven
> > retrieve a List of architectures from all dependencies via REST service:
> > https://repo1.maven.org/maven2/*rest/api*/*query-architectures*
> > /<fully_qualified_artifact_path>
>
>
> So that will not scale and prevents mirroring
>
> Better is to store metadata and let the client retrieve the metadata and
> parse the query itself
>
>
> > the response will answer with list of architectures:
> > linux-ppc, linux-arm64, linux-x86 and
> > Consumer's project (POM) would require another architecture linux-x86_64
> > and the build fails.
> > Suppose the required architecture matched with linux-arm64 and the next
> > query will ask for versions
> > https://repo1.maven.org/maven2/*rest/api*/*query-versions*
> > /<fully_qualified_artifact_path>
> > finally query artifact binary
> > https://repo1.maven.org/maven2/*rest/api*/*query-artifact-binary*
> > /<fully_qualified_artifact_path>
> >
> >
> That is getting towards irreproducible builds... I don't think we want to
> end up there
>
> Now, you do not need to specify the pattern how the artifact is described
> > groupId:artifactId::::.
> > It's because the REST server provides Maven with all operations we can
> > query concrete binary from repo.
> > The deployer of POM can specify artifact description. Now it is
> > architectureId been added but in next few years it will be something
> else.
> > Look if you deploy artifact to *https://repo1.maven.org/
> > <https://repo1.maven.org/>maven2* via *rest/api* then Nexus server will
> > check if the POM is compliant with modelVersion 4.0 *maven2* in the path.
> > This way we separate artifacts POM which are not compatible, we let the
> > maven break the backwards compatibility without breaking old repositories
> > and old code line of Maven and layout, and finally it is the
> responsibility
> > of *consumer* what default URL of Maven Central is specified in his
> > settings.xml and therefore what modelVersion as well.
> >
> >
> Been there with maven1->maven2
>
> Consensus is "let's not do that again"
>
>
> > So the next generation 5.0 would go to *https://repo1.maven.org/
> > <https://repo1.maven.org/>maven5*.
> >
> > WDYT?
> >
> >
> >
> >
> >
> > On Thu, Sep 1, 2016 at 12:07 PM, Stephen Connolly <
> > [email protected] <javascript:;>> wrote:
> >
> > > One of the things I feel is necessary to grow Maven in the modelVersion
> > > 5.0.0 world is to start taking account of architecture specific
> > artifacts.
> > >
> > > Currently, the Maven repository layout does not handle architecture
> > > specific dependencies well.
> > >
> > > So, for example:
> > >
> > > Say I have a foo.jar that depends on a native library... bar.dll /
> > > libbar.so / etc
> > >
> > > Ideally we'd like to say that foo just depends on bar...
> > >
> > > A consumer of foo that is running on, say my local machine, could then
> > see
> > > that I am running on os-x- x86_64 and because I am wanting to run
> > tests...
> > > it would look for bar with the architecture of `os-x-x86_64` to get the
> > > native library for me
> > >
> > > When I am building the installer for windows on my os-x machine (using
> > say
> > > .NET and the WiX toolchain) the corresponding (future does not exist
> yet)
> > > maven plugin could request the win-x86 architecture of the dependency
> and
> > > the rpm plugin could request the linux-ppc, linux-arm64, linux-x86 and
> > > linux-x86_64 artifacts in order to produce the corresponding rpm
> > > architecture artifacts
> > >
> > > So when I think about this concept... I feel it is important that we
> > find a
> > > way to introduce the architectureId into the GACVT of the repository.
> > >
> > > When we do this, to my mind, we need to be mindful that modelVersion
> > 4.0.0
> > > consumers would like to be able to consume these architecture specific
> > > dependencies also... and the 4.0.0 GAV constraints will constrain the
> > > possible solutions that we can pick if we value letting 4.0.0 consumers
> > > access these architecture specific artifacts via the `default` layout
> we
> > > currently employ for the maven repository.
> > >
> > > So the first things first... our current `default` layout transforms
> the
> > > GroupId:ArtifactId:Version:Classifier:Type into a repository URL of
> > >
> > > `${groupId.replaceAll('.','/')}/${artifactId}/${version}/${
> > > artifactId}-${version}${classifier==null?'':'-'+classifier}.${type}`
> > >
> > > If we want to add architectureId into that URL Path and still have that
> > > resolvable by GAVCT at a modelVersion 4.0.0 coordinate, we are
> basically
> > > left with stuffing the architectureId into one of the existing
> > > components...
> > >
> > > Now when we think about an architecture specific artifact, the first
> > thing
> > > that comes to mind is that each architecture specific artifact most
> > likely
> > > has different dependencies... hopefully the .pdt file (that would be
> > > deployed at the GAV without an architecture... modulo multi-machine
> > builds)
> > > would provide the architecture specific dependency trees so that
> > > modelVersion 5.0.0 aware consumers would - just naturally - be aware of
> > > those differences in dependencies
> > >
> > > But - if we want to give the modelVersion 4.0.0 consumers our best
> > effort -
> > > we probably need to give each architectureId it's own modelVersion
> 4.0.0
> > > pom.
> > >
> > > In other words, I do not think we should try to munge the
> architectureId
> > > into either classifier or type as both of those would force the
> > > dependencies to be viewed as having the same dependencies in the
> > > modelVersion 4.0.0 world
> > >
> > > So that leaves us with groupId, artifactId and version...
> > >
> > > I personally think version is a non-runner. In modelVersion 4.0.0 you
> can
> > > only depend on one version of a dependency at a time... version ranges
> > > would become completely and utterly unusable (never mind that they are
> > > unusable now)... plus my gut tells me that it would be a total mess!
> > >
> > > So that leaves groupId and artifactId... our choices basically boil
> down
> > to
> > >
> > > legacyGroupId == '${groupId}'; legacyArtifactId ==
> > > '${architectureId}.${artifactId}'
> > > legacyGroupId == '${groupId}'; legacyArtifactId ==
> > > '${architectureId}-${artifactId}'
> > > legacyGroupId == '${groupId}'; legacyArtifactId ==
> > > '${artifactId}.${architectureId}'
> > > legacyGroupId == '${groupId}'; legacyArtifactId ==
> > > '${artifactId}-${architectureId}'
> > > legacyGroupId == '${groupId}.${architectureId}'; legacyArtifactId ==
> > > '${artifactId}'
> > > legacyGroupId == '${groupId}.${artifactId}'; legacyArtifactId ==
> > > '${architectureId}'
> > >
> > > I personally think that the ones that place `architectureId` lexically
> > > before `artifactId` are not "right"... the most important coordinate is
> > the
> > > groupId, the next most is the artifactId, then the architecture, then
> the
> > > version, etc
> > >
> > > So to my mind that leaves us with:
> > >
> > > legacyGroupId == '${groupId}'; legacyArtifactId ==
> > > '${artifactId}.${architectureId}'
> > > legacyGroupId == '${groupId}'; legacyArtifactId ==
> > > '${artifactId}-${architectureId}'
> > > legacyGroupId == '${groupId}.${artifactId}'; legacyArtifactId ==
> > > '${architectureId}'
> > >
> > > Now when we look at how, say, a modelVersion 4.0.0 consumer would use
> > these
> > > dependencies... the variant where we shift the artifactId into the
> > groupId
> > > would mean that you would end up with loads of `linux-arm`
> > > "legacyArtifactId" dependencies in your modelVersion 4.0.0 consumer...
> > > which would presumably be ugly (just like now if you have two matching
> > > `artifactId` dependencies in your .war which forces us to disambiguate
> by
> > > prefixing the groupId when copying into WEB-INF/lib)... so I am going
> to
> > > reject that one also.
> > >
> > > The convention seems to be that the artifactId does not contain a `.`
> > with
> > > most artifacts that I am aware of using `-` as the separator... this
> > could
> > > be used to argue either way... my preference is to run with `-` as the
> > > separator... though I am open to using `.` to provide a convention that
> > > architecture is distinguished using a `.`
> > >
> > > So how would this work...
> > >
> > > Ok, I have my foobar project that builds a .jar and the native
> libraries
> > > that are required by that .jar
> > >
> > > So from the reactor for that project we want to deploy
> > >
> > > com.example:foobar:::1.0:pom (the legacy pom for the .jar to allow
> > > modelVersion 4.0.0 consumption of the jar)
> > > com.example:foobar:::1.0:pdt (the modern project dependency trees for
> all
> > > attached artifacts)
> > > com.example:foobar:::1.0:jar (the jar)
> > > com.example:foobar::javadoc:1.0:jar (the javadoc jar)
> > > com.example:foobar::sources:1.0:jar (the source jar)
> > > com.example:foobar:win_x86::1.0:pom (the legacy pom for the 32-bit
> DLL)
> > > com.example:foobar:win_x86::1.0:dll (the 32-bit DLL... alternatively
> the
> > > type might be `native-library` or `lib` but let's assume DLL)
> > > com.example:foobar:win_x86_64::1.0:pom (the legacy pom for the 64-bit
> > DLL)
> > > com.example:foobar:win_x86_64::1.0:dll (the 64-bit DLL)
> > > com.example:foobar:osx_x86_64::1.0:pom (the legacy pom for the 64-bit
> > OS-X
> > > .dylib)
> > > com.example:foobar:osx_x86_64::1.0:dylib (the 64-bit .dylib...
> > > alternatively the type might be `native-library` or `lib` but let's
> > assume
> > > dylib)
> > > com.example:foobar:elf_arm::1.0:pom (the legacy pom for the linux ARM
> > .so)
> > > com.example:foobar:elf_arm::1.0:so (the ARM .so ... alternatively the
> > type
> > > might be `native-library` or `lib` but let's assume so)
> > > com.example:foobar:elf_x86::1.0:pom (the legacy pom for the linux x86
> > > 32-bit .so)
> > > com.example:foobar:elf_x86::1.0:so (the x86-32-bit .so)
> > > com.example:foobar:elf_x86_64::1.0:pom (the legacy pom for the linux
> x86
> > > 64-bit .so)
> > > com.example:foobar:elf_x86_64::1.0:so (the x86 64-bit .so)
> > >
> > > My main build machine cannot cross-compile for PPC or ARM64... so we
> have
> > > two other build machines that will want to produce the extra
> architecture
> > > specific artifacts...
> > >
> > > com.example:foobar:elf_ppc::1.0:pom (the legacy pom for the linux PPC
> > .so)
> > > com.example:foobar:elf_ppc::1.0:so (the PPC .so)
> > >
> > > and
> > >
> > > com.example:foobar:elf_arm_64::1.0:pom (the legacy pom for the linux
> ARM
> > > 64-bit .so)
> > > com.example:foobar:elf_arm_64::1.0:so (the ARM 64-bit .so)
> > >
> > > In order to accommodate delayed deployment, I am going to suggest that
> > the
> > > PPC and ARM64 deployments should publish their *supplemental* pdts at
> > their
> > > coordinates, e.g.
> > >
> > > com.example:foobar:elf_ppc::1.0:pdt (the suplemental project
> dependency
> > > trees for the PPC reactor artifacts)
> > >
> > > and
> > >
> > > com.example:foobar:elf_arm_64::1.0:pdt (the suplemental project
> > dependency
> > > trees for the ARM64 reactor artifacts)
> > >
> > > So ultimately we would end up with the following files being deployed
> (in
> > > three "atomic" deployments):
> > >
> > > com/example/foobar/1.0/foobar-1.0.pom
> > > com/example/foobar/1.0/foobar-1.0.pdt
> > > com/example/foobar/1.0/foobar-1.0.jar
> > > com/example/foobar/1.0/foobar-1.0-javadoc.jar
> > > com/example/foobar/1.0/foobar-1.0-sources.jar
> > > com/example/foobar-win_x86/1.0/foobar-win_x86-1.0.pom
> > > com/example/foobar-win_x86/1.0/foobar-win_x86-1.0.dll
> > > com/example/foobar-win_x86_64/1.0/foobar-win_x86_64-1.0.pom
> > > com/example/foobar-win_x86_64/1.0/foobar-win_x86_64-1.0.dll
> > > com/example/foobar-osx_x86_64/1.0/foobar-win_x86_64-1.0.pom
> > > com/example/foobar-osx_x86_64/1.0/foobar-win_x86_64-1.0.dylib
> > > com/example/foobar-elf_arm/1.0/foobar-elf_arm-1.0.pom
> > > com/example/foobar-elf_arm/1.0/foobar-elf_arm-1.0.so
> > > com/example/foobar-elf_x86/1.0/foobar-elf_x86-1.0.pom
> > > com/example/foobar-elf_x86/1.0/foobar-elf_x86-1.0.so
> > > com/example/foobar-elf_x86_64/1.0/foobar-elf_x86_64-1.0.pom
> > > com/example/foobar-elf_x86_64/1.0/foobar-elf_x86_64-1.0.so
> > >
> > > com/example/foobar-elf_ppc/1.0/foobar-elf_ppc-1.0.pom
> > > com/example/foobar-elf_ppc/1.0/foobar-elf_ppc-1.0.pdt
> > > com/example/foobar-elf_ppc/1.0/foobar-elf_ppc-1.0.so
> > >
> > > com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pom
> > > com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pdt
> > > com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.so
> > >
> > > When a modelVersion 5.0.0 consumer does something like:
> > >
> > > compile: {
> > >   dependencies: ["com.example:foobar:1.0:jar"]
> > > }
> > > test: {
> > >   dependencies: ["org.junit:junit:5.0:jar"]
> > > }
> > >
> > > and wants to run its tests on linux ARM64 it will start by resolving
> > > `com/example/foobar/1.0/foobar-1.0.pdt` this will give it the
> dependency
> > > tree of the `.jar` which will declare an architecture dependent native
> > > library dependency (somehow or other... this is why we may use
> > > `native-library` as the "type")... because it knows that it is running
> on
> > > ARM64 architecture it will then know that it needs
> > > `com.example:foobar:elf_arm_64::1.0:so` since this is not available in
> > the
> > > `com/example/foobar/1.0/foobar-1.0.pdt` trees it will then attempt to
> > > download `com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pdt`
> > if
> > > that exists, it will use that tree... if it doesn't exist... we fail
> the
> > > build (technically we could fall back to checking for
> > > `com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pom` and
> > > `com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.so` before
> > > failing
> > > the build... but as we know the artifacts were produced by a 5.0.0
> aware
> > > producer - as we have `com/example/foobar/1.0/foobar-1.0.pdt`
> resolved)
> > >
> > > A modelVersion 4.0.0 consumer is not really going to be able to have as
> > > flexible a build... but at least they can - through declarations such
> as
> > >
> > > <dependency>
> > >   <groupId>com.example</groupId>
> > >   <artifactId>foobar-elf_arm64</artifactId>
> > >   <version>1.0</version>
> > >   <type>so</type>
> > > </dependency>
> > >
> > > grab the .so to bundle into a .zip or installer and if they want to
> > write a
> > > pom with architecture based profile activation injecting test scoped
> > > dependencies they can do that also
> > >
> > > WDYT?
> > >
> > > If anyone has any experience from the NMaven experiments, or learnings
> > from
> > > .deb or .rpm attempts to solve architecture dependent artifacts mixed
> > with
> > > noarch artifacts... please step forward and join the discussion.
> > >
> > > -Stephen
> > >
> > > Notes:
> > >
> > > 1. I am not saying what conventions will be used to define the
> > > `architectureId` values here
> > > 2. I am not discussing the schema for the .pdt files here... other than
> > the
> > > general priciple that they will contain multiple dependency trees for
> > each
> > > artifact produced by the project
> > > 3. I am not discussing how a modelVersion 5.0.0 build would be invoked
> or
> > > detect that it should just do the PPC deployment
> > > 4. This proposal does not include the new metadata schema that we would
> > > likely require to assist with such a deployment format
> > > 5. I am not discussing or proposing a modelVersion 5.0.0 schema
> here... I
> > > use a non-XML format to help people mentally disassociate thinking
> about
> > > the architectureId specific things from the current 4.0.0 way of doing
> > > things
> > >
> >
> >
> >
> > --
> > Cheers
> > Tibor
> >
>
>
> --
> Sent from my phone
>



-- 
Cheers
Tibor

Re: [DISCUSS] Incorporating an ArchitectureId into the GAVCT of the repository

Reply via email to