On 11/05/2009, at 11:12 AM, Brian Fox wrote:
It's time to start looking at the problems with the current 2.x
resolution
scheme as it specifically relates to repository declaration and
discovery.
Sorry for the delay in responding to this, I'm still catching up on May.
I think the first few sections are accurate and complete.
For requirements:
1. maintain the ability for a user to checkout your code and run mvn
install and have it work with no prior setup on their part.
+1
2. be able to depend on some jar and not worry about any
repositories required for transitive resolution (ie discover the
repositories transitively as dependencies are processed) (this is
controversial and may be eliminated. First it contributes to the
Problem #4 above in that SAT can't be done on a bounded list of
repositories. It also doesn't work normally behind a repository
manager because the list of repos is usually controlled in the repo
manager and thus autodiscovery is intentionally blocked, usually via
a mirrorOf * to circumvent the repos maven finds in the poms.)
I think we can achieve this in a way that is compatible with repo
managers, depending on the solution (see below)
If we have this though, we need to add a new requirement:
5. builds should be able to add their own alternative versions for
artifacts (eg, see xwiki's build that provides a lot of custom
versions of standard things), without affecting other builds. So in
this case, they would use a custom version to ensure within their
build it can override others and contribute to ranges, but its
existence in a local repository shouldn't affect other builds.
3. be able to separate the dependencies needed by maven plugins from
those needed by the build. This means not only where they are
resolved from, but also how they are stored locally to prevent cross-
contamination.
I think I would reword this. I can understand wanting to locate
plugins separately, and for their repos/deps not to affect the rest of
the build, but I'm not sure why local storage matters. A dependency
junit:junit:3.8.1 used in a plugin should be the same as that used in
a project. Perhaps an alternate/additional requirement is "3. a given
artifact coordinate must be always use an identical artifact across a
build".
4. Repository identification: at this point we are pretty much in
agreement that the URL should be the unique identifier for a
repository. People who care about what they are publishing either
need to use canonical repositories like Maven central or need to
guarantee the existence of the repositories or have decent pointers.
In a fully distributed system the relocation mechanism we have does
not work in a fully distributed system without a master to manage
relocations.
This is a solution, not a requirement :) I think it's clear we need a
unique identifier. A URI is a good way to do that, but we need to
accommodate that repositories will move too (This was a problem listed
earlier). Depending on how we solve the above, it may become less of
an issue. So perhaps reword as "repositories must be uniquely
identifiable and able to be relocated to a new location over time
without affecting existing builds".
I'd then break out artifact relocation as separate requirements:
6. relocating an artifact to a different coordinate must be possible
even if that is on a different repository
Stemming from the location I'd add:
7. repositories must be able to be mirrored to different locations and
the user select from their choice of closer, identical repository.
Also, probably implied but worth stating:
8. all discovery must be possible without a repository manager
installed (though using one can improve the ability to route requests
differently)
And finally, maybe implied but worth being explicit about:
9. must work for locating parent projects (this will start giving us
better ways to deal with the chicken/egg problem and auto-versioning)
Turning to solutions since it has been a while now... here's some
starting points.
I'm tossing around two alternatives in my head:
1) using the repository as the start of the namespace (ie, http://repo1.maven.org/maven2/junit/junit/3.8.1/junit-3.8.1.jar
is different to http://repo.otherproject.com/junit/junit/3.8.1/junit-3.8.1.jar)
, where the repository contributes to the "version" of the artifact,
but is considered the same group/artifact ID for the purpose of
resolution. Not that this is just for identification, location needs
to be separate.
2) considering group/artifact ID to be globally unique and repository
can be derived from that
I'm leaning towards (2) as its shorter notation and easier to
understand. Under (1), we'd probably need to be able to add the
repository to a dependency element (perhaps with a shorthand notation
defined in the pom or its parent
Either way, the resolution mechanism should not be affected by the
repositories used. For a given set of artifacts, that should always
resolve the same way. The versions available to a range calculation
will alter depending on the available repositories, but these should
all be known up front in the build. I don't think we need to deal with
how version ranges are calculated / made reproducible here (that's
being separately dealt with), as long as the above requirements are
met with respect to the repositories used for it.
To accommodate this, I think the repositories in the POM should become
constrained to locating metadata for a certain set of artifacts, so
they can be used to expand reach through resolution, but do not affect
anything already encountered, and do not affect resolution outside the
current project. As long as the revised (3) above holds, this will be
reproducible.
Given 1) , 2), 3), and 5), I think a delegating structure for locating
an artifact is the way to go. That is, specifying *only* the
<dependency> element is enough for a build to locate an artifact, and
always get the same one. The advantages are significant: less
configuration/easier set up for new repositories, simpler resolution
logic, faster resolution as it never needs to search multiple
repositories. The delegation needs to go right down to the version
level (snapshots in one repo, releases in another). Then the downside
is loss of control (if we point javax to the download.java.net repos
automatically, we have to live with that doing dodgy stuff in that
namespace like bad POMs or changing released artifacts, or just being
down).
I think this can be overcome by layers of routing rules. So, if
central becomes the source of pointers to artifacts, then a project
can add a repository to locate *missing* ones (not override existing)
as described above, then a user can *alter* routes from their
settings.xml. A common one for this will be * -> repository manager,
but you could have others whether you are using a repo manager or not.
As for local storage, which was mentioned in the requirements, I'm
still in favour or this or similar: http://docs.codehaus.org/display/MAVEN/Local+repository+separation
. The important part here is that metadata is separated from artifacts
and local installations are only used when you intended them to be.
Anyway, just a starting point for discussion, if we can agree on some
of the fundamentals I'm sure we can build up a more complete solution.
Cheers,
Brett
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
For additional commands, e-mail: dev-h...@maven.apache.org