gnodet commented on code in PR #197: URL: https://github.com/apache/maven-resolver/pull/197#discussion_r1002523689
########## src/site/markdown/remote-repository-filtering.md: ########## @@ -0,0 +1,70 @@ +# Remote Repository Filtering +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +A new Maven Resolver feature that allows filtering of Artifact by RemoteRepository based on various (extensible) +criteria. + +## Why? + +Remote Repository Filtering (RRF) is a long asked feature of Maven, and plays huge role when your build uses +several remote repositories. In such cases Maven "searches" the ordered list (effective POM) of remote repositories, +and artifact gets resolved using "first wins" strategy. This have several implications: + +* your build gets slower, as if your artifact is in Nth repository, Maven must make N-1 requests that will result in + 404 Not Found only to get to Nth repository to finally get the artifact. +* you build "leaks" artifact requests, as those repositories are asked for artifacts, that does not (or worse, + cannot) have them. Still, those remote repository operators do get your requests in access logs. +* to "simplify" things, users tend to use MRM "group" (or "virtual") repositories, that causes data loss on + Maven Project side (project loses artifact origin information) and ends up in disasters, as at the end these + "super-uber groups" grow uncontrollably, their member count become uncontrollabble (as new members are being + added as time passes), or created groups count grows uncontrollably, and project start loosing the knownled + about their required remote repositories, needed to (re)build a project, hence these projects become + unbuildable without the MRM, projects become bound to MRM. + +So Maven by default gets slower as remote repositories are added, leaks your own build informations to remote +repository operators, and current solutions offered to solve this problem just end up in disasters (most often). + +## What it is? + +Imagine you can instruct Maven which repository can contain what artifact? Instead of "round robin" searching +for artifacts in remote repositories, Maven could be instructed in controlled way to directly reach only the +needed remote repository. + +With RRF, Maven build does NOT have to slow down with new remote repositories added, and will not leak either +build information anywhere, as it will get things from where they should be get from. + +## What it is not? + +When it solely comes to dependencies, don't forget +[maven-enforcer-plugin](https://maven.apache.org/enforcer/enforcer-rules/bannedDependencies.html) rules that are doing +exactly that. RRF is NOT an alternative means to these enforcer rules, they are alternative tools to make your build +more faster and more private, optimized, without loosing build information (remote repositories should be in POM). + +## Maven Central is special + +Maven Central (MC) repository is special in this respect, as Maven will always try to get things from here, as your build, +plugins, plugin dependencies, extension, etc will most often come from here. While you CAN filter MC, filtering MC is +most often a bad idea (filtering, as in "limiting what can come from it"). On other hand, MC itself offers helps +to prevent request leakage to it (publishes available prefixes, see below). + +So, **most often** limiting "what can be fetched" from MC is a bad idea, it **can be done** but in very very cautious way, +as otherwise you risk your build. RRF does not distinguish the "context" of an artifact, it merely filters them out +by {artifact, remoteRepository) pair, and by limiting MC you can easily get into state where you break your build (as +plugin depends on filtered artifact). Review Comment: Would it be possible to add a `How?` section briefly explaining how to set up RRF ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@maven.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org