@Team, Please share your insights for this issue.
On 2021/07/01 09:11:12, wecai <cw8...@126.com> wrote: > We have a large dependency which has 300+ transitive dependencies, let's name > the dependency as BigDep1. > > > We have large numbers of libraries that depend on BigDep1. We may add > exclusions when we use these libraries in our project. > <dependency> > <groupId>com.company...</groupId> > <artifactId>Lib1</artifactId> > <exclusion> > <groupId>some_group_id</groupId> > <artifactId>some_artifact_id</artifactId> > </exclusion> > </dependency> > > > It took long time and huge memory to buid the project, we saw the BigDep1 is > resolved thousands of times without hit from memory cache... > > > By checking the code, we can see Maven is trying to load the resolved result > of BigDep1 from cache, but as debugged it always failed to get the cached > result. > We can see the key is determined by GAV, repositories, childSelector, > childManager, childTraverser, childFilter, this means exclusions is > considered as part of the key. > https://github.com/apache/maven-resolver/blob/master/maven-resolver-impl/src/main/java/org/eclipse/aether/internal/impl/collect/DefaultDependencyCollector.java#L504 > | Object key = | > | | args.pool.toKey( d.getArtifact(), childRepos, childSelector, > childManager, childTraverser, childFilter ); | > | | | > | | List<DependencyNode> children = args.pool.getChildren( key ); | > | | if ( children == null ) => always null. need recalculate and again save > to cache which takes long time and consumes large memory | > | | { | > | | args.pool.putChildren( key, child.getChildren() ); | > | | | > | | args.nodes.push( child ); | > | | | > | | process( args, results, descriptorResult.getDependencies(), childRepos, > childSelector, childManager, | > | | childTraverser, childFilter ); | > | | | > | | args.nodes.pop(); | > | | } | > > > Let me use a simple pattern to describe the problem: > > > Lib1 -> BigDep1 > Lib2 -> Lib3 (has exclusion) -> BigDep1 > Lib4 -> Lib2 > ... > > > Now in our project, we use libraries: Lib1, Lib2 , Lib4 with exclusions. > > > Project -> Lib1 > Project -> Lib2 > Project -> Lib4 (has exclusion) > > > Here is how maven resolve the dependencies: > maven starts to resolve Lib1, Lib1 -> BigDep1. maven first resolves BigDep1 > and caches BigDep1 in memory > maven starts to resolve Lib2, Lib2 -> Lib3 (has exclusion) -> BigDep1, as > Lib3 has exclusion, so maven cannot load BigDep1 from cache and calculate > BigDep1 again. > maven starts to resolve Lib4 (has exclusion), Lib4 (has exclusion) -> Lib2 > ->Lib3 -> BigDep1, as Lib4 has exclusion, so maven cannot load Lib2, Lib3, > BigDep1 from cache, all of them recalculated. > > > I'm thinking if we can use GAV as the cache key and apply the exclusions > later. maven can resolve the dependencies in this way: > maven starts to resolve Lib1, maven first resolves BigDep1 and caches BigDep1 > by using BigDep1's GAV as key. > maven starts to resolve Lib2, Lib2 -> Lib3 (has exclusion) -> BigDep1, maven > get BigDep1 from cache, then calc Lib3 without applying exclusion and cache > the result with Lib'3 GAV. > when maven comes to resolve Lib2, maven starts to apply Lib3's exclusion to > Lib3, add Lib3 with exclusion as children of Lib2 and then cache Lib2. > maven starts to resolve Lib4 (has exclusion), Lib4 (has exclusion) -> Lib2 > ->Lib3 -> BigDep1, maven get Lib2 from cache, then calc Lib4 without > applying the exclusion and then cache Lib4. > when maven comes to resolve the current project, maven applies Lib4's > exclusion, add Lib4 with exclusion as children of Project module, and then > cache Project's resolved result. > > > Does this make sense? > > > This means all libraries' resolved result are cached with its GAV. > Only the one which depends on it need to load the result from cache and apply > exclusions if any. > > > Thanks, > Eric --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org For additional commands, e-mail: dev-h...@maven.apache.org