Re: build q: getting the full DAG or all maven dependencies
Does hadoop-dist package help? It does the packaging stuff for hadoop, IIRC it defines all the projects so that the dist kicks post everything is built [1], it has scripts mentioned in the pom which does the packaging work. for the protobuf, I think in the yarn modules we don't have the scope defined, so by default it is taking as compile, maybe putting the scope in the parent pom [2] should help, I tried locally & it did, else you need to define the scope thing in every POM which uses protobuf... -Ayush [1] https://github.com/apache/hadoop/blob/trunk/hadoop-dist/pom.xml#L32 [2] https://github.com/apache/hadoop/commit/c373d3fa39013e8a8f6a6122c3ca230b4aa10abe On Tue, 13 Feb 2024 at 23:37, Steve Loughran wrote: > it does, but i'm not sure if there is a single module where you can ask for > it and get the full list. > > For that verification project I've got I may declare more poms as > dependencies so can do the aggregate scan there. this would also let me run > maven dependency -verbose, save the output to a file and see what is there. > would let us define lists of libraries we don't want in distributions > > On Mon, 12 Feb 2024 at 18:03, Sangjin Lee wrote: > > > Does the maven dependency plugin help? I might try mvn dependency:tree > and > > see if it takes you somewhere. > > > > Sangjin > > > > > > On Mon, Feb 12, 2024 at 9:50 AM Steve Loughran > > > > > wrote: > > > > > how can we work out the entire DAG of dependencies in a hadoop distro? > > > > > > I'm asking as there are things in 3.4.0 that we shouldn't need > (protobuf > > > 2.5), and when I add the pR to move off log4j 1.17 to reload4j, I still > > > find one in the yarn timeline lib dir > > > https://github.com/apache/hadoop/pull/6547 > > > > > > > > > see HADOOP-19074 for a list of what is in 3.4.0 RC0, which predates the > > new > > > shaded jar. > > > > > >
Re: build q: getting the full DAG or all maven dependencies
I tried running it at the root project (hadoop), and got a meaningful dependency tree. It does print an exhaustive and transitive tree of dependencies. As for log4j with your patch, I see two ways log4j is introduced: - log4j -> hadoop-common@2.8.5 -> hadoop-yarn-server-timelineservice-hbase-tests - log4j -> solr:slor-core@8.11.2 -> hadoop-yarn-applications-catalog-webapp:war That said, both are test-scoped. I'm not sure why we're packaging test-only dependencies into the hadoop distro. Is it a known thing? Sangjin On Tue, Feb 13, 2024 at 10:07 AM Steve Loughran wrote: > it does, but i'm not sure if there is a single module where you can ask for > it and get the full list. > > For that verification project I've got I may declare more poms as > dependencies so can do the aggregate scan there. this would also let me run > maven dependency -verbose, save the output to a file and see what is there. > would let us define lists of libraries we don't want in distributions > > On Mon, 12 Feb 2024 at 18:03, Sangjin Lee wrote: > > > Does the maven dependency plugin help? I might try mvn dependency:tree > and > > see if it takes you somewhere. > > > > Sangjin > > > > > > On Mon, Feb 12, 2024 at 9:50 AM Steve Loughran > > > > > wrote: > > > > > how can we work out the entire DAG of dependencies in a hadoop distro? > > > > > > I'm asking as there are things in 3.4.0 that we shouldn't need > (protobuf > > > 2.5), and when I add the pR to move off log4j 1.17 to reload4j, I still > > > find one in the yarn timeline lib dir > > > https://github.com/apache/hadoop/pull/6547 > > > > > > > > > see HADOOP-19074 for a list of what is in 3.4.0 RC0, which predates the > > new > > > shaded jar. > > > > > >
Re: build q: getting the full DAG or all maven dependencies
it does, but i'm not sure if there is a single module where you can ask for it and get the full list. For that verification project I've got I may declare more poms as dependencies so can do the aggregate scan there. this would also let me run maven dependency -verbose, save the output to a file and see what is there. would let us define lists of libraries we don't want in distributions On Mon, 12 Feb 2024 at 18:03, Sangjin Lee wrote: > Does the maven dependency plugin help? I might try mvn dependency:tree and > see if it takes you somewhere. > > Sangjin > > > On Mon, Feb 12, 2024 at 9:50 AM Steve Loughran > > wrote: > > > how can we work out the entire DAG of dependencies in a hadoop distro? > > > > I'm asking as there are things in 3.4.0 that we shouldn't need (protobuf > > 2.5), and when I add the pR to move off log4j 1.17 to reload4j, I still > > find one in the yarn timeline lib dir > > https://github.com/apache/hadoop/pull/6547 > > > > > > see HADOOP-19074 for a list of what is in 3.4.0 RC0, which predates the > new > > shaded jar. > > >
Re: build q: getting the full DAG or all maven dependencies
Does the maven dependency plugin help? I might try mvn dependency:tree and see if it takes you somewhere. Sangjin On Mon, Feb 12, 2024 at 9:50 AM Steve Loughran wrote: > how can we work out the entire DAG of dependencies in a hadoop distro? > > I'm asking as there are things in 3.4.0 that we shouldn't need (protobuf > 2.5), and when I add the pR to move off log4j 1.17 to reload4j, I still > find one in the yarn timeline lib dir > https://github.com/apache/hadoop/pull/6547 > > > see HADOOP-19074 for a list of what is in 3.4.0 RC0, which predates the new > shaded jar. >
build q: getting the full DAG or all maven dependencies
how can we work out the entire DAG of dependencies in a hadoop distro? I'm asking as there are things in 3.4.0 that we shouldn't need (protobuf 2.5), and when I add the pR to move off log4j 1.17 to reload4j, I still find one in the yarn timeline lib dir https://github.com/apache/hadoop/pull/6547 see HADOOP-19074 for a list of what is in 3.4.0 RC0, which predates the new shaded jar.