[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2024-01-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809761#comment-17809761
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

Hexiaoqiao commented on PR #6373:
URL: https://github.com/apache/hadoop/pull/6373#issuecomment-1905390687

   Committed to trunk. Thanks @JiaLiangC and @steveloughran .




> Parallel Maven Build Support for Apache Hadoop
> --
>
> Key: HADOOP-19019
> URL: https://issues.apache.org/jira/browse/HADOOP-19019
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: caijialiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch11-HDFS-17287.diff
>
>
> The reason for the slow compilation: The Hadoop project has many modules, and 
> the inability to compile them in parallel results in a slow process. For 
> instance, the first compilation of Hadoop might take several hours, and even 
> with local Maven dependencies, a subsequent compilation can still take close 
> to 40 minutes, which is very slow.
> How to solve it: Use {{mvn dependency:tree}} and {{maven-to-plantuml}} to 
> investigate the dependency issues that prevent parallel compilation.
>  * Investigate the dependencies between project modules.
>  * Analyze the dependencies in multi-module Maven projects.
>  * Download {{{}maven-to-plantuml{}}}:
>  
> {{wget 
> [https://github.com/phxql/maven-to-plantuml/releases/download/v1.0/maven-to-plantuml-1.0.jar]}}
>  * Generate a dependency tree:
>  
> {{mvn dependency:tree > dep.txt}}
>  * Generate a UML diagram from the dependency tree:
>  
> {{java -jar maven-to-plantuml.jar --input dep.txt --output dep.puml}}
> For more information, visit: [maven-to-plantuml GitHub 
> repository|https://github.com/phxql/maven-to-plantuml/tree/master].
>  
> *Hadoop Parallel Compilation Submission Logic*
>  # Reasons for Parallel Compilation Failure
>  * 
>  ** In sequential compilation, as modules are compiled one by one in order, 
> there are no errors because the compilation follows the module sequence.
>  ** However, in parallel compilation, all modules are compiled 
> simultaneously. The compilation order during multi-module concurrent 
> compilation depends on the inter-module dependencies. If Module A depends on 
> Module B, then Module B will be compiled before Module A. This ensures that 
> the compilation order follows the dependencies between modules.
> But when Hadoop compiles in parallel, for example, compiling 
> {{{}hadoop-yarn-project{}}}, the dependencies between modules are correct. 
> The issue arises during the dist package stage. {{dist}} packages all other 
> compiled modules.
> *Behavior of {{hadoop-yarn-project}} in Serial Compilation:*
>  * 
>  ** In serial compilation, it compiles modules in the pom one by one in 
> sequence. After all modules are compiled, it compiles 
> {{{}hadoop-yarn-project{}}}. During the {{prepare-package}} stage, the 
> {{maven-assembly-plugin}} plugin is executed for packaging. All packages are 
> repackaged according to the description in 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}.
> *Behavior of {{hadoop-yarn-project}} in Parallel Compilation:*
>  * 
>  ** Parallel compilation compiles modules according to the dependency order 
> among them. If modules do not declare dependencies on each other through 
> {{{}dependency{}}}, they are compiled in parallel. According to the 
> dependency definition in the pom of {{{}hadoop-yarn-project{}}}, the 
> dependencies are compiled first, followed by {{{}hadoop-yarn-project{}}}, 
> executing its {{{}maven-assembly-plugin{}}}.
>  ** However, the files needed for packaging in 
> {{hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml}} are 
> not all included in the {{dependency}} of {{{}hadoop-yarn-project{}}}. 
> Therefore, when compiling {{hadoop-yarn-project}} and executing 
> {{{}maven-assembly-plugin{}}}, not all required modules are built yet, 
> leading to errors in parallel compilation.
> *Solution:*
>  * 
>  ** The solution is relatively straightforward: organize all modules from 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}, 
> and then declare them as dependencies in the pom of 
> {{{}hadoop-yarn-project{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2024-01-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809760#comment-17809760
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

Hexiaoqiao merged PR #6373:
URL: https://github.com/apache/hadoop/pull/6373




> Parallel Maven Build Support for Apache Hadoop
> --
>
> Key: HADOOP-19019
> URL: https://issues.apache.org/jira/browse/HADOOP-19019
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: caijialiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch11-HDFS-17287.diff
>
>
> The reason for the slow compilation: The Hadoop project has many modules, and 
> the inability to compile them in parallel results in a slow process. For 
> instance, the first compilation of Hadoop might take several hours, and even 
> with local Maven dependencies, a subsequent compilation can still take close 
> to 40 minutes, which is very slow.
> How to solve it: Use {{mvn dependency:tree}} and {{maven-to-plantuml}} to 
> investigate the dependency issues that prevent parallel compilation.
>  * Investigate the dependencies between project modules.
>  * Analyze the dependencies in multi-module Maven projects.
>  * Download {{{}maven-to-plantuml{}}}:
>  
> {{wget 
> [https://github.com/phxql/maven-to-plantuml/releases/download/v1.0/maven-to-plantuml-1.0.jar]}}
>  * Generate a dependency tree:
>  
> {{mvn dependency:tree > dep.txt}}
>  * Generate a UML diagram from the dependency tree:
>  
> {{java -jar maven-to-plantuml.jar --input dep.txt --output dep.puml}}
> For more information, visit: [maven-to-plantuml GitHub 
> repository|https://github.com/phxql/maven-to-plantuml/tree/master].
>  
> *Hadoop Parallel Compilation Submission Logic*
>  # Reasons for Parallel Compilation Failure
>  * 
>  ** In sequential compilation, as modules are compiled one by one in order, 
> there are no errors because the compilation follows the module sequence.
>  ** However, in parallel compilation, all modules are compiled 
> simultaneously. The compilation order during multi-module concurrent 
> compilation depends on the inter-module dependencies. If Module A depends on 
> Module B, then Module B will be compiled before Module A. This ensures that 
> the compilation order follows the dependencies between modules.
> But when Hadoop compiles in parallel, for example, compiling 
> {{{}hadoop-yarn-project{}}}, the dependencies between modules are correct. 
> The issue arises during the dist package stage. {{dist}} packages all other 
> compiled modules.
> *Behavior of {{hadoop-yarn-project}} in Serial Compilation:*
>  * 
>  ** In serial compilation, it compiles modules in the pom one by one in 
> sequence. After all modules are compiled, it compiles 
> {{{}hadoop-yarn-project{}}}. During the {{prepare-package}} stage, the 
> {{maven-assembly-plugin}} plugin is executed for packaging. All packages are 
> repackaged according to the description in 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}.
> *Behavior of {{hadoop-yarn-project}} in Parallel Compilation:*
>  * 
>  ** Parallel compilation compiles modules according to the dependency order 
> among them. If modules do not declare dependencies on each other through 
> {{{}dependency{}}}, they are compiled in parallel. According to the 
> dependency definition in the pom of {{{}hadoop-yarn-project{}}}, the 
> dependencies are compiled first, followed by {{{}hadoop-yarn-project{}}}, 
> executing its {{{}maven-assembly-plugin{}}}.
>  ** However, the files needed for packaging in 
> {{hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml}} are 
> not all included in the {{dependency}} of {{{}hadoop-yarn-project{}}}. 
> Therefore, when compiling {{hadoop-yarn-project}} and executing 
> {{{}maven-assembly-plugin{}}}, not all required modules are built yet, 
> leading to errors in parallel compilation.
> *Solution:*
>  * 
>  ** The solution is relatively straightforward: organize all modules from 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}, 
> and then declare them as dependencies in the pom of 
> {{{}hadoop-yarn-project{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2024-01-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809216#comment-17809216
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

Hexiaoqiao commented on PR #6373:
URL: https://github.com/apache/hadoop/pull/6373#issuecomment-1902975164

   If no more other concerns, I will check this PR into trunk for a short 
while. @steveloughran 




> Parallel Maven Build Support for Apache Hadoop
> --
>
> Key: HADOOP-19019
> URL: https://issues.apache.org/jira/browse/HADOOP-19019
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: caijialiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch11-HDFS-17287.diff
>
>
> The reason for the slow compilation: The Hadoop project has many modules, and 
> the inability to compile them in parallel results in a slow process. For 
> instance, the first compilation of Hadoop might take several hours, and even 
> with local Maven dependencies, a subsequent compilation can still take close 
> to 40 minutes, which is very slow.
> How to solve it: Use {{mvn dependency:tree}} and {{maven-to-plantuml}} to 
> investigate the dependency issues that prevent parallel compilation.
>  * Investigate the dependencies between project modules.
>  * Analyze the dependencies in multi-module Maven projects.
>  * Download {{{}maven-to-plantuml{}}}:
>  
> {{wget 
> [https://github.com/phxql/maven-to-plantuml/releases/download/v1.0/maven-to-plantuml-1.0.jar]}}
>  * Generate a dependency tree:
>  
> {{mvn dependency:tree > dep.txt}}
>  * Generate a UML diagram from the dependency tree:
>  
> {{java -jar maven-to-plantuml.jar --input dep.txt --output dep.puml}}
> For more information, visit: [maven-to-plantuml GitHub 
> repository|https://github.com/phxql/maven-to-plantuml/tree/master].
>  
> *Hadoop Parallel Compilation Submission Logic*
>  # Reasons for Parallel Compilation Failure
>  * 
>  ** In sequential compilation, as modules are compiled one by one in order, 
> there are no errors because the compilation follows the module sequence.
>  ** However, in parallel compilation, all modules are compiled 
> simultaneously. The compilation order during multi-module concurrent 
> compilation depends on the inter-module dependencies. If Module A depends on 
> Module B, then Module B will be compiled before Module A. This ensures that 
> the compilation order follows the dependencies between modules.
> But when Hadoop compiles in parallel, for example, compiling 
> {{{}hadoop-yarn-project{}}}, the dependencies between modules are correct. 
> The issue arises during the dist package stage. {{dist}} packages all other 
> compiled modules.
> *Behavior of {{hadoop-yarn-project}} in Serial Compilation:*
>  * 
>  ** In serial compilation, it compiles modules in the pom one by one in 
> sequence. After all modules are compiled, it compiles 
> {{{}hadoop-yarn-project{}}}. During the {{prepare-package}} stage, the 
> {{maven-assembly-plugin}} plugin is executed for packaging. All packages are 
> repackaged according to the description in 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}.
> *Behavior of {{hadoop-yarn-project}} in Parallel Compilation:*
>  * 
>  ** Parallel compilation compiles modules according to the dependency order 
> among them. If modules do not declare dependencies on each other through 
> {{{}dependency{}}}, they are compiled in parallel. According to the 
> dependency definition in the pom of {{{}hadoop-yarn-project{}}}, the 
> dependencies are compiled first, followed by {{{}hadoop-yarn-project{}}}, 
> executing its {{{}maven-assembly-plugin{}}}.
>  ** However, the files needed for packaging in 
> {{hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml}} are 
> not all included in the {{dependency}} of {{{}hadoop-yarn-project{}}}. 
> Therefore, when compiling {{hadoop-yarn-project}} and executing 
> {{{}maven-assembly-plugin{}}}, not all required modules are built yet, 
> leading to errors in parallel compilation.
> *Solution:*
>  * 
>  ** The solution is relatively straightforward: organize all modules from 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}, 
> and then declare them as dependencies in the pom of 
> {{{}hadoop-yarn-project{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2024-01-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809161#comment-17809161
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

steveloughran commented on PR #6373:
URL: https://github.com/apache/hadoop/pull/6373#issuecomment-1902729436

   Who is going to merge this? @Hexiaoqiao?




> Parallel Maven Build Support for Apache Hadoop
> --
>
> Key: HADOOP-19019
> URL: https://issues.apache.org/jira/browse/HADOOP-19019
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: caijialiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch11-HDFS-17287.diff
>
>
> The reason for the slow compilation: The Hadoop project has many modules, and 
> the inability to compile them in parallel results in a slow process. For 
> instance, the first compilation of Hadoop might take several hours, and even 
> with local Maven dependencies, a subsequent compilation can still take close 
> to 40 minutes, which is very slow.
> How to solve it: Use {{mvn dependency:tree}} and {{maven-to-plantuml}} to 
> investigate the dependency issues that prevent parallel compilation.
>  * Investigate the dependencies between project modules.
>  * Analyze the dependencies in multi-module Maven projects.
>  * Download {{{}maven-to-plantuml{}}}:
>  
> {{wget 
> [https://github.com/phxql/maven-to-plantuml/releases/download/v1.0/maven-to-plantuml-1.0.jar]}}
>  * Generate a dependency tree:
>  
> {{mvn dependency:tree > dep.txt}}
>  * Generate a UML diagram from the dependency tree:
>  
> {{java -jar maven-to-plantuml.jar --input dep.txt --output dep.puml}}
> For more information, visit: [maven-to-plantuml GitHub 
> repository|https://github.com/phxql/maven-to-plantuml/tree/master].
>  
> *Hadoop Parallel Compilation Submission Logic*
>  # Reasons for Parallel Compilation Failure
>  * 
>  ** In sequential compilation, as modules are compiled one by one in order, 
> there are no errors because the compilation follows the module sequence.
>  ** However, in parallel compilation, all modules are compiled 
> simultaneously. The compilation order during multi-module concurrent 
> compilation depends on the inter-module dependencies. If Module A depends on 
> Module B, then Module B will be compiled before Module A. This ensures that 
> the compilation order follows the dependencies between modules.
> But when Hadoop compiles in parallel, for example, compiling 
> {{{}hadoop-yarn-project{}}}, the dependencies between modules are correct. 
> The issue arises during the dist package stage. {{dist}} packages all other 
> compiled modules.
> *Behavior of {{hadoop-yarn-project}} in Serial Compilation:*
>  * 
>  ** In serial compilation, it compiles modules in the pom one by one in 
> sequence. After all modules are compiled, it compiles 
> {{{}hadoop-yarn-project{}}}. During the {{prepare-package}} stage, the 
> {{maven-assembly-plugin}} plugin is executed for packaging. All packages are 
> repackaged according to the description in 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}.
> *Behavior of {{hadoop-yarn-project}} in Parallel Compilation:*
>  * 
>  ** Parallel compilation compiles modules according to the dependency order 
> among them. If modules do not declare dependencies on each other through 
> {{{}dependency{}}}, they are compiled in parallel. According to the 
> dependency definition in the pom of {{{}hadoop-yarn-project{}}}, the 
> dependencies are compiled first, followed by {{{}hadoop-yarn-project{}}}, 
> executing its {{{}maven-assembly-plugin{}}}.
>  ** However, the files needed for packaging in 
> {{hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml}} are 
> not all included in the {{dependency}} of {{{}hadoop-yarn-project{}}}. 
> Therefore, when compiling {{hadoop-yarn-project}} and executing 
> {{{}maven-assembly-plugin{}}}, not all required modules are built yet, 
> leading to errors in parallel compilation.
> *Solution:*
>  * 
>  ** The solution is relatively straightforward: organize all modules from 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}, 
> and then declare them as dependencies in the pom of 
> {{{}hadoop-yarn-project{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2024-01-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17807721#comment-17807721
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

hadoop-yetus commented on PR #6373:
URL: https://github.com/apache/hadoop/pull/6373#issuecomment-1895784931

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 34s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 50s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  35m 22s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  18m 15s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  16m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  mvnsite  |   4m 48s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m 49s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   4m 20s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  shadedclient  | 135m 36s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 52s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  17m 52s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  16m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  16m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  mvnsite  |   4m 42s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   4m 51s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   4m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  shadedclient  |  48m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 241m 44s |  |  hadoop-yarn-project in the 
patch passed.  |
   | +1 :green_heart: |  unit  | 162m 20s |  |  hadoop-mapreduce-project in the 
patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 14s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 633m 12s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6373/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6373 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient codespell detsecrets xmllint |
   | uname | Linux 08a9c77cafe4 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / b9656c76142a19c5a1b71fcb91dd75d404f00c0a |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6373/3/testReport/ |
   | Max. process+thread count | 2702 (vs. ulimit of 5500) |
   | modules | C: hadoop-yarn-project hadoop-mapreduce-project U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6373/3/console |
   | versions | git=2.25.1 maven=3.6.3 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This 

[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2024-01-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17807516#comment-17807516
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

JiaLiangC commented on code in PR #6373:
URL: https://github.com/apache/hadoop/pull/6373#discussion_r1454392094


##
hadoop-yarn-project/pom.xml:
##
@@ -90,6 +91,56 @@
   hadoop-yarn-applications-catalog-webapp
   war
 
+
+  org.apache.hadoop

Review Comment:
   @steveloughran  all these dependencies already changed to provided   scope.





> Parallel Maven Build Support for Apache Hadoop
> --
>
> Key: HADOOP-19019
> URL: https://issues.apache.org/jira/browse/HADOOP-19019
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: caijialiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch11-HDFS-17287.diff
>
>
> The reason for the slow compilation: The Hadoop project has many modules, and 
> the inability to compile them in parallel results in a slow process. For 
> instance, the first compilation of Hadoop might take several hours, and even 
> with local Maven dependencies, a subsequent compilation can still take close 
> to 40 minutes, which is very slow.
> How to solve it: Use {{mvn dependency:tree}} and {{maven-to-plantuml}} to 
> investigate the dependency issues that prevent parallel compilation.
>  * Investigate the dependencies between project modules.
>  * Analyze the dependencies in multi-module Maven projects.
>  * Download {{{}maven-to-plantuml{}}}:
>  
> {{wget 
> [https://github.com/phxql/maven-to-plantuml/releases/download/v1.0/maven-to-plantuml-1.0.jar]}}
>  * Generate a dependency tree:
>  
> {{mvn dependency:tree > dep.txt}}
>  * Generate a UML diagram from the dependency tree:
>  
> {{java -jar maven-to-plantuml.jar --input dep.txt --output dep.puml}}
> For more information, visit: [maven-to-plantuml GitHub 
> repository|https://github.com/phxql/maven-to-plantuml/tree/master].
>  
> *Hadoop Parallel Compilation Submission Logic*
>  # Reasons for Parallel Compilation Failure
>  * 
>  ** In sequential compilation, as modules are compiled one by one in order, 
> there are no errors because the compilation follows the module sequence.
>  ** However, in parallel compilation, all modules are compiled 
> simultaneously. The compilation order during multi-module concurrent 
> compilation depends on the inter-module dependencies. If Module A depends on 
> Module B, then Module B will be compiled before Module A. This ensures that 
> the compilation order follows the dependencies between modules.
> But when Hadoop compiles in parallel, for example, compiling 
> {{{}hadoop-yarn-project{}}}, the dependencies between modules are correct. 
> The issue arises during the dist package stage. {{dist}} packages all other 
> compiled modules.
> *Behavior of {{hadoop-yarn-project}} in Serial Compilation:*
>  * 
>  ** In serial compilation, it compiles modules in the pom one by one in 
> sequence. After all modules are compiled, it compiles 
> {{{}hadoop-yarn-project{}}}. During the {{prepare-package}} stage, the 
> {{maven-assembly-plugin}} plugin is executed for packaging. All packages are 
> repackaged according to the description in 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}.
> *Behavior of {{hadoop-yarn-project}} in Parallel Compilation:*
>  * 
>  ** Parallel compilation compiles modules according to the dependency order 
> among them. If modules do not declare dependencies on each other through 
> {{{}dependency{}}}, they are compiled in parallel. According to the 
> dependency definition in the pom of {{{}hadoop-yarn-project{}}}, the 
> dependencies are compiled first, followed by {{{}hadoop-yarn-project{}}}, 
> executing its {{{}maven-assembly-plugin{}}}.
>  ** However, the files needed for packaging in 
> {{hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml}} are 
> not all included in the {{dependency}} of {{{}hadoop-yarn-project{}}}. 
> Therefore, when compiling {{hadoop-yarn-project}} and executing 
> {{{}maven-assembly-plugin{}}}, not all required modules are built yet, 
> leading to errors in parallel compilation.
> *Solution:*
>  * 
>  ** The solution is relatively straightforward: organize all modules from 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}, 
> and then declare them as dependencies in the pom of 
> {{{}hadoop-yarn-project{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2024-01-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17807511#comment-17807511
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

Hexiaoqiao commented on PR #6373:
URL: https://github.com/apache/hadoop/pull/6373#issuecomment-1894833569

   Great! Thanks @JiaLiangC , Let's wait if anymore folks would like to give 
another review here.




> Parallel Maven Build Support for Apache Hadoop
> --
>
> Key: HADOOP-19019
> URL: https://issues.apache.org/jira/browse/HADOOP-19019
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: caijialiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch11-HDFS-17287.diff
>
>
> The reason for the slow compilation: The Hadoop project has many modules, and 
> the inability to compile them in parallel results in a slow process. For 
> instance, the first compilation of Hadoop might take several hours, and even 
> with local Maven dependencies, a subsequent compilation can still take close 
> to 40 minutes, which is very slow.
> How to solve it: Use {{mvn dependency:tree}} and {{maven-to-plantuml}} to 
> investigate the dependency issues that prevent parallel compilation.
>  * Investigate the dependencies between project modules.
>  * Analyze the dependencies in multi-module Maven projects.
>  * Download {{{}maven-to-plantuml{}}}:
>  
> {{wget 
> [https://github.com/phxql/maven-to-plantuml/releases/download/v1.0/maven-to-plantuml-1.0.jar]}}
>  * Generate a dependency tree:
>  
> {{mvn dependency:tree > dep.txt}}
>  * Generate a UML diagram from the dependency tree:
>  
> {{java -jar maven-to-plantuml.jar --input dep.txt --output dep.puml}}
> For more information, visit: [maven-to-plantuml GitHub 
> repository|https://github.com/phxql/maven-to-plantuml/tree/master].
>  
> *Hadoop Parallel Compilation Submission Logic*
>  # Reasons for Parallel Compilation Failure
>  * 
>  ** In sequential compilation, as modules are compiled one by one in order, 
> there are no errors because the compilation follows the module sequence.
>  ** However, in parallel compilation, all modules are compiled 
> simultaneously. The compilation order during multi-module concurrent 
> compilation depends on the inter-module dependencies. If Module A depends on 
> Module B, then Module B will be compiled before Module A. This ensures that 
> the compilation order follows the dependencies between modules.
> But when Hadoop compiles in parallel, for example, compiling 
> {{{}hadoop-yarn-project{}}}, the dependencies between modules are correct. 
> The issue arises during the dist package stage. {{dist}} packages all other 
> compiled modules.
> *Behavior of {{hadoop-yarn-project}} in Serial Compilation:*
>  * 
>  ** In serial compilation, it compiles modules in the pom one by one in 
> sequence. After all modules are compiled, it compiles 
> {{{}hadoop-yarn-project{}}}. During the {{prepare-package}} stage, the 
> {{maven-assembly-plugin}} plugin is executed for packaging. All packages are 
> repackaged according to the description in 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}.
> *Behavior of {{hadoop-yarn-project}} in Parallel Compilation:*
>  * 
>  ** Parallel compilation compiles modules according to the dependency order 
> among them. If modules do not declare dependencies on each other through 
> {{{}dependency{}}}, they are compiled in parallel. According to the 
> dependency definition in the pom of {{{}hadoop-yarn-project{}}}, the 
> dependencies are compiled first, followed by {{{}hadoop-yarn-project{}}}, 
> executing its {{{}maven-assembly-plugin{}}}.
>  ** However, the files needed for packaging in 
> {{hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml}} are 
> not all included in the {{dependency}} of {{{}hadoop-yarn-project{}}}. 
> Therefore, when compiling {{hadoop-yarn-project}} and executing 
> {{{}maven-assembly-plugin{}}}, not all required modules are built yet, 
> leading to errors in parallel compilation.
> *Solution:*
>  * 
>  ** The solution is relatively straightforward: organize all modules from 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}, 
> and then declare them as dependencies in the pom of 
> {{{}hadoop-yarn-project{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2024-01-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17802353#comment-17802353
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

JiaLiangC commented on PR #6373:
URL: https://github.com/apache/hadoop/pull/6373#issuecomment-1876166196

   @Hexiaoqiao 
   Test environment: CentOS 8 x86_64, 16GB RAM, SSD.
   Tested on Hadoop 3.3.6.
   The initial serial compilation took almost 3 hours due to slow dependency 
downloads. With parallel compilation (-2C), the initial compilation took about 
1 hour, approximately 2 times faster.
   For subsequent compilations, with dependencies already downloaded locally, 
the overall parallel compilation time for Hadoop was 13 minutes, while serial 
compilation took 37 minutes.




> Parallel Maven Build Support for Apache Hadoop
> --
>
> Key: HADOOP-19019
> URL: https://issues.apache.org/jira/browse/HADOOP-19019
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: caijialiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch11-HDFS-17287.diff
>
>
> The reason for the slow compilation: The Hadoop project has many modules, and 
> the inability to compile them in parallel results in a slow process. For 
> instance, the first compilation of Hadoop might take several hours, and even 
> with local Maven dependencies, a subsequent compilation can still take close 
> to 40 minutes, which is very slow.
> How to solve it: Use {{mvn dependency:tree}} and {{maven-to-plantuml}} to 
> investigate the dependency issues that prevent parallel compilation.
>  * Investigate the dependencies between project modules.
>  * Analyze the dependencies in multi-module Maven projects.
>  * Download {{{}maven-to-plantuml{}}}:
>  
> {{wget 
> [https://github.com/phxql/maven-to-plantuml/releases/download/v1.0/maven-to-plantuml-1.0.jar]}}
>  * Generate a dependency tree:
>  
> {{mvn dependency:tree > dep.txt}}
>  * Generate a UML diagram from the dependency tree:
>  
> {{java -jar maven-to-plantuml.jar --input dep.txt --output dep.puml}}
> For more information, visit: [maven-to-plantuml GitHub 
> repository|https://github.com/phxql/maven-to-plantuml/tree/master].
>  
> *Hadoop Parallel Compilation Submission Logic*
>  # Reasons for Parallel Compilation Failure
>  * 
>  ** In sequential compilation, as modules are compiled one by one in order, 
> there are no errors because the compilation follows the module sequence.
>  ** However, in parallel compilation, all modules are compiled 
> simultaneously. The compilation order during multi-module concurrent 
> compilation depends on the inter-module dependencies. If Module A depends on 
> Module B, then Module B will be compiled before Module A. This ensures that 
> the compilation order follows the dependencies between modules.
> But when Hadoop compiles in parallel, for example, compiling 
> {{{}hadoop-yarn-project{}}}, the dependencies between modules are correct. 
> The issue arises during the dist package stage. {{dist}} packages all other 
> compiled modules.
> *Behavior of {{hadoop-yarn-project}} in Serial Compilation:*
>  * 
>  ** In serial compilation, it compiles modules in the pom one by one in 
> sequence. After all modules are compiled, it compiles 
> {{{}hadoop-yarn-project{}}}. During the {{prepare-package}} stage, the 
> {{maven-assembly-plugin}} plugin is executed for packaging. All packages are 
> repackaged according to the description in 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}.
> *Behavior of {{hadoop-yarn-project}} in Parallel Compilation:*
>  * 
>  ** Parallel compilation compiles modules according to the dependency order 
> among them. If modules do not declare dependencies on each other through 
> {{{}dependency{}}}, they are compiled in parallel. According to the 
> dependency definition in the pom of {{{}hadoop-yarn-project{}}}, the 
> dependencies are compiled first, followed by {{{}hadoop-yarn-project{}}}, 
> executing its {{{}maven-assembly-plugin{}}}.
>  ** However, the files needed for packaging in 
> {{hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml}} are 
> not all included in the {{dependency}} of {{{}hadoop-yarn-project{}}}. 
> Therefore, when compiling {{hadoop-yarn-project}} and executing 
> {{{}maven-assembly-plugin{}}}, not all required modules are built yet, 
> leading to errors in parallel compilation.
> *Solution:*
>  * 
>  ** The solution is relatively straightforward: organize all modules from 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}, 
> and then declare them as dependencies in the pom of 
> {{{}hadoop-yarn-project{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2024-01-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17802173#comment-17802173
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

Hexiaoqiao commented on PR #6373:
URL: https://github.com/apache/hadoop/pull/6373#issuecomment-1875368262

   @JiaLiangC Thanks for your work and involve me here. It is very interesting 
improvement. I want to know if any time cost save when change to parallel 
build. Another side, beside hadoop-yarn module, any other modules need to set 
dependency explicitly? Thanks again.




> Parallel Maven Build Support for Apache Hadoop
> --
>
> Key: HADOOP-19019
> URL: https://issues.apache.org/jira/browse/HADOOP-19019
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: caijialiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch11-HDFS-17287.diff
>
>
> The reason for the slow compilation: The Hadoop project has many modules, and 
> the inability to compile them in parallel results in a slow process. For 
> instance, the first compilation of Hadoop might take several hours, and even 
> with local Maven dependencies, a subsequent compilation can still take close 
> to 40 minutes, which is very slow.
> How to solve it: Use {{mvn dependency:tree}} and {{maven-to-plantuml}} to 
> investigate the dependency issues that prevent parallel compilation.
>  * Investigate the dependencies between project modules.
>  * Analyze the dependencies in multi-module Maven projects.
>  * Download {{{}maven-to-plantuml{}}}:
>  
> {{wget 
> [https://github.com/phxql/maven-to-plantuml/releases/download/v1.0/maven-to-plantuml-1.0.jar]}}
>  * Generate a dependency tree:
>  
> {{mvn dependency:tree > dep.txt}}
>  * Generate a UML diagram from the dependency tree:
>  
> {{java -jar maven-to-plantuml.jar --input dep.txt --output dep.puml}}
> For more information, visit: [maven-to-plantuml GitHub 
> repository|https://github.com/phxql/maven-to-plantuml/tree/master].
>  
> *Hadoop Parallel Compilation Submission Logic*
>  # Reasons for Parallel Compilation Failure
>  * 
>  ** In sequential compilation, as modules are compiled one by one in order, 
> there are no errors because the compilation follows the module sequence.
>  ** However, in parallel compilation, all modules are compiled 
> simultaneously. The compilation order during multi-module concurrent 
> compilation depends on the inter-module dependencies. If Module A depends on 
> Module B, then Module B will be compiled before Module A. This ensures that 
> the compilation order follows the dependencies between modules.
> But when Hadoop compiles in parallel, for example, compiling 
> {{{}hadoop-yarn-project{}}}, the dependencies between modules are correct. 
> The issue arises during the dist package stage. {{dist}} packages all other 
> compiled modules.
> *Behavior of {{hadoop-yarn-project}} in Serial Compilation:*
>  * 
>  ** In serial compilation, it compiles modules in the pom one by one in 
> sequence. After all modules are compiled, it compiles 
> {{{}hadoop-yarn-project{}}}. During the {{prepare-package}} stage, the 
> {{maven-assembly-plugin}} plugin is executed for packaging. All packages are 
> repackaged according to the description in 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}.
> *Behavior of {{hadoop-yarn-project}} in Parallel Compilation:*
>  * 
>  ** Parallel compilation compiles modules according to the dependency order 
> among them. If modules do not declare dependencies on each other through 
> {{{}dependency{}}}, they are compiled in parallel. According to the 
> dependency definition in the pom of {{{}hadoop-yarn-project{}}}, the 
> dependencies are compiled first, followed by {{{}hadoop-yarn-project{}}}, 
> executing its {{{}maven-assembly-plugin{}}}.
>  ** However, the files needed for packaging in 
> {{hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml}} are 
> not all included in the {{dependency}} of {{{}hadoop-yarn-project{}}}. 
> Therefore, when compiling {{hadoop-yarn-project}} and executing 
> {{{}maven-assembly-plugin{}}}, not all required modules are built yet, 
> leading to errors in parallel compilation.
> *Solution:*
>  * 
>  ** The solution is relatively straightforward: organize all modules from 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}, 
> and then declare them as dependencies in the pom of 
> {{{}hadoop-yarn-project{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2024-01-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17801641#comment-17801641
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

JiaLiangC commented on code in PR #6373:
URL: https://github.com/apache/hadoop/pull/6373#discussion_r1439125229


##
hadoop-yarn-project/pom.xml:
##
@@ -90,6 +91,56 @@
   hadoop-yarn-applications-catalog-webapp
   war
 
+
+  org.apache.hadoop

Review Comment:
   Yes, the scope here should be defined as 'provided'.





> Parallel Maven Build Support for Apache Hadoop
> --
>
> Key: HADOOP-19019
> URL: https://issues.apache.org/jira/browse/HADOOP-19019
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: caijialiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch11-HDFS-17287.diff
>
>
> The reason for the slow compilation: The Hadoop project has many modules, and 
> the inability to compile them in parallel results in a slow process. For 
> instance, the first compilation of Hadoop might take several hours, and even 
> with local Maven dependencies, a subsequent compilation can still take close 
> to 40 minutes, which is very slow.
> How to solve it: Use {{mvn dependency:tree}} and {{maven-to-plantuml}} to 
> investigate the dependency issues that prevent parallel compilation.
>  * Investigate the dependencies between project modules.
>  * Analyze the dependencies in multi-module Maven projects.
>  * Download {{{}maven-to-plantuml{}}}:
>  
> {{wget 
> [https://github.com/phxql/maven-to-plantuml/releases/download/v1.0/maven-to-plantuml-1.0.jar]}}
>  * Generate a dependency tree:
>  
> {{mvn dependency:tree > dep.txt}}
>  * Generate a UML diagram from the dependency tree:
>  
> {{java -jar maven-to-plantuml.jar --input dep.txt --output dep.puml}}
> For more information, visit: [maven-to-plantuml GitHub 
> repository|https://github.com/phxql/maven-to-plantuml/tree/master].
>  
> *Hadoop Parallel Compilation Submission Logic*
>  # Reasons for Parallel Compilation Failure
>  * 
>  ** In sequential compilation, as modules are compiled one by one in order, 
> there are no errors because the compilation follows the module sequence.
>  ** However, in parallel compilation, all modules are compiled 
> simultaneously. The compilation order during multi-module concurrent 
> compilation depends on the inter-module dependencies. If Module A depends on 
> Module B, then Module B will be compiled before Module A. This ensures that 
> the compilation order follows the dependencies between modules.
> But when Hadoop compiles in parallel, for example, compiling 
> {{{}hadoop-yarn-project{}}}, the dependencies between modules are correct. 
> The issue arises during the dist package stage. {{dist}} packages all other 
> compiled modules.
> *Behavior of {{hadoop-yarn-project}} in Serial Compilation:*
>  * 
>  ** In serial compilation, it compiles modules in the pom one by one in 
> sequence. After all modules are compiled, it compiles 
> {{{}hadoop-yarn-project{}}}. During the {{prepare-package}} stage, the 
> {{maven-assembly-plugin}} plugin is executed for packaging. All packages are 
> repackaged according to the description in 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}.
> *Behavior of {{hadoop-yarn-project}} in Parallel Compilation:*
>  * 
>  ** Parallel compilation compiles modules according to the dependency order 
> among them. If modules do not declare dependencies on each other through 
> {{{}dependency{}}}, they are compiled in parallel. According to the 
> dependency definition in the pom of {{{}hadoop-yarn-project{}}}, the 
> dependencies are compiled first, followed by {{{}hadoop-yarn-project{}}}, 
> executing its {{{}maven-assembly-plugin{}}}.
>  ** However, the files needed for packaging in 
> {{hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml}} are 
> not all included in the {{dependency}} of {{{}hadoop-yarn-project{}}}. 
> Therefore, when compiling {{hadoop-yarn-project}} and executing 
> {{{}maven-assembly-plugin{}}}, not all required modules are built yet, 
> leading to errors in parallel compilation.
> *Solution:*
>  * 
>  ** The solution is relatively straightforward: organize all modules from 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}, 
> and then declare them as dependencies in the pom of 
> {{{}hadoop-yarn-project{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2024-01-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17801640#comment-17801640
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

JiaLiangC commented on code in PR #6373:
URL: https://github.com/apache/hadoop/pull/6373#discussion_r1439125035


##
hadoop-yarn-project/pom.xml:
##
@@ -90,6 +91,56 @@
   hadoop-yarn-applications-catalog-webapp
   war
 
+
+  org.apache.hadoop
+  hadoop-yarn-applications-distributedshell
+
+
+  org.apache.hadoop
+  hadoop-yarn-applications-unmanaged-am-launcher
+  ${project.version}
+
+
+  org.apache.hadoop
+  hadoop-yarn-server-tests
+  ${project.version}
+
+
+  org.apache.hadoop
+  hadoop-yarn-server-timelineservice-hbase-client
+  ${project.version}
+  

Review Comment:
   cd hadoop-yarn-project
   
   mvn clean -T2C -Pnative -Pdist -Dtar  -Psrc -Pyarn-ui 
-Dzookeeper.version=3.7.2 -Dhbase.profile=2.0 -DskipTests -DskipITs install 
   
   The purpose of adding exclusions here is to resolve version conflicts during 
compilation, as shown in the diagram. I did not encounter this conflict when 
testing with Hadoop 3.3.6; it only appears in the trunk branch.
   
![image](https://github.com/apache/hadoop/assets/18082602/4e558bee-6218-4036-baee-8cf26109c550)
   





> Parallel Maven Build Support for Apache Hadoop
> --
>
> Key: HADOOP-19019
> URL: https://issues.apache.org/jira/browse/HADOOP-19019
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: caijialiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch11-HDFS-17287.diff
>
>
> The reason for the slow compilation: The Hadoop project has many modules, and 
> the inability to compile them in parallel results in a slow process. For 
> instance, the first compilation of Hadoop might take several hours, and even 
> with local Maven dependencies, a subsequent compilation can still take close 
> to 40 minutes, which is very slow.
> How to solve it: Use {{mvn dependency:tree}} and {{maven-to-plantuml}} to 
> investigate the dependency issues that prevent parallel compilation.
>  * Investigate the dependencies between project modules.
>  * Analyze the dependencies in multi-module Maven projects.
>  * Download {{{}maven-to-plantuml{}}}:
>  
> {{wget 
> [https://github.com/phxql/maven-to-plantuml/releases/download/v1.0/maven-to-plantuml-1.0.jar]}}
>  * Generate a dependency tree:
>  
> {{mvn dependency:tree > dep.txt}}
>  * Generate a UML diagram from the dependency tree:
>  
> {{java -jar maven-to-plantuml.jar --input dep.txt --output dep.puml}}
> For more information, visit: [maven-to-plantuml GitHub 
> repository|https://github.com/phxql/maven-to-plantuml/tree/master].
>  
> *Hadoop Parallel Compilation Submission Logic*
>  # Reasons for Parallel Compilation Failure
>  * 
>  ** In sequential compilation, as modules are compiled one by one in order, 
> there are no errors because the compilation follows the module sequence.
>  ** However, in parallel compilation, all modules are compiled 
> simultaneously. The compilation order during multi-module concurrent 
> compilation depends on the inter-module dependencies. If Module A depends on 
> Module B, then Module B will be compiled before Module A. This ensures that 
> the compilation order follows the dependencies between modules.
> But when Hadoop compiles in parallel, for example, compiling 
> {{{}hadoop-yarn-project{}}}, the dependencies between modules are correct. 
> The issue arises during the dist package stage. {{dist}} packages all other 
> compiled modules.
> *Behavior of {{hadoop-yarn-project}} in Serial Compilation:*
>  * 
>  ** In serial compilation, it compiles modules in the pom one by one in 
> sequence. After all modules are compiled, it compiles 
> {{{}hadoop-yarn-project{}}}. During the {{prepare-package}} stage, the 
> {{maven-assembly-plugin}} plugin is executed for packaging. All packages are 
> repackaged according to the description in 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}.
> *Behavior of {{hadoop-yarn-project}} in Parallel Compilation:*
>  * 
>  ** Parallel compilation compiles modules according to the dependency order 
> among them. If modules do not declare dependencies on each other through 
> {{{}dependency{}}}, they are compiled in parallel. According to the 
> dependency definition in the pom of {{{}hadoop-yarn-project{}}}, the 
> dependencies are compiled first, followed by {{{}hadoop-yarn-project{}}}, 
> executing its {{{}maven-assembly-plugin{}}}.
>  ** However, the files needed for packaging in 
> {{hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml}} are 
> not all included in 

[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2024-01-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17801596#comment-17801596
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

steveloughran commented on code in PR #6373:
URL: https://github.com/apache/hadoop/pull/6373#discussion_r1439080434


##
hadoop-yarn-project/pom.xml:
##
@@ -90,6 +91,56 @@
   hadoop-yarn-applications-catalog-webapp
   war
 
+
+  org.apache.hadoop

Review Comment:
   shouldn't all these be scoped as provided?



##
hadoop-yarn-project/pom.xml:
##
@@ -90,6 +91,56 @@
   hadoop-yarn-applications-catalog-webapp
   war
 
+
+  org.apache.hadoop
+  hadoop-yarn-applications-distributedshell
+
+
+  org.apache.hadoop
+  hadoop-yarn-applications-unmanaged-am-launcher
+  ${project.version}
+
+
+  org.apache.hadoop
+  hadoop-yarn-server-tests
+  ${project.version}
+
+
+  org.apache.hadoop
+  hadoop-yarn-server-timelineservice-hbase-client
+  ${project.version}
+  

Review Comment:
   this worries me. why is this exclusion needed?





> Parallel Maven Build Support for Apache Hadoop
> --
>
> Key: HADOOP-19019
> URL: https://issues.apache.org/jira/browse/HADOOP-19019
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: caijialiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch11-HDFS-17287.diff
>
>
> The reason for the slow compilation: The Hadoop project has many modules, and 
> the inability to compile them in parallel results in a slow process. For 
> instance, the first compilation of Hadoop might take several hours, and even 
> with local Maven dependencies, a subsequent compilation can still take close 
> to 40 minutes, which is very slow.
> How to solve it: Use {{mvn dependency:tree}} and {{maven-to-plantuml}} to 
> investigate the dependency issues that prevent parallel compilation.
>  * Investigate the dependencies between project modules.
>  * Analyze the dependencies in multi-module Maven projects.
>  * Download {{{}maven-to-plantuml{}}}:
>  
> {{wget 
> [https://github.com/phxql/maven-to-plantuml/releases/download/v1.0/maven-to-plantuml-1.0.jar]}}
>  * Generate a dependency tree:
>  
> {{mvn dependency:tree > dep.txt}}
>  * Generate a UML diagram from the dependency tree:
>  
> {{java -jar maven-to-plantuml.jar --input dep.txt --output dep.puml}}
> For more information, visit: [maven-to-plantuml GitHub 
> repository|https://github.com/phxql/maven-to-plantuml/tree/master].
>  
> *Hadoop Parallel Compilation Submission Logic*
>  # Reasons for Parallel Compilation Failure
>  * 
>  ** In sequential compilation, as modules are compiled one by one in order, 
> there are no errors because the compilation follows the module sequence.
>  ** However, in parallel compilation, all modules are compiled 
> simultaneously. The compilation order during multi-module concurrent 
> compilation depends on the inter-module dependencies. If Module A depends on 
> Module B, then Module B will be compiled before Module A. This ensures that 
> the compilation order follows the dependencies between modules.
> But when Hadoop compiles in parallel, for example, compiling 
> {{{}hadoop-yarn-project{}}}, the dependencies between modules are correct. 
> The issue arises during the dist package stage. {{dist}} packages all other 
> compiled modules.
> *Behavior of {{hadoop-yarn-project}} in Serial Compilation:*
>  * 
>  ** In serial compilation, it compiles modules in the pom one by one in 
> sequence. After all modules are compiled, it compiles 
> {{{}hadoop-yarn-project{}}}. During the {{prepare-package}} stage, the 
> {{maven-assembly-plugin}} plugin is executed for packaging. All packages are 
> repackaged according to the description in 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}.
> *Behavior of {{hadoop-yarn-project}} in Parallel Compilation:*
>  * 
>  ** Parallel compilation compiles modules according to the dependency order 
> among them. If modules do not declare dependencies on each other through 
> {{{}dependency{}}}, they are compiled in parallel. According to the 
> dependency definition in the pom of {{{}hadoop-yarn-project{}}}, the 
> dependencies are compiled first, followed by {{{}hadoop-yarn-project{}}}, 
> executing its {{{}maven-assembly-plugin{}}}.
>  ** However, the files needed for packaging in 
> {{hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml}} are 
> not all included in the {{dependency}} of {{{}hadoop-yarn-project{}}}. 
> Therefore, when compiling {{hadoop-yarn-project}} and executing 
> {{{}maven-assembly-plugin{}}}, not all required modules are built yet, 
> leading to 

[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2023-12-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17801406#comment-17801406
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

JiaLiangC commented on PR #6373:
URL: https://github.com/apache/hadoop/pull/6373#issuecomment-1872647348

   @Hexiaoqiao Could you help review this pr?




> Parallel Maven Build Support for Apache Hadoop
> --
>
> Key: HADOOP-19019
> URL: https://issues.apache.org/jira/browse/HADOOP-19019
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: caijialiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch11-HDFS-17287.diff
>
>
> The reason for the slow compilation: The Hadoop project has many modules, and 
> the inability to compile them in parallel results in a slow process. For 
> instance, the first compilation of Hadoop might take several hours, and even 
> with local Maven dependencies, a subsequent compilation can still take close 
> to 40 minutes, which is very slow.
> How to solve it: Use {{mvn dependency:tree}} and {{maven-to-plantuml}} to 
> investigate the dependency issues that prevent parallel compilation.
>  * Investigate the dependencies between project modules.
>  * Analyze the dependencies in multi-module Maven projects.
>  * Download {{{}maven-to-plantuml{}}}:
>  
> {{wget 
> [https://github.com/phxql/maven-to-plantuml/releases/download/v1.0/maven-to-plantuml-1.0.jar]}}
>  * Generate a dependency tree:
>  
> {{mvn dependency:tree > dep.txt}}
>  * Generate a UML diagram from the dependency tree:
>  
> {{java -jar maven-to-plantuml.jar --input dep.txt --output dep.puml}}
> For more information, visit: [maven-to-plantuml GitHub 
> repository|https://github.com/phxql/maven-to-plantuml/tree/master].
>  
> *Hadoop Parallel Compilation Submission Logic*
>  # Reasons for Parallel Compilation Failure
>  * 
>  ** In sequential compilation, as modules are compiled one by one in order, 
> there are no errors because the compilation follows the module sequence.
>  ** However, in parallel compilation, all modules are compiled 
> simultaneously. The compilation order during multi-module concurrent 
> compilation depends on the inter-module dependencies. If Module A depends on 
> Module B, then Module B will be compiled before Module A. This ensures that 
> the compilation order follows the dependencies between modules.
> But when Hadoop compiles in parallel, for example, compiling 
> {{{}hadoop-yarn-project{}}}, the dependencies between modules are correct. 
> The issue arises during the dist package stage. {{dist}} packages all other 
> compiled modules.
> *Behavior of {{hadoop-yarn-project}} in Serial Compilation:*
>  * 
>  ** In serial compilation, it compiles modules in the pom one by one in 
> sequence. After all modules are compiled, it compiles 
> {{{}hadoop-yarn-project{}}}. During the {{prepare-package}} stage, the 
> {{maven-assembly-plugin}} plugin is executed for packaging. All packages are 
> repackaged according to the description in 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}.
> *Behavior of {{hadoop-yarn-project}} in Parallel Compilation:*
>  * 
>  ** Parallel compilation compiles modules according to the dependency order 
> among them. If modules do not declare dependencies on each other through 
> {{{}dependency{}}}, they are compiled in parallel. According to the 
> dependency definition in the pom of {{{}hadoop-yarn-project{}}}, the 
> dependencies are compiled first, followed by {{{}hadoop-yarn-project{}}}, 
> executing its {{{}maven-assembly-plugin{}}}.
>  ** However, the files needed for packaging in 
> {{hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml}} are 
> not all included in the {{dependency}} of {{{}hadoop-yarn-project{}}}. 
> Therefore, when compiling {{hadoop-yarn-project}} and executing 
> {{{}maven-assembly-plugin{}}}, not all required modules are built yet, 
> leading to errors in parallel compilation.
> *Solution:*
>  * 
>  ** The solution is relatively straightforward: organize all modules from 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}, 
> and then declare them as dependencies in the pom of 
> {{{}hadoop-yarn-project{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2023-12-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17801202#comment-17801202
 ] 

ASF GitHub Bot commented on HADOOP-19019:
-

hadoop-yetus commented on PR #6373:
URL: https://github.com/apache/hadoop/pull/6373#issuecomment-1872173866

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 32s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 40s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  35m 22s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  16m 58s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  15m  4s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  mvnsite  |   4m 48s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   4m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  shadedclient  | 133m 44s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 43s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 23s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  17m 23s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m  6s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  17m  6s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  mvnsite  |   5m 12s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   5m  2s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   4m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  shadedclient  |  54m 22s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 238m 49s | 
[/patch-unit-hadoop-yarn-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6373/2/artifact/out/patch-unit-hadoop-yarn-project.txt)
 |  hadoop-yarn-project in the patch passed.  |
   | +1 :green_heart: |  unit  | 161m  3s |  |  hadoop-mapreduce-project in the 
patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 14s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 632m 56s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6373/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6373 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient codespell detsecrets xmllint |
   | uname | Linux d64615eda07c 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 941069621b58020f19fcee00bef8e97a61a6cf27 |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6373/2/testReport/ |
   | Max. process+thread count | 2703 (vs. ulimit of 5500) |

[jira] [Commented] (HADOOP-19019) Parallel Maven Build Support for Apache Hadoop

2023-12-28 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17801093#comment-17801093
 ] 

Xiaoqiao He commented on HADOOP-19019:
--

Thanks [~jialiang] for your works. Move from HDFS to COMMON module.

> Parallel Maven Build Support for Apache Hadoop
> --
>
> Key: HADOOP-19019
> URL: https://issues.apache.org/jira/browse/HADOOP-19019
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: caijialiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch11-HDFS-17287.diff
>
>
> The reason for the slow compilation: The Hadoop project has many modules, and 
> the inability to compile them in parallel results in a slow process. For 
> instance, the first compilation of Hadoop might take several hours, and even 
> with local Maven dependencies, a subsequent compilation can still take close 
> to 40 minutes, which is very slow.
> How to solve it: Use {{mvn dependency:tree}} and {{maven-to-plantuml}} to 
> investigate the dependency issues that prevent parallel compilation.
>  * Investigate the dependencies between project modules.
>  * Analyze the dependencies in multi-module Maven projects.
>  * Download {{{}maven-to-plantuml{}}}:
>  
> {{wget 
> [https://github.com/phxql/maven-to-plantuml/releases/download/v1.0/maven-to-plantuml-1.0.jar]}}
>  * Generate a dependency tree:
>  
> {{mvn dependency:tree > dep.txt}}
>  * Generate a UML diagram from the dependency tree:
>  
> {{java -jar maven-to-plantuml.jar --input dep.txt --output dep.puml}}
> For more information, visit: [maven-to-plantuml GitHub 
> repository|https://github.com/phxql/maven-to-plantuml/tree/master].
> Here's the translation of the Hadoop PR description into English:
> *Hadoop Parallel Compilation Submission Logic*
>  # Reasons for Parallel Compilation Failure
>  ** In sequential compilation, as modules are compiled one by one in order, 
> there are no errors because the compilation follows the module sequence.
>  ** However, in parallel compilation, all modules are compiled 
> simultaneously. The compilation order during multi-module concurrent 
> compilation depends on the inter-module dependencies. If Module A depends on 
> Module B, then Module B will be compiled before Module A. This ensures that 
> the compilation order follows the dependencies between modules.
> But when Hadoop compiles in parallel, for example, compiling 
> {{{}hadoop-yarn-project{}}}, the dependencies between modules are correct. 
> The issue arises during the dist package stage. {{dist}} packages all other 
> compiled modules.
> *Behavior of {{hadoop-yarn-project}} in Serial Compilation:*
>  ** In serial compilation, it compiles modules in the pom one by one in 
> sequence. After all modules are compiled, it compiles 
> {{{}hadoop-yarn-project{}}}. During the {{prepare-package}} stage, the 
> {{maven-assembly-plugin}} plugin is executed for packaging. All packages are 
> repackaged according to the description in 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}.
> *Behavior of {{hadoop-yarn-project}} in Parallel Compilation:*
>  ** Parallel compilation compiles modules according to the dependency order 
> among them. If modules do not declare dependencies on each other through 
> {{{}dependency{}}}, they are compiled in parallel. According to the 
> dependency definition in the pom of {{{}hadoop-yarn-project{}}}, the 
> dependencies are compiled first, followed by {{{}hadoop-yarn-project{}}}, 
> executing its {{{}maven-assembly-plugin{}}}.
>  ** However, the files needed for packaging in 
> {{hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml}} are 
> not all included in the {{dependency}} of {{{}hadoop-yarn-project{}}}. 
> Therefore, when compiling {{hadoop-yarn-project}} and executing 
> {{{}maven-assembly-plugin{}}}, not all required modules are built yet, 
> leading to errors in parallel compilation.
> *Solution:*
>  ** The solution is relatively straightforward: organize all modules from 
> {{{}hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml{}}}, 
> and then declare them as dependencies in the pom of 
> {{{}hadoop-yarn-project{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org