slfan1989 commented on PR #7337: URL: https://github.com/apache/hadoop/pull/7337#issuecomment-2623908838
@steveloughran @ayushtkn @Hexiaoqiao @jojochuang @aajisaka @cnauroth @szetszwo Upgrading JUnit from 4 to 5 is not only a technical upgrade opportunity but also a good chance for us to improve the test-related code and fix potential issues. I’d like to discuss whether the improvement strategies I’ve been considering are reasonable. 1. **Import Optimization** There are currently two ways to reference JUnit methods in the project: - The first way is by importing static methods, allowing us to directly use assertTrue in unit tests. - The second way is by importing `org.junit.Assert`, which requires us to use the fully qualified name (e.g., Assert.assertTrue) in unit tests. I hope to unify all references to the first method (static imports) in this update, but I'm unsure if this is feasible. 2. **Import * Optimization** I’ve noticed that many unit tests use the `import *` approach. Do we need to expand these imports in the unit tests? 3. **Addressing Checkstyle Issues, Incomplete JavaDocs, and Typos** There are several Checkstyle issues in the code, such as incomplete JavaDocs and typos. Should we also address these issues in this update? Some logs use `+` for string concatenation instead of placeholders. Some methods could also be optimized using lambda expressions. Should we address these issues in this update as well? 4. **Improvements for Large Modules** For example, the `hadoop-common` module, I’ve noticed, has many downstream dependencies, affecting several other modules, as shown below: ``` hadoop-azure hadoop-aws hadoop-distcp hadoop-azure-datalake hadoop-hdfs hadoop-hdfs-rbf hadoop-cos hadoop-aliyun hadoop-mapreduce-client-core ``` We can’t make all the changes at once because there are too many involved modules, and some test classes have dependencies on each other. I’ve found that JUnit 4 and JUnit 5 can coexist, meaning that a module can have both JUnit 4 and JUnit 5 test styles, and the tests will still run successfully. For very large modules, we can modify them incrementally, based on the package structure. For example, for `hadoop-common`, we can submit multiple PRs to complete the module’s upgrade step by step. Make sure that each PR submitted doesn’t include changes to more than 50 classes, as this will make the review process easier. HADOOP-19415. [JDK17] Upgrade JUnit from 4 to 5 in hadoop-common Part1. ``` cli、conf、constants、crypto、fs ``` HADOOP-19415. [JDK17] Upgrade JUnit from 4 to 5 in hadoop-common Part2. ``` ha、http ``` If we encounter unit tests that depend on other modules, we can leave them unchanged for now. Once the non-dependent parts of the other modules are upgraded, we can submit a single PR to complete the upgrade of the dependencies. Let’s illustrate with an example: The `FCStatisticsBaseTest` class in `hadoop-common` has dependencies on the following three modules: `hadoop-aws`, `hadoop-common`, and `hadoop-aliyun`. <img width="825" alt="image" src="https://github.com/user-attachments/assets/c6ea830a-3f57-4b95-8a01-3fffbcb8712b" /> Due to these dependencies, we will not upgrade this abstract class for now. Instead, we will parallelly upgrade the parts of these three modules(`hadoop-aws`, `hadoop-common`, and `hadoop-aliyun`) that have no dependencies. Once the other parts are upgraded, we will come back to upgrade the unit tests with dependencies. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
