Context: At present, the bulk of Java tests in the Hop codebase are unit tests. These tests simulate interactions with external components like persistence layers or messaging systems, using mocks. However, this reliance on unit tests and mocks during development may not accurately reflect real-world scenarios, potentially leading to compatibility issues with the actual target systems. Let me elaborate on my current use case and why I advocate for the adoption of Testcontainers to enhance compatibility and stability. In my ongoing task, I am establishing a new database connection to CrateDB. CrateDB is essentially a fork of PostgreSQL tailored for specific purposes like time series and blob storage. Consequently, some PostgreSQL features deemed non-essential for CrateDB's goals have been omitted. During development, I replicated the functionality of PostgreSqlDatabaseMetaTest for CrateDB. This test suite verifies whether each method of the API inherited from BaseDatabaseMeta executes the expected SQL statement against the target DBMS. However, it became apparent that certain unit tests in CrateSqlDatabaseMetaTest, derived from PostgreSQL tests, are irrelevant due to CrateDB's lack of support for sequences.
Proposal: To address the inefficiency of such tests, which ultimately incur unnecessary costs in terms of time and resources, I propose introducing test classes that evaluate the execution of the aforementioned API against their actual targets, beyond simply validating the expected code generation. These tests would leverage Testcontainers to create disposable instances of the actual targets. For instance, in this Merge Request (MR) [ https://github.com/apache/hop/pull/3791] (see CrateDBDatabaseMetaIT), an example of using the CrateDB Testcontainer is provided, demonstrating its utility for achieving my objectives. Benefits: - Enhanced reliability of developed integrations - Safeguarding against regressions - The disposable nature of Testcontainers allows for short-lived component instantiation - Facilitates comprehensive integration testing with diverse scenarios Drawback: - Potential increase in resources and time consumption on CI, albeit optimizable in most cases Conclusion: This proposed approach does not seek to replace existing integration tests, which validate specific scenarios within a comprehensive suite encompassing Hop and third-party components (DBMSs, Message brokers, etc.). Instead, it complements these tests by offering early validation of communication between Hop and individual components, one by one, thereby fostering improved system reliability and stability.
