[DISCUSS] Introduction of TestContainers for Integration Tests against actual DBMSs

Sergio De Lorenzis Sun, 14 Apr 2024 15:39:48 -0700

Context:

At present, the bulk of Java tests in the Hop codebase are unit tests.
These tests simulate interactions with external components like persistence
layers or messaging systems, using mocks. However, this reliance on unit
tests and mocks during development may not accurately reflect real-world
scenarios, potentially leading to compatibility issues with the actual
target systems.
Let me elaborate on my current use case and why I advocate for the adoption
of Testcontainers to enhance compatibility and stability. In my ongoing
task, I am establishing a new database connection to CrateDB. CrateDB is
essentially a fork of PostgreSQL tailored for specific purposes like time
series and blob storage. Consequently, some PostgreSQL features deemed
non-essential for CrateDB's goals have been omitted. During development, I
replicated the functionality of PostgreSqlDatabaseMetaTest for CrateDB.
This test suite verifies whether each method of the API inherited from
BaseDatabaseMeta executes the expected SQL statement against the target
DBMS. However, it became apparent that certain unit tests in
CrateSqlDatabaseMetaTest, derived from PostgreSQL tests, are irrelevant due
to CrateDB's lack of support for sequences.


Proposal:
To address the inefficiency of such tests, which ultimately incur
unnecessary costs in terms of time and resources, I propose introducing
test classes that evaluate the execution of the aforementioned API against
their actual targets, beyond simply validating the expected code
generation. These tests would leverage Testcontainers to create disposable
instances of the actual targets.
For instance, in this Merge Request (MR) [
https://github.com/apache/hop/pull/3791] (see CrateDBDatabaseMetaIT), an
example of using the CrateDB Testcontainer is provided, demonstrating its
utility for achieving my objectives.

Benefits:
- Enhanced reliability of developed integrations
- Safeguarding against regressions
- The disposable nature of Testcontainers allows for short-lived component
instantiation
- Facilitates comprehensive integration testing with diverse scenarios

Drawback:
- Potential increase in resources and time consumption on CI, albeit
optimizable in most cases

Conclusion:
This proposed approach does not seek to replace existing integration tests,
which validate specific scenarios within a comprehensive suite encompassing
Hop and third-party components (DBMSs, Message brokers, etc.). Instead, it
complements these tests by offering early validation of communication
between Hop and individual components, one by one, thereby fostering
improved system reliability and stability.

[DISCUSS] Introduction of TestContainers for Integration Tests against actual DBMSs

Reply via email to