[ 
https://issues.apache.org/jira/browse/FLINK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846535#comment-17846535
 ] 

Lorenzo Affetti commented on FLINK-20392:
-----------------------------------------

[~m.orazow] and I found ourselves working on two issues under this umbrella:
 - https://issues.apache.org/jira/browse/FLINK-20398  (batch sql)
 - https://issues.apache.org/jira/browse/FLINK-20400  (streaming sql)

And contributed with this 2 PRs:
 - [https://github.com/apache/flink/pull/24471]  (batch sql)
 - [https://github.com/apache/flink/pull/24776]  (streaming sql)

The "batch" PR makes use of the MiniClusterExtension while the "streaming" PR 
makes use of TestContainers and, specifically, FlinkContainers.
We think that the underlying purpose here (apart from porting tests from bash 
to Java) is to:
 - reflect their bash behavior
 - adopt a homogeneous solution for (hopefully) every test porting.

[~m.orazow]  and I had an offline discussion mainly focused on the nature of 
these end-to-end tests: at the best of our understanding, these tests are meant 
to test the end-to-end functionalities of some "close-to-reality" Flink 
cluster. Indeed, their current behavior (in bash) consists of starting a local 
Flink cluster and issuing `flink run` commands against it. Moreover, some test 
(e.g.: flink-end-to-end-tests/test-scripts/test_queryable_state_restart_tm.sh) 
also test failure by killing a random TaskManager (as a side note, this 
umbrella does not have an issue per existing e2e test, we volunteer to sink up 
its state in the near future).

After an offline discussion, Muhammet and I believe that _the TestContainers 
approach is the one that fits most the idea behind this umbrella issue._

We elicit here below the PROs and CONs of both approaches.

TestContainers :
 - PRO: the Flink cluster is as close as possible to reality (full TCP/IP stack 
involved, separate processes)
 - PRO: uses the Flink binaries in the Flink dist as the bash version does
 - PRO: handlers to control the cluster available -> a random TM can be killed 
to check the failure state
 - CON: running tests requires Docker images to be built (time required)

MiniClusterExtension:
 - PRO: the code is very clean: no need to deploy separate JAR for the Flink 
app, as that is coded in the test and the Flink cluster is registered in the 
current test environment
 - PRO: no further build steps required apart from java build (fast run)
 - CON: missing handlers for TM kill (could be implemented by exposing 
MiniClusterResource from the extension)
 - CON: less "close-to-reality" as each service runs in a separate thread (not 
process), however an in-depth analysis of the implementation of the 
MiniClusterResource should be conducted to understand how much this is close to 
a Flink cluster.

*Wrapping up, we do think that if we want to mimic as much as possible the 
current behavior of tests, TestContainers is the way to go at the cost of the 
the Docker-build time of containers in CI.*

We would like to gather at least 2 people's agreement on one of the 2 solution 
without a strong opposing position from somebody else in order to edit our PRs 
to match and to continue with this massive port.

 

As they are active in the thread and experts on the topic, I suggest [~mapohl] 
and [~jark] to help us in this.

 

Thank you guys for all the support and helpful reviews as of now.

> Migrating bash e2e tests to Java/Docker
> ---------------------------------------
>
>                 Key: FLINK-20392
>                 URL: https://issues.apache.org/jira/browse/FLINK-20392
>             Project: Flink
>          Issue Type: Technical Debt
>          Components: Test Infrastructure, Tests
>            Reporter: Matthias Pohl
>            Priority: Minor
>              Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> starter
>
> This Jira issue serves as an umbrella ticket for single e2e test migration 
> tasks. This should enable us to migrate all bash-based e2e tests step-by-step.
> The goal is to utilize the e2e test framework (see 
> [flink-end-to-end-tests-common|https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common]).
>  Ideally, the test should use Docker containers as much as possible 
> disconnect the execution from the environment. A good source to achieve that 
> is [testcontainers.org|https://www.testcontainers.org/].
> The related ML discussion is [Stop adding new bash-based e2e tests to 
> Flink|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Stop-adding-new-bash-based-e2e-tests-to-Flink-td46607.html].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to