[ https://issues.apache.org/jira/browse/FLINK-25948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zichen Liu updated FLINK-25948: ------------------------------- Description: Add .close() BOTH the KDSAsyncClient and the underlying HTTPClient for BOTH KDS & KDF maintain a reference to the HTTPClient at all times in order to close it need to find an appropriate time in the Sink/SinkWriter was: Intermittent failures introduced as part of merge (PR#18314: [FLINK-24228[connectors/firehose] - Unified Async Sink for Kinesis Firehose|https://github.com/apache/flink/pull/18314]): # Failures are intermittent and affecting c. 1 in 7 of builds- on {{flink-ci.flink}} and {{flink-ci.flink-master-mirror}} . # The issue looks identical to the KinesaliteContainer startup issue (Appendix 1). # I have managed to reproduce the issue locally - if I start some parallel containers and keep them running - and then run {{KinesisFirehoseSinkITCase}} then c. 1 in 6 gives the error. # The errors have a slightly different appearance on {{flink-ci.flink-master-mirror}} vs {{flink-ci.flink}} which has the same appearance as local. I only hope it is a difference in logging/killing environment variables. (and that there aren’t 2 distinct issues) Appendix 1: {code:java} org.testcontainers.containers.ContainerLaunchException: Container startup failed at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:336) at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:317) at org.testcontainers.containers.GenericContainer.starting(GenericContainer.java:1066) at ... 11 more Caused by: org.testcontainers.containers.ContainerLaunchException: Could not create/start container at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:525) at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:331) at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:81) ... 12 more Caused by: org.rnorth.ducttape.TimeoutException: Timeout waiting for result with exception at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:54) at {code} > KDS / KDF Sink should call .close() to clean up resources > --------------------------------------------------------- > > Key: FLINK-25948 > URL: https://issues.apache.org/jira/browse/FLINK-25948 > Project: Flink > Issue Type: Bug > Components: Connectors / Common > Affects Versions: 1.15.0 > Reporter: Zichen Liu > Assignee: Ahmed Hamdy > Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.15.0 > > > Add .close() BOTH the KDSAsyncClient and the underlying HTTPClient > for BOTH KDS & KDF > maintain a reference to the HTTPClient at all times in order to close it > need to find an appropriate time in the Sink/SinkWriter -- This message was sent by Atlassian Jira (v8.20.1#820001)