[ 
https://issues.apache.org/jira/browse/FLINK-25948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zichen Liu updated FLINK-25948:
-------------------------------
    Description: 
Add .close() BOTH the KDSAsyncClient and the underlying HTTPClient

for BOTH KDS & KDF

maintain a reference to the HTTPClient at all times in order to close it

need to find an appropriate time in the Sink/SinkWriter

  was:
Intermittent failures introduced as part of merge (PR#18314: 
[FLINK-24228[connectors/firehose] - Unified Async Sink for Kinesis 
Firehose|https://github.com/apache/flink/pull/18314]):
 # Failures are intermittent and affecting c. 1 in 7 of builds- on 
{{flink-ci.flink}} and {{flink-ci.flink-master-mirror}} .
 # The issue looks identical to the KinesaliteContainer startup issue (Appendix 
1).
 # I have managed to reproduce the issue locally - if I start some parallel 
containers and keep them running - and then run {{KinesisFirehoseSinkITCase}}  
then c. 1 in 6 gives the error.
 # The errors have a slightly different appearance on 
{{flink-ci.flink-master-mirror}} vs {{flink-ci.flink}} which has the same 
appearance as local. I only hope it is a difference in logging/killing 
environment variables. (and that there aren’t 2 distinct issues)

Appendix 1:
{code:java}
org.testcontainers.containers.ContainerLaunchException: Container startup failed

at 
org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:336)
at 
org.testcontainers.containers.GenericContainer.start(GenericContainer.java:317)
at 
org.testcontainers.containers.GenericContainer.starting(GenericContainer.java:1066)
at 
... 11 more
Caused by: org.testcontainers.containers.ContainerLaunchException: Could not 
create/start container
at 
org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:525)
at 
org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:331)
at 
org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:81)
... 12 more
Caused by: org.rnorth.ducttape.TimeoutException: Timeout waiting for result 
with exception
at 
org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:54)
at
{code}


> KDS / KDF Sink should call .close() to clean up resources
> ---------------------------------------------------------
>
>                 Key: FLINK-25948
>                 URL: https://issues.apache.org/jira/browse/FLINK-25948
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Common
>    Affects Versions: 1.15.0
>            Reporter: Zichen Liu
>            Assignee: Ahmed Hamdy
>            Priority: Critical
>              Labels: pull-request-available, test-stability
>             Fix For: 1.15.0
>
>
> Add .close() BOTH the KDSAsyncClient and the underlying HTTPClient
> for BOTH KDS & KDF
> maintain a reference to the HTTPClient at all times in order to close it
> need to find an appropriate time in the Sink/SinkWriter



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to