[ https://issues.apache.org/jira/browse/FLINK-35815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hong Liang Teoh resolved FLINK-35815. ------------------------------------- Resolution: Fixed > KinesisProxySyncV2 doesn't always retry throttling exceptions. > --------------------------------------------------------------- > > Key: FLINK-35815 > URL: https://issues.apache.org/jira/browse/FLINK-35815 > Project: Flink > Issue Type: Bug > Components: Connectors / Kinesis > Affects Versions: aws-connector-4.2.0, aws-connector-4.3.0, 1.19.1 > Reporter: Krzysztof Dziolak > Assignee: Aleksandr Pilipenko > Priority: Major > Labels: pull-request-available > Fix For: aws-connector-4.4.0 > > > *Problem:* > We have observed missing retrys on throttling for DescribeStreamSummary calls > from Kinesis. > {code:java} > org.apache.flink.runtime.rest.handler.RestHandlerException: Could not execute > application. > ... > Caused by: java.util.concurrent.CompletionException: > org.apache.flink.util.FlinkRuntimeException: Could not execute application. > ... > Caused by: org.apache.flink.util.FlinkRuntimeException: Could not execute > application. > ... > Caused by: org.apache.flink.client.program.ProgramInvocationException: The > main method caused an error: Error registering stream: <edited> > ... > Caused by: > org.apache.flink.streaming.connectors.kinesis.util.StreamConsumerRegistrarUtil$FlinkKinesisStreamConsumerRegistrarException: > Error registering stream: <edited> > ... > Caused by: > org.apache.flink.kinesis.shaded.software.amazon.awssdk.services.kinesis.model.LimitExceededException: > Rate exceeded for stream <edited>. (Service: Kinesis, Status Code: 400, > Request ID: efe4c2a9-3c3b-9c1c-b0ed-d9b05db93be2, Extended Request ID: > pSG6kwQXgPWD2S7YoPT4RKf+g8QbRBaxc0grhNz6juEoti/uGUQTzyqsfmFCLSHoM+u1ydHvqxzsv/0ICUid6aTAQdndy2EO) > ... > at > org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxySyncV2.lambda$describeStreamSummary$0(KinesisProxySyncV2.java:91) > at > org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxySyncV2.invokeWithRetryAndBackoff(KinesisProxySyncV2.java:175) > at > org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxySyncV2.describeStreamSummary(KinesisProxySyncV2.java:90) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.fanout.StreamConsumerRegistrar.registerStreamConsumer(StreamConsumerRegistrar.java:92) > at > org.apache.flink.streaming.connectors.kinesis.util.StreamConsumerRegistrarUtil.registerStreamConsumers(StreamConsumerRegistrarUtil.java:121) > ... {code} > The same problem occurs both with LAZY and EAGER registration strategies. > > *Why does it get stuck?* > *[https://github.com/apache/flink-connector-aws/blob/c716ca439b2c8e6d4b5905a03c867c418e031688/flink-connector-aws/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/util/AwsV2Util.java#L77]* > The `isRecoverableException` check validates the cause of the exception only, > but it doesn't inspect the actual exception being evaluated for retriability. > In this particular case, LimitExceededException is thrown without wrappers > and it appears that this case is not handled correctly. -- This message was sent by Atlassian Jira (v8.20.10#820010)