hajimeni created FLINK-37918:
--------------------------------
Summary: Restore the ability to set an interval for GetRecords
calls to Kinesis shards.
Key: FLINK-37918
URL: https://issues.apache.org/jira/browse/FLINK-37918
Project: Flink
Issue Type: Improvement
Components: Connectors / Kinesis
Affects Versions: aws-connector-5.0.0
Reporter: hajimeni
h3. Background
The previous Flink Kinesis connector (flink-connector-kinesis) provided a
configuration parameter, SHARD_GETRECORDS_INTERVAL_MILLIS, which allowed users
to set a specific interval between GetRecords calls for each shard. This
functionality is absent in the new AWS Kinesis Streams connector
(flink-connector-aws-kinesis-streams).
h3. Problem
The lack of a configurable interval for GetRecords calls in the new connector
(KinesisStreamsSource) poses a significant challenge in scenarios with multiple
consumers reading from the same Kinesis stream. Without the ability to increase
the interval between GetRecords calls, consumers can easily exceed the AWS
Kinesis limit of five GetRecords calls per second per shard. This leads to
several issues: * Wasted API Calls and Increased Costs: Continuous, rapid calls
that are likely to be throttled are inefficient and can lead to increased costs.
* Operational Instability: In a multi-tenant or multi-application environment,
the absence of this control makes it difficult to ensure stable and predictable
data consumption across all consumers.
* AWS documentation recommends adjusting the frequency of GetRecords calls to
avoid these issues, especially when multiple consumers are involved. You can
find this recommendation in the AWS Kinesis Developer Guide. (see:
https://docs.aws.amazon.com/streams/latest/dev/kinesis-low-latency.html )
h3. Feature Request
We request the re-introduction of a configuration option, similar to
SHARD_GETRECORDS_INTERVAL_MILLIS, in the flink-connector-aws-kinesis-streams
connector. This would allow users to effectively manage the rate of GetRecords
calls per shard, thereby preventing API throttling and ensuring the stability
and efficiency of Flink applications that consume data from Kinesis Data
Streams.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)