[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372558#comment-15372558 ] ASF GitHub Bot commented on FLINK-4018: --- Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/2071 No problem, thank you! > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372515#comment-15372515 ] ASF GitHub Bot commented on FLINK-4018: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2071 > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372513#comment-15372513 ] ASF GitHub Bot commented on FLINK-4018: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/2071 Sry, I forgot to merge it. Will do now. > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372134#comment-15372134 ] ASF GitHub Bot commented on FLINK-4018: --- Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/2071 @rmetzger Will you have time to help merge this PR too? I think the changes are good to go. > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367938#comment-15367938 ] ASF GitHub Bot commented on FLINK-4018: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/2071 No problem :) +1 to merge once travis is green > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367604#comment-15367604 ] ASF GitHub Bot commented on FLINK-4018: --- Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/2071 @rmetzger Thanks for the quick review Robert. Comments are addressed, sorry for the sloppy execution on not validating input, should have remembered that :) > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367602#comment-15367602 ] ASF GitHub Bot commented on FLINK-4018: --- Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/2071#discussion_r70065967 --- Diff: flink-streaming-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java --- @@ -155,6 +163,15 @@ public void run() { // we can close this consumer thread once we've reached the end of the subscribed shard break; } else { + if (fetchIntervalMillis != 0) { --- End diff -- `KinesisConfigUtil.validateConfiguration()` will check for a negative value, at the client before the tasks are submitted. > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367588#comment-15367588 ] ASF GitHub Bot commented on FLINK-4018: --- Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/2071#discussion_r70064126 --- Diff: flink-streaming-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java --- @@ -155,6 +163,15 @@ public void run() { // we can close this consumer thread once we've reached the end of the subscribed shard break; } else { + if (fetchIntervalMillis != 0) { + if (LOG.isDebugEnabled()) { + LOG.debug( + "Consumer {} of subtask {} is sleeping for {} milliseconds before fetching the next batch of records ...", --- End diff -- Makes sense, I'll remove it! > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367554#comment-15367554 ] ASF GitHub Bot commented on FLINK-4018: --- Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/2071#discussion_r70059792 --- Diff: flink-streaming-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java --- @@ -155,6 +163,15 @@ public void run() { // we can close this consumer thread once we've reached the end of the subscribed shard break; } else { + if (fetchIntervalMillis != 0) { --- End diff -- Yup. Adding the validation to `KinesisConfigUtil.validateConfiguration()`, like the other values. > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367550#comment-15367550 ] ASF GitHub Bot commented on FLINK-4018: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/2071#discussion_r70059642 --- Diff: flink-streaming-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java --- @@ -155,6 +163,15 @@ public void run() { // we can close this consumer thread once we've reached the end of the subscribed shard break; } else { + if (fetchIntervalMillis != 0) { + if (LOG.isDebugEnabled()) { + LOG.debug( + "Consumer {} of subtask {} is sleeping for {} milliseconds before fetching the next batch of records ...", --- End diff -- I wonder if this log statement is really necessary. It can lead to quite a lot of log entries just for sleeping. (+ there is this other DEBUG log entry on each fetch) > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367547#comment-15367547 ] ASF GitHub Bot commented on FLINK-4018: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/2071#discussion_r70059586 --- Diff: flink-streaming-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java --- @@ -155,6 +163,15 @@ public void run() { // we can close this consumer thread once we've reached the end of the subscribed shard break; } else { + if (fetchIntervalMillis != 0) { --- End diff -- There is no input type validation happening, right? So if a user sets a negative sleeping interval, it will probably fail. > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367067#comment-15367067 ] ASF GitHub Bot commented on FLINK-4018: --- Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/2071 @rmetzger Rebased + addressed your comment. Please review, thanks :) > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366118#comment-15366118 ] ASF GitHub Bot commented on FLINK-4018: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/2071 Great, thank you. > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366114#comment-15366114 ] ASF GitHub Bot commented on FLINK-4018: --- Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/2071 I'll try to update this PR tomorrow. > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348502#comment-15348502 ] ASF GitHub Bot commented on FLINK-4018: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/2071 Yes, I would suggest to do this PR after the big one. > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348482#comment-15348482 ] ASF GitHub Bot commented on FLINK-4018: --- Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/2071 Thank you for reviewing this @rmetzger. Should I address your comment after the rework in https://github.com/apache/flink/pull/2131 is merged, and rebase this PR on that? > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348477#comment-15348477 ] ASF GitHub Bot commented on FLINK-4018: --- Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/2071#discussion_r68420387 --- Diff: flink-streaming-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumerThread.java --- @@ -105,6 +117,8 @@ public void run() { break; } else { + Thread.sleep(idleMillisBetweenFetches); --- End diff -- Agree. I'll change the default value of this to 0 in `KinesisConfigConstants`, and check value here before calling sleep. > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348168#comment-15348168 ] ASF GitHub Bot commented on FLINK-4018: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/2071#discussion_r68386857 --- Diff: flink-streaming-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumerThread.java --- @@ -105,6 +117,8 @@ public void run() { break; } else { + Thread.sleep(idleMillisBetweenFetches); --- End diff -- I'm not sure if its a good idea to introduce this waiting time by default. Imagine we are consuming slowly because downstream operations are expensive. Then, we would introduce an artificial slowdown (in latency and throughput). I would suggest to: - Set the value by default to 0 - Only call sleep if the idle time is > 0. > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Kinesis Connector, Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.1.0 > > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15315316#comment-15315316 ] ASF GitHub Bot commented on FLINK-4018: --- GitHub user tzulitai opened a pull request: https://github.com/apache/flink/pull/2071 [FLINK-4018](streaming-connectors) Configurable idle time between getRecords requests to Kinesis shards Along with this new configuration and the already existing `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more control on the desired throughput behaviour for the Kinesis consumer. The default value for this new configuration is 500 milliseconds idle time. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tzulitai/flink FLINK-4018 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2071.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2071 commit 74fcb62f62007c0396d2ae870298f35ae5ce Author: Gordon TaiDate: 2016-06-04T04:43:30Z [FLINK-4018] Add configuration for idle time between get requests to Kinesis shards > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4018) Configurable idle time between getRecords requests to Kinesis shards
[ https://issues.apache.org/jira/browse/FLINK-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15315290#comment-15315290 ] Tzu-Li (Gordon) Tai commented on FLINK-4018: We have a user who has tried out the connector and preparing for production next week with Flink & Kinesis. The only thing worrying is this issue. I'll PR within the next 24 hours for review. > Configurable idle time between getRecords requests to Kinesis shards > > > Key: FLINK-4018 > URL: https://issues.apache.org/jira/browse/FLINK-4018 > Project: Flink > Issue Type: Sub-task > Components: Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > > Currently, the Kinesis consumer is calling getRecords() right after finishing > previous calls. This results in easily reaching Amazon's limitation of 5 GET > requests per shard per second. Although the code already has backoff & retry > mechanism, this will affect other applications consuming from the same > Kinesis stream. > Along with this new configuration and the already existing > `KinesisConfigConstants.CONFIG_SHARD_RECORDS_PER_GET`, users will have more > control on the desired throughput behaviour for the Kinesis consumer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)