[ 
https://issues.apache.org/jira/browse/FLINK-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15715367#comment-15715367
 ] 

ASF GitHub Bot commented on FLINK-4574:
---------------------------------------

GitHub user tony810430 opened a pull request:

    https://github.com/apache/flink/pull/2925

    [FLINK-4574] [kinesis]Strengthen fetch interval implementation in Kinesis 
consumer

    I used Timer to implement it.
    
    If "flink.shard.getrecords.intervalmillis" is set by default value, which 
is 0, the timer will schedule ShardConsumerFetcher once and run it forever.
    If "flink.shard.getrecords.intervalmillis" is greater than 0, the timer 
will schedule ShardConsumerFetcher at a fixed ratio by using 
timer.scheduleAtFixedRate, which makes sure two consecutive function call would 
be a fixed interval.
    But if the getRecords took too much time and couldn't be finished on time, 
ShardConsumerFetcher would log the warning and drop the next delayed task.
    
    Ideally :
    |----p1----|----p2----|----p3----|
    |=====>   |====>       |====>      |
      task1         task2           task3
    
    task2 is delayed by task1: task2 will be dropped
    |----p1----|----p2----|----p3----|
    |============>      |====>      |
      task1                            task3

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tony810430/flink FLINK-4574

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2925.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2925
    
----
commit 44ef9b4df7e805a8f20b31ba55f511264820b7c1
Author: 魏偉哲 <[email protected]>
Date:   2016-12-02T09:15:28Z

    [FLINK-4574] Strengthen fetch interval implementation in Kinesis consumer

----


> Strengthen fetch interval implementation in Kinesis consumer
> ------------------------------------------------------------
>
>                 Key: FLINK-4574
>                 URL: https://issues.apache.org/jira/browse/FLINK-4574
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kinesis Connector
>    Affects Versions: 1.1.0
>            Reporter: Tzu-Li (Gordon) Tai
>            Assignee: Wei-Che Wei
>             Fix For: 1.2.0
>
>
> As pointed out by [~rmetzger], right now the fetch interval implementation in 
> the {{ShardConsumer}} class of the Kinesis consumer can lead to much longer 
> interval times than specified by the user, ex. say the specified fetch 
> interval is {{f}}, it takes {{x}} to complete a {{getRecords()}} call, and 
> {{y}} to complete processing the fetched records for emitting, than the 
> actual interval between each fetch is actually {{f+x+y}}.
> The main problem with this is that we can never guarantee how much time has 
> past since the last {{getRecords}} call, thus can not guarantee that returned 
> shard iterators will not have expired the next time we use them, even if we 
> limit the user-given value for {{f}} to not be longer than the iterator 
> expire time.
> I propose to improve this by, per {{ShardConsumer}}, use a 
> {{ScheduledExecutorService}} / {{Timer}} to do the fixed-interval fetching, 
> and a separate blocking queue that collects the fetched records for emitting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to