[jira] [Commented] (FLINK-5701) FlinkKafkaProducer should check asyncException on checkpoints

ASF GitHub Bot (JIRA) Wed, 08 Feb 2017 07:45:44 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858135#comment-15858135
 ]


ASF GitHub Bot commented on FLINK-5701:
---------------------------------------

Github user rmetzger commented on the issue:

    https://github.com/apache/flink/pull/3278
  
    I think we should allow users to disable the wait on flush, because it can 
substantially delay the confirmation of a checkpoint.
    If a user favors fast checkpoints over complete data in Kafka (for example 
when a particular producer instance is used mostly for debugging purposes 
only), we should allow them to do that. The overhead for us making this 
configurable is very low, but the benefit for some users might be huge.


> FlinkKafkaProducer should check asyncException on checkpoints
> -------------------------------------------------------------
>
>                 Key: FLINK-5701
>                 URL: https://issues.apache.org/jira/browse/FLINK-5701
>             Project: Flink
>          Issue Type: Bug
>          Components: Kafka Connector, Streaming Connectors
>            Reporter: Tzu-Li (Gordon) Tai
>            Priority: Critical
>
> Reported in ML: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Fink-KafkaProducer-Data-Loss-td11413.html
> The problem:
> The producer holds a {{pendingRecords}} value that is incremented on each 
> invoke() and decremented on each callback, used to check if the producer 
> needs to sync on pending callbacks on checkpoints.
> On each checkpoint, we should only consider the checkpoint succeeded iff 
> after flushing the {{pendingRecords == 0}} and {{asyncException == null}} 
> (currently, we’re only checking {{pendingRecords}}).
> A quick fix for this is to check and rethrow async exceptions in the 
> {{snapshotState}} method both before and after flushing and 
> {{pendingRecords}} becomes 0.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5701) FlinkKafkaProducer should check asyncException on checkpoints

Reply via email to