Rafał Trójczak created FLINK-33729:
--------------------------------------

             Summary: Events are getting lost when an exception occurs within a 
processing function
                 Key: FLINK-33729
                 URL: https://issues.apache.org/jira/browse/FLINK-33729
             Project: Flink
          Issue Type: Bug
          Components: Connectors / Pulsar
    Affects Versions: 1.15.3
            Reporter: Rafał Trójczak


We have a Flink job using a Pulsar source that reads from an input topic, and a 
Pulsar sink that is writing to an output topic.  Both Flink and Pulsar 
connector are of version 1.15.3. The Pulsar version that I use is 2.10.3.

Here is a simple project that is intended to reproduce this problem: 
[https://github.com/trojczak/flink-pulsar-connector-problem/]

All of my tests were done on my local Kubernetes cluster using the Flink 
Kubernetes Operator and Pulsar is running on  my local Docker. But the same 
problem occurred on a "normal" cluster.

Expected behavior: When an exception is thrown within the code (or a 
TaskManager pod is restarted for any other reason, e.g. OOM exception), the 
processing should be picked up from the last event sent to the output topic.

Actual behavior: The events before the failure are sent correctly to the output 
topic, next some of the events from the input topic are missing, then from some 
point the events are being processed normally until the next exception is 
thrown, and so on. Finally, from 100 events that should be sent from the input 
topic to the output topic, only 40 are sent.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to