Rafał Trójczak created FLINK-33729: --------------------------------------
Summary: Events are getting lost when an exception occurs within a processing function Key: FLINK-33729 URL: https://issues.apache.org/jira/browse/FLINK-33729 Project: Flink Issue Type: Bug Components: Connectors / Pulsar Affects Versions: 1.15.3 Reporter: Rafał Trójczak We have a Flink job using a Pulsar source that reads from an input topic, and a Pulsar sink that is writing to an output topic. Both Flink and Pulsar connector are of version 1.15.3. The Pulsar version that I use is 2.10.3. Here is a simple project that is intended to reproduce this problem: [https://github.com/trojczak/flink-pulsar-connector-problem/] All of my tests were done on my local Kubernetes cluster using the Flink Kubernetes Operator and Pulsar is running on my local Docker. But the same problem occurred on a "normal" cluster. Expected behavior: When an exception is thrown within the code (or a TaskManager pod is restarted for any other reason, e.g. OOM exception), the processing should be picked up from the last event sent to the output topic. Actual behavior: The events before the failure are sent correctly to the output topic, next some of the events from the input topic are missing, then from some point the events are being processed normally until the next exception is thrown, and so on. Finally, from 100 events that should be sent from the input topic to the output topic, only 40 are sent. -- This message was sent by Atlassian Jira (v8.20.10#820010)