Hi Gordon, I was able to reproduce the data loss on standalone flink cluster
also. I have stripped down version of our code with here:

Environment:
Flink standalone 1.3.0
Kafka 0.9

*What the code is doing:*
-consume messages from kafka topic ('event.filter.topic' property in
application.properties)
-group them by key
-analyze the events in a window and filter some messages.
-send remaining messages to kafka topc ('sep.http.topic' property in
application.properties)

To build: 
./gradlew clean assemble

The jar needs path to 'application.properties' file to run

Important properties in application.properties:
window.session.interval.sec
kafka.brokers
event.filter.topic --> source topic
sep.http.topic --> destination topic

To test:
-Use 'EventGenerator' class to publish messages to source kafka topic
        The data published won't be filtered by the logic. If you publish 10
messages to the source topic, 
        those 10 messages should be sent to the destination topic.

-Once we  see that flink has received all the messages, bring down all kafka
brokers

-Let Flink jobs fail for 2-3 times. 

-Restart kafka brokers. 

Note: Data loss isn't observed frequently. 1/4 times or so.

Thanks for all your help.

eventFilter.zip
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/n14522/eventFilter.zip>
  









--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Fink-KafkaProducer-Data-Loss-tp11413p14522.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.

Reply via email to