[ https://issues.apache.org/jira/browse/FLINK-8086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aljoscha Krettek updated FLINK-8086: ------------------------------------ Fix Version/s: 1.4.0 > FlinkKafkaProducer011 can permanently fail in recovery through > ProducerFencedException > -------------------------------------------------------------------------------------- > > Key: FLINK-8086 > URL: https://issues.apache.org/jira/browse/FLINK-8086 > Project: Flink > Issue Type: Bug > Components: Kafka Connector > Affects Versions: 1.4.0, 1.5.0 > Reporter: Stefan Richter > Assignee: Piotr Nowojski > Priority: Blocker > Fix For: 1.4.0 > > > Chaos monkey test in a cluster environment can permanently bring down our > FlinkKafkaProducer011. > Typically, after a small number of randomly killed TMs, the data generator > job is no longer able to recover from a checkpoint because of the following > problem: > org.apache.kafka.common.errors.ProducerFencedException: Producer attempted an > operation with an old epoch. Either there is a newer producer with the same > transactionalId, or the producer's transaction has been expired by the broker. > The problem is reproduceable and happened for me in every run after the chaos > monkey killed a couple of TMs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)