What are the pros and cons of Kafka offset keeping vs Flink offset keeping? Is one more reliable than the other? Personally I prefer having flink manage it due to it being intrinsically tied to its checkpointing mechanism. But interested to learn from others experiences.
Thanks Tim On Thu, Feb 13, 2020, 12:39 AM Hegde, Mahendra <mahendra.he...@arity.com> wrote: > Thanks Theo ! > > > > *From: *"theo.diefent...@scoop-software.de" < > theo.diefent...@scoop-software.de> > *Date: *Thursday, 13 February 2020 at 12:13 AM > *To: *"Hegde, Mahendra" <mahendra.he...@arity.com>, "user@flink.apache.org" > <user@flink.apache.org> > *Subject: *[External] AW: How Flink Kafka Consumer works when it restarts > > > > Hi Mahendra, > > > > Flink will regularly create checkpoints or manually triggered savepoints. > This is data managed and stored by Flink and that data also contains the > kafka offsets. > > > > When restarting, you can configure to restart from the last checkpoint and > or savepoint. > > > > You can additionally configure Flink to commit the offsets to kafka, > again, on checkpoint only. You can then configure Flink to restart from the > committed offset, if you don't let Flink restart from an existing > checkpoint or savepoint, where it would first search in to retore the > offsets. > > > > Having the offsets loaded either from checkpoint, savepoint or kafka, it > will directly communicate with Kafka and ask kafka to poll messages > starting from those offsets. > > > > Best regards > > Theo > > > > > Von meinem Huawei-Telefon gesendet > > > > -------- Ursprüngliche Nachricht -------- > Von: "Hegde, Mahendra" <mahendra.he...@arity.com> > Datum: Mi., 12. Feb. 2020, 17:50 > An: user@flink.apache.org > Betreff: How Flink Kafka Consumer works when it restarts > > Hi All, > > > > I am bit confused on Flink kafka consumer working. > > I read that Flink stores the kafka message offset in checkpoint and uses > it in case if it restarts. > > > > Question is when exactly Flink is committing about successful consumption > confirmation to kafka broker? > > And when Flink job restarts will it send last offset which is available in > checkpoint to kafka broker to start consuming from that point ? > > Or Kafka broker will resume based on last committed offset information > available? > > (I mean who manages the actual offset here, Kafka broker or the Flink > client) > > > > Thanks > > Mahendra >