Why would a kafka source checkpoint take so long?

2017-07-12 Thread Gyula Fóra
Hi, I have noticed a strange behavior in one of our jobs: every once in a while the Kafka source checkpointing time becomes extremely large compared to what it usually is. (To be very specific it is a kafka source chained with a stateless map operator) To be more specific checkpointing the offset

Re: Why would a kafka source checkpoint take so long?

2017-07-12 Thread Urs Schoenenberger
Hi Gyula, I don't know the cause unfortunately, but we observed a similiar issue on Flink 1.1.3. The problem seems to be gone after upgrading to 1.2.1. Which version are you running on? Urs On 12.07.2017 09:48, Gyula Fóra wrote: > Hi, > > I have noticed a strange behavior in one of our jobs: ev

Re: Why would a kafka source checkpoint take so long?

2017-07-12 Thread Gyula Fóra
Hi, We are using the latest 1.3.1 Gyula Urs Schoenenberger ezt írta (időpont: 2017. júl. 12., Sze, 10:44): > Hi Gyula, > > I don't know the cause unfortunately, but we observed a similiar issue > on Flink 1.1.3. The problem seems to be gone after upgrading to 1.2.1. > Which version are you run

Re: Why would a kafka source checkpoint take so long?

2017-07-12 Thread Stefan Richter
Hi, could you introduce some logging to figure out from which method call the delay is introduced? Best, Stefan > Am 12.07.2017 um 11:37 schrieb Gyula Fóra : > > Hi, > > We are using the latest 1.3.1 > > Gyula > > Urs Schoenenberger > ezt írta (időpon

Re: Why would a kafka source checkpoint take so long?

2017-07-12 Thread Gyula Fóra
Yes thats definitely what I am about to do next but just thought maybe someone has seen this before. Will post info next time it happens. (Not guaranteed to happen soon as it didn't happen for a long time before) Gyula On Wed, Jul 12, 2017, 12:13 Stefan Richter wrote: > Hi, > > could you intro

Re: Why would a kafka source checkpoint take so long?

2017-07-12 Thread Stephan Ewen
Can it be that the checkpoint thread is waiting to grab the lock, which is held by the chain under backpressure? On Wed, Jul 12, 2017 at 12:23 PM, Gyula Fóra wrote: > Yes thats definitely what I am about to do next but just thought maybe > someone has seen this before. > > Will post info next ti

Re: Why would a kafka source checkpoint take so long?

2017-07-12 Thread Gyula Fóra
I have added logging that will help determine this as well, next time this happens I will post the results. (Although there doesnt seem to be high backpressure) Thanks for the tips, Gyula Stephan Ewen ezt írta (időpont: 2017. júl. 12., Sze, 15:27): > Can it be that the checkpoint thread is wait

Re: Why would a kafka source checkpoint take so long?

2017-07-12 Thread vinay patil
with commit on checkpoints. Also I dont see this happen in >>>>> > other jobs. >>>>> > >>>>> > Any clue on what might cause this? >>>>> > >>>>> > Thanks :) >>>>> > Gyula >>>>> >>>>> -- >>>>> Urs Schönenberger - [hidden email] >>>>> <http:///user/SendEmail.jtp?type=node&node=14210&i=5> >>>>> >>>>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring >>>>> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke >>>>> Sitz: Unterföhring * Amtsgericht München * HRB 135082 >>>>> >>>> >>>> >> > > -- > If you reply to this email, your message will be added to the discussion > below: > http://apache-flink-user-mailing-list-archive.2336050. > n4.nabble.com/Why-would-a-kafka-source-checkpoint-take- > so-long-tp14193p14210.html > To start a new topic under Apache Flink User Mailing List archive., email > ml+s2336050n1...@n4.nabble.com > To unsubscribe from Apache Flink User Mailing List archive., click here > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=dmluYXkxOC5wYXRpbEBnbWFpbC5jb218MXwxODExMDE2NjAx> > . > NAML > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Why-would-a-kafka-source-checkpoint-take-so-long-tp14193p14232.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: Why would a kafka source checkpoint take so long?

2017-07-13 Thread Stephan Ewen
gt;>> > I have noticed a strange behavior in one of our jobs: every once in >>>>>> a >>>>>> > while the Kafka source checkpointing time becomes extremely large >>>>>> > compared to what it usually is. (To be very specific it is a kafka

Re: Why would a kafka source checkpoint take so long?

2017-07-13 Thread Vinay Patil
t; jobs. >>>>>>> > >>>>>>> > Any clue on what might cause this? >>>>>>> > >>>>>>> > Thanks :) >>>>>>> > Gyula >>>>>>> > >>>>>>> > >&

Re: Why would a kafka source checkpoint take so long?

2017-07-14 Thread Gyula Fóra
Hi, I have seen this again yesterday, now with some logging it looks like acquiring the lock took all the time. In this case it was pretty clear that the job started falling behind a few minutes before starting the checkpoint so backpressure seems to be the culprit. Thanks, Gyula Stephan Ewen e