Hi Yun Thanks for looking into it and forwarded it to the right place.
Med venlig hilsen / Best regards Lasse Nedergaard > Den 22. apr. 2020 kl. 11.06 skrev Yun Tang <myas...@live.com>: > > > Hi Lasse > > After debug locally, this should be a bug in Flink (even the latest version). > However, the bug should be caused in network stack with which I am not very > familiar and not so easy to find root cause directly. After discussion with > our network guys in Flink, we decide to first create FLINK-17322 [1] to track > this problem, and related owner would take a look at this problem. > > Really thank you for reporting this bug. > > [1] https://issues.apache.org/jira/browse/FLINK-17322 > > Best > Yun Tang > From: Yun Tang <myas...@live.com> > Sent: Wednesday, April 22, 2020 1:43 > To: Lasse Nedergaard <lassenedergaardfl...@gmail.com> > Cc: user <user@flink.apache.org> > Subject: Re: Latency tracking together with broadcast state can cause job > failure > > Hi Lasse > > Really sorry for missing your reply. I'll run your project and find the root > cause in my day time. And thanks for @Robert Metzger 's kind remind. > > Best > Yun Tang > From: Robert Metzger <rmetz...@apache.org> > Sent: Tuesday, April 21, 2020 20:01 > To: Lasse Nedergaard <lassenedergaardfl...@gmail.com> > Cc: Yun Tang <myas...@live.com>; user <user@flink.apache.org> > Subject: Re: Latency tracking together with broadcast state can cause job > failure > > Hey Lasse, > has the problem been resolved? > > (I'm also responding to this to make sure the thread gets attention again :) ) > > Best, > Robert > > >> On Wed, Apr 1, 2020 at 10:03 PM Lasse Nedergaard >> <lassenedergaardfl...@gmail.com> wrote: >> Hi >> >> I have attached a simple project with a test that reproduce the problem. The >> normal fault is a mixed string but you can also EOF exception. >> Please let me know if you have any questions to the solution. >> >> Med venlig hilsen / Best regards >> Lasse Nedergaard >> >> >> Den 1. apr. 2020 kl. 09.15 skrev Yun Tang <myas...@live.com>: >> >> >> Hi Lasse >> >> Never meet this problem before, but can you share some exception stack trace >> so that we could take a look. The simple project to reproduce is also a good >> choice. >> >> Best >> Yun Tang >> From: Lasse Nedergaard <lassenedergaardfl...@gmail.com> >> Sent: Tuesday, March 31, 2020 19:10 >> To: user <user@flink.apache.org> >> Subject: Latency tracking together with broadcast state can cause job failure >> >> Hi >> >> We have in both Flink 1.9.2 and 1.10 struggled with random deserialze and >> Index out of range exception in one of our job. We also get out of memory >> exceptions. >> We have now identified it as a latency tracking together with broadcast >> state Causing the problem. When we do integration testing locally we don’t >> see any problem it’s only fails running on the cluster. >> We have concluded that latency tracking package send over broadcast cause >> the data stream to be corrupted and causing the exceptions. >> We work on preparing a simple project on github to reproduce the problem so >> the underlying problem can be solved. >> >> Anyone else have seen these kind of problems? >> >> Med venlig hilsen / Best regards >> Lasse Nedergaard >>