Hi Yun

Thanks for looking into it and forwarded it to the right place. 


Med venlig hilsen / Best regards
Lasse Nedergaard


> Den 22. apr. 2020 kl. 11.06 skrev Yun Tang <myas...@live.com>:
> 
> 
> Hi Lasse
> 
> After debug locally, this should be a bug in Flink (even the latest version). 
> However, the bug should be caused in network stack with which I am not very 
> familiar and not so easy to find root cause directly. After discussion with 
> our network guys in Flink, we decide to first create FLINK-17322 [1] to track 
> this problem, and related owner would take a look at this problem.
> 
> Really thank you for reporting this bug.
> 
> [1] https://issues.apache.org/jira/browse/FLINK-17322
> 
> Best
> Yun Tang
> From: Yun Tang <myas...@live.com>
> Sent: Wednesday, April 22, 2020 1:43
> To: Lasse Nedergaard <lassenedergaardfl...@gmail.com>
> Cc: user <user@flink.apache.org>
> Subject: Re: Latency tracking together with broadcast state can cause job 
> failure
>  
> Hi Lasse
> 
> Really sorry for missing your reply. I'll run your project and find the root 
> cause in my day time. And thanks for @Robert Metzger 's kind remind.
> 
> Best
> Yun Tang
> From: Robert Metzger <rmetz...@apache.org>
> Sent: Tuesday, April 21, 2020 20:01
> To: Lasse Nedergaard <lassenedergaardfl...@gmail.com>
> Cc: Yun Tang <myas...@live.com>; user <user@flink.apache.org>
> Subject: Re: Latency tracking together with broadcast state can cause job 
> failure
>  
> Hey Lasse,
> has the problem been resolved?
> 
> (I'm also responding to this to make sure the thread gets attention again :) )
> 
> Best,
> Robert
> 
> 
>> On Wed, Apr 1, 2020 at 10:03 PM Lasse Nedergaard 
>> <lassenedergaardfl...@gmail.com> wrote:
>> Hi
>> 
>> I have attached a simple project with a test that reproduce the problem. The 
>> normal fault is a mixed string but you can also EOF exception. 
>> Please let me know if you have any questions to the solution. 
>> 
>> Med venlig hilsen / Best regards
>> Lasse Nedergaard
>> 
>> 
>> Den 1. apr. 2020 kl. 09.15 skrev Yun Tang <myas...@live.com>:
>> 
>> 
>> Hi Lasse
>> 
>> Never meet this problem before, but can you share some exception stack trace 
>> so that we could take a look. The simple project to reproduce is also a good 
>> choice.
>> 
>> Best
>> Yun Tang
>> From: Lasse Nedergaard <lassenedergaardfl...@gmail.com>
>> Sent: Tuesday, March 31, 2020 19:10
>> To: user <user@flink.apache.org>
>> Subject: Latency tracking together with broadcast state can cause job failure
>>  
>> Hi
>> 
>> We have in both Flink 1.9.2 and 1.10 struggled with random deserialze and 
>> Index out of range exception in one of our job. We also get out of memory 
>> exceptions. 
>> We have now identified it as a latency tracking together with broadcast 
>> state Causing the problem. When we do integration testing locally we don’t 
>> see any problem it’s only fails running on the cluster. 
>> We have concluded that latency tracking package send over broadcast cause 
>> the data stream to be corrupted and causing the exceptions. 
>> We work on preparing a simple project on github to reproduce the problem so 
>> the underlying problem can be solved. 
>> 
>> Anyone else have seen these kind of problems?
>> 
>> Med venlig hilsen / Best regards
>> Lasse Nedergaard
>> 

Reply via email to