Re: Re: NPE in JobManager

2017-01-20 Thread Dave Marion
Fixing my accumulator did the trick. I should note that the JobManager did not fail when I ran this previously against Flink 1.1.3. Thanks for the help! Dave > On January 20, 2017 at 8:45 AM Dave Marion <dlmar...@comcast.net> wrote: > > I do see that message in one of t

Re: Re: NPE in JobManager

2017-01-20 Thread Dave Marion
ike that one in the log of one of the TaskManagers > > "Failed to serialize accumulators for task." > > with an exception stack trace? > > > Stephan > > > > On Fri, Jan 20, 2017 at 2:10 PM, Dave Marion <dlmar...@comcast.

Re: Re: NPE in JobManager

2017-01-20 Thread Dave Marion
in a non-standard way, but the > JobManager should also catch that (log a warning or debug message) and simply > continue (not crash). > > I'll try to add a patch that the JobManager tolerates these kinds of > issues in the accumulators. > > Stephan > > &g

Fwd: Re: NPE in JobManager

2017-01-19 Thread Dave Marion
Noticed I didn't cc the user list. Original Message -- From: Dave Marion <dlmar...@comcast.net> To: Ted Yu <yuzhih...@gmail.com> Date: January 19, 2017 at 12:13 PM Subject: Re: NPE in JobManager That might take some time. Here is a hand typed top N lines. If that is

NPE in JobManager

2017-01-19 Thread Dave Marion
I'm running flink-1.1.4-bin-hadoop27-scala_2.11 and I'm running into an issue where after some period of time (measured in 1 - 3 hours) the JobManager gets an NPE and shuts itself down. The failure is at JobManager$$updateAccumulators$1.apply(JobManager.scala:1790). I'm using a custom