Fixing my accumulator did the trick. I should note that the JobManager did not
fail when I ran this previously against Flink 1.1.3. Thanks for the help!
Dave
> On January 20, 2017 at 8:45 AM Dave Marion <dlmar...@comcast.net> wrote:
>
> I do see that message in one of t
ike that one in the log of one of the TaskManagers
>
> "Failed to serialize accumulators for task."
>
> with an exception stack trace?
>
>
> Stephan
>
>
>
> On Fri, Jan 20, 2017 at 2:10 PM, Dave Marion <dlmar...@comcast.
in a non-standard way, but the
> JobManager should also catch that (log a warning or debug message) and simply
> continue (not crash).
>
> I'll try to add a patch that the JobManager tolerates these kinds of
> issues in the accumulators.
>
> Stephan
>
>
&g
Noticed I didn't cc the user list.
Original Message --
From: Dave Marion <dlmar...@comcast.net>
To: Ted Yu <yuzhih...@gmail.com>
Date: January 19, 2017 at 12:13 PM
Subject: Re: NPE in JobManager
That might take some time. Here is a hand typed top N lines. If that is
I'm running flink-1.1.4-bin-hadoop27-scala_2.11 and I'm running into an issue
where after some period of time (measured in 1 - 3 hours) the JobManager gets
an NPE and shuts itself down. The failure is at
JobManager$$updateAccumulators$1.apply(JobManager.scala:1790). I'm using a
custom