Mani Kolbe created BEAM-10481:
---------------------------------
Summary: MetricsAccumulator is not registering when resuming from
a checkpoint
Key: BEAM-10481
URL: https://issues.apache.org/jira/browse/BEAM-10481
Project: Beam
Issue Type: Bug
Components: runner-spark
Affects Versions: 2.22.0
Reporter: Mani Kolbe
I am running in an beam application in streaming mode with jobName and
checkPointDir configured. When I recover application from a planned stop, I am
getting failure.
I did some investigation and noticed that the accumulator is not getting
registered on recovering checkpoint scenario.
Correct me if I am wrong. If you see the code in the screenshot below from
MetricsAccumulator class on beam v2.22.0, you can see new instance of
MetricsContainerStepMapAccumulator is getting registered on line#64. But if a
recovered value is present, it constructs a new instance with the recovered
value (Line#78). But this new accumulator instance is not getting registered.
This is forcing Spark Driver to throw exception:
_java.lang.UnsupportedOperationException: Accumulator must be registered before
send to executor_
_!image-2020-07-14-10-53-16-863.png!_
--
This message was sent by Atlassian Jira
(v8.3.4#803005)