In my structured streaming job I am updating Spark Accumulators in the
updateAcrossEvents method but they are always 0 when I try to print them in
my StreamingListener. Here's the code:

.mapGroupsWithState(GroupStateTimeout.ProcessingTimeTimeout())(
        updateAcrossEvents
      )

The accumulators get incremented in 'updateAcrossEvents'. I've a
StreamingListener which writes values of the accumulators in
'onQueryProgress' method but in this method the Accumulators are ALWAYS
ZERO!

When I added log statements in the updateAcrossEvents, I could see that
these accumulators are getting incremented as expected.

This only happens when I run in the 'Cluster' mode. In Local mode it works
fine which implies that the Accumulators are not getting distributed
correctly - or something like that!

Note: I've seen quite a few answers on the Web that tell me to perform an
"Action". That's not a solution here. This is a 'Stateful Structured
Streaming' job. Yes, I am also 'registering' them in SparkContext.

Reply via email to