Hello,

I have a batch and a streaming driver using same functions (Scala). I use
accumulators (passed to functions constructors) to count stuff.

In the batch driver, doing so in the right point of the pipeline, I'm able
to retrieve the accumulator value and print it as log4j log.

In the streaming driver, doing the same results in just nothing. That's
probably due to the fact that accumulators in the streaming driver are
created empty and the code to print them is executed once at the driver
(when they are empty) when the StreamingContext is started and the DAG is
created.

I'm looking for a way to log at every batch period of my Spark Streaming
driver the current value of my accumulators. Indeed, I wish to reset such
accumulators at each period so to just have the counts related to that
period.

Any advice would be really appreciated.

Thanks,
Roberto

Reply via email to