Hello, I have a batch and a streaming driver using same functions (Scala). I use accumulators (passed to functions constructors) to count stuff.
In the batch driver, doing so in the right point of the pipeline, I'm able to retrieve the accumulator value and print it as log4j log. In the streaming driver, doing the same results in just nothing. That's probably due to the fact that accumulators in the streaming driver are created empty and the code to print them is executed once at the driver (when they are empty) when the StreamingContext is started and the DAG is created. I'm looking for a way to log at every batch period of my Spark Streaming driver the current value of my accumulators. Indeed, I wish to reset such accumulators at each period so to just have the counts related to that period. Any advice would be really appreciated. Thanks, Roberto