It certainly makes sense for a single streaming job. But it is definitely non-trivial to make this useful to all Spark programs. If I were to have a long running SParkContext and submit a wide variety of jobs to it, this would make the list of accumulators very, very large. Maybe the solution is to have pagination of these and always sort them by the last update time.
-- Reynold Xin On October 16, 2014 at 12:11:00 PM, Sean McNamara (sean.mcnam...@webtrends.com) wrote: Accumulators on the stage info page show the rolling life time value of accumulators as well as per task which is handy. I think it would be useful to add another field to the “Accumulators” table that also shows the total for the stage you are looking at (basically just a merge of the accumulators for tasks in that stage). This would be useful for any job that is iterative (eg- basically every spark streaming job). Does this idea make sense? Separate but related question- From the operational side I think it could be very useful to have an accumulators summary page. For example we have a spark streaming job with many different stages. It is difficult to navigate into each stage to pick out a trend. An accumulators page that allowed one to filter by stage description and/or accumulator name would be very useful. Thoughts? Thanks, Sean --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org