It certainly makes sense for a single streaming job. But it is definitely 
non-trivial to make this useful to all Spark programs. If I were to have a long 
running SParkContext and submit a wide variety of jobs to it, this would make 
the list of accumulators very, very large. Maybe the solution is to have 
pagination of these and always sort them by the last update time.

-- 
Reynold Xin


On October 16, 2014 at 12:11:00 PM, Sean McNamara (sean.mcnam...@webtrends.com) 
wrote:

Accumulators on the stage info page show the rolling life time value of 
accumulators as well as per task which is handy. I think it would be useful to 
add another field to the “Accumulators” table that also shows the total for the 
stage you are looking at (basically just a merge of the accumulators for tasks 
in that stage). This would be useful for any job that is iterative (eg- 
basically every spark streaming job).  

Does this idea make sense?  


Separate but related question- From the operational side I think it could be 
very useful to have an accumulators summary page. For example we have a spark 
streaming job with many different stages. It is difficult to navigate into each 
stage to pick out a trend. An accumulators page that allowed one to filter by 
stage description and/or accumulator name would be very useful.  

Thoughts?  


Thanks,  

Sean  
---------------------------------------------------------------------  
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org  
For additional commands, e-mail: dev-h...@spark.apache.org  

Reply via email to