Hi spark committers I would like to discuss the possibility of changing the signature of SparkContext 's setJobGroup and clearJobGroup functions to return a replica of SparkContext with the job group set/unset instead of mutating the original context. I am building a spark job server and I am assigning job groups before passing control to user provided logic that uses spark context to define and execute a job (very much like job-server). The issue is that I can't reliably know when to clear the job group as user defined code can use futures to submit multiple tasks in parallel. In fact, I am even allowing users to return a future from their function on which spark server can register callbacks to know when the user defined job is complete. Now, if I set the job group before passing control to user function and wait on future to complete so that I can clear the job group, I can no longer use that SparkContext for any other job. This means I will have to lock on the SparkContext which seems like a bad idea. Therefore, my proposal would be to return new instance of SparkContext (a replica with just job group set/unset) that can further be used in concurrent environment safely. I am also happy mutating the original SparkContext just not break backward compatibility as long as the returned SparkContext is not affected by set/unset of job groups on original SparkContext.
Thoughts please? Thanks, Aniket