[ https://issues.apache.org/jira/browse/SPARK-636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-636. -------------------------------- Resolution: Incomplete > Add mechanism to run system management/configuration tasks on all workers > ------------------------------------------------------------------------- > > Key: SPARK-636 > URL: https://issues.apache.org/jira/browse/SPARK-636 > Project: Spark > Issue Type: New Feature > Components: Spark Core > Reporter: Josh Rosen > Priority: Major > Labels: bulk-closed > > It would be useful to have a mechanism to run a task on all workers in order > to perform system management tasks, such as purging caches or changing system > properties. This is useful for automated experiments and benchmarking; I > don't envision this being used for heavy computation. > Right now, I can mimic this with something like > {code} > sc.parallelize(0 until numMachines, numMachines).foreach { } > {code} > but this does not guarantee that every worker runs a task and requires my > user code to know the number of workers. > One sample use case is setup and teardown for benchmark tests. For example, > I might want to drop cached RDDs, purge shuffle data, and call > {{System.gc()}} between test runs. It makes sense to incorporate some of > this functionality, such as dropping cached RDDs, into Spark itself, but it > might be helpful to have a general mechanism for running ad-hoc tasks like > {{System.gc()}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org