GitHub user JeetKunDoug opened a pull request: https://github.com/apache/spark/pull/21322
[SPARK-24225] Support closing AutoClosable objects in MemoryStore This allows Broadcast Variables can be released properly ## What changes were proposed in this pull request? Broadcast variables, while usually used to broadcast data to executors, can also be used to control the scope and lifecycle of shared resources (e.g. connection pools). When creating and destroying those resources within a task is expensive, using a broadcast variable to keep them deserialized in memory for multiple tasks to share can make a huge difference in the efficiency of a Spark job. In `MemoryStore`, check if any entries in a `DeserializedMemoryEntry` implement `AutoClosable` and, if so, call `close` on those resources. This occurs in two places: - `remove` of an individual item - `clear` of the MemoryStore ## How was this patch tested? Added additional tests to `MemoryStoreSuite` in order to check that we properly close resources, and handle exceptions properly. You can merge this pull request into a Git repository by running: $ git pull https://github.com/JeetKunDoug/spark handle-autoclosable-objects Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21322.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21322 ---- commit f254f94fdc5e2648d7c1104bf5ec2355de7c6055 Author: Doug Rohrer <drohrer@...> Date: 2018-05-14T16:24:00Z [SPARK-24225] Support closing AutoClosable objects in MemoryStore so Broadcast Variables can be released properly ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org