GitHub user JeetKunDoug opened a pull request:

    https://github.com/apache/spark/pull/21322

    [SPARK-24225] Support closing AutoClosable objects in MemoryStore

    This allows Broadcast Variables can be released properly
    
    ## What changes were proposed in this pull request?
    
    Broadcast variables, while usually used to broadcast data to executors, can 
also be used to control the scope and lifecycle of shared resources (e.g. 
connection pools). When creating and destroying those resources within a task 
is expensive, using a broadcast variable to keep them deserialized in memory 
for multiple tasks to share can make a huge difference in the efficiency of a 
Spark job.
    
    In `MemoryStore`, check if any entries in a `DeserializedMemoryEntry` 
implement `AutoClosable` and, if so, call `close` on those resources. This 
occurs in two places:
    - `remove` of an individual item
    - `clear` of the MemoryStore
    
    ## How was this patch tested?
    
    Added additional tests to `MemoryStoreSuite` in order to check that we 
properly close resources, and handle exceptions properly.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JeetKunDoug/spark handle-autoclosable-objects

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21322.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21322
    
----
commit f254f94fdc5e2648d7c1104bf5ec2355de7c6055
Author: Doug Rohrer <drohrer@...>
Date:   2018-05-14T16:24:00Z

    [SPARK-24225] Support closing AutoClosable objects in MemoryStore so 
Broadcast Variables can be released properly

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to