GitHub user szhem opened a pull request:

    https://github.com/apache/spark/pull/19410

    [SPARK-22184][CORE][GRAPHX] GraphX fails in case of insufficient memory and 
checkpoints enabled

    ## What changes were proposed in this pull request?
    
    Fix for [SPARK-22184](https://issues.apache.org/jira/browse/SPARK-22184) 
JIRA issue (and also includes the related #19373).
    
    In case of GraphX jobs, when checkpoints are enabled, GraphX can fail with 
`FileNotFoundException`.
    
    The failure can happen during Pregel iterations or when Pregel completes 
only in cases of insufficient memory when checkpointed RDDs are evicted from 
memory and have to be read from disk (but already removed from there).
    
    This PR proposes to preserve all the checkpoints the last one (checkpoint) 
of `messages` and `graph` depends on during the iterations, and all the 
checkpoints of `messages` and `graph` the resulting `graph` depends at the end 
of Pregel iterations.
    
    ## How was this patch tested?
    
    Unit tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/szhem/spark 
SPARK-22184-graphx-early-checkpoints-removal

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19410.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19410
    
----
commit f2386b61a47abf19b8ca6cea7e0e5c7da9baf7d6
Author: Sergey Zhemzhitsky <szhemzhit...@gmail.com>
Date:   2017-09-27T21:33:18Z

    [SPARK-22150][CORE] preventing too early removal of checkpoints in case of 
dependant RDDs

commit aa2bedae74999694b0a9992986e85d3f9feab5f6
Author: Sergey Zhemzhitsky <szhemzhit...@gmail.com>
Date:   2017-10-02T13:10:48Z

    [SPARK-22150][CORE] checking whether two checkpoints have the same 
checkpointed RDD as their parent to prevent early removal

commit 6406aea3bc87c1f3a9460bbc2ae1af67d7c0c294
Author: Sergey Zhemzhitsky <szhemzhit...@gmail.com>
Date:   2017-10-02T13:22:19Z

    [SPARK-22150][CORE] respecting scala style settings

commit 4a55cda79e61e7eec67ae9545beb0c38eca7b11b
Author: Sergey Zhemzhitsky <szhemzhit...@gmail.com>
Date:   2017-10-02T14:43:27Z

    [SPARK-22184][CORE][GRAPHX] retain all the checkpoints the last one depends 
on

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to