Github user mallman commented on the issue:

    https://github.com/apache/spark/pull/19410
  
    Hi @szhem.
    
    I'm sorry I haven't been more responsive here. I can relate to your 
frustration, and I do want to help you make progress on this PR and merge it 
in. I have indeed been busy with other responsibilities, but I can rededicate 
time to reviewing this PR. 
    
    Of all the approaches you've proposed so far, I like the 
`ContextCleaner`-based one the best. Personally, I'm okay with setting 
`spark.cleaner.referenceTracking.cleanCheckpoints` to `true` by default for the 
next major Spark release and documenting this change of behavior in the release 
notes. However, that may not be okay with the senior maintainers. As an 
alternative I wonder if we could instead create a new config just for graph RDD 
checkpoint cleaning such as 
`spark.cleaner.referenceTracking.cleanGraphCheckpoints` and set that to `true` 
by default. Then use that config value in `PeriodicGraphCheckpointer` instead 
of `spark.cleaner.referenceTracking.cleanCheckpoints`.
    
    Would you be willing to open another PR with your `ContextCleaner`-based 
approach? I'm not suggesting you close this PR. We can call each PR alternative 
solutions for the same JIRA ticket and cross-reference each PR. If you do that 
then I will try to debug the problem with the retained checkpoints.
    
    Thank you.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to