phaniarnab opened a new pull request, #1756: URL: https://github.com/apache/systemds/pull/1756
This patch adds the compiler flags and runtime support to checkpoint any Spark instruction which is marked for caching. During postprocessing of a marked instruction, we first inplace persist the RDD and then store the RDD in the local Lineage cache for reuse. This patch also fixes a bug in the last commit which was unpersisting the locally cached RDDs during rmvar. Future commits will add rewrites to mark the Spark instructions for caching in a cost-based manner. Hyperparameter tuning of LmDS with 2.5k columns improves by 22x by caching the cpmm results in the executors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
