Repository: spark Updated Branches: refs/heads/branch-1.5 1e61ff4ca -> cb7a90ad5
[SPARK-14298][ML][MLLIB] LDA should support disable checkpoint ## What changes were proposed in this pull request? In the doc of [```checkpointInterval```](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala#L241), we told users that they can disable checkpoint by setting ```checkpointInterval = -1```. But we did not handle this situation for LDA actually, we should fix this bug. ## How was this patch tested? Existing tests. cc jkbradley Author: Yanbo Liang <yblia...@gmail.com> Closes #12089 from yanboliang/spark-14298. (cherry picked from commit 56af8e85cca056096fe4e765d8d287e0f9efc0d2) Signed-off-by: Joseph K. Bradley <jos...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cb7a90ad Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cb7a90ad Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cb7a90ad Branch: refs/heads/branch-1.5 Commit: cb7a90ad5539000f7cb8e92b40a1103aad9311a0 Parents: 1e61ff4 Author: Yanbo Liang <yblia...@gmail.com> Authored: Fri Apr 8 11:49:44 2016 -0700 Committer: Joseph K. Bradley <jos...@databricks.com> Committed: Mon Apr 11 16:18:42 2016 -0700 ---------------------------------------------------------------------- .../org/apache/spark/mllib/impl/PeriodicCheckpointer.scala | 6 ++++-- .../apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala | 3 ++- 2 files changed, 6 insertions(+), 3 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/cb7a90ad/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicCheckpointer.scala ---------------------------------------------------------------------- diff --git a/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicCheckpointer.scala b/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicCheckpointer.scala index 72d3aab..e316cab 100644 --- a/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicCheckpointer.scala +++ b/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicCheckpointer.scala @@ -51,7 +51,8 @@ import org.apache.spark.storage.StorageLevel * - This class removes checkpoint files once later Datasets have been checkpointed. * However, references to the older Datasets will still return isCheckpointed = true. * - * @param checkpointInterval Datasets will be checkpointed at this interval + * @param checkpointInterval Datasets will be checkpointed at this interval. + * If this interval was set as -1, then checkpointing will be disabled. * @param sc SparkContext for the Datasets given to this checkpointer * @tparam T Dataset type, such as RDD[Double] */ @@ -88,7 +89,8 @@ private[mllib] abstract class PeriodicCheckpointer[T]( updateCount += 1 // Handle checkpointing (after persisting) - if ((updateCount % checkpointInterval) == 0 && sc.getCheckpointDir.nonEmpty) { + if (checkpointInterval != -1 && (updateCount % checkpointInterval) == 0 + && sc.getCheckpointDir.nonEmpty) { // Add new checkpoint before removing old checkpoints. checkpoint(newData) checkpointQueue.enqueue(newData) http://git-wip-us.apache.org/repos/asf/spark/blob/cb7a90ad/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala ---------------------------------------------------------------------- diff --git a/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala b/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala index 11a0595..20db608 100644 --- a/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala +++ b/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala @@ -69,7 +69,8 @@ import org.apache.spark.storage.StorageLevel * // checkpointed: graph4 * }}} * - * @param checkpointInterval Graphs will be checkpointed at this interval + * @param checkpointInterval Graphs will be checkpointed at this interval. + * If this interval was set as -1, then checkpointing will be disabled. * @tparam VD Vertex descriptor type * @tparam ED Edge descriptor type * --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org