Repository: spark
Updated Branches:
  refs/heads/master 502476e45 -> ea4aab7e8


[SPARK-12440][CORE] Avoid setCheckpoint warning when directory is not local

In SparkContext method `setCheckpointDir`, a warning is issued when spark 
master is not local and the passed directory for the checkpoint dir appears to 
be local.

In practice, when relying on HDFS configuration file and using a relative path 
for the checkpoint directory (using an incomplete URI without HDFS scheme, 
...), this warning should not be issued and might be confusing.
In fact, in this case, the checkpoint directory is successfully created, and 
the checkpointing mechanism works as expected.

This PR uses the `FileSystem` instance created with the given directory, and 
checks whether it is local or not.
(The rationale is that since this same `FileSystem` instance is used to create 
the checkpoint dir anyway and can therefore be reliably used to determine if it 
is local or not).

The warning is only issued if the directory is not local, on top of the 
existing conditions.

Author: pierre-borckmans <pierre.borckm...@realimpactanalytics.com>

Closes #10392 from pierre-borckmans/SPARK-12440_CheckpointDir_Warning_NonLocal.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ea4aab7e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ea4aab7e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ea4aab7e

Branch: refs/heads/master
Commit: ea4aab7e87fbcf9ac90f93af79cc892b56508aa0
Parents: 502476e
Author: pierre-borckmans <pierre.borckm...@realimpactanalytics.com>
Authored: Thu Dec 24 13:48:21 2015 +0000
Committer: Sean Owen <so...@cloudera.com>
Committed: Thu Dec 24 13:48:21 2015 +0000

----------------------------------------------------------------------
 core/src/main/scala/org/apache/spark/SparkContext.scala | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/ea4aab7e/core/src/main/scala/org/apache/spark/SparkContext.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/SparkContext.scala 
b/core/src/main/scala/org/apache/spark/SparkContext.scala
index 67230f4..d506782 100644
--- a/core/src/main/scala/org/apache/spark/SparkContext.scala
+++ b/core/src/main/scala/org/apache/spark/SparkContext.scala
@@ -2073,8 +2073,9 @@ class SparkContext(config: SparkConf) extends Logging 
with ExecutorAllocationCli
     // its own local file system, which is incorrect because the checkpoint 
files
     // are actually on the executor machines.
     if (!isLocal && Utils.nonLocalPaths(directory).isEmpty) {
-      logWarning("Checkpoint directory must be non-local " +
-        "if Spark is running on a cluster: " + directory)
+      logWarning("Spark is not running in local mode, therefore the checkpoint 
directory " +
+        s"must not be on the local filesystem. Directory '$directory' " +
+        "appears to be on the local filesystem.")
     }
 
     checkpointDir = Option(directory).map { dir =>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to