Repository: spark
Updated Branches:
  refs/heads/master 2404d8e54 -> d8e14db84


[SPARK-18842][TESTS] De-duplicate paths in classpaths in processes for 
local-cluster mode in ReplSuite to work around the length limitation on Windows

## What changes were proposed in this pull request?

`ReplSuite`s hang due to the length limitation on Windows with the exception as 
below:

```
Spark context available as 'sc' (master = local-cluster[1,1,1024], app id = 
app-20161223114000-0000).
Spark session available as 'spark'.
Exception in thread "ExecutorRunner for app-20161223114000-0000/26995" 
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.Arrays.copyOf(Arrays.java:3332)
        at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
        at 
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
        at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:622)
        at java.lang.StringBuilder.append(StringBuilder.java:202)
        at java.lang.ProcessImpl.createCommandLine(ProcessImpl.java:194)
        at java.lang.ProcessImpl.<init>(ProcessImpl.java:340)
        at java.lang.ProcessImpl.start(ProcessImpl.java:137)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
        at 
org.apache.spark.deploy.worker.ExecutorRunner.org$apache$spark$deploy$worker$ExecutorRunner$$fetchAndRunExecutor(ExecutorRunner.scala:167)
        at 
org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:73)
```

The reason is, it keeps failing and goes in an infinite loop. This fails 
because it uses the paths (via `getFile`) from URLs in the tests whereas some 
added afterward are normal local paths.
(`url.getFile` gives `/C:/a/b/c` and some paths are added later as the format 
of `C:\a\b\c`. )

So, many classpaths are duplicated because normal local paths and paths from 
URLs are mixed. This length is up to 40K which hits the length limitation 
problem (32K) on Windows.

The full command line built here is - 
https://gist.github.com/HyukjinKwon/46af7946c9a5fd4c6fc70a8a0aba1beb

## How was this patch tested?

Manually via AppVeyor.

**Before**
https://ci.appveyor.com/project/spark-test/spark/build/395-find-path-issues

**After**
https://ci.appveyor.com/project/spark-test/spark/build/398-find-path-issues

Author: hyukjinkwon <gurwls...@gmail.com>

Closes #16398 from HyukjinKwon/SPARK-18842-more.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d8e14db8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d8e14db8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d8e14db8

Branch: refs/heads/master
Commit: d8e14db84f5ea752fbe92036209f67232b4dcc1f
Parents: 2404d8e
Author: hyukjinkwon <gurwls...@gmail.com>
Authored: Tue Dec 27 18:50:54 2016 +0000
Committer: Sean Owen <so...@cloudera.com>
Committed: Tue Dec 27 18:50:54 2016 +0000

----------------------------------------------------------------------
 .../src/test/scala/org/apache/spark/repl/ReplSuite.scala           | 2 +-
 .../src/test/scala/org/apache/spark/repl/ReplSuite.scala           | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/d8e14db8/repl/scala-2.10/src/test/scala/org/apache/spark/repl/ReplSuite.scala
----------------------------------------------------------------------
diff --git 
a/repl/scala-2.10/src/test/scala/org/apache/spark/repl/ReplSuite.scala 
b/repl/scala-2.10/src/test/scala/org/apache/spark/repl/ReplSuite.scala
index 26b8600..b3688c9 100644
--- a/repl/scala-2.10/src/test/scala/org/apache/spark/repl/ReplSuite.scala
+++ b/repl/scala-2.10/src/test/scala/org/apache/spark/repl/ReplSuite.scala
@@ -44,7 +44,7 @@ class ReplSuite extends SparkFunSuite {
         }
       }
     }
-    val classpath = paths.mkString(File.pathSeparator)
+    val classpath = paths.map(new 
File(_).getAbsolutePath).mkString(File.pathSeparator)
 
     val oldExecutorClasspath = System.getProperty(CONF_EXECUTOR_CLASSPATH)
     System.setProperty(CONF_EXECUTOR_CLASSPATH, classpath)

http://git-wip-us.apache.org/repos/asf/spark/blob/d8e14db8/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
----------------------------------------------------------------------
diff --git 
a/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala 
b/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
index 9262e93..55c9167 100644
--- a/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
+++ b/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
@@ -45,7 +45,7 @@ class ReplSuite extends SparkFunSuite {
         }
       }
     }
-    val classpath = paths.mkString(File.pathSeparator)
+    val classpath = paths.map(new 
File(_).getAbsolutePath).mkString(File.pathSeparator)
 
     val oldExecutorClasspath = System.getProperty(CONF_EXECUTOR_CLASSPATH)
     System.setProperty(CONF_EXECUTOR_CLASSPATH, classpath)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to