spark git commit: [SPARK-20548][FLAKY-TEST] share one REPL instance among REPL test cases

2017-05-09 Thread wenchen
Repository: spark
Updated Branches:
  refs/heads/branch-2.2 272d2a10d -> 08e1b78f0


[SPARK-20548][FLAKY-TEST] share one REPL instance among REPL test cases

`ReplSuite.newProductSeqEncoder with REPL defined class` was flaky and throws 
OOM exception frequently. By analyzing the heap dump, we found the reason is 
that, in each test case of `ReplSuite`, we create a REPL instance, which 
creates a classloader and loads a lot of classes related to `SparkContext`. 
More details please see 
https://github.com/apache/spark/pull/17833#issuecomment-298711435.

In this PR, we create a new test suite, `SingletonReplSuite`, which shares one 
REPL instances among all the test cases. Then we move most of the tests from 
`ReplSuite` to `SingletonReplSuite`, to avoid creating a lot of REPL instances 
and reduce memory footprint.

test only change

Author: Wenchen Fan 

Closes #17844 from cloud-fan/flaky-test.

(cherry picked from commit f561a76b2f895dea52f228a9376948242c3331ad)
Signed-off-by: Wenchen Fan 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/08e1b78f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/08e1b78f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/08e1b78f

Branch: refs/heads/branch-2.2
Commit: 08e1b78f01955c7151d9e984d392d45deced6e34
Parents: 272d2a1
Author: Wenchen Fan 
Authored: Wed May 10 00:09:35 2017 +0800
Committer: Wenchen Fan 
Committed: Wed May 10 00:11:25 2017 +0800

--
 .../main/scala/org/apache/spark/repl/Main.scala |   2 +-
 .../org/apache/spark/repl/SparkILoop.scala  |   9 +-
 .../scala/org/apache/spark/repl/ReplSuite.scala | 271 +---
 .../apache/spark/repl/SingletonReplSuite.scala  | 408 +++
 4 files changed, 412 insertions(+), 278 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/08e1b78f/repl/scala-2.11/src/main/scala/org/apache/spark/repl/Main.scala
--
diff --git a/repl/scala-2.11/src/main/scala/org/apache/spark/repl/Main.scala 
b/repl/scala-2.11/src/main/scala/org/apache/spark/repl/Main.scala
index 39fc621..b8b38e8 100644
--- a/repl/scala-2.11/src/main/scala/org/apache/spark/repl/Main.scala
+++ b/repl/scala-2.11/src/main/scala/org/apache/spark/repl/Main.scala
@@ -68,7 +68,7 @@ object Main extends Logging {
 
 if (!hasErrors) {
   interp.process(settings) // Repl starts and goes in loop of R.E.P.L
-  Option(sparkContext).map(_.stop)
+  Option(sparkContext).foreach(_.stop)
 }
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/08e1b78f/repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala
--
diff --git 
a/repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala 
b/repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala
index 76a66c1..d1d25b7 100644
--- a/repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala
+++ b/repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala
@@ -86,15 +86,8 @@ class SparkILoop(in0: Option[BufferedReader], out: 
JPrintWriter)
 echo("Type :help for more information.")
   }
 
-  /** Add repl commands that needs to be blocked. e.g. reset */
-  private val blockedCommands = Set[String]()
-
-  /** Standard commands */
-  lazy val sparkStandardCommands: List[SparkILoop.this.LoopCommand] =
-standardCommands.filter(cmd => !blockedCommands(cmd.name))
-
   /** Available commands */
-  override def commands: List[LoopCommand] = sparkStandardCommands
+  override def commands: List[LoopCommand] = standardCommands
 
   /**
* We override `loadFiles` because we need to initialize Spark *before* the 
REPL

http://git-wip-us.apache.org/repos/asf/spark/blob/08e1b78f/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
--
diff --git 
a/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala 
b/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
index 121a02a..c7ae194 100644
--- a/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
+++ b/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
@@ -21,12 +21,12 @@ import java.io._
 import java.net.URLClassLoader
 
 import scala.collection.mutable.ArrayBuffer
-import org.apache.commons.lang3.StringEscapeUtils
+
 import org.apache.log4j.{Level, LogManager}
+
 import org.apache.spark.{SparkContext, SparkFunSuite}
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.internal.StaticSQLConf.CATALOG_IMPLEMENTATION
-import org.apache.spark.util.Utils
 
 class ReplSuite extends 

spark git commit: [SPARK-20548][FLAKY-TEST] share one REPL instance among REPL test cases

2017-05-09 Thread wenchen
Repository: spark
Updated Branches:
  refs/heads/master 181261a81 -> f561a76b2


[SPARK-20548][FLAKY-TEST] share one REPL instance among REPL test cases

## What changes were proposed in this pull request?

`ReplSuite.newProductSeqEncoder with REPL defined class` was flaky and throws 
OOM exception frequently. By analyzing the heap dump, we found the reason is 
that, in each test case of `ReplSuite`, we create a REPL instance, which 
creates a classloader and loads a lot of classes related to `SparkContext`. 
More details please see 
https://github.com/apache/spark/pull/17833#issuecomment-298711435.

In this PR, we create a new test suite, `SingletonReplSuite`, which shares one 
REPL instances among all the test cases. Then we move most of the tests from 
`ReplSuite` to `SingletonReplSuite`, to avoid creating a lot of REPL instances 
and reduce memory footprint.

## How was this patch tested?

test only change

Author: Wenchen Fan 

Closes #17844 from cloud-fan/flaky-test.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f561a76b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f561a76b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f561a76b

Branch: refs/heads/master
Commit: f561a76b2f895dea52f228a9376948242c3331ad
Parents: 181261a
Author: Wenchen Fan 
Authored: Wed May 10 00:09:35 2017 +0800
Committer: Wenchen Fan 
Committed: Wed May 10 00:09:35 2017 +0800

--
 .../main/scala/org/apache/spark/repl/Main.scala |   2 +-
 .../org/apache/spark/repl/SparkILoop.scala  |   9 +-
 .../scala/org/apache/spark/repl/ReplSuite.scala | 272 +
 .../apache/spark/repl/SingletonReplSuite.scala  | 408 +++
 4 files changed, 412 insertions(+), 279 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f561a76b/repl/scala-2.11/src/main/scala/org/apache/spark/repl/Main.scala
--
diff --git a/repl/scala-2.11/src/main/scala/org/apache/spark/repl/Main.scala 
b/repl/scala-2.11/src/main/scala/org/apache/spark/repl/Main.scala
index 39fc621..b8b38e8 100644
--- a/repl/scala-2.11/src/main/scala/org/apache/spark/repl/Main.scala
+++ b/repl/scala-2.11/src/main/scala/org/apache/spark/repl/Main.scala
@@ -68,7 +68,7 @@ object Main extends Logging {
 
 if (!hasErrors) {
   interp.process(settings) // Repl starts and goes in loop of R.E.P.L
-  Option(sparkContext).map(_.stop)
+  Option(sparkContext).foreach(_.stop)
 }
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/f561a76b/repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala
--
diff --git 
a/repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala 
b/repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala
index 76a66c1..d1d25b7 100644
--- a/repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala
+++ b/repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala
@@ -86,15 +86,8 @@ class SparkILoop(in0: Option[BufferedReader], out: 
JPrintWriter)
 echo("Type :help for more information.")
   }
 
-  /** Add repl commands that needs to be blocked. e.g. reset */
-  private val blockedCommands = Set[String]()
-
-  /** Standard commands */
-  lazy val sparkStandardCommands: List[SparkILoop.this.LoopCommand] =
-standardCommands.filter(cmd => !blockedCommands(cmd.name))
-
   /** Available commands */
-  override def commands: List[LoopCommand] = sparkStandardCommands
+  override def commands: List[LoopCommand] = standardCommands
 
   /**
* We override `loadFiles` because we need to initialize Spark *before* the 
REPL

http://git-wip-us.apache.org/repos/asf/spark/blob/f561a76b/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
--
diff --git 
a/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala 
b/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
index 8fe2708..c7ae194 100644
--- a/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
+++ b/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
@@ -21,12 +21,12 @@ import java.io._
 import java.net.URLClassLoader
 
 import scala.collection.mutable.ArrayBuffer
-import org.apache.commons.lang3.StringEscapeUtils
+
 import org.apache.log4j.{Level, LogManager}
+
 import org.apache.spark.{SparkContext, SparkFunSuite}
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.internal.StaticSQLConf.CATALOG_IMPLEMENTATION
-import org.apache.spark.util.Utils
 
 class ReplSuite extends SparkFunSuite {
 
@@ -148,71 +148,6 @@ class