[GitHub] spark pull request #19687: [SPARK-19644][SQL]Clean up Scala reflection garba...

2017-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19687


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19687: [SPARK-19644][SQL]Clean up Scala reflection garba...

2017-11-08 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/19687#discussion_r149802015
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
 ---
@@ -441,4 +443,28 @@ class ExpressionEncoderSuite extends PlanTest with 
AnalysisTest {
   }
 }
   }
+
+  /**
+   * Verify the size of scala.reflect.runtime.JavaUniverse.undoLog before 
and after `func` to
+   * ensure we don't leak Scala reflection garbage.
+   *
+   * @see 
org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects
+   */
+  private def verifyNotLeakingReflectionObjects[T](func: => T): T = {
+def undoLogSize: Int = {
+  import scala.reflect.runtime.{JavaUniverse, universe}
--- End diff --

No special reason. I changed to use the fully qualified class name now.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19687: [SPARK-19644][SQL]Clean up Scala reflection garba...

2017-11-08 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/19687#discussion_r149765467
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
 ---
@@ -370,7 +372,7 @@ class ExpressionEncoderSuite extends PlanTest with 
AnalysisTest {
   private def encodeDecodeTest[T : ExpressionEncoder](
   input: T,
   testName: String): Unit = {
-test(s"encode/decode for $testName: $input") {
+testAndVerifyNotLeakingReflectionObjects(s"encode/decode for 
$testName: $input") {
   val encoder = implicitly[ExpressionEncoder[T]]
--- End diff --

This is to verify that `UnresolvedMapObjects` doesn't leak. ScalaReflection 
creates a function here but the function runs after creating the encoder: 
https://github.com/apache/spark/blob/6327ea570bf542983081c5d1d3ee7e6123365c8f/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala#L280
 





---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19687: [SPARK-19644][SQL]Clean up Scala reflection garba...

2017-11-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19687#discussion_r149632280
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
 ---
@@ -370,7 +372,7 @@ class ExpressionEncoderSuite extends PlanTest with 
AnalysisTest {
   private def encodeDecodeTest[T : ExpressionEncoder](
   input: T,
   testName: String): Unit = {
-test(s"encode/decode for $testName: $input") {
+testAndVerifyNotLeakingReflectionObjects(s"encode/decode for 
$testName: $input") {
   val encoder = implicitly[ExpressionEncoder[T]]
--- End diff --

here we will verify the memory leak, seems no need to create 
`testAndVerifyNotLeakingReflectionObjects`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19687: [SPARK-19644][SQL]Clean up Scala reflection garba...

2017-11-08 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19687#discussion_r149630182
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
 ---
@@ -441,4 +443,28 @@ class ExpressionEncoderSuite extends PlanTest with 
AnalysisTest {
   }
 }
   }
+
+  /**
+   * Verify the size of scala.reflect.runtime.JavaUniverse.undoLog before 
and after `func` to
+   * ensure we don't leak Scala reflection garbage.
+   *
+   * @see 
org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects
+   */
+  private def verifyNotLeakingReflectionObjects[T](func: => T): T = {
+def undoLogSize: Int = {
+  import scala.reflect.runtime.{JavaUniverse, universe}
+  universe.asInstanceOf[JavaUniverse].undoLog.log.size
+}
+
+val previousUndoLogSize = undoLogSize
+val r = func
+assert(previousUndoLogSize == undoLogSize)
+r
+  }
+
+  private def testAndVerifyNotLeakingReflectionObjects(testName: 
String)(testFun: => Any) {
+test(testName) {
+  verifyNotLeakingReflectionObjects(testFun)
--- End diff --

Do we have to call `verifyNotLeakingReflectionObjects` twice? One is to 
call `implicit`. The other is here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19687: [SPARK-19644][SQL]Clean up Scala reflection garba...

2017-11-08 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/19687#discussion_r149627183
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
 ---
@@ -441,4 +443,28 @@ class ExpressionEncoderSuite extends PlanTest with 
AnalysisTest {
   }
 }
   }
+
+  /**
+   * Verify the size of scala.reflect.runtime.JavaUniverse.undoLog before 
and after `func` to
+   * ensure we don't leak Scala reflection garbage.
+   *
+   * @see 
org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects
+   */
+  private def verifyNotLeakingReflectionObjects[T](func: => T): T = {
+def undoLogSize: Int = {
+  import scala.reflect.runtime.{JavaUniverse, universe}
--- End diff --

Just out of curiosity why import here, and for one usage in the line below?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19687: [SPARK-19644][SQL]Clean up Scala reflection garba...

2017-11-07 Thread zsxwing
GitHub user zsxwing opened a pull request:

https://github.com/apache/spark/pull/19687

[SPARK-19644][SQL]Clean up Scala reflection garbage after creating Encoder

## What changes were proposed in this pull request?

Because of the memory leak issue in `scala.reflect.api.Types.TypeApi.<:<` 
(https://github.com/scala/bug/issues/8302), creating an encoder may leak memory.

This PR adds `cleanUpReflectionObjects` to clean up these leaking objects 
for methods calling `scala.reflect.api.Types.TypeApi.<:<`.

## How was this patch tested?

The updated unit tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zsxwing/spark SPARK-19644

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19687.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19687


commit c03811ff006058987fa8d5fb9f7d097b9acc9ac5
Author: Shixiong Zhu 
Date:   2017-11-08T00:33:55Z

Clean up Scala reflection garbage after creating Encoder




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org