Repository: spark
Updated Branches:
  refs/heads/branch-1.0 e3955643d -> 358e7e51c


[SPARK-2088] fix NPE in toString

After deserialization, the transient field creationSiteInfo does not get 
backfilled with the default value, but the toString method, which is invoked by 
the serializer, expects the field to always be non-null. An NPE is thrown when 
toString is called by the serializer when creationSiteInfo is null.

Author: Doris Xin <doris.s....@gmail.com>

Closes #1028 from dorx/toStringNPE and squashes the following commits:

f20021e [Doris Xin] unit test for toString after desrialization
6f0a586 [Doris Xin] Merge branch 'master' into toStringNPE
f47fecf [Doris Xin] Merge branch 'master' into toStringNPE
76199c6 [Doris Xin] [SPARK-2088] fix NPE in toString

(cherry picked from commit 83c226d454722d5dea186d48070fb98652d0dafb)
Signed-off-by: Xiangrui Meng <m...@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/358e7e51
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/358e7e51
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/358e7e51

Branch: refs/heads/branch-1.0
Commit: 358e7e51cc736223d36071b44b7ff853635fc6e7
Parents: e395564
Author: Doris Xin <doris.s....@gmail.com>
Authored: Thu Jun 12 12:53:07 2014 -0700
Committer: Xiangrui Meng <m...@databricks.com>
Committed: Thu Jun 12 12:53:38 2014 -0700

----------------------------------------------------------------------
 core/src/main/scala/org/apache/spark/rdd/RDD.scala      | 2 +-
 core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala | 9 ++++++++-
 2 files changed, 9 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/358e7e51/core/src/main/scala/org/apache/spark/rdd/RDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
index aa03e92..da4f4f8 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
@@ -1135,7 +1135,7 @@ abstract class RDD[T: ClassTag](
 
   /** User code that created this RDD (e.g. `textFile`, `parallelize`). */
   @transient private[spark] val creationSiteInfo = Utils.getCallSiteInfo
-  private[spark] def getCreationSite: String = creationSiteInfo.toString
+  private[spark] def getCreationSite: String = 
Option(creationSiteInfo).getOrElse("").toString
 
   private[spark] def elementClassTag: ClassTag[T] = classTag[T]
 

http://git-wip-us.apache.org/repos/asf/spark/blob/358e7e51/core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala
----------------------------------------------------------------------
diff --git a/core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala 
b/core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala
index e686068..fdbed45 100644
--- a/core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala
+++ b/core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala
@@ -24,7 +24,7 @@ import org.scalatest.FunSuite
 
 import org.apache.spark._
 import org.apache.spark.SparkContext._
-import org.apache.spark.rdd._
+import org.apache.spark.util.Utils
 
 class RDDSuite extends FunSuite with SharedSparkContext {
 
@@ -66,6 +66,13 @@ class RDDSuite extends FunSuite with SharedSparkContext {
     }
   }
 
+  test("serialization") {
+    val empty = new EmptyRDD[Int](sc)
+    val serial = Utils.serialize(empty)
+    val deserial: EmptyRDD[Int] = Utils.deserialize(serial)
+    assert(!deserial.toString().isEmpty())
+  }
+
   test("countApproxDistinct") {
 
     def error(est: Long, size: Long) = math.abs(est - size) / size.toDouble

Reply via email to