spark git commit: [SPARK-15255][SQL] limit the length of name for cached DataFrame

rxin Tue, 10 May 2016 22:30:07 -0700

Repository: spark
Updated Branches:
  refs/heads/master 665545960 -> 1fbe2785d



[SPARK-15255][SQL] limit the length of name for cached DataFrame

## What changes were proposed in this pull request?

We use the tree string of an SparkPlan as the name of cached DataFrame, that 
could be very long, cause the browser to be not responsive. This PR will limit 
the length of the name to 1000 characters.

## How was this patch tested?

Here is how the UI looks right now:

![ui](https://cloud.githubusercontent.com/assets/40902/15163355/d5640f9c-16bc-11e6-8655-809af8a4fed1.png)

Author: Davies Liu <dav...@databricks.com>

Closes #13033 from davies/cache_name.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1fbe2785
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1fbe2785
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1fbe2785

Branch: refs/heads/master
Commit: 1fbe2785dff53a9eae5f13809091de7520a1e1b2
Parents: 6655459
Author: Davies Liu <dav...@databricks.com>
Authored: Tue May 10 22:29:41 2016 -0700
Committer: Reynold Xin <r...@databricks.com>
Committed: Tue May 10 22:29:41 2016 -0700

----------------------------------------------------------------------
 .../spark/sql/execution/columnar/InMemoryTableScanExec.scala   | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/1fbe2785/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala
----------------------------------------------------------------------
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala
index a36071a..009fbaa 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala
@@ -19,6 +19,8 @@ package org.apache.spark.sql.execution.columnar
 
 import scala.collection.mutable.ArrayBuffer
 
+import org.apache.commons.lang.StringUtils
+
 import org.apache.spark.{Accumulable, Accumulator}
 import org.apache.spark.network.util.JavaUtils
 import org.apache.spark.rdd.RDD
@@ -177,7 +179,9 @@ private[sql] case class InMemoryRelation(
       }
     }.persist(storageLevel)
 
-    cached.setName(tableName.map(n => s"In-memory table 
$n").getOrElse(child.toString))
+    cached.setName(
+      tableName.map(n => s"In-memory table $n")
+        .getOrElse(StringUtils.abbreviate(child.toString, 1024)))
     _cachedColumnBuffers = cached
   }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-15255][SQL] limit the length of name for cached DataFrame

Reply via email to