[GitHub] spark pull request #23018: [SPARK-26023][SQL] Dumping truncated plans and ge...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/23018 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23018: [SPARK-26023][SQL] Dumping truncated plans and ge...
Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/23018#discussion_r232960406 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala --- @@ -469,7 +471,21 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product { def treeString: String = treeString(verbose = true) def treeString(verbose: Boolean, addSuffix: Boolean = false): String = { -generateTreeString(0, Nil, new StringBuilder, verbose = verbose, addSuffix = addSuffix).toString +val writer = new StringBuilderWriter() +try { + treeString(writer, verbose, addSuffix, None) + writer.toString +} finally { + writer.close() +} + } + + def treeString( + writer: Writer, + verbose: Boolean, + addSuffix: Boolean, + maxFields: Option[Int]): Unit = { +generateTreeString(0, Nil, writer, verbose, "", addSuffix) --- End diff -- I would prefer to avoid overcomplicating the PR again, frankly speaking. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23018: [SPARK-26023][SQL] Dumping truncated plans and ge...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23018#discussion_r232883084 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala --- @@ -469,7 +471,21 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product { def treeString: String = treeString(verbose = true) def treeString(verbose: Boolean, addSuffix: Boolean = false): String = { -generateTreeString(0, Nil, new StringBuilder, verbose = verbose, addSuffix = addSuffix).toString +val writer = new StringBuilderWriter() +try { + treeString(writer, verbose, addSuffix, None) + writer.toString +} finally { + writer.close() +} + } + + def treeString( + writer: Writer, + verbose: Boolean, + addSuffix: Boolean, + maxFields: Option[Int]): Unit = { +generateTreeString(0, Nil, writer, verbose, "", addSuffix) --- End diff -- If #22879 is merged first, we should add that function here. If this one is merged first, that PR better have the function. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23018: [SPARK-26023][SQL] Dumping truncated plans and ge...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/23018#discussion_r232871356 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala --- @@ -469,7 +471,21 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product { def treeString: String = treeString(verbose = true) def treeString(verbose: Boolean, addSuffix: Boolean = false): String = { -generateTreeString(0, Nil, new StringBuilder, verbose = verbose, addSuffix = addSuffix).toString +val writer = new StringBuilderWriter() +try { + treeString(writer, verbose, addSuffix, None) + writer.toString +} finally { + writer.close() +} + } + + def treeString( + writer: Writer, + verbose: Boolean, + addSuffix: Boolean, + maxFields: Option[Int]): Unit = { +generateTreeString(0, Nil, writer, verbose, "", addSuffix) --- End diff -- How about add another function only save `nodeName`? I'll use it in another PR: https://github.com/apache/spark/pull/22879 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23018: [SPARK-26023][SQL] Dumping truncated plans and ge...
GitHub user MaxGekk opened a pull request: https://github.com/apache/spark/pull/23018 [SPARK-26023][SQL] Dumping truncated plans and generated code to a file ## What changes were proposed in this pull request? In the PR, I propose new method for debugging queries by dumping info about their execution to a file. It saves logical, optimized and physical plan similar to the `explain()` method + generated code. One of the advantages of the method over `explain` is it does not materializes full output as one string in memory which can cause OOMs. ## How was this patch tested? Added a few tests to `QueryExecutionSuite` to check positive and negative scenarios. You can merge this pull request into a Git repository by running: $ git pull https://github.com/MaxGekk/spark-1 truncated-plan-to-file Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23018.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23018 commit 090e8c25969f8e51d71c135c80f8e43f5c5101f9 Author: Maxim Gekk Date: 2018-11-12T21:28:45Z Dumping truncated plans to a file --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org