[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-08 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23169 @HeartSaVioR I've updated the description in SQLConf.scala. Is there some other documentation that should be updated

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-04 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23169 @HeartSaVioR I added tests for the default case and for a truncated plan. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-04 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23169 Yes, I'm trying to figure out how to test with a changed SQLConf. Once i do that I'll fix. --- - To unsubscribe, e-mail

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-03 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23169 Ok, I've updated this PR so that the default behavior does not change - full plan strings are always printed. This should be fully backwards compatible. Plan strings will only

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-11-29 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23169 If you have an idea of what those use cases are I could take a look and see if there is an impact. If not, we could turn it off by default (set the max length to Long.Max

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-11-28 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23169 I added changes to QueryExecution in the latest commit to address the UI issue. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #23169: [SPARK-26103][SQL] Limit the length of debug stri...

2018-11-28 Thread DaveDeCaprio
Github user DaveDeCaprio commented on a diff in the pull request: https://github.com/apache/spark/pull/23169#discussion_r237346220 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1610,6 +1610,12 @@ object SQLConf { "&quo

[GitHub] spark pull request #23169: [SPARK-26103][SQL] Limit the length of debug stri...

2018-11-28 Thread DaveDeCaprio
Github user DaveDeCaprio commented on a diff in the pull request: https://github.com/apache/spark/pull/23169#discussion_r237344141 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/SizeLimitedWriter.scala --- @@ -0,0 +1,48 @@ +/* + * Licensed

[GitHub] spark pull request #23169: [SPARK-26103][SQL] Limit the length of debug stri...

2018-11-28 Thread DaveDeCaprio
Github user DaveDeCaprio commented on a diff in the pull request: https://github.com/apache/spark/pull/23169#discussion_r237344154 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/SizeLimitedWriter.scala --- @@ -0,0 +1,48 @@ +/* + * Licensed

[GitHub] spark pull request #23169: [SPARK-26103][SQL] Limit the length of debug stri...

2018-11-28 Thread DaveDeCaprio
Github user DaveDeCaprio commented on a diff in the pull request: https://github.com/apache/spark/pull/23169#discussion_r237344026 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/trees/TreeNodeSuite.scala --- @@ -595,4 +596,14 @@ class TreeNodeSuite extends

[GitHub] spark pull request #23169: [SPARK-26103][SQL] Limit the length of debug stri...

2018-11-28 Thread DaveDeCaprio
Github user DaveDeCaprio commented on a diff in the pull request: https://github.com/apache/spark/pull/23169#discussion_r237343993 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala --- @@ -202,6 +202,26 @@ package object util extends Logging

[GitHub] spark issue #23076: [SPARK-26103][SQL] Added maxDepth to limit the length of...

2018-11-28 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23076 Closing this pull request in favor of #23169 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #23076: [SPARK-26103][SQL] Added maxDepth to limit the le...

2018-11-28 Thread DaveDeCaprio
Github user DaveDeCaprio closed the pull request at: https://github.com/apache/spark/pull/23076 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-11-28 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23169 @MaxGekk and @hvanhovell, this is an alternative solution for #23076. It limits overall plan length when generating the full string in memory, but not if a specific writer is passed

[GitHub] spark pull request #23169: [SPARK-26103][SQL] Limit the length of debug stri...

2018-11-28 Thread DaveDeCaprio
GitHub user DaveDeCaprio opened a pull request: https://github.com/apache/spark/pull/23169 [SPARK-26103][SQL] Limit the length of debug strings for query plans ## What changes were proposed in this pull request? The PR puts in a limit on the size of a debug string generated

[GitHub] spark issue #23076: [SPARK-26103][SQL] Added maxDepth to limit the length of...

2018-11-19 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23076 Limiting on pure size makes sense. I'd be willing to implement that in a different PR. Now that we are using a writer it should be pretty easy to do that. That was harder in the Spark

[GitHub] spark issue #23076: [SPARK-26103][SQL] Added maxDepth to limit the length of...

2018-11-18 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23076 Any case where we use truncatedString to limit the output columns I also think makes sense to limit the plan depth

[GitHub] spark issue #23076: [SPARK-26103][SQL] Added maxDepth to limit the length of...

2018-11-18 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23076 I originally wrote this against 2.3.0 where the toString was always called as part of newExecutionId. I'm not sure if that still happens in master

[GitHub] spark issue #23076: [SPARK-26103][SQL] Added maxDepth to limit the length of...

2018-11-18 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23076 The goal of this is to avoid creation of massive strings if toString gets called on a QueryExecution. In our use we have some large plans that generate plan strings in the hundreds

[GitHub] spark issue #23076: [SPARK-26103][SQL] Added maxDepth to limit the length of...

2018-11-18 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23076 @MaxGekk and @hvanhovell, I think this PR is complementary to the PR for SPARK-26023 which recently made allowed for file dumping of queries. It might make sense to only have

[GitHub] spark pull request #23076: [SPARK-26103][SQL] Added maxDepth to limit the le...

2018-11-18 Thread DaveDeCaprio
Github user DaveDeCaprio commented on a diff in the pull request: https://github.com/apache/spark/pull/23076#discussion_r23462 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -454,8 +455,9 @@ case class InputAdapter(child

[GitHub] spark pull request #23076: [SPARK-26103][SQL] Added maxDepth to limit the le...

2018-11-18 Thread DaveDeCaprio
Github user DaveDeCaprio commented on a diff in the pull request: https://github.com/apache/spark/pull/23076#discussion_r23348 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala --- @@ -701,3 +725,23 @@ abstract class TreeNode[BaseType

[GitHub] spark pull request #23076: [SPARK-26103][SQL] Added maxDepth to limit the le...

2018-11-18 Thread DaveDeCaprio
Github user DaveDeCaprio commented on a diff in the pull request: https://github.com/apache/spark/pull/23076#discussion_r23270 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala --- @@ -75,7 +78,7 @@ object CurrentOrigin

[GitHub] spark issue #23076: [SPARK-26103][SQL] Added maxDepth to limit the length of...

2018-11-17 Thread DaveDeCaprio
Github user DaveDeCaprio commented on the issue: https://github.com/apache/spark/pull/23076 This contribution is my original work and I license the work to the project under the project’s open source license

[GitHub] spark pull request #23076: [SPARK-26103][SQL] Added maxDepth to limit the le...

2018-11-17 Thread DaveDeCaprio
GitHub user DaveDeCaprio opened a pull request: https://github.com/apache/spark/pull/23076 [SPARK-26103][SQL] Added maxDepth to limit the length of text plans Nested query plans can get extremely large (hundreds of megabytes). ## What changes were proposed in this pull