[GitHub] spark pull request #16934: [SPARK-19603][SS]Fix StreamingQuery explain comma...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16934 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16934: [SPARK-19603][SS]Fix StreamingQuery explain comma...
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/16934#discussion_r101398714 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala --- @@ -121,3 +121,25 @@ case class ExplainCommand( ("Error occurred during query planning: \n" + cause.getMessage).split("\n").map(Row(_)) } } + +/** An explain command for users to see how a streaming batch is executed. */ +case class StreamingExplainCommand( +queryExecution: IncrementalExecution, +extended: Boolean) extends RunnableCommand { + + override val output: Seq[Attribute] = --- End diff -- is this required? Just asking because the one above doesn't have it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16934: [SPARK-19603][SS]Fix StreamingQuery explain comma...
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/16934#discussion_r101398532 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala --- @@ -277,10 +279,22 @@ class StreamSuite extends StreamTest { test("explain") { val inputData = MemoryStream[String] -val df = inputData.toDS().map(_ + "foo") -// Test `explain` not throwing errors -df.explain() -val q = df.writeStream.queryName("memory_explain").format("memory").start() +val df = inputData.toDS().map(_ + "foo").groupBy("value").agg(count("*")) + +// Test `df.explain` +val explain = ExplainCommand(df.queryExecution.logical, extended = false) +val explainString = + spark.sessionState +.executePlan(explain) +.executedPlan +.executeCollect() +.map(_.getString(0)) +.mkString("\n") +assert(explainString.contains("StateStoreRestore")) --- End diff -- I would also check that this doesn't have a `LocalTableScan` but has a `StreamingRelation` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16934: [SPARK-19603][SS]Fix StreamingQuery explain comma...
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/16934#discussion_r101387562 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -673,7 +673,7 @@ class StreamExecution( if (lastExecution == null) { "No physical plan. Waiting for data." } else { - val explain = ExplainCommand(lastExecution.logical, extended = extended) + val explain = ExplainCommand(lastExecution.logical, extended = extended, streaming = true) --- End diff -- So this means that this code will always return an updated plan for the last batch showing which data files were read instead of just referring to it as a StreamingRelation. We wouldn't have the bug if we had just used `logicalPlan` instead of `lastExecution.logicalPlan`, right? Then the problem would be that the `logicalPlan` may contain errors though? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16934: [SPARK-19603][SS]Fix StreamingQuery explain comma...
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/16934 [SPARK-19603][SS]Fix StreamingQuery explain command ## What changes were proposed in this pull request? `StreamingQuery.explain` doesn't show the correct streaming physical plan right now because `ExplainCommand` receives a runtime batch plan and its `logicalPlan.isStreaming` is always false. This PR adds `streaming` parameter to `ExplainCommand` to allow `StreamExecution` to specify that it's a streaming plan. ## How was this patch tested? The updated unit test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zsxwing/spark SPARK-19603 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16934.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16934 commit 3b6c86a5581df4bdb9a94eac095c9c1ee1363f47 Author: Shixiong Zhu Date: 2017-02-15T01:04:21Z Fix StreamingQuery explain command --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org