[GitHub] spark pull request #16934: [SPARK-19603][SS]Fix StreamingQuery explain comma...

2017-02-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16934


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16934: [SPARK-19603][SS]Fix StreamingQuery explain comma...

2017-02-15 Thread brkyvz
Github user brkyvz commented on a diff in the pull request:

https://github.com/apache/spark/pull/16934#discussion_r101398714
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala 
---
@@ -121,3 +121,25 @@ case class ExplainCommand(
 ("Error occurred during query planning: \n" + 
cause.getMessage).split("\n").map(Row(_))
   }
 }
+
+/** An explain command for users to see how a streaming batch is executed. 
*/
+case class StreamingExplainCommand(
+queryExecution: IncrementalExecution,
+extended: Boolean) extends RunnableCommand {
+
+  override val output: Seq[Attribute] =
--- End diff --

is this required? Just asking because the one above doesn't have it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16934: [SPARK-19603][SS]Fix StreamingQuery explain comma...

2017-02-15 Thread brkyvz
Github user brkyvz commented on a diff in the pull request:

https://github.com/apache/spark/pull/16934#discussion_r101398532
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala ---
@@ -277,10 +279,22 @@ class StreamSuite extends StreamTest {
 
   test("explain") {
 val inputData = MemoryStream[String]
-val df = inputData.toDS().map(_ + "foo")
-// Test `explain` not throwing errors
-df.explain()
-val q = 
df.writeStream.queryName("memory_explain").format("memory").start()
+val df = inputData.toDS().map(_ + 
"foo").groupBy("value").agg(count("*"))
+
+// Test `df.explain`
+val explain = ExplainCommand(df.queryExecution.logical, extended = 
false)
+val explainString =
+  spark.sessionState
+.executePlan(explain)
+.executedPlan
+.executeCollect()
+.map(_.getString(0))
+.mkString("\n")
+assert(explainString.contains("StateStoreRestore"))
--- End diff --

I would also check that this doesn't have a `LocalTableScan` but has a 
`StreamingRelation`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16934: [SPARK-19603][SS]Fix StreamingQuery explain comma...

2017-02-15 Thread brkyvz
Github user brkyvz commented on a diff in the pull request:

https://github.com/apache/spark/pull/16934#discussion_r101387562
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
 ---
@@ -673,7 +673,7 @@ class StreamExecution(
 if (lastExecution == null) {
   "No physical plan. Waiting for data."
 } else {
-  val explain = ExplainCommand(lastExecution.logical, extended = 
extended)
+  val explain = ExplainCommand(lastExecution.logical, extended = 
extended, streaming = true)
--- End diff --

So this means that this code will always return an updated plan for the 
last batch showing which data files were read instead of just referring to it 
as a StreamingRelation. We wouldn't have the bug if we had just used 
`logicalPlan` instead of `lastExecution.logicalPlan`, right? Then the problem 
would be that the `logicalPlan` may contain errors though?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16934: [SPARK-19603][SS]Fix StreamingQuery explain comma...

2017-02-14 Thread zsxwing
GitHub user zsxwing opened a pull request:

https://github.com/apache/spark/pull/16934

[SPARK-19603][SS]Fix StreamingQuery explain command

## What changes were proposed in this pull request?

`StreamingQuery.explain` doesn't show the correct streaming physical plan 
right now because `ExplainCommand` receives a runtime batch plan and its 
`logicalPlan.isStreaming` is always false.

This PR adds `streaming` parameter to `ExplainCommand` to allow 
`StreamExecution` to specify that it's a streaming plan.

## How was this patch tested?

The updated unit test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zsxwing/spark SPARK-19603

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16934.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16934


commit 3b6c86a5581df4bdb9a94eac095c9c1ee1363f47
Author: Shixiong Zhu 
Date:   2017-02-15T01:04:21Z

Fix StreamingQuery explain command




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org