hvanhovell commented on a change in pull request #24654: [SPARK-27439][SQL] Explainging Dataset should show correct resolved plans URL: https://github.com/apache/spark/pull/24654#discussion_r285889661
########## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ########## @@ -498,12 +500,25 @@ class Dataset[T] private[sql]( * @since 1.6.0 */ def explain(extended: Boolean): Unit = { - val explain = ExplainCommand(queryExecution.logical, extended = extended) - sparkSession.sessionState.executePlan(explain).executedPlan.executeCollect().foreach { - // scalastyle:off println - r => println(r.getString(0)) - // scalastyle:on println + val qe = if (isStreaming) { Review comment: Can we put this into a method that is shared by both the `Dataset` and the explain command, e.g.: ```scala def explainQueryExecution(plan: LogicalPlan, queryExecution: => QueryExecution): QueryExecution = { if (plan.isStreaming) { // This is used only by explaining `Dataset/DataFrame` created by `spark.readStream`, so the // output mode does not matter since there is no `Sink`. new IncrementalExecution( sparkSession, logicalPlan, OutputMode.Append(), "<unknown>", UUID.randomUUID, UUID.randomUUID, 0, OffsetSeqMetadata(0, 0)) } else { queryExecution } } ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org