Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/14214 @mariobriggs Thanks for the information! > 1 can be eliminated because 'executedPlan' is a ' lazy val' on QueryExecution ? Yea indeed. Its being there can provide us debug info but on second thought it might not be worth it. So let's also skip it in the case of `ForeachSink`. > ... However this cannot be the solution since SparkListenerSQLExecutionStart is a public API already Yea we probably do not want to modify this public API; so what we did in this patch was, passing [3]'s `incrementalExecution` into the listener so we'll only initialize physical planning only once for [2] and [3]. > ... why not keep [1] and the change to [2] be the simple case of changing L52 to the following: new Dataset(data.sparkSession, data.queryExecution, implicitly[Encoder[T]]) This is great. If reviews decide that 2 rounds of physical planning is acceptable, then let's do it your way! > ... ConsoleSink ... but it is only for Debug purposes So maybe let us live with it.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org