wangshuo128 commented on a change in pull request #31968:
URL: https://github.com/apache/spark/pull/31968#discussion_r604647620



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
##########
@@ -223,11 +224,18 @@ class Dataset[T] private[sql](
   @transient private[sql] val logicalPlan: LogicalPlan = {
     // For various commands (like DDL) and queries with side effects, we force 
query execution
     // to happen right away to let these side effects take place eagerly.
+    def eagerRun(plan: LogicalPlan): LogicalPlan = {
+      val relation =
+        LocalRelation(plan.output, withAction("command", 
queryExecution)(_.executeCollect()))
+      relation.setTagValue(Dataset.DATASET_EAGER_RUN_TAG, true)
+      relation
+    }
+
     val plan = queryExecution.analyzed match {
       case c: Command =>
-        LocalRelation(c.output, withAction("command", 
queryExecution)(_.executeCollect()))

Review comment:
       But when people run commands without action in Spark application, e.g. 
`sql("DROP TABLE t")`,  this is the place that Spark implicitly adds a 
`collect` action to the DataFrame. 
   If we don't trigger a SQL execution here, we should still add an action like 
`collect` somewhere in Spark in this situation, do you have any suggestions?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to