Github user sohama4 commented on a diff in the pull request: https://github.com/apache/spark/pull/21316#discussion_r188143976 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -1607,7 +1607,9 @@ class Dataset[T] private[sql]( */ @Experimental @InterfaceStability.Evolving - def reduce(func: (T, T) => T): T = rdd.reduce(func) + def reduce(func: (T, T) => T): T = withNewExecutionId { --- End diff -- Thanks, that makes sense when I looked at the code for `foreach` and `foreachPartition`; I put up a new version with this change. It however wasn't clear immediately how the new function `withNewRDDExecutionId` would be beneficial over `withNewExecutionId`, can you elaborate a little when you get the chance?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org