[GitHub] spark pull request #21316: [SPARK-20538][SQL] Wrap Dataset.reduce with withN...

sohama4 Mon, 14 May 2018 18:34:16 -0700

Github user sohama4 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21316#discussion_r188143976
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
    @@ -1607,7 +1607,9 @@ class Dataset[T] private[sql](
        */
       @Experimental
       @InterfaceStability.Evolving
    -  def reduce(func: (T, T) => T): T = rdd.reduce(func)
    +  def reduce(func: (T, T) => T): T = withNewExecutionId {
    --- End diff --
    
    Thanks, that makes sense when I looked at the code for `foreach` and 
`foreachPartition`; I put up a new version with this change. It however wasn't 
clear immediately how the new function `withNewRDDExecutionId` would be 
beneficial over `withNewExecutionId`, can you elaborate a little when you get 
the chance?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21316: [SPARK-20538][SQL] Wrap Dataset.reduce with withN...

Reply via email to