[GitHub] spark issue #20806: [SPARK-23661][SQL] Implement treeAggregate on Dataset AP...

WeichenXu123 Fri, 16 Mar 2018 06:15:04 -0700

Github user WeichenXu123 commented on the issue:

    https://github.com/apache/spark/pull/20806
  
    @viirya ok. but there're already a class in ML use 
`TypedImperativeAggregator`, see `Summarizer`.
    
    And do you benchmark and compare this PR and `df.rdd.treeAggregate`?
    Seems they're almost the same. Is there some difference which can make 
remarkable performance improvement ?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20806: [SPARK-23661][SQL] Implement treeAggregate on Dataset AP...

Reply via email to