Qiping Li created SPARK-3920:
--------------------------------

             Summary: Add option to support aggregation using treeAggregate in 
decision tree
                 Key: SPARK-3920
                 URL: https://issues.apache.org/jira/browse/SPARK-3920
             Project: Spark
          Issue Type: Improvement
          Components: MLlib
            Reporter: Qiping Li
             Fix For: 1.2.0


In [SPARK-3366|https://issues.apache.org/jira/browse/SPARK-3366], we used 
distribute aggregation to aggregate node stats, which can save computation and 
communication time when the shuffle size is very large. But experiments have 
shown that if shuffle size is not large enough(e.g, shallow trees), this will 
cause some performance loss(greater than 20% in some cases). We should support 
both options for aggregation so that user can choose a proper one based on 
their needs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to