[ 
https://issues.apache.org/jira/browse/SPARK-16920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871635#comment-15871635
 ] 

Mahmoud Rawas commented on SPARK-16920:
---------------------------------------

It seems that there is no N^2 complexity issue, and as for the stress test I 
have added a guide on how to perform one with some explanation on the fix, 
please review the following gist and let me know if you prefer any changes.

https://gist.github.com/mhmoudr/3681668f0ae56ca70cd95c8602f963e1

> Investigate and fix issues introduced in SPARK-15858
> ----------------------------------------------------
>
>                 Key: SPARK-16920
>                 URL: https://issues.apache.org/jira/browse/SPARK-16920
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Vladimir Feinberg
>
> There were several issues regarding the PR resolving SPARK-15858, my comments 
> are available here:
> https://github.com/apache/spark/commit/393db655c3c43155305fbba1b2f8c48a95f18d93
> The two most important issues are:
> 1. The PR did not add a stress test proving it resolved the issue it was 
> supposed to (though I have no doubt the optimization made is indeed correct).
> 2. The PR introduced quadratic prediction time in terms of the number of 
> trees, which was previously linear. This issue needs to be investigated for 
> whether it causes problems for large numbers of trees (say, 1000), an 
> appropriate test should be added, and then fixed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to