[ https://issues.apache.org/jira/browse/SPARK-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995375#comment-14995375 ]
Joseph K. Bradley commented on SPARK-7130: ------------------------------------------ I think it reaches a little farther than that. The logic for deciding whether to sample is in BaggedPoint.scala, though you're correct that the line 88 affects it. This JIRA description was a little out-of-date; I'll update it to indicate that the implementations are still shared. > spark.ml RandomForest* should always do bootstrapping > ----------------------------------------------------- > > Key: SPARK-7130 > URL: https://issues.apache.org/jira/browse/SPARK-7130 > Project: Spark > Issue Type: Improvement > Components: ML > Affects Versions: 1.4.0 > Reporter: Joseph K. Bradley > Priority: Minor > > Currently, spark.ml RandomForest does not do bootstrapping if numTrees = 1. > For consistency and a simpler API, it should always do bootstrapping. The > current behavior is an artifact of the old API, in which RandomForest and > DecisionTree share the same implementation. This change should happen after > the implementation is moved to spark.ml (which we need to do so that the > implementation can be generalized). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org