[ 
https://issues.apache.org/jira/browse/SPARK-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237029#comment-14237029
 ] 

Kai Sasaki commented on SPARK-4607:
-----------------------------------

[~josephkb] I think each trees in iterations of GrandientBoostedTrees is always 
trained all training data. Is there any case when we have to do subsampling 
with making RandomForest? Current GrandientBoostedTrees code uses non 
subsampling RandomForest. 

> Add random seed to GradientBoostedTrees
> ---------------------------------------
>
>                 Key: SPARK-4607
>                 URL: https://issues.apache.org/jira/browse/SPARK-4607
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.2.0
>            Reporter: Joseph K. Bradley
>            Priority: Minor
>
> Gradient Boosted Trees does not take a random seed, but it uses randomness if 
> the subsampling rate is < 1.  It should take a random seed parameter.
> This update will also help to make unit tests more stable by allowing 
> determinism (using a small set of fixed random seeds).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to