[ 
https://issues.apache.org/jira/browse/SPARK-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960034#comment-14960034
 ] 

Yanbo Liang commented on SPARK-9695:
------------------------------------

I agree if users set pipeline stage's seed, it has higher priority than the 
pipeline's seed.
To the pipeline storage and load, I think we should store the whole pipeline 
and each stage's seed to reproduce the same results. This issue should 
considered at the pipeline and stage's storage and load related tasks.
I think the assumption of random number generator should not change behavior 
across Spark versions is reasonable.
I will try to submit an initial patch for this issue and looking forward your 
comments.

> Add random seed Param to ML Pipeline
> ------------------------------------
>
>                 Key: SPARK-9695
>                 URL: https://issues.apache.org/jira/browse/SPARK-9695
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML
>            Reporter: Joseph K. Bradley
>
> Note this will require some discussion about whether to make HasSeed the main 
> API for whether an algorithm takes a seed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to