[ https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022379#comment-16022379 ]
yuhao yang commented on SPARK-20082: ------------------------------------ refer to https://issues.apache.org/jira/browse/SPARK-20767 for some insights shared by [~cezden] {quote} Technical aspects: 1. The implementation of LDA fitting does not currently allow the coefficients pre-setting (private setter), as noted by a comment in the source code of OnlineLDAOptimizer.setLambda: "This is only used for testing now. In the future, it can help support training stop/resume". 2. The lambda matrix is always randomly initialized by the optimizer, which needs fixing for preset lambda matrix. {quote} > Incremental update of LDA model, by adding initialModel as start point > ---------------------------------------------------------------------- > > Key: SPARK-20082 > URL: https://issues.apache.org/jira/browse/SPARK-20082 > Project: Spark > Issue Type: New Feature > Components: ML > Affects Versions: 2.1.0 > Reporter: Mathieu D > > Some mllib models support an initialModel to start from and update it > incrementally with new data. > From what I understand of OnlineLDAOptimizer, it is possible to incrementally > update an existing model with batches of new documents. > I suggest to add an initialModel as a start point for LDA. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org