[jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point

yuhao yang (JIRA) Tue, 23 May 2017 23:37:31 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022379#comment-16022379
 ]


yuhao yang commented on SPARK-20082:
------------------------------------

refer to https://issues.apache.org/jira/browse/SPARK-20767 for some insights 
shared by [~cezden]
{quote}
Technical aspects:
1. The implementation of LDA fitting does not currently allow the coefficients 
pre-setting (private setter), as noted by a comment in the source code of 
OnlineLDAOptimizer.setLambda: "This is only used for testing now. In the 
future, it can help support training stop/resume".
2. The lambda matrix is always randomly initialized by the optimizer, which 
needs fixing for preset lambda matrix.
{quote}

> Incremental update of LDA model, by adding initialModel as start point
> ----------------------------------------------------------------------
>
>                 Key: SPARK-20082
>                 URL: https://issues.apache.org/jira/browse/SPARK-20082
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>    Affects Versions: 2.1.0
>            Reporter: Mathieu D
>
> Some mllib models support an initialModel to start from and update it 
> incrementally with new data.
> From what I understand of OnlineLDAOptimizer, it is possible to incrementally 
> update an existing model with batches of new documents.
> I suggest to add an initialModel as a start point for LDA.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point

Reply via email to