[ 
https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16066816#comment-16066816
 ] 

Mathieu DESPRIEE commented on SPARK-20082:
------------------------------------------

I updated the PR.

Basically, here is the approach :
- only Online optimizer is supported, any use with EM optimizer is rejected. If 
incremental is also desirable for EM, I suggest we open another JIRA for it, to 
take the time discussing the initialization with an existing graph and new 
documents.
- I added an {{initialModel}} parameter that is used to initialize doc 
concentration and topic matrix from it.

 [~yuhaoyan], could you check it please ?

> Incremental update of LDA model, by adding initialModel as start point
> ----------------------------------------------------------------------
>
>                 Key: SPARK-20082
>                 URL: https://issues.apache.org/jira/browse/SPARK-20082
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>    Affects Versions: 2.1.0
>            Reporter: Mathieu DESPRIEE
>
> Some mllib models support an initialModel to start from and update it 
> incrementally with new data.
> From what I understand of OnlineLDAOptimizer, it is possible to incrementally 
> update an existing model with batches of new documents.
> I suggest to add an initialModel as a start point for LDA.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to