[
https://issues.apache.org/jira/browse/MADLIB-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Frank McQuillan updated MADLIB-1352:
------------------------------------
Fix Version/s: (was: v2.0)
v1.18.0
> Add warm start to LDA
> ----------------------
>
> Key: MADLIB-1352
> URL: https://issues.apache.org/jira/browse/MADLIB-1352
> Project: Apache MADlib
> Issue Type: New Feature
> Components: Module: Parallel Latent Dirichlet Allocation
> Reporter: Frank McQuillan
> Assignee: Himanshu Pandey
> Priority: Major
> Fix For: v1.18.0
>
>
> In LDA
> http://madlib.apache.org/docs/latest/group__grp__lda.html
> implement warm start so can pick up from where you left off in the last
> training.
> I would suggest we model this on the warm start implemented in MLP
> http://madlib.apache.org/docs/latest/group__grp__nn.html
> since it will be the same general idea for LDA.
> The LDA interface will be:
> {code}
> lda_train( data_table,
> model_table,
> output_data_table,
> voc_size,
> topic_num,
> iter_num,
> alpha,
> beta,
> evaluate_every,
> perplexity_tol,
> warm_start -- new param
> )
> warm_start (optional)
> BOOLEAN, default: FALSE. Initialize weights with the coefficients from the
> last call of the training function. If set to true, weights will be
> initialized from the model_table generated by the previous run. Note that
> parameters voc_size and topic_num must remain constant between calls when
> warm_start is used. Other parameters can be changed for the warm start run.
> {code}
> Open questions
> 1) Validate this statement:
> {code}
> Note that parameters voc_size and topic_num must remain constant between
> calls when warm_start is used. Other parameters can be changed for the warm
> start run.
> {code}
> Notes
> 1) Depending on open question #1 above, do validation checks on user input to
> ensure that user does not change any parameter that they are not allowed to
> change from the previous run.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)