[
https://issues.apache.org/jira/browse/MADLIB-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Frank McQuillan updated MADLIB-1352:
------------------------------------
Description:
In LDA
http://madlib.apache.org/docs/latest/group__grp__lda.html
implement warm start so can pick up from where you left off in the last
training.
was:
In LDA
http://madlib.apache.org/docs/latest/group__grp__lda.html
make stopping criteria on perplexity rather than just number of iterations.
Suggested approach is to do what scikit-learn does
https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html
evaluate_every : int, optional (default=0)
How often to evaluate perplexity. Only used in fit method. set it to 0 or
negative number to not evalute perplexity in training at all. Evaluating
perplexity can help you check convergence in training process, but it will also
increase total training time. Evaluating perplexity in every iteration might
increase training time up to two-fold.
perp_tol : float, optional (default=1e-1)
Perplexity tolerance in batch learning. Only used when evaluate_every is
greater than 0.
> Add warm start to LDA
> ----------------------
>
> Key: MADLIB-1352
> URL: https://issues.apache.org/jira/browse/MADLIB-1352
> Project: Apache MADlib
> Issue Type: New Feature
> Components: Module: Parallel Latent Dirichlet Allocation
> Reporter: Frank McQuillan
> Priority: Major
> Fix For: v2.0
>
>
> In LDA
> http://madlib.apache.org/docs/latest/group__grp__lda.html
> implement warm start so can pick up from where you left off in the last
> training.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)