[ https://issues.apache.org/jira/browse/MADLIB-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910557#comment-16910557 ]
Frank McQuillan commented on MADLIB-1351: ----------------------------------------- [~nikhilkak] can you give some guidance to Himanshu since it looks like u worked on that JIRA? thx > Add stopping criteria on perplexity to LDA > ------------------------------------------ > > Key: MADLIB-1351 > URL: https://issues.apache.org/jira/browse/MADLIB-1351 > Project: Apache MADlib > Issue Type: Improvement > Components: Module: Parallel Latent Dirichlet Allocation > Reporter: Frank McQuillan > Assignee: Himanshu Pandey > Priority: Major > Fix For: v1.17 > > > In LDA > http://madlib.apache.org/docs/latest/group__grp__lda.html > make stopping criteria on perplexity rather than just number of iterations. > Suggested approach is to do what scikit-learn does > https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html > evaluate_every : int, optional (default=0) > How often to evaluate perplexity. Set it to 0 or negative number to not > evaluate perplexity in training at all. Evaluating perplexity can help you > check convergence in training process, but it will also increase total > training time. Evaluating perplexity in every iteration might increase > training time up to two-fold. > perplexity_tol : float, optional (default=1e-1) > Perplexity tolerance to stop iterating. Only used when evaluate_every is > greater than 0. -- This message was sent by Atlassian Jira (v8.3.2#803003)