[ 
https://issues.apache.org/jira/browse/SPARK-8555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15076488#comment-15076488
 ] 

Sean Owen commented on SPARK-8555:
----------------------------------

Have a look at 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-MLlib-specificContributionGuidelines
 ; generally speaking there are so many algorithms to implement and most aren't 
that useful or widely used, and so few really belong in MLlib itself. I'm not 
commenting on HDP here, though I don't think it's that commonly used. The idea 
is that it should prove itself out externally.

> Online Variational Inference for the Hierarchical Dirichlet Process
> -------------------------------------------------------------------
>
>                 Key: SPARK-8555
>                 URL: https://issues.apache.org/jira/browse/SPARK-8555
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: yuhao yang
>            Priority: Minor
>
> The task is created for exploration on the online HDP algorithm described in
> http://jmlr.csail.mit.edu/proceedings/papers/v15/wang11a/wang11a.pdf.
> Major advantage for the algorithm: one pass on corpus, streaming friendly, 
> automatic K (topic number).
> Currently the scope is to support online HDP for topic modeling, i.e. 
> probably an optimizer for LDA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to