Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/1269#issuecomment-69236610
  
    @akopich  I had hoped to get this into MLlib, but after more consideration, 
I believe it is too complex.  Currently, what would be ideal is a simple 
implementation of LDA since that is all that most users need.  While 
generalizations like robust PLSA may outperform LDA with proper tuning, it’s 
somewhat of a research area, and it may be better to go with LDA since it has 
been very widely tested and used.
    
    However, I am sure some users would want to use your implementation of 
Robust PLSA, so it would be valuable for you to make it available as a package 
for Spark.
    
    The best path right now, I believe, will be to create a simple PR with a 
minimal public API, where that API should be extensible with (a) extra 
parameters/features and (b) alternate optimization/learning algorithms.  I've 
posted a public design doc on the LDA JIRA 
[here](https://issues.apache.org/jira/browse/SPARK-1405), and I’m going to 
submit such a PR.  I would of course appreciate your feedback on it.  Thanks 
very much for your understanding.
    
    When we merge the initial LDA PR, @mengxr will be sure to include all of 
those who have participated as authors of Spark LDA PRs: @akopich @witgo 
@yinxusen @dlwh @EntilZha @jegonzal
    
    CC: @mengxr 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to