[jira] [Commented] (SPARK-5566) Tokenizer for mllib package

yuhao yang (JIRA) Wed, 04 Feb 2015 06:40:18 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305172#comment-14305172
 ]


yuhao yang commented on SPARK-5566:
-----------------------------------

Actually I believe many current code like Word2Vec and HashingTF share the 
similar data flow and it's best if we can take the common requirement into 
consideration. 

> Tokenizer for mllib package
> ---------------------------
>
>                 Key: SPARK-5566
>                 URL: https://issues.apache.org/jira/browse/SPARK-5566
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML, MLlib
>    Affects Versions: 1.3.0
>            Reporter: Joseph K. Bradley
>
> There exist tokenizer classes in the spark.ml.feature package and in the 
> LDAExample in the spark.examples.mllib package.  The Tokenizer in the 
> LDAExample is more advanced and should be made into a full-fledged public 
> class in spark.mllib.feature.  The spark.ml.feature.Tokenizer class should 
> become a wrapper around the new Tokenizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-5566) Tokenizer for mllib package

Reply via email to