[jira] [Assigned] (SPARK-8703) Add CountVectorizer as a ml transformer to convert document to words count vector

2015-06-29 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-8703:
---

Assignee: Apache Spark

 Add CountVectorizer as a ml transformer to convert document to words count 
 vector
 -

 Key: SPARK-8703
 URL: https://issues.apache.org/jira/browse/SPARK-8703
 Project: Spark
  Issue Type: New Feature
  Components: ML
Reporter: yuhao yang
Assignee: Apache Spark
   Original Estimate: 24h
  Remaining Estimate: 24h

 Converts a text document to a sparse vector of token counts.
 I can further add an estimator to extract vocabulary from corpus if that's 
 appropriate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-8703) Add CountVectorizer as a ml transformer to convert document to words count vector

2015-06-29 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-8703:
---

Assignee: (was: Apache Spark)

 Add CountVectorizer as a ml transformer to convert document to words count 
 vector
 -

 Key: SPARK-8703
 URL: https://issues.apache.org/jira/browse/SPARK-8703
 Project: Spark
  Issue Type: New Feature
  Components: ML
Reporter: yuhao yang
   Original Estimate: 24h
  Remaining Estimate: 24h

 Converts a text document to a sparse vector of token counts.
 I can further add an estimator to extract vocabulary from corpus if that's 
 appropriate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org