[jira] [Commented] (SPARK-12375) VectorIndexer: allow unknown categories

2017-10-27 Thread Weichen Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16223120#comment-16223120
 ] 

Weichen Xu commented on SPARK-12375:


Coordinated with [~yuhaoyan] I take this over. Thanks!

> VectorIndexer: allow unknown categories
> ---
>
> Key: SPARK-12375
> URL: https://issues.apache.org/jira/browse/SPARK-12375
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Reporter: Joseph K. Bradley
>Assignee: yuhao yang
>
> Add option for allowing unknown categories, probably via a parameter like 
> "allowUnknownCategories."
> If true, then handle unknown categories during transform by assigning them to 
> an extra category index.
> The API should resemble the API used for StringIndexer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12375) VectorIndexer: allow unknown categories

2017-10-27 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16222566#comment-16222566
 ] 

Apache Spark commented on SPARK-12375:
--

User 'WeichenXu123' has created a pull request for this issue:
https://github.com/apache/spark/pull/19588

> VectorIndexer: allow unknown categories
> ---
>
> Key: SPARK-12375
> URL: https://issues.apache.org/jira/browse/SPARK-12375
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Reporter: Joseph K. Bradley
>Assignee: yuhao yang
>
> Add option for allowing unknown categories, probably via a parameter like 
> "allowUnknownCategories."
> If true, then handle unknown categories during transform by assigning them to 
> an extra category index.
> The API should resemble the API used for StringIndexer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12375) VectorIndexer: allow unknown categories

2015-12-24 Thread yuhao yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071370#comment-15071370
 ] 

yuhao yang commented on SPARK-12375:


PR here, https://github.com/apache/spark/pull/10466 

> VectorIndexer: allow unknown categories
> ---
>
> Key: SPARK-12375
> URL: https://issues.apache.org/jira/browse/SPARK-12375
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Reporter: Joseph K. Bradley
>Assignee: Apache Spark
>
> Add option for allowing unknown categories, probably via a parameter like 
> "allowUnknownCategories."
> If true, then handle unknown categories during transform by assigning them to 
> an extra category index.
> The API should resemble the API used for StringIndexer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12375) VectorIndexer: allow unknown categories

2015-12-16 Thread yuhao yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061263#comment-15061263
 ] 

yuhao yang commented on SPARK-12375:


Anyone working on this? If not, I'll start to.

> VectorIndexer: allow unknown categories
> ---
>
> Key: SPARK-12375
> URL: https://issues.apache.org/jira/browse/SPARK-12375
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Reporter: Joseph K. Bradley
>
> Add option for allowing unknown categories, probably via a parameter like 
> "allowUnknownCategories."
> If true, then handle unknown categories during transform by assigning them to 
> an extra category index.
> The API should resemble the API used for StringIndexer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org