[ 
https://issues.apache.org/jira/browse/SPARK-10513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961740#comment-14961740
 ] 

Joseph K. Bradley commented on SPARK-10513:
-------------------------------------------

Oh, I see, good question.  I guess the root problem is that we should probably 
not allow empty feature names.  How about we change StringIndexer to rename the 
empty string to "0" or the next available integer (or something like that)?  We 
can document that behavior + note the behavior change in the release notes.
* Later, we can add a Param to allow other behaviors: filter out Rows with 
invalid values, or throw an Exception.

If this sounds reasonable, I can make a JIRA for it.

> Springleaf Marketing Response
> -----------------------------
>
>                 Key: SPARK-10513
>                 URL: https://issues.apache.org/jira/browse/SPARK-10513
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML
>            Reporter: Yanbo Liang
>            Assignee: Yanbo Liang
>
> Apply ML pipeline API to Springleaf Marketing Response 
> (https://www.kaggle.com/c/springleaf-marketing-response)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to