[ https://issues.apache.org/jira/browse/SPARK-17498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483552#comment-15483552 ]
Sean Owen commented on SPARK-17498: ----------------------------------- This is more band-aid than anything. Really, the assumption is that you know all the labels in advance, or that if you're using this class, then the input does represent all known labels. If that's not true, then you need to do something else anyway. Mapping everything to an 'unknown' class isn't tha tuseful. > StringIndexer.setHandleInvalid sohuld have another option 'new' > --------------------------------------------------------------- > > Key: SPARK-17498 > URL: https://issues.apache.org/jira/browse/SPARK-17498 > Project: Spark > Issue Type: Improvement > Components: ML > Reporter: Miroslav Balaz > > That will map unseen label to maximum known label +1, IndexToString would map > that back to "<undef>" or NA if there is something like that in spark, -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org