[
https://issues.apache.org/jira/browse/SPARK-10513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959949#comment-14959949
]
Yanbo Liang edited comment on SPARK-10513 at 10/16/15 12:45 AM:
----------------------------------------------------------------
[~josephkb] For 4: If a column of StringType contains "" value (not null),
StringIndexer will transform it right, but OneHotEncoder will throw exception
caused by "" can not be assigned as a feature name. I think we should discuss
whether it is legal that one category feature contains "" value, otherwise we
should filter these kinds of values or replaced "" with other user specified
values?
was (Author: yanboliang):
[~josephkb] For 4: If a column of StringType has "" value (not null),
StringIndexer will transform it right, but OneHotEncoder will throw exception
caused of "" can not as a feature name. I think we should discuss that whether
it is legal that one category feature contains "" value, otherwise we should
filter these kinds of values or replaced "" with other user specified values?
> Springleaf Marketing Response
> -----------------------------
>
> Key: SPARK-10513
> URL: https://issues.apache.org/jira/browse/SPARK-10513
> Project: Spark
> Issue Type: Sub-task
> Components: ML
> Reporter: Yanbo Liang
> Assignee: Yanbo Liang
>
> Apply ML pipeline API to Springleaf Marketing Response
> (https://www.kaggle.com/c/springleaf-marketing-response)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]