[
https://issues.apache.org/jira/browse/SPARK-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14349907#comment-14349907
]
Apache Spark commented on SPARK-4588:
-
User 'mengxr' has created a pull request for this issue:
https://github.com/apache/spark/pull/4925
Add API for feature attributes
--
Key: SPARK-4588
URL: https://issues.apache.org/jira/browse/SPARK-4588
Project: Spark
Issue Type: Sub-task
Components: ML, MLlib
Reporter: Xiangrui Meng
Assignee: Sean Owen
Priority: Critical
Feature attributes, e.g., continuous/categorical, feature names, feature
dimension, number of categories, number of nonzeros (support) could be useful
for ML algorithms.
In SPARK-3569, we added metadata to schema, which can be used to store
feature attributes along with the dataset. We need to provide a wrapper over
the Metadata class for ML usage.
The design doc is available at
https://docs.google.com/document/d/1796XfSzFbZvGWFs0ky99AJhlqkOBRG1O2bUxK2N4Grk/edit?usp=sharing
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org