[
https://issues.apache.org/jira/browse/FLINK-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sachin Goel updated FLINK-2379:
---
Description:
Design methods to evaluate statistics over dataset of vectors.
For continuous fields, Minimum, maximum, mean, variance.
For discrete fields, Class counts, Entropy, Gini Impurity.
Further statistical measures can also be supported. For example, correlation
between two series, computing the covariance matrix, etc.
[These are currently the things Spark supports.]
was:
Design methods to evaluate statistics over dataset of vectors.
For continuous fields, Minimum, maximum, mean, variance.
For discrete fields, Class counts, Entropy, Gini Impurity.
Issue Type: New Feature (was: Bug)
Add methods to evaluate field wise statistics over DataSet of vectors.
--
Key: FLINK-2379
URL: https://issues.apache.org/jira/browse/FLINK-2379
Project: Flink
Issue Type: New Feature
Components: Machine Learning Library
Reporter: Sachin Goel
Design methods to evaluate statistics over dataset of vectors.
For continuous fields, Minimum, maximum, mean, variance.
For discrete fields, Class counts, Entropy, Gini Impurity.
Further statistical measures can also be supported. For example, correlation
between two series, computing the covariance matrix, etc.
[These are currently the things Spark supports.]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)