[ 
https://issues.apache.org/jira/browse/FLINK-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Goel updated FLINK-2379:
-------------------------------
    Description: 
Design methods to evaluate statistics over dataset of vectors.
For continuous fields, Minimum, maximum, mean, variance.
For discrete fields, Class counts, Entropy, Gini Impurity.

Further statistical measures can also be supported. For example, correlation 
between two series, computing the covariance matrix, etc. 
[These are currently the things Spark supports.]

  was:
Design methods to evaluate statistics over dataset of vectors.
For continuous fields, Minimum, maximum, mean, variance.
For discrete fields, Class counts, Entropy, Gini Impurity.

     Issue Type: New Feature  (was: Bug)

> Add methods to evaluate field wise statistics over DataSet of vectors.
> ----------------------------------------------------------------------
>
>                 Key: FLINK-2379
>                 URL: https://issues.apache.org/jira/browse/FLINK-2379
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Sachin Goel
>
> Design methods to evaluate statistics over dataset of vectors.
> For continuous fields, Minimum, maximum, mean, variance.
> For discrete fields, Class counts, Entropy, Gini Impurity.
> Further statistical measures can also be supported. For example, correlation 
> between two series, computing the covariance matrix, etc. 
> [These are currently the things Spark supports.]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to