DB Tsai created SPARK-10597: ------------------------------- Summary: MultivariateOnlineSummarizer for weighted instances Key: SPARK-10597 URL: https://issues.apache.org/jira/browse/SPARK-10597 Project: Spark Issue Type: New Feature Components: MLlib Affects Versions: 1.5.0 Reporter: DB Tsai
MultivariateOnlineSummarizer for weighted instances is implemented as private API for #SPARK-7685. In #SPARK-7685, the online numerical stable version of unbiased estimation of variance defined by the reliability weights: [[https://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Reliability_weights]] is implemented, but we would like to make it as public api since there are different use-cases. Currently, `count` will return the actual number of instances, and ignores instance weights, but `numNonzeros` will return the weighted # of nonzeros. We need to decide the behavior of them before making it public. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org