[ https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088843#comment-13088843 ]
Phil Steitz commented on MATH-449: ---------------------------------- We could allow NaNs in the input vectors and skip updating the bivariate covariances for pairs including a NaN. > Storeless covariance > -------------------- > > Key: MATH-449 > URL: https://issues.apache.org/jira/browse/MATH-449 > Project: Commons Math > Issue Type: Improvement > Reporter: Patrick Meyer > Assignee: Phil Steitz > Fix For: 3.1 > > Attachments: MATH-449.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Currently there is no storeless version for computing the covariance. > However, Pebay (2008) describes algorithms for on-line covariance > computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have > provided a simple class for implementing this algorithm. It would be nice to > have this integrated into org.apache.commons.math.stat.correlation.Covariance. > {code} > //This code is granted for inclusion in the Apache Commons under the terms of > the ASL. > public class StorelessCovariance{ > private double deltaX = 0.0; > private double deltaY = 0.0; > private double meanX = 0.0; > private double meanY = 0.0; > private double N=0; > private Double covarianceNumerator=0.0; > private boolean unbiased=true; > public Covariance(boolean unbiased){ > this.unbiased = unbiased; > } > public void increment(Double x, Double y){ > if(x!=null & y!=null){ > N++; > deltaX = x - meanX; > deltaY = y - meanY; > meanX += deltaX/N; > meanY += deltaY/N; > covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY; > } > > } > public Double getResult(){ > if(unbiased){ > return covarianceNumerator/(N-1.0); > }else{ > return covarianceNumerator/N; > } > } > } > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira