[ 
https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088913#comment-13088913
 ] 

Patrick Meyer commented on MATH-449:
------------------------------------

I like that idea. It's probably the best way to handle it. However, in looking 
back at the regular Covariance class, it only provides for listwise deletion. 
Should we reconsider treatment of missing data in Covariance and 
StorelessCovariance so that the implementations are similar? We should probably 
give the user an option for treatment of missing data. The cov() function in R 
has an option for casewise or pairwise deletion but it looks like only casewise 
is available for the Pearson correlation. Missing data for Spearman's 
correlation is handled through the ranking procedure.

> Storeless covariance
> --------------------
>
>                 Key: MATH-449
>                 URL: https://issues.apache.org/jira/browse/MATH-449
>             Project: Commons Math
>          Issue Type: Improvement
>            Reporter: Patrick Meyer
>            Assignee: Phil Steitz
>             Fix For: 3.1
>
>         Attachments: MATH-449.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently there is no storeless version for computing the covariance. 
> However, Pebay (2008) describes algorithms for on-line covariance 
> computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have 
> provided a simple class for implementing this algorithm. It would be nice to 
> have this integrated into org.apache.commons.math.stat.correlation.Covariance.
> {code}
> //This code is granted for inclusion in the Apache Commons under the terms of 
> the ASL.
> public class StorelessCovariance{
>     private double deltaX = 0.0;
>     private double deltaY = 0.0;
>     private double meanX = 0.0;
>     private double meanY = 0.0;
>     private double N=0;
>     private Double covarianceNumerator=0.0;
>     private boolean unbiased=true;
>     public Covariance(boolean unbiased){
>       this.unbiased = unbiased;
>     }
>     public void increment(Double x, Double y){
>         if(x!=null & y!=null){
>             N++;
>             deltaX = x - meanX;
>             deltaY = y - meanY;
>             meanX += deltaX/N;
>             meanY += deltaY/N;
>             covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
>         }
>         
>     }
>     public Double getResult(){
>         if(unbiased){
>             return covarianceNumerator/(N-1.0);
>         }else{
>             return covarianceNumerator/N;
>         }
>     }   
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to