[jira] [Commented] (MATH-909) FDistribution NoBracketingException in BrentSolver

2012-11-26 Thread Patrick Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503910#comment-13503910
 ] 

Patrick Meyer commented on MATH-909:


According to the R documentation, the gamma and beta functions are C 
translations of the SLATEC Fortran subroutines, as you suspected. The 
incomplete gamma appears to have a different origin. According to the R 
documentation, the pbeta function is related to the incomplete beta function of 
Abramowitz and Stegun. They cite two different sources for the function 
depending on whether it is a central or non-central pbeta.

Central pbeta:

Didonato, A. and Morris, A., Jr, (1992) Algorithm 708: Significant digit 
computation of the incomplete beta function ratios, ACM Transactions on 
Mathematical Software, 18, 360–373. (See also
Brown, B. and Lawrence Levy, L. (1994) Certification of algorithm 708: 
Significant digit computation of the incomplete beta, ACM Transactions on 
Mathematical Software, 20, 393–397.)

Non-central pbeta:

Lenth, R. V. (1987) Algorithm AS226: Computing noncentral beta probabilities. 
Appl. Statist, 36, 241–244, incorporating
Frick, H. (1990)'s AS R84, Appl. Statist, 39, 311–2, and
Lam, M.L. (1995)'s AS R95, Appl. Statist, 44, 551–2.

As far as test cases go, I think we should include a test case, given the 
proposed work on the underlying incomplete beta function. The test case does 
not have to be specific to this issue, but it would be safe to include a test.






> FDistribution NoBracketingException in BrentSolver
> --
>
> Key: MATH-909
> URL: https://issues.apache.org/jira/browse/MATH-909
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.0
>Reporter: Patrick Meyer
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> I get an exception when running the code below. the exception is 
> {code}
> function values at endpoints do not have different signs, endpoints: [0, 
> 1.002], values: [-0.025, -∞]
> {code}
> The problematic code:
> {code}
> double df1 = 10675;
> double df2 = 501725;
> FDistribution fDist = new FDistribution(df1, df2);
> System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException
> {code}
> However, R returns the value 0.9733505. The R code is:
> {code}
> qf(p=.025, df1=10675, df2=501725)
> {code}
> I don't know enough about the FDistribution class to know the solution to the 
> exception, but I thought I would report it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-909) FDistribution NoBracketingException in BrentSolver

2012-11-26 Thread Patrick Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503818#comment-13503818
 ] 

Patrick Meyer commented on MATH-909:


Ha, you're right! R is less accurate. I checked the value with Stata (code 
listed below) and the result was 0.97307795. I'm satisfied. CM returns a more 
accurate value.

{code}
display invF(10675, 501725, 0.025)
{code}

> FDistribution NoBracketingException in BrentSolver
> --
>
> Key: MATH-909
> URL: https://issues.apache.org/jira/browse/MATH-909
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.0
>Reporter: Patrick Meyer
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> I get an exception when running the code below. the exception is 
> {code}
> function values at endpoints do not have different signs, endpoints: [0, 
> 1.002], values: [-0.025, -∞]
> {code}
> The problematic code:
> {code}
> double df1 = 10675;
> double df2 = 501725;
> FDistribution fDist = new FDistribution(df1, df2);
> System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException
> {code}
> However, R returns the value 0.9733505. The R code is:
> {code}
> qf(p=.025, df1=10675, df2=501725)
> {code}
> I don't know enough about the FDistribution class to know the solution to the 
> exception, but I thought I would report it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-909) FDistribution NoBracketingException in BrentSolver

2012-11-26 Thread Patrick Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503773#comment-13503773
 ] 

Patrick Meyer commented on MATH-909:


I tested it with the most recent version and I also got 0.9730779455126357. The 
exception no longer occurs, but the result still seems to be to different from 
the value reported by R.

> FDistribution NoBracketingException in BrentSolver
> --
>
> Key: MATH-909
> URL: https://issues.apache.org/jira/browse/MATH-909
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.0
>Reporter: Patrick Meyer
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> I get an exception when running the code below. the exception is 
> {code}
> function values at endpoints do not have different signs, endpoints: [0, 
> 1.002], values: [-0.025, -∞]
> {code}
> The problematic code:
> {code}
> double df1 = 10675;
> double df2 = 501725;
> FDistribution fDist = new FDistribution(df1, df2);
> System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException
> {code}
> However, R returns the value 0.9733505. The R code is:
> {code}
> qf(p=.025, df1=10675, df2=501725)
> {code}
> I don't know enough about the FDistribution class to know the solution to the 
> exception, but I thought I would report it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (MATH-909) FDistribution NoBracketingException in BrentSolver

2012-11-26 Thread Patrick Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503773#comment-13503773
 ] 

Patrick Meyer edited comment on MATH-909 at 11/26/12 1:17 PM:
--

I tested it with the most recent version and I also got 0.9730779455126357. The 
exception no longer occurs, but the result still seems to be too different from 
the value reported by R.

  was (Author: meyerjp):
I tested it with the most recent version and I also got 0.9730779455126357. 
The exception no longer occurs, but the result still seems to be to different 
from the value reported by R.
  
> FDistribution NoBracketingException in BrentSolver
> --
>
> Key: MATH-909
> URL: https://issues.apache.org/jira/browse/MATH-909
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.0
>Reporter: Patrick Meyer
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> I get an exception when running the code below. the exception is 
> {code}
> function values at endpoints do not have different signs, endpoints: [0, 
> 1.002], values: [-0.025, -∞]
> {code}
> The problematic code:
> {code}
> double df1 = 10675;
> double df2 = 501725;
> FDistribution fDist = new FDistribution(df1, df2);
> System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException
> {code}
> However, R returns the value 0.9733505. The R code is:
> {code}
> qf(p=.025, df1=10675, df2=501725)
> {code}
> I don't know enough about the FDistribution class to know the solution to the 
> exception, but I thought I would report it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MATH-909) FDistribution NoBracketingException in BrentSolver

2012-11-25 Thread Patrick Meyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Meyer updated MATH-909:
---

Description: 
I get an exception when running the code below. the exception is 

{code}
function values at endpoints do not have different signs, endpoints: [0, 
1.002], values: [-0.025, -∞]
{code}

The problematic code:

{code}
double df1 = 10675;
double df2 = 501725;
FDistribution fDist = new FDistribution(df1, df2);
System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException
{code}

However, R returns the value 0.9733505. The R code is:
{code}
qf(p=.025, df1=10675, df2=501725)
{code}

I don't know enough about the FDistribution class to know the solution to the 
exception, but I thought I would report it.


  was:
I get an exception when running the code below. the exception is 

{code}
function values at endpoints do not have different signs, endpoints: [0, 
1.002], values: [-0.025, -∞]
{code}

The problematic code:

{code}
double df1 = 10675;
double df2 = 501725;
FDistribution fDist = new FDistribution(df1, df2);   
System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException
{code}

However, R returns the value 0.9733505. The R code is:
{code}
qf(p=.025, df1=10675, df2=501725)
{code}

I don't know enough about the FDistribution class to know the solution to the 
exception, but I thought I would report it.



> FDistribution NoBracketingException in BrentSolver
> --
>
> Key: MATH-909
> URL: https://issues.apache.org/jira/browse/MATH-909
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.0
>Reporter: Patrick Meyer
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> I get an exception when running the code below. the exception is 
> {code}
> function values at endpoints do not have different signs, endpoints: [0, 
> 1.002], values: [-0.025, -∞]
> {code}
> The problematic code:
> {code}
> double df1 = 10675;
> double df2 = 501725;
> FDistribution fDist = new FDistribution(df1, df2);
> System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException
> {code}
> However, R returns the value 0.9733505. The R code is:
> {code}
> qf(p=.025, df1=10675, df2=501725)
> {code}
> I don't know enough about the FDistribution class to know the solution to the 
> exception, but I thought I would report it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MATH-909) FDistribution NoBracketingException in BrentSolver

2012-11-25 Thread Patrick Meyer (JIRA)
Patrick Meyer created MATH-909:
--

 Summary: FDistribution NoBracketingException in BrentSolver
 Key: MATH-909
 URL: https://issues.apache.org/jira/browse/MATH-909
 Project: Commons Math
  Issue Type: Bug
Affects Versions: 3.0
Reporter: Patrick Meyer


I get an exception when running the code below. the exception is 

{code}
function values at endpoints do not have different signs, endpoints: [0, 
1.002], values: [-0.025, -∞]
{code}

The problematic code:

{code}
double df1 = 10675;
double df2 = 501725;
FDistribution fDist = new FDistribution(df1, df2);   
System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException
{code}

However, R returns the value 0.9733505. The R code is:
{code}
qf(p=.025, df1=10675, df2=501725)
{code}

I don't know enough about the FDistribution class to know the solution to the 
exception, but I thought I would report it.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-449) Storeless covariance

2011-08-22 Thread Patrick Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088913#comment-13088913
 ] 

Patrick Meyer commented on MATH-449:


I like that idea. It's probably the best way to handle it. However, in looking 
back at the regular Covariance class, it only provides for listwise deletion. 
Should we reconsider treatment of missing data in Covariance and 
StorelessCovariance so that the implementations are similar? We should probably 
give the user an option for treatment of missing data. The cov() function in R 
has an option for casewise or pairwise deletion but it looks like only casewise 
is available for the Pearson correlation. Missing data for Spearman's 
correlation is handled through the ranking procedure.

> Storeless covariance
> 
>
> Key: MATH-449
> URL: https://issues.apache.org/jira/browse/MATH-449
> Project: Commons Math
>  Issue Type: Improvement
>Reporter: Patrick Meyer
>Assignee: Phil Steitz
> Fix For: 3.1
>
> Attachments: MATH-449.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently there is no storeless version for computing the covariance. 
> However, Pebay (2008) describes algorithms for on-line covariance 
> computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have 
> provided a simple class for implementing this algorithm. It would be nice to 
> have this integrated into org.apache.commons.math.stat.correlation.Covariance.
> {code}
> //This code is granted for inclusion in the Apache Commons under the terms of 
> the ASL.
> public class StorelessCovariance{
> private double deltaX = 0.0;
> private double deltaY = 0.0;
> private double meanX = 0.0;
> private double meanY = 0.0;
> private double N=0;
> private Double covarianceNumerator=0.0;
> private boolean unbiased=true;
> public Covariance(boolean unbiased){
>   this.unbiased = unbiased;
> }
> public void increment(Double x, Double y){
> if(x!=null & y!=null){
> N++;
> deltaX = x - meanX;
> deltaY = y - meanY;
> meanX += deltaX/N;
> meanY += deltaY/N;
> covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
> }
> 
> }
> public Double getResult(){
> if(unbiased){
> return covarianceNumerator/(N-1.0);
> }else{
> return covarianceNumerator/N;
> }
> }   
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-449) Storeless covariance

2011-08-22 Thread Patrick Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088719#comment-13088719
 ] 

Patrick Meyer commented on MATH-449:


I like all of these ideas. When I wrote the patch, I didn't know if forcing a 
square matrix was preferred, so I wrote it more generally. A square matrix is 
fine with me. 

Incrementing the full vector of new values is definitely the safest way to do 
it. However, it forces the user into listwise deletion if a case has any 
missing data. The more granular version allows a user to implement pairwise 
deletion. Nether option is a great way to handle missing data, but do we want 
to force one approach on the user? Is there way to increment the full vector of 
values and account for missing data on one or more variables?

Thanks,
Patrick

> Storeless covariance
> 
>
> Key: MATH-449
> URL: https://issues.apache.org/jira/browse/MATH-449
> Project: Commons Math
>  Issue Type: Improvement
>Reporter: Patrick Meyer
>Assignee: Phil Steitz
> Fix For: 3.1
>
> Attachments: MATH-449.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently there is no storeless version for computing the covariance. 
> However, Pebay (2008) describes algorithms for on-line covariance 
> computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have 
> provided a simple class for implementing this algorithm. It would be nice to 
> have this integrated into org.apache.commons.math.stat.correlation.Covariance.
> {code}
> //This code is granted for inclusion in the Apache Commons under the terms of 
> the ASL.
> public class StorelessCovariance{
> private double deltaX = 0.0;
> private double deltaY = 0.0;
> private double meanX = 0.0;
> private double meanY = 0.0;
> private double N=0;
> private Double covarianceNumerator=0.0;
> private boolean unbiased=true;
> public Covariance(boolean unbiased){
>   this.unbiased = unbiased;
> }
> public void increment(Double x, Double y){
> if(x!=null & y!=null){
> N++;
> deltaX = x - meanX;
> deltaY = y - meanY;
> meanX += deltaX/N;
> meanY += deltaY/N;
> covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
> }
> 
> }
> public Double getResult(){
> if(unbiased){
> return covarianceNumerator/(N-1.0);
> }else{
> return covarianceNumerator/N;
> }
> }   
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-449) Storeless covariance

2011-08-21 Thread Patrick Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088390#comment-13088390
 ] 

Patrick Meyer commented on MATH-449:


These changes sound fine to me. I'd be happy to add the javadoc once these 
changes are made. Do I just add the javadoc comments to the class files? Will 
subversion pick up the changes on comments?

Thanks,
Patrick

> Storeless covariance
> 
>
> Key: MATH-449
> URL: https://issues.apache.org/jira/browse/MATH-449
> Project: Commons Math
>  Issue Type: Improvement
>Reporter: Patrick Meyer
>Assignee: Phil Steitz
> Fix For: 3.1
>
> Attachments: MATH-449.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently there is no storeless version for computing the covariance. 
> However, Pebay (2008) describes algorithms for on-line covariance 
> computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have 
> provided a simple class for implementing this algorithm. It would be nice to 
> have this integrated into org.apache.commons.math.stat.correlation.Covariance.
> {code}
> //This code is granted for inclusion in the Apache Commons under the terms of 
> the ASL.
> public class StorelessCovariance{
> private double deltaX = 0.0;
> private double deltaY = 0.0;
> private double meanX = 0.0;
> private double meanY = 0.0;
> private double N=0;
> private Double covarianceNumerator=0.0;
> private boolean unbiased=true;
> public Covariance(boolean unbiased){
>   this.unbiased = unbiased;
> }
> public void increment(Double x, Double y){
> if(x!=null & y!=null){
> N++;
> deltaX = x - meanX;
> deltaY = y - meanY;
> meanX += deltaX/N;
> meanY += deltaY/N;
> covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
> }
> 
> }
> public Double getResult(){
> if(unbiased){
> return covarianceNumerator/(N-1.0);
> }else{
> return covarianceNumerator/N;
> }
> }   
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MATH-449) Storeless covariance

2011-08-17 Thread Patrick Meyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Meyer updated MATH-449:
---

Attachment: MATH-449.patch

This patch includes three new classes, StorelessCovariance.java, 
StorelessCovarianceMatrix.java, and StorelessCovarianceTest.java. For the test 
cases, I used the same data as in CovarianceTest.java. However, I reduced the 
accuracy to 10E-7 because the tests failed the Longley data when using 10E-9.

> Storeless covariance
> 
>
> Key: MATH-449
> URL: https://issues.apache.org/jira/browse/MATH-449
> Project: Commons Math
>  Issue Type: Improvement
>Reporter: Patrick Meyer
> Fix For: 3.1
>
> Attachments: MATH-449.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently there is no storeless version for computing the covariance. 
> However, Pebay (2008) describes algorithms for on-line covariance 
> computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have 
> provided a simple class for implementing this algorithm. It would be nice to 
> have this integrated into org.apache.commons.math.stat.correlation.Covariance.
> {code}
> //This code is granted for inclusion in the Apache Commons under the terms of 
> the ASL.
> public class StorelessCovariance{
> private double deltaX = 0.0;
> private double deltaY = 0.0;
> private double meanX = 0.0;
> private double meanY = 0.0;
> private double N=0;
> private Double covarianceNumerator=0.0;
> private boolean unbiased=true;
> public Covariance(boolean unbiased){
>   this.unbiased = unbiased;
> }
> public void increment(Double x, Double y){
> if(x!=null & y!=null){
> N++;
> deltaX = x - meanX;
> deltaY = y - meanY;
> meanX += deltaX/N;
> meanY += deltaY/N;
> covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
> }
> 
> }
> public Double getResult(){
> if(unbiased){
> return covarianceNumerator/(N-1.0);
> }else{
> return covarianceNumerator/N;
> }
> }   
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Closed] (MATH-647) MATH-449

2011-08-17 Thread Patrick Meyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Meyer closed MATH-647.
--

Resolution: Not A Problem

Error. This issue should not have been created

> MATH-449
> 
>
> Key: MATH-647
> URL: https://issues.apache.org/jira/browse/MATH-647
> Project: Commons Math
>  Issue Type: Bug
>Reporter: Patrick Meyer
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MATH-647) MATH-449

2011-08-17 Thread Patrick Meyer (JIRA)
MATH-449


 Key: MATH-647
 URL: https://issues.apache.org/jira/browse/MATH-647
 Project: Commons Math
  Issue Type: Bug
Reporter: Patrick Meyer




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-449) Storeless covariance

2011-06-15 Thread Patrick Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050013#comment-13050013
 ] 

Patrick Meyer commented on MATH-449:


I agree. A new class would be best. Now that I am more familiar with commons 
math, my code should be changed to use double primitive types instead of Double 
objects. That use seems more consistent with other descriptive statistics in 
math.

> Storeless covariance
> 
>
> Key: MATH-449
> URL: https://issues.apache.org/jira/browse/MATH-449
> Project: Commons Math
>  Issue Type: Improvement
>Reporter: Patrick Meyer
> Fix For: 3.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently there is no storeless version for computing the covariance. 
> However, Pebay (2008) describes algorithms for on-line covariance 
> computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have 
> provided a simple class for implementing this algorithm. It would be nice to 
> have this integrated into org.apache.commons.math.stat.correlation.Covariance.
> {code}
> //This code is granted for inclusion in the Apache Commons under the terms of 
> the ASL.
> public class StorelessCovariance{
> private double deltaX = 0.0;
> private double deltaY = 0.0;
> private double meanX = 0.0;
> private double meanY = 0.0;
> private double N=0;
> private Double covarianceNumerator=0.0;
> private boolean unbiased=true;
> public Covariance(boolean unbiased){
>   this.unbiased = unbiased;
> }
> public void increment(Double x, Double y){
> if(x!=null & y!=null){
> N++;
> deltaX = x - meanX;
> deltaY = y - meanY;
> meanX += deltaX/N;
> meanY += deltaY/N;
> covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
> }
> 
> }
> public Double getResult(){
> if(unbiased){
> return covarianceNumerator/(N-1.0);
> }else{
> return covarianceNumerator/N;
> }
> }   
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-449) Storeless covariance

2011-06-15 Thread Patrick Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049943#comment-13049943
 ] 

Patrick Meyer commented on MATH-449:


I've added the comment to the code. If you have better language for the 
comment, pleas send it to me and I will include it.

Do you have any suggestions for how to best integrate this code into the 
Covariance class? It's not so easy given that the class allows for computation 
of a covariance matrix.

> Storeless covariance
> 
>
> Key: MATH-449
> URL: https://issues.apache.org/jira/browse/MATH-449
> Project: Commons Math
>  Issue Type: Improvement
>Reporter: Patrick Meyer
> Fix For: 3.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently there is no storeless version for computing the covariance. 
> However, Pebay (2008) describes algorithms for on-line covariance 
> computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have 
> provided a simple class for implementing this algorithm. It would be nice to 
> have this integrated into org.apache.commons.math.stat.correlation.Covariance.
> {code}
> //This code is granted for inclusion in the Apache Commons under the terms of 
> the ASL.
> public class StorelessCovariance{
> private double deltaX = 0.0;
> private double deltaY = 0.0;
> private double meanX = 0.0;
> private double meanY = 0.0;
> private double N=0;
> private Double covarianceNumerator=0.0;
> private boolean unbiased=true;
> public Covariance(boolean unbiased){
>   this.unbiased = unbiased;
> }
> public void increment(Double x, Double y){
> if(x!=null & y!=null){
> N++;
> deltaX = x - meanX;
> deltaY = y - meanY;
> meanX += deltaX/N;
> meanY += deltaY/N;
> covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
> }
> 
> }
> public Double getResult(){
> if(unbiased){
> return covarianceNumerator/(N-1.0);
> }else{
> return covarianceNumerator/N;
> }
> }   
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MATH-449) Storeless covariance

2011-06-15 Thread Patrick Meyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Meyer updated MATH-449:
---

Description: 
Currently there is no storeless version for computing the covariance. However, 
Pebay (2008) describes algorithms for on-line covariance computations, 
[http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have provided a 
simple class for implementing this algorithm. It would be nice to have this 
integrated into org.apache.commons.math.stat.correlation.Covariance.

{code}
//This code is granted for inclusion in the Apache Commons under the terms of 
the ASL.

public class StorelessCovariance{

private double deltaX = 0.0;
private double deltaY = 0.0;
private double meanX = 0.0;
private double meanY = 0.0;
private double N=0;
private Double covarianceNumerator=0.0;
private boolean unbiased=true;

public Covariance(boolean unbiased){
this.unbiased = unbiased;
}

public void increment(Double x, Double y){
if(x!=null & y!=null){
N++;
deltaX = x - meanX;
deltaY = y - meanY;
meanX += deltaX/N;
meanY += deltaY/N;
covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
}

}

public Double getResult(){
if(unbiased){
return covarianceNumerator/(N-1.0);
}else{
return covarianceNumerator/N;
}
}   
}
{code}

  was:
Currently there is no storeless version for computing the covariance. However, 
Pebay (2008) describes algorithms for on-line covariance computations, 
[http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have provided a 
simple class for implementing this algorithm. It would be nice to have this 
integrated into org.apache.commons.math.stat.correlation.Covariance.

{code}
public class StorelessCovariance{

private double deltaX = 0.0;
private double deltaY = 0.0;
private double meanX = 0.0;
private double meanY = 0.0;
private double N=0;
private Double covarianceNumerator=0.0;
private boolean unbiased=true;

public Covariance(boolean unbiased){
this.unbiased = unbiased;
}

public void increment(Double x, Double y){
if(x!=null & y!=null){
N++;
deltaX = x - meanX;
deltaY = y - meanY;
meanX += deltaX/N;
meanY += deltaY/N;
covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
}

}

public Double getResult(){
if(unbiased){
return covarianceNumerator/(N-1.0);
}else{
return covarianceNumerator/N;
}
}   
}
{code}


> Storeless covariance
> 
>
> Key: MATH-449
> URL: https://issues.apache.org/jira/browse/MATH-449
> Project: Commons Math
>  Issue Type: Improvement
>Reporter: Patrick Meyer
> Fix For: 3.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently there is no storeless version for computing the covariance. 
> However, Pebay (2008) describes algorithms for on-line covariance 
> computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have 
> provided a simple class for implementing this algorithm. It would be nice to 
> have this integrated into org.apache.commons.math.stat.correlation.Covariance.
> {code}
> //This code is granted for inclusion in the Apache Commons under the terms of 
> the ASL.
> public class StorelessCovariance{
> private double deltaX = 0.0;
> private double deltaY = 0.0;
> private double meanX = 0.0;
> private double meanY = 0.0;
> private double N=0;
> private Double covarianceNumerator=0.0;
> private boolean unbiased=true;
> public Covariance(boolean unbiased){
>   this.unbiased = unbiased;
> }
> public void increment(Double x, Double y){
> if(x!=null & y!=null){
> N++;
> deltaX = x - meanX;
> deltaY = y - meanY;
> meanX += deltaX/N;
> meanY += deltaY/N;
> covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
> }
> 
> }
> public Double getResult(){
> if(unbiased){
> return covarianceNumerator/(N-1.0);
> }else{
> return covarianceNumerator/N;
> }
> }   
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MATH-473) Frequency: new option: NON-sorted

2011-01-13 Thread Patrick Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981476#action_12981476
 ] 

Patrick Meyer commented on MATH-473:


I too like Phil's idea for decoupling the interface and implementation.

> Frequency: new option: NON-sorted
> -
>
> Key: MATH-473
> URL: https://issues.apache.org/jira/browse/MATH-473
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 1.0, 1.1, 1.2, 2.0, 2.1
>Reporter: Dan Checkoway
> Fix For: 3.0
>
>
> I have a request for enhancement on org.apache.commons.math.stat.Frequency.  
> I would like to be able to specify that the the backing map NOT be sorted.  
> Right now it uses TreeMap.  I would like to have the option of specifying 
> that sorting is not important, and would in fact hinder performance, and a 
> plain old HashMap should be used instead.
> i.e. constructor such as:
> public Frequency(boolean sorted);
> If sorted is true, use a TreeMap.  If sorted is false, use a HashMap.  Is 
> this feasible?  I'd be happy to contribute a patch if that would help.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MATH-473) Frequency: new option: NON-sorted

2011-01-13 Thread Patrick Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981240#action_12981240
 ] 

Patrick Meyer commented on MATH-473:


I think you might encounter problems with the cumulative count and percentages 
if there is no sorting. Your data may only be nominal and thus have no need for 
sorting, but I think the changes would require more than switching between a 
TreeMap and HashMap. Specifically, thought would need to be given to the 
cumulative count and percentage methods. In terms of performance, I don't think 
there would be a perceptible difference between a TreeMap and a HashMap.

Also, I think you could simply write your own class that implements the 
comparable interface. That way you could define any type of sorting you would 
like.

> Frequency: new option: NON-sorted
> -
>
> Key: MATH-473
> URL: https://issues.apache.org/jira/browse/MATH-473
> Project: Commons Math
>  Issue Type: Improvement
>Reporter: Dan Checkoway
>
> I have a request for enhancement on org.apache.commons.math.stat.Frequency.  
> I would like to be able to specify that the the backing map NOT be sorted.  
> Right now it uses TreeMap.  I would like to have the option of specifying 
> that sorting is not important, and would in fact hinder performance, and a 
> plain old HashMap should be used instead.
> i.e. constructor such as:
> public Frequency(boolean sorted);
> If sorted is true, use a TreeMap.  If sorted is false, use a HashMap.  Is 
> this feasible?  I'd be happy to contribute a patch if that would help.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MATH-448) Frequency get number of unique values

2010-12-09 Thread Patrick Meyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Meyer updated MATH-448:
---

Attachment: MATH-448.patch

> Frequency get number of unique values
> -
>
> Key: MATH-448
> URL: https://issues.apache.org/jira/browse/MATH-448
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Patrick Meyer
>Priority: Minor
> Attachments: MATH-448.patch, MATH-448.patch
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> It is often useful to know the number of unique elements in a frequency 
> table. Could you add a simple method that returns the size of freqTable. It 
> seems like it would be as simple as:
> {code}
> public int getUniqueCount(){
>  return freqTable.size();
> }
> {code}
> Given that freqTable is private, there is no way to extend the class and add 
> this method. Thanks!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MATH-448) Frequency get number of unique values

2010-12-06 Thread Patrick Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12968555#action_12968555
 ] 

Patrick Meyer commented on MATH-448:


All right, maybe I have it right this time. I used Eclipse to generate the 
patch and it seems to have the right information in it. The latest path file 
(MATH-448.patch) is now attached to this issue.

> Frequency get number of unique values
> -
>
> Key: MATH-448
> URL: https://issues.apache.org/jira/browse/MATH-448
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Patrick Meyer
>Priority: Minor
> Attachments: MATH-448.patch
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> It is often useful to know the number of unique elements in a frequency 
> table. Could you add a simple method that returns the size of freqTable. It 
> seems like it would be as simple as:
> {code}
> public int getUniqueCount(){
>  return freqTable.size();
> }
> {code}
> Given that freqTable is private, there is no way to extend the class and add 
> this method. Thanks!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MATH-448) Frequency get number of unique values

2010-12-06 Thread Patrick Meyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Meyer updated MATH-448:
---

Attachment: MATH-448.patch

> Frequency get number of unique values
> -
>
> Key: MATH-448
> URL: https://issues.apache.org/jira/browse/MATH-448
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Patrick Meyer
>Priority: Minor
> Attachments: MATH-448.patch
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> It is often useful to know the number of unique elements in a frequency 
> table. Could you add a simple method that returns the size of freqTable. It 
> seems like it would be as simple as:
> {code}
> public int getUniqueCount(){
>  return freqTable.size();
> }
> {code}
> Given that freqTable is private, there is no way to extend the class and add 
> this method. Thanks!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MATH-448) Frequency get number of unique values

2010-12-06 Thread Patrick Meyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Meyer updated MATH-448:
---

Attachment: (was: Frequency.diff)

> Frequency get number of unique values
> -
>
> Key: MATH-448
> URL: https://issues.apache.org/jira/browse/MATH-448
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Patrick Meyer
>Priority: Minor
> Attachments: MATH-448.patch
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> It is often useful to know the number of unique elements in a frequency 
> table. Could you add a simple method that returns the size of freqTable. It 
> seems like it would be as simple as:
> {code}
> public int getUniqueCount(){
>  return freqTable.size();
> }
> {code}
> Given that freqTable is private, there is no way to extend the class and add 
> this method. Thanks!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MATH-448) Frequency get number of unique values

2010-12-06 Thread Patrick Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12967252#action_12967252
 ] 

Patrick Meyer commented on MATH-448:


OK, I uploaded a new patch that was created using the "Export Diff Patch" 
option in Netbeans. Is this the format you need? (This is the first patch I've 
tried to submit - thanks for your patience.)

> Frequency get number of unique values
> -
>
> Key: MATH-448
> URL: https://issues.apache.org/jira/browse/MATH-448
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Patrick Meyer
>Priority: Minor
> Attachments: Frequency.diff
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> It is often useful to know the number of unique elements in a frequency 
> table. Could you add a simple method that returns the size of freqTable. It 
> seems like it would be as simple as:
> {code}
> public int getUniqueCount(){
>  return freqTable.size();
> }
> {code}
> Given that freqTable is private, there is no way to extend the class and add 
> this method. Thanks!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MATH-448) Frequency get number of unique values

2010-12-06 Thread Patrick Meyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Meyer updated MATH-448:
---

Attachment: Frequency.diff

> Frequency get number of unique values
> -
>
> Key: MATH-448
> URL: https://issues.apache.org/jira/browse/MATH-448
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Patrick Meyer
>Priority: Minor
> Attachments: Frequency.diff
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> It is often useful to know the number of unique elements in a frequency 
> table. Could you add a simple method that returns the size of freqTable. It 
> seems like it would be as simple as:
> {code}
> public int getUniqueCount(){
>  return freqTable.size();
> }
> {code}
> Given that freqTable is private, there is no way to extend the class and add 
> this method. Thanks!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MATH-448) Frequency get number of unique values

2010-12-06 Thread Patrick Meyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Meyer updated MATH-448:
---

Attachment: (was: Frequency.diff)

> Frequency get number of unique values
> -
>
> Key: MATH-448
> URL: https://issues.apache.org/jira/browse/MATH-448
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Patrick Meyer
>Priority: Minor
> Attachments: Frequency.diff
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> It is often useful to know the number of unique elements in a frequency 
> table. Could you add a simple method that returns the size of freqTable. It 
> seems like it would be as simple as:
> {code}
> public int getUniqueCount(){
>  return freqTable.size();
> }
> {code}
> Given that freqTable is private, there is no way to extend the class and add 
> this method. Thanks!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MATH-448) Frequency get number of unique values

2010-12-06 Thread Patrick Meyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Meyer updated MATH-448:
---

Attachment: Frequency.diff

> Frequency get number of unique values
> -
>
> Key: MATH-448
> URL: https://issues.apache.org/jira/browse/MATH-448
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Patrick Meyer
>Priority: Minor
> Attachments: Frequency.diff
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> It is often useful to know the number of unique elements in a frequency 
> table. Could you add a simple method that returns the size of freqTable. It 
> seems like it would be as simple as:
> {code}
> public int getUniqueCount(){
>  return freqTable.size();
> }
> {code}
> Given that freqTable is private, there is no way to extend the class and add 
> this method. Thanks!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MATH-449) Storeless covariance

2010-12-01 Thread Patrick Meyer (JIRA)
Storeless covariance


 Key: MATH-449
 URL: https://issues.apache.org/jira/browse/MATH-449
 Project: Commons Math
  Issue Type: Improvement
Reporter: Patrick Meyer


Currently there is no storeless version for computing the covariance. However, 
Pebay (2008) describes algorithms for on-line covariance computations, 
[http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have provided a 
simple class for implementing this algorithm. It would be nice to have this 
integrated into org.apache.commons.math.stat.correlation.Covariance.

{code}
public class StorelessCovariance{

private double deltaX = 0.0;
private double deltaY = 0.0;
private double meanX = 0.0;
private double meanY = 0.0;
private double N=0;
private Double covarianceNumerator=0.0;
private boolean unbiased=true;

public Covariance(boolean unbiased){
this.unbiased = unbiased;
}

public void increment(Double x, Double y){
if(x!=null & y!=null){
N++;
deltaX = x - meanX;
deltaY = y - meanY;
meanX += deltaX/N;
meanY += deltaY/N;
covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
}

}

public Double getResult(){
if(unbiased){
return covarianceNumerator/(N-1.0);
}else{
return covarianceNumerator/N;
}
}   
}
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MATH-448) Frequency get number of unique values

2010-12-01 Thread Patrick Meyer (JIRA)
Frequency get number of unique values
-

 Key: MATH-448
 URL: https://issues.apache.org/jira/browse/MATH-448
 Project: Commons Math
  Issue Type: New Feature
Reporter: Patrick Meyer
Priority: Minor


It is often useful to know the number of unique elements in a frequency table. 
Could you add a simple method that returns the size of freqTable. It seems like 
it would be as simple as:

{code}
public int getUniqueCount(){
 return freqTable.size();
}
{code}

Given that freqTable is private, there is no way to extend the class and add 
this method. Thanks!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.