[jira] [Commented] (MATH-909) FDistribution NoBracketingException in BrentSolver
[ https://issues.apache.org/jira/browse/MATH-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503910#comment-13503910 ] Patrick Meyer commented on MATH-909: According to the R documentation, the gamma and beta functions are C translations of the SLATEC Fortran subroutines, as you suspected. The incomplete gamma appears to have a different origin. According to the R documentation, the pbeta function is related to the incomplete beta function of Abramowitz and Stegun. They cite two different sources for the function depending on whether it is a central or non-central pbeta. Central pbeta: Didonato, A. and Morris, A., Jr, (1992) Algorithm 708: Significant digit computation of the incomplete beta function ratios, ACM Transactions on Mathematical Software, 18, 360–373. (See also Brown, B. and Lawrence Levy, L. (1994) Certification of algorithm 708: Significant digit computation of the incomplete beta, ACM Transactions on Mathematical Software, 20, 393–397.) Non-central pbeta: Lenth, R. V. (1987) Algorithm AS226: Computing noncentral beta probabilities. Appl. Statist, 36, 241–244, incorporating Frick, H. (1990)'s AS R84, Appl. Statist, 39, 311–2, and Lam, M.L. (1995)'s AS R95, Appl. Statist, 44, 551–2. As far as test cases go, I think we should include a test case, given the proposed work on the underlying incomplete beta function. The test case does not have to be specific to this issue, but it would be safe to include a test. > FDistribution NoBracketingException in BrentSolver > -- > > Key: MATH-909 > URL: https://issues.apache.org/jira/browse/MATH-909 > Project: Commons Math > Issue Type: Bug >Affects Versions: 3.0 >Reporter: Patrick Meyer > Original Estimate: 24h > Remaining Estimate: 24h > > I get an exception when running the code below. the exception is > {code} > function values at endpoints do not have different signs, endpoints: [0, > 1.002], values: [-0.025, -∞] > {code} > The problematic code: > {code} > double df1 = 10675; > double df2 = 501725; > FDistribution fDist = new FDistribution(df1, df2); > System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException > {code} > However, R returns the value 0.9733505. The R code is: > {code} > qf(p=.025, df1=10675, df2=501725) > {code} > I don't know enough about the FDistribution class to know the solution to the > exception, but I thought I would report it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-909) FDistribution NoBracketingException in BrentSolver
[ https://issues.apache.org/jira/browse/MATH-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503818#comment-13503818 ] Patrick Meyer commented on MATH-909: Ha, you're right! R is less accurate. I checked the value with Stata (code listed below) and the result was 0.97307795. I'm satisfied. CM returns a more accurate value. {code} display invF(10675, 501725, 0.025) {code} > FDistribution NoBracketingException in BrentSolver > -- > > Key: MATH-909 > URL: https://issues.apache.org/jira/browse/MATH-909 > Project: Commons Math > Issue Type: Bug >Affects Versions: 3.0 >Reporter: Patrick Meyer > Original Estimate: 24h > Remaining Estimate: 24h > > I get an exception when running the code below. the exception is > {code} > function values at endpoints do not have different signs, endpoints: [0, > 1.002], values: [-0.025, -∞] > {code} > The problematic code: > {code} > double df1 = 10675; > double df2 = 501725; > FDistribution fDist = new FDistribution(df1, df2); > System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException > {code} > However, R returns the value 0.9733505. The R code is: > {code} > qf(p=.025, df1=10675, df2=501725) > {code} > I don't know enough about the FDistribution class to know the solution to the > exception, but I thought I would report it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-909) FDistribution NoBracketingException in BrentSolver
[ https://issues.apache.org/jira/browse/MATH-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503773#comment-13503773 ] Patrick Meyer commented on MATH-909: I tested it with the most recent version and I also got 0.9730779455126357. The exception no longer occurs, but the result still seems to be to different from the value reported by R. > FDistribution NoBracketingException in BrentSolver > -- > > Key: MATH-909 > URL: https://issues.apache.org/jira/browse/MATH-909 > Project: Commons Math > Issue Type: Bug >Affects Versions: 3.0 >Reporter: Patrick Meyer > Original Estimate: 24h > Remaining Estimate: 24h > > I get an exception when running the code below. the exception is > {code} > function values at endpoints do not have different signs, endpoints: [0, > 1.002], values: [-0.025, -∞] > {code} > The problematic code: > {code} > double df1 = 10675; > double df2 = 501725; > FDistribution fDist = new FDistribution(df1, df2); > System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException > {code} > However, R returns the value 0.9733505. The R code is: > {code} > qf(p=.025, df1=10675, df2=501725) > {code} > I don't know enough about the FDistribution class to know the solution to the > exception, but I thought I would report it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (MATH-909) FDistribution NoBracketingException in BrentSolver
[ https://issues.apache.org/jira/browse/MATH-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503773#comment-13503773 ] Patrick Meyer edited comment on MATH-909 at 11/26/12 1:17 PM: -- I tested it with the most recent version and I also got 0.9730779455126357. The exception no longer occurs, but the result still seems to be too different from the value reported by R. was (Author: meyerjp): I tested it with the most recent version and I also got 0.9730779455126357. The exception no longer occurs, but the result still seems to be to different from the value reported by R. > FDistribution NoBracketingException in BrentSolver > -- > > Key: MATH-909 > URL: https://issues.apache.org/jira/browse/MATH-909 > Project: Commons Math > Issue Type: Bug >Affects Versions: 3.0 >Reporter: Patrick Meyer > Original Estimate: 24h > Remaining Estimate: 24h > > I get an exception when running the code below. the exception is > {code} > function values at endpoints do not have different signs, endpoints: [0, > 1.002], values: [-0.025, -∞] > {code} > The problematic code: > {code} > double df1 = 10675; > double df2 = 501725; > FDistribution fDist = new FDistribution(df1, df2); > System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException > {code} > However, R returns the value 0.9733505. The R code is: > {code} > qf(p=.025, df1=10675, df2=501725) > {code} > I don't know enough about the FDistribution class to know the solution to the > exception, but I thought I would report it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MATH-909) FDistribution NoBracketingException in BrentSolver
[ https://issues.apache.org/jira/browse/MATH-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Meyer updated MATH-909: --- Description: I get an exception when running the code below. the exception is {code} function values at endpoints do not have different signs, endpoints: [0, 1.002], values: [-0.025, -∞] {code} The problematic code: {code} double df1 = 10675; double df2 = 501725; FDistribution fDist = new FDistribution(df1, df2); System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException {code} However, R returns the value 0.9733505. The R code is: {code} qf(p=.025, df1=10675, df2=501725) {code} I don't know enough about the FDistribution class to know the solution to the exception, but I thought I would report it. was: I get an exception when running the code below. the exception is {code} function values at endpoints do not have different signs, endpoints: [0, 1.002], values: [-0.025, -∞] {code} The problematic code: {code} double df1 = 10675; double df2 = 501725; FDistribution fDist = new FDistribution(df1, df2); System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException {code} However, R returns the value 0.9733505. The R code is: {code} qf(p=.025, df1=10675, df2=501725) {code} I don't know enough about the FDistribution class to know the solution to the exception, but I thought I would report it. > FDistribution NoBracketingException in BrentSolver > -- > > Key: MATH-909 > URL: https://issues.apache.org/jira/browse/MATH-909 > Project: Commons Math > Issue Type: Bug >Affects Versions: 3.0 >Reporter: Patrick Meyer > Original Estimate: 24h > Remaining Estimate: 24h > > I get an exception when running the code below. the exception is > {code} > function values at endpoints do not have different signs, endpoints: [0, > 1.002], values: [-0.025, -∞] > {code} > The problematic code: > {code} > double df1 = 10675; > double df2 = 501725; > FDistribution fDist = new FDistribution(df1, df2); > System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException > {code} > However, R returns the value 0.9733505. The R code is: > {code} > qf(p=.025, df1=10675, df2=501725) > {code} > I don't know enough about the FDistribution class to know the solution to the > exception, but I thought I would report it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MATH-909) FDistribution NoBracketingException in BrentSolver
Patrick Meyer created MATH-909: -- Summary: FDistribution NoBracketingException in BrentSolver Key: MATH-909 URL: https://issues.apache.org/jira/browse/MATH-909 Project: Commons Math Issue Type: Bug Affects Versions: 3.0 Reporter: Patrick Meyer I get an exception when running the code below. the exception is {code} function values at endpoints do not have different signs, endpoints: [0, 1.002], values: [-0.025, -∞] {code} The problematic code: {code} double df1 = 10675; double df2 = 501725; FDistribution fDist = new FDistribution(df1, df2); System.out.println(fDist.inverseCumulativeProbability(0.025));//NoBracketingException {code} However, R returns the value 0.9733505. The R code is: {code} qf(p=.025, df1=10675, df2=501725) {code} I don't know enough about the FDistribution class to know the solution to the exception, but I thought I would report it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-449) Storeless covariance
[ https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088913#comment-13088913 ] Patrick Meyer commented on MATH-449: I like that idea. It's probably the best way to handle it. However, in looking back at the regular Covariance class, it only provides for listwise deletion. Should we reconsider treatment of missing data in Covariance and StorelessCovariance so that the implementations are similar? We should probably give the user an option for treatment of missing data. The cov() function in R has an option for casewise or pairwise deletion but it looks like only casewise is available for the Pearson correlation. Missing data for Spearman's correlation is handled through the ranking procedure. > Storeless covariance > > > Key: MATH-449 > URL: https://issues.apache.org/jira/browse/MATH-449 > Project: Commons Math > Issue Type: Improvement >Reporter: Patrick Meyer >Assignee: Phil Steitz > Fix For: 3.1 > > Attachments: MATH-449.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Currently there is no storeless version for computing the covariance. > However, Pebay (2008) describes algorithms for on-line covariance > computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have > provided a simple class for implementing this algorithm. It would be nice to > have this integrated into org.apache.commons.math.stat.correlation.Covariance. > {code} > //This code is granted for inclusion in the Apache Commons under the terms of > the ASL. > public class StorelessCovariance{ > private double deltaX = 0.0; > private double deltaY = 0.0; > private double meanX = 0.0; > private double meanY = 0.0; > private double N=0; > private Double covarianceNumerator=0.0; > private boolean unbiased=true; > public Covariance(boolean unbiased){ > this.unbiased = unbiased; > } > public void increment(Double x, Double y){ > if(x!=null & y!=null){ > N++; > deltaX = x - meanX; > deltaY = y - meanY; > meanX += deltaX/N; > meanY += deltaY/N; > covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY; > } > > } > public Double getResult(){ > if(unbiased){ > return covarianceNumerator/(N-1.0); > }else{ > return covarianceNumerator/N; > } > } > } > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-449) Storeless covariance
[ https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088719#comment-13088719 ] Patrick Meyer commented on MATH-449: I like all of these ideas. When I wrote the patch, I didn't know if forcing a square matrix was preferred, so I wrote it more generally. A square matrix is fine with me. Incrementing the full vector of new values is definitely the safest way to do it. However, it forces the user into listwise deletion if a case has any missing data. The more granular version allows a user to implement pairwise deletion. Nether option is a great way to handle missing data, but do we want to force one approach on the user? Is there way to increment the full vector of values and account for missing data on one or more variables? Thanks, Patrick > Storeless covariance > > > Key: MATH-449 > URL: https://issues.apache.org/jira/browse/MATH-449 > Project: Commons Math > Issue Type: Improvement >Reporter: Patrick Meyer >Assignee: Phil Steitz > Fix For: 3.1 > > Attachments: MATH-449.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Currently there is no storeless version for computing the covariance. > However, Pebay (2008) describes algorithms for on-line covariance > computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have > provided a simple class for implementing this algorithm. It would be nice to > have this integrated into org.apache.commons.math.stat.correlation.Covariance. > {code} > //This code is granted for inclusion in the Apache Commons under the terms of > the ASL. > public class StorelessCovariance{ > private double deltaX = 0.0; > private double deltaY = 0.0; > private double meanX = 0.0; > private double meanY = 0.0; > private double N=0; > private Double covarianceNumerator=0.0; > private boolean unbiased=true; > public Covariance(boolean unbiased){ > this.unbiased = unbiased; > } > public void increment(Double x, Double y){ > if(x!=null & y!=null){ > N++; > deltaX = x - meanX; > deltaY = y - meanY; > meanX += deltaX/N; > meanY += deltaY/N; > covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY; > } > > } > public Double getResult(){ > if(unbiased){ > return covarianceNumerator/(N-1.0); > }else{ > return covarianceNumerator/N; > } > } > } > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-449) Storeless covariance
[ https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088390#comment-13088390 ] Patrick Meyer commented on MATH-449: These changes sound fine to me. I'd be happy to add the javadoc once these changes are made. Do I just add the javadoc comments to the class files? Will subversion pick up the changes on comments? Thanks, Patrick > Storeless covariance > > > Key: MATH-449 > URL: https://issues.apache.org/jira/browse/MATH-449 > Project: Commons Math > Issue Type: Improvement >Reporter: Patrick Meyer >Assignee: Phil Steitz > Fix For: 3.1 > > Attachments: MATH-449.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Currently there is no storeless version for computing the covariance. > However, Pebay (2008) describes algorithms for on-line covariance > computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have > provided a simple class for implementing this algorithm. It would be nice to > have this integrated into org.apache.commons.math.stat.correlation.Covariance. > {code} > //This code is granted for inclusion in the Apache Commons under the terms of > the ASL. > public class StorelessCovariance{ > private double deltaX = 0.0; > private double deltaY = 0.0; > private double meanX = 0.0; > private double meanY = 0.0; > private double N=0; > private Double covarianceNumerator=0.0; > private boolean unbiased=true; > public Covariance(boolean unbiased){ > this.unbiased = unbiased; > } > public void increment(Double x, Double y){ > if(x!=null & y!=null){ > N++; > deltaX = x - meanX; > deltaY = y - meanY; > meanX += deltaX/N; > meanY += deltaY/N; > covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY; > } > > } > public Double getResult(){ > if(unbiased){ > return covarianceNumerator/(N-1.0); > }else{ > return covarianceNumerator/N; > } > } > } > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MATH-449) Storeless covariance
[ https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Meyer updated MATH-449: --- Attachment: MATH-449.patch This patch includes three new classes, StorelessCovariance.java, StorelessCovarianceMatrix.java, and StorelessCovarianceTest.java. For the test cases, I used the same data as in CovarianceTest.java. However, I reduced the accuracy to 10E-7 because the tests failed the Longley data when using 10E-9. > Storeless covariance > > > Key: MATH-449 > URL: https://issues.apache.org/jira/browse/MATH-449 > Project: Commons Math > Issue Type: Improvement >Reporter: Patrick Meyer > Fix For: 3.1 > > Attachments: MATH-449.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Currently there is no storeless version for computing the covariance. > However, Pebay (2008) describes algorithms for on-line covariance > computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have > provided a simple class for implementing this algorithm. It would be nice to > have this integrated into org.apache.commons.math.stat.correlation.Covariance. > {code} > //This code is granted for inclusion in the Apache Commons under the terms of > the ASL. > public class StorelessCovariance{ > private double deltaX = 0.0; > private double deltaY = 0.0; > private double meanX = 0.0; > private double meanY = 0.0; > private double N=0; > private Double covarianceNumerator=0.0; > private boolean unbiased=true; > public Covariance(boolean unbiased){ > this.unbiased = unbiased; > } > public void increment(Double x, Double y){ > if(x!=null & y!=null){ > N++; > deltaX = x - meanX; > deltaY = y - meanY; > meanX += deltaX/N; > meanY += deltaY/N; > covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY; > } > > } > public Double getResult(){ > if(unbiased){ > return covarianceNumerator/(N-1.0); > }else{ > return covarianceNumerator/N; > } > } > } > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (MATH-647) MATH-449
[ https://issues.apache.org/jira/browse/MATH-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Meyer closed MATH-647. -- Resolution: Not A Problem Error. This issue should not have been created > MATH-449 > > > Key: MATH-647 > URL: https://issues.apache.org/jira/browse/MATH-647 > Project: Commons Math > Issue Type: Bug >Reporter: Patrick Meyer > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MATH-647) MATH-449
MATH-449 Key: MATH-647 URL: https://issues.apache.org/jira/browse/MATH-647 Project: Commons Math Issue Type: Bug Reporter: Patrick Meyer -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-449) Storeless covariance
[ https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050013#comment-13050013 ] Patrick Meyer commented on MATH-449: I agree. A new class would be best. Now that I am more familiar with commons math, my code should be changed to use double primitive types instead of Double objects. That use seems more consistent with other descriptive statistics in math. > Storeless covariance > > > Key: MATH-449 > URL: https://issues.apache.org/jira/browse/MATH-449 > Project: Commons Math > Issue Type: Improvement >Reporter: Patrick Meyer > Fix For: 3.0 > > Original Estimate: 168h > Remaining Estimate: 168h > > Currently there is no storeless version for computing the covariance. > However, Pebay (2008) describes algorithms for on-line covariance > computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have > provided a simple class for implementing this algorithm. It would be nice to > have this integrated into org.apache.commons.math.stat.correlation.Covariance. > {code} > //This code is granted for inclusion in the Apache Commons under the terms of > the ASL. > public class StorelessCovariance{ > private double deltaX = 0.0; > private double deltaY = 0.0; > private double meanX = 0.0; > private double meanY = 0.0; > private double N=0; > private Double covarianceNumerator=0.0; > private boolean unbiased=true; > public Covariance(boolean unbiased){ > this.unbiased = unbiased; > } > public void increment(Double x, Double y){ > if(x!=null & y!=null){ > N++; > deltaX = x - meanX; > deltaY = y - meanY; > meanX += deltaX/N; > meanY += deltaY/N; > covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY; > } > > } > public Double getResult(){ > if(unbiased){ > return covarianceNumerator/(N-1.0); > }else{ > return covarianceNumerator/N; > } > } > } > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-449) Storeless covariance
[ https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049943#comment-13049943 ] Patrick Meyer commented on MATH-449: I've added the comment to the code. If you have better language for the comment, pleas send it to me and I will include it. Do you have any suggestions for how to best integrate this code into the Covariance class? It's not so easy given that the class allows for computation of a covariance matrix. > Storeless covariance > > > Key: MATH-449 > URL: https://issues.apache.org/jira/browse/MATH-449 > Project: Commons Math > Issue Type: Improvement >Reporter: Patrick Meyer > Fix For: 3.0 > > Original Estimate: 168h > Remaining Estimate: 168h > > Currently there is no storeless version for computing the covariance. > However, Pebay (2008) describes algorithms for on-line covariance > computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have > provided a simple class for implementing this algorithm. It would be nice to > have this integrated into org.apache.commons.math.stat.correlation.Covariance. > {code} > //This code is granted for inclusion in the Apache Commons under the terms of > the ASL. > public class StorelessCovariance{ > private double deltaX = 0.0; > private double deltaY = 0.0; > private double meanX = 0.0; > private double meanY = 0.0; > private double N=0; > private Double covarianceNumerator=0.0; > private boolean unbiased=true; > public Covariance(boolean unbiased){ > this.unbiased = unbiased; > } > public void increment(Double x, Double y){ > if(x!=null & y!=null){ > N++; > deltaX = x - meanX; > deltaY = y - meanY; > meanX += deltaX/N; > meanY += deltaY/N; > covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY; > } > > } > public Double getResult(){ > if(unbiased){ > return covarianceNumerator/(N-1.0); > }else{ > return covarianceNumerator/N; > } > } > } > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MATH-449) Storeless covariance
[ https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Meyer updated MATH-449: --- Description: Currently there is no storeless version for computing the covariance. However, Pebay (2008) describes algorithms for on-line covariance computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have provided a simple class for implementing this algorithm. It would be nice to have this integrated into org.apache.commons.math.stat.correlation.Covariance. {code} //This code is granted for inclusion in the Apache Commons under the terms of the ASL. public class StorelessCovariance{ private double deltaX = 0.0; private double deltaY = 0.0; private double meanX = 0.0; private double meanY = 0.0; private double N=0; private Double covarianceNumerator=0.0; private boolean unbiased=true; public Covariance(boolean unbiased){ this.unbiased = unbiased; } public void increment(Double x, Double y){ if(x!=null & y!=null){ N++; deltaX = x - meanX; deltaY = y - meanY; meanX += deltaX/N; meanY += deltaY/N; covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY; } } public Double getResult(){ if(unbiased){ return covarianceNumerator/(N-1.0); }else{ return covarianceNumerator/N; } } } {code} was: Currently there is no storeless version for computing the covariance. However, Pebay (2008) describes algorithms for on-line covariance computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have provided a simple class for implementing this algorithm. It would be nice to have this integrated into org.apache.commons.math.stat.correlation.Covariance. {code} public class StorelessCovariance{ private double deltaX = 0.0; private double deltaY = 0.0; private double meanX = 0.0; private double meanY = 0.0; private double N=0; private Double covarianceNumerator=0.0; private boolean unbiased=true; public Covariance(boolean unbiased){ this.unbiased = unbiased; } public void increment(Double x, Double y){ if(x!=null & y!=null){ N++; deltaX = x - meanX; deltaY = y - meanY; meanX += deltaX/N; meanY += deltaY/N; covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY; } } public Double getResult(){ if(unbiased){ return covarianceNumerator/(N-1.0); }else{ return covarianceNumerator/N; } } } {code} > Storeless covariance > > > Key: MATH-449 > URL: https://issues.apache.org/jira/browse/MATH-449 > Project: Commons Math > Issue Type: Improvement >Reporter: Patrick Meyer > Fix For: 3.0 > > Original Estimate: 168h > Remaining Estimate: 168h > > Currently there is no storeless version for computing the covariance. > However, Pebay (2008) describes algorithms for on-line covariance > computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have > provided a simple class for implementing this algorithm. It would be nice to > have this integrated into org.apache.commons.math.stat.correlation.Covariance. > {code} > //This code is granted for inclusion in the Apache Commons under the terms of > the ASL. > public class StorelessCovariance{ > private double deltaX = 0.0; > private double deltaY = 0.0; > private double meanX = 0.0; > private double meanY = 0.0; > private double N=0; > private Double covarianceNumerator=0.0; > private boolean unbiased=true; > public Covariance(boolean unbiased){ > this.unbiased = unbiased; > } > public void increment(Double x, Double y){ > if(x!=null & y!=null){ > N++; > deltaX = x - meanX; > deltaY = y - meanY; > meanX += deltaX/N; > meanY += deltaY/N; > covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY; > } > > } > public Double getResult(){ > if(unbiased){ > return covarianceNumerator/(N-1.0); > }else{ > return covarianceNumerator/N; > } > } > } > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MATH-473) Frequency: new option: NON-sorted
[ https://issues.apache.org/jira/browse/MATH-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981476#action_12981476 ] Patrick Meyer commented on MATH-473: I too like Phil's idea for decoupling the interface and implementation. > Frequency: new option: NON-sorted > - > > Key: MATH-473 > URL: https://issues.apache.org/jira/browse/MATH-473 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 1.0, 1.1, 1.2, 2.0, 2.1 >Reporter: Dan Checkoway > Fix For: 3.0 > > > I have a request for enhancement on org.apache.commons.math.stat.Frequency. > I would like to be able to specify that the the backing map NOT be sorted. > Right now it uses TreeMap. I would like to have the option of specifying > that sorting is not important, and would in fact hinder performance, and a > plain old HashMap should be used instead. > i.e. constructor such as: > public Frequency(boolean sorted); > If sorted is true, use a TreeMap. If sorted is false, use a HashMap. Is > this feasible? I'd be happy to contribute a patch if that would help. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MATH-473) Frequency: new option: NON-sorted
[ https://issues.apache.org/jira/browse/MATH-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981240#action_12981240 ] Patrick Meyer commented on MATH-473: I think you might encounter problems with the cumulative count and percentages if there is no sorting. Your data may only be nominal and thus have no need for sorting, but I think the changes would require more than switching between a TreeMap and HashMap. Specifically, thought would need to be given to the cumulative count and percentage methods. In terms of performance, I don't think there would be a perceptible difference between a TreeMap and a HashMap. Also, I think you could simply write your own class that implements the comparable interface. That way you could define any type of sorting you would like. > Frequency: new option: NON-sorted > - > > Key: MATH-473 > URL: https://issues.apache.org/jira/browse/MATH-473 > Project: Commons Math > Issue Type: Improvement >Reporter: Dan Checkoway > > I have a request for enhancement on org.apache.commons.math.stat.Frequency. > I would like to be able to specify that the the backing map NOT be sorted. > Right now it uses TreeMap. I would like to have the option of specifying > that sorting is not important, and would in fact hinder performance, and a > plain old HashMap should be used instead. > i.e. constructor such as: > public Frequency(boolean sorted); > If sorted is true, use a TreeMap. If sorted is false, use a HashMap. Is > this feasible? I'd be happy to contribute a patch if that would help. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MATH-448) Frequency get number of unique values
[ https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Meyer updated MATH-448: --- Attachment: MATH-448.patch > Frequency get number of unique values > - > > Key: MATH-448 > URL: https://issues.apache.org/jira/browse/MATH-448 > Project: Commons Math > Issue Type: New Feature >Reporter: Patrick Meyer >Priority: Minor > Attachments: MATH-448.patch, MATH-448.patch > > Original Estimate: 0.25h > Remaining Estimate: 0.25h > > It is often useful to know the number of unique elements in a frequency > table. Could you add a simple method that returns the size of freqTable. It > seems like it would be as simple as: > {code} > public int getUniqueCount(){ > return freqTable.size(); > } > {code} > Given that freqTable is private, there is no way to extend the class and add > this method. Thanks! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MATH-448) Frequency get number of unique values
[ https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12968555#action_12968555 ] Patrick Meyer commented on MATH-448: All right, maybe I have it right this time. I used Eclipse to generate the patch and it seems to have the right information in it. The latest path file (MATH-448.patch) is now attached to this issue. > Frequency get number of unique values > - > > Key: MATH-448 > URL: https://issues.apache.org/jira/browse/MATH-448 > Project: Commons Math > Issue Type: New Feature >Reporter: Patrick Meyer >Priority: Minor > Attachments: MATH-448.patch > > Original Estimate: 0.25h > Remaining Estimate: 0.25h > > It is often useful to know the number of unique elements in a frequency > table. Could you add a simple method that returns the size of freqTable. It > seems like it would be as simple as: > {code} > public int getUniqueCount(){ > return freqTable.size(); > } > {code} > Given that freqTable is private, there is no way to extend the class and add > this method. Thanks! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MATH-448) Frequency get number of unique values
[ https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Meyer updated MATH-448: --- Attachment: MATH-448.patch > Frequency get number of unique values > - > > Key: MATH-448 > URL: https://issues.apache.org/jira/browse/MATH-448 > Project: Commons Math > Issue Type: New Feature >Reporter: Patrick Meyer >Priority: Minor > Attachments: MATH-448.patch > > Original Estimate: 0.25h > Remaining Estimate: 0.25h > > It is often useful to know the number of unique elements in a frequency > table. Could you add a simple method that returns the size of freqTable. It > seems like it would be as simple as: > {code} > public int getUniqueCount(){ > return freqTable.size(); > } > {code} > Given that freqTable is private, there is no way to extend the class and add > this method. Thanks! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MATH-448) Frequency get number of unique values
[ https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Meyer updated MATH-448: --- Attachment: (was: Frequency.diff) > Frequency get number of unique values > - > > Key: MATH-448 > URL: https://issues.apache.org/jira/browse/MATH-448 > Project: Commons Math > Issue Type: New Feature >Reporter: Patrick Meyer >Priority: Minor > Attachments: MATH-448.patch > > Original Estimate: 0.25h > Remaining Estimate: 0.25h > > It is often useful to know the number of unique elements in a frequency > table. Could you add a simple method that returns the size of freqTable. It > seems like it would be as simple as: > {code} > public int getUniqueCount(){ > return freqTable.size(); > } > {code} > Given that freqTable is private, there is no way to extend the class and add > this method. Thanks! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MATH-448) Frequency get number of unique values
[ https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12967252#action_12967252 ] Patrick Meyer commented on MATH-448: OK, I uploaded a new patch that was created using the "Export Diff Patch" option in Netbeans. Is this the format you need? (This is the first patch I've tried to submit - thanks for your patience.) > Frequency get number of unique values > - > > Key: MATH-448 > URL: https://issues.apache.org/jira/browse/MATH-448 > Project: Commons Math > Issue Type: New Feature >Reporter: Patrick Meyer >Priority: Minor > Attachments: Frequency.diff > > Original Estimate: 0.25h > Remaining Estimate: 0.25h > > It is often useful to know the number of unique elements in a frequency > table. Could you add a simple method that returns the size of freqTable. It > seems like it would be as simple as: > {code} > public int getUniqueCount(){ > return freqTable.size(); > } > {code} > Given that freqTable is private, there is no way to extend the class and add > this method. Thanks! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MATH-448) Frequency get number of unique values
[ https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Meyer updated MATH-448: --- Attachment: Frequency.diff > Frequency get number of unique values > - > > Key: MATH-448 > URL: https://issues.apache.org/jira/browse/MATH-448 > Project: Commons Math > Issue Type: New Feature >Reporter: Patrick Meyer >Priority: Minor > Attachments: Frequency.diff > > Original Estimate: 0.25h > Remaining Estimate: 0.25h > > It is often useful to know the number of unique elements in a frequency > table. Could you add a simple method that returns the size of freqTable. It > seems like it would be as simple as: > {code} > public int getUniqueCount(){ > return freqTable.size(); > } > {code} > Given that freqTable is private, there is no way to extend the class and add > this method. Thanks! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MATH-448) Frequency get number of unique values
[ https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Meyer updated MATH-448: --- Attachment: (was: Frequency.diff) > Frequency get number of unique values > - > > Key: MATH-448 > URL: https://issues.apache.org/jira/browse/MATH-448 > Project: Commons Math > Issue Type: New Feature >Reporter: Patrick Meyer >Priority: Minor > Attachments: Frequency.diff > > Original Estimate: 0.25h > Remaining Estimate: 0.25h > > It is often useful to know the number of unique elements in a frequency > table. Could you add a simple method that returns the size of freqTable. It > seems like it would be as simple as: > {code} > public int getUniqueCount(){ > return freqTable.size(); > } > {code} > Given that freqTable is private, there is no way to extend the class and add > this method. Thanks! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MATH-448) Frequency get number of unique values
[ https://issues.apache.org/jira/browse/MATH-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Meyer updated MATH-448: --- Attachment: Frequency.diff > Frequency get number of unique values > - > > Key: MATH-448 > URL: https://issues.apache.org/jira/browse/MATH-448 > Project: Commons Math > Issue Type: New Feature >Reporter: Patrick Meyer >Priority: Minor > Attachments: Frequency.diff > > Original Estimate: 0.25h > Remaining Estimate: 0.25h > > It is often useful to know the number of unique elements in a frequency > table. Could you add a simple method that returns the size of freqTable. It > seems like it would be as simple as: > {code} > public int getUniqueCount(){ > return freqTable.size(); > } > {code} > Given that freqTable is private, there is no way to extend the class and add > this method. Thanks! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MATH-449) Storeless covariance
Storeless covariance Key: MATH-449 URL: https://issues.apache.org/jira/browse/MATH-449 Project: Commons Math Issue Type: Improvement Reporter: Patrick Meyer Currently there is no storeless version for computing the covariance. However, Pebay (2008) describes algorithms for on-line covariance computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf]. I have provided a simple class for implementing this algorithm. It would be nice to have this integrated into org.apache.commons.math.stat.correlation.Covariance. {code} public class StorelessCovariance{ private double deltaX = 0.0; private double deltaY = 0.0; private double meanX = 0.0; private double meanY = 0.0; private double N=0; private Double covarianceNumerator=0.0; private boolean unbiased=true; public Covariance(boolean unbiased){ this.unbiased = unbiased; } public void increment(Double x, Double y){ if(x!=null & y!=null){ N++; deltaX = x - meanX; deltaY = y - meanY; meanX += deltaX/N; meanY += deltaY/N; covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY; } } public Double getResult(){ if(unbiased){ return covarianceNumerator/(N-1.0); }else{ return covarianceNumerator/N; } } } {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MATH-448) Frequency get number of unique values
Frequency get number of unique values - Key: MATH-448 URL: https://issues.apache.org/jira/browse/MATH-448 Project: Commons Math Issue Type: New Feature Reporter: Patrick Meyer Priority: Minor It is often useful to know the number of unique elements in a frequency table. Could you add a simple method that returns the size of freqTable. It seems like it would be as simple as: {code} public int getUniqueCount(){ return freqTable.size(); } {code} Given that freqTable is private, there is no way to extend the class and add this method. Thanks! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.