Re: [R] Correlation discrepancy
Divide by 8 leads biased estimator of covariance. R cov function calculates unbiased estimator(divide by (sample size)-1). Regards, Kohta -- View this message in context: http://r.789695.n4.nabble.com/Correlation-discrepancy-tp3762457p3762491.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation discrepancy
Dear Mr Dimitris and Mr Harding, by mistake I have typed my colleagues name (i.e. Ashok) while thanking you. Please excuse me for that. Regards Vincy --- On Tue, 8/23/11, ted.hard...@wlandres.net wrote: From: ted.hard...@wlandres.net Subject: Re: [R] Correlation discrepancy To: r-help@r-project.org Cc: "Vincy Pyne" Received: Tuesday, August 23, 2011, 11:38 AM In addition, something has gone wrong, Vincy, with your data x,y between evaluating cov(x,y) and evaluating your explicit formula. If I repeat your commands: x = c(44,46,46,47,45,43,45,44) y = c(44,43,41,41,46,48,44,43) cov(x, y) # [1] -2.428571 sum((x-mean(x))*(y-mean(y)))/8 # [1] -2.125 which has the right sign and, when changed to incorporate the correct denomonator (n-1 = 7) as suggested by Dimitris: sum((x-mean(x))*(y-mean(y)))/7 # [1] -2.428571 gives exact agreement. With regard to your second formula, this should correspondingly be: sum(x*y)/7 - (mean(x)*mean(y))*8/7 # [1] -2.428571 again agreeing exactly. Your result: >> covariance = sum((x-mean(x))*(y-mean(y)))/8 # no of of paired >> obs. = 8 >> >> or >> >> covariance = sum(x*y)/8-(mean(x)*mean(y)) >> >> gives >> >> covariance = 2.125 agrees in numerical magnitude with the "1/8" form, but has the wrong sign. Or maybe you simply mis-typed "-2.125" as "2.125". Hoping this helps, Ted. On 23-Aug-11 11:25:15, Dimitris Rizopoulos wrote: > well, you don't have the correct denominator, i.e., n-1, > with n denoting the sample size. Have a look at the *Details* > section of the online help file for cov(), and try also > > sum((x-mean(x))*(y-mean(y)))/7 > cov(x, y) > > > I hope it helps. > > Best, > Dimitris > > > On 8/23/2011 1:18 PM, Vincy Pyne wrote: >> Dear R list, I have one very elementary question regrading correlation >> between two variables. >> >> x = c(44,46,46,47,45,43,45,44) >> y = c(44,43,41,41,46,48,44,43) >> >>> cov(x, y) >> [1] -2.428571 >> >> However, if I try to calculate the covariance using the formula as >> >> >> covariance = sum((x-mean(x))*(y-mean(y)))/8 # no of of paired >> obs. = 8 >> >> or >> >> covariance = sum(x*y)/8-(mean(x)*mean(y)) >> >> gives >> >> covariance = 2.125 >> >> I am not able to figure out where I am going wrong w.r.t. the >> covariance formula. Kindly guide. >> >> Regards >> >> Vincy >> >> >> >> >> >> >> >> >> >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- > Dimitris Rizopoulos > Assistant Professor > Department of Biostatistics > Erasmus University Medical Center > > Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands > Tel: +31/(0)10/7043478 > Fax: +31/(0)10/7043014 > Web: http://www.erasmusmc.nl/biostatistiek/ > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 23-Aug-11 Time: 12:38:36 -- XFMail -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation discrepancy
Dear Mr. Dimitris and Mr Harding, thanks a lot for your guidance. It will be interesting to find out how the Excel deals with this formula. I will try it. Thanks again. Regards Ashok --- On Tue, 8/23/11, ted.hard...@wlandres.net wrote: From: ted.hard...@wlandres.net Subject: Re: [R] Correlation discrepancy To: r-help@r-project.org Cc: "Vincy Pyne" Received: Tuesday, August 23, 2011, 11:38 AM In addition, something has gone wrong, Vincy, with your data x,y between evaluating cov(x,y) and evaluating your explicit formula. If I repeat your commands: x = c(44,46,46,47,45,43,45,44) y = c(44,43,41,41,46,48,44,43) cov(x, y) # [1] -2.428571 sum((x-mean(x))*(y-mean(y)))/8 # [1] -2.125 which has the right sign and, when changed to incorporate the correct denomonator (n-1 = 7) as suggested by Dimitris: sum((x-mean(x))*(y-mean(y)))/7 # [1] -2.428571 gives exact agreement. With regard to your second formula, this should correspondingly be: sum(x*y)/7 - (mean(x)*mean(y))*8/7 # [1] -2.428571 again agreeing exactly. Your result: >> covariance = sum((x-mean(x))*(y-mean(y)))/8 # no of of paired >> obs. = 8 >> >> or >> >> covariance = sum(x*y)/8-(mean(x)*mean(y)) >> >> gives >> >> covariance = 2.125 agrees in numerical magnitude with the "1/8" form, but has the wrong sign. Or maybe you simply mis-typed "-2.125" as "2.125". Hoping this helps, Ted. On 23-Aug-11 11:25:15, Dimitris Rizopoulos wrote: > well, you don't have the correct denominator, i.e., n-1, > with n denoting the sample size. Have a look at the *Details* > section of the online help file for cov(), and try also > > sum((x-mean(x))*(y-mean(y)))/7 > cov(x, y) > > > I hope it helps. > > Best, > Dimitris > > > On 8/23/2011 1:18 PM, Vincy Pyne wrote: >> Dear R list, I have one very elementary question regrading correlation >> between two variables. >> >> x = c(44,46,46,47,45,43,45,44) >> y = c(44,43,41,41,46,48,44,43) >> >>> cov(x, y) >> [1] -2.428571 >> >> However, if I try to calculate the covariance using the formula as >> >> >> covariance = sum((x-mean(x))*(y-mean(y)))/8 # no of of paired >> obs. = 8 >> >> or >> >> covariance = sum(x*y)/8-(mean(x)*mean(y)) >> >> gives >> >> covariance = 2.125 >> >> I am not able to figure out where I am going wrong w.r.t. the >> covariance formula. Kindly guide. >> >> Regards >> >> Vincy >> >> >> >> >> >> >> >> >> >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- > Dimitris Rizopoulos > Assistant Professor > Department of Biostatistics > Erasmus University Medical Center > > Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands > Tel: +31/(0)10/7043478 > Fax: +31/(0)10/7043014 > Web: http://www.erasmusmc.nl/biostatistiek/ > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 23-Aug-11 Time: 12:38:36 -- XFMail -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation discrepancy
In addition, something has gone wrong, Vincy, with your data x,y between evaluating cov(x,y) and evaluating your explicit formula. If I repeat your commands: x = c(44,46,46,47,45,43,45,44) y = c(44,43,41,41,46,48,44,43) cov(x, y) # [1] -2.428571 sum((x-mean(x))*(y-mean(y)))/8 # [1] -2.125 which has the right sign and, when changed to incorporate the correct denomonator (n-1 = 7) as suggested by Dimitris: sum((x-mean(x))*(y-mean(y)))/7 # [1] -2.428571 gives exact agreement. With regard to your second formula, this should correspondingly be: sum(x*y)/7 - (mean(x)*mean(y))*8/7 # [1] -2.428571 again agreeing exactly. Your result: >> covariance = sum((x-mean(x))*(y-mean(y)))/8 # no of of paired >> obs. = 8 >> >> or >> >> covariance = sum(x*y)/8-(mean(x)*mean(y)) >> >> gives >> >> covariance = 2.125 agrees in numerical magnitude with the "1/8" form, but has the wrong sign. Or maybe you simply mis-typed "-2.125" as "2.125". Hoping this helps, Ted. On 23-Aug-11 11:25:15, Dimitris Rizopoulos wrote: > well, you don't have the correct denominator, i.e., n-1, > with n denoting the sample size. Have a look at the *Details* > section of the online help file for cov(), and try also > > sum((x-mean(x))*(y-mean(y)))/7 > cov(x, y) > > > I hope it helps. > > Best, > Dimitris > > > On 8/23/2011 1:18 PM, Vincy Pyne wrote: >> Dear R list, I have one very elementary question regrading correlation >> between two variables. >> >> x = c(44,46,46,47,45,43,45,44) >> y = c(44,43,41,41,46,48,44,43) >> >>> cov(x, y) >> [1] -2.428571 >> >> However, if I try to calculate the covariance using the formula as >> >> >> covariance = sum((x-mean(x))*(y-mean(y)))/8 # no of of paired >> obs. = 8 >> >> or >> >> covariance = sum(x*y)/8-(mean(x)*mean(y)) >> >> gives >> >> covariance = 2.125 >> >> I am not able to figure out where I am going wrong w.r.t. the >> covariance formula. Kindly guide. >> >> Regards >> >> Vincy >> >> >> >> >> >> >> >> >> >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- > Dimitris Rizopoulos > Assistant Professor > Department of Biostatistics > Erasmus University Medical Center > > Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands > Tel: +31/(0)10/7043478 > Fax: +31/(0)10/7043014 > Web: http://www.erasmusmc.nl/biostatistiek/ > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 23-Aug-11 Time: 12:38:36 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation discrepancy
well, you don't have the correct denominator, i.e., n-1, with n denoting the sample size. Have a look at the *Details* section of the online help file for cov(), and try also sum((x-mean(x))*(y-mean(y)))/7 cov(x, y) I hope it helps. Best, Dimitris On 8/23/2011 1:18 PM, Vincy Pyne wrote: Dear R list, I have one very elementary question regrading correlation between two variables. x = c(44,46,46,47,45,43,45,44) y = c(44,43,41,41,46,48,44,43) cov(x, y) [1] -2.428571 However, if I try to calculate the covariance using the formula as covariance = sum((x-mean(x))*(y-mean(y)))/8 # no of of paired obs. = 8 or covariance = sum(x*y)/8-(mean(x)*mean(y)) gives covariance = 2.125 I am not able to figure out where I am going wrong w.r.t. the covariance formula. Kindly guide. Regards Vincy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.