Re: [R] Sampling in R
Dear Seyit, You might consider the boot package in this situation. Here is an example: require(boot) DF<-data.frame(Xvar,Yvar) temp <- boot(DF, function(DF,d){ S <- DF[d,] cor(S$Xvar,S$Yvar) }, R = 5000) temp$t0 # [1] 0.617422 max(temp$t) # [1] 0.7783784 hist(temp$t) Once you load the boot package, take a look at ?boot. HTH, Jorge On Tue, Apr 21, 2009 at 4:53 AM, Seyit Ali Kayis wrote: > > Dear R users, > > I need to do sampling without replacement (bootstraps). I have two > variables (Xvar, Yvar). > I have a correlation from original data set cor(Xvar, Yvar)=0.6174221. I am > doing 5 sampling, > and in each sampling calculating correlations, saving, sorting and > getting 95% cutt off point (0.1351877). > I am getting maximum value as 0.3507219 (much smaller than correlation of > my original data). > I repeated the sampling a couple of time and none of them produced a > correlation > coefficient higher than my original data set. However, if I sort out my > Xvar and Yvar and > obtain correlation it is 0.9657125 which is much higher than correlation > for my original data. > I am doing sampling in another program and getting at least 1% higher > correlation than mine. > Now I am getting confused with sampling(random data) in R. My data and > codes for the scenario above are > in the attached file. I want to understand where I am making a mistake. Any > comment is deeply appreciated. > > Kind Regards > > Seyit Ali > > > > -- > Dr. Seyit Ali KAYIS > Selcuk University > Faculty of Agriculture > Kampus, Konya, TURKEY > >s_a_ka...@yahoo.com,s_a_ka...@hotmail.com > Tell: +90 332 223 2830 Mobile: +90 535 587 1139 Fax: +90 332 241 0108 > > Greetings from Konya, TURKEY >http://www.ziraat.selcuk.edu.tr/skayis/ > > -- > > > > > > > _ > No-one wants to be lonely this Autumn Find someone to snuggle up with > > > Fchannel%2Findex%2Easpx%3Ftrackingid%3D1048628&_t=773568480&_r=nzWINDOWSliveMAILemailTAGLINES&_m=EXT > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sampling in R: Please read my email from attached text.
You need to sample pairs rather than sampling individually within Xvar and Yvar. You also generally sample with replacement. If you sample without replacement for the length of the data, then you just get the same set. On Apr 21, 2009, at 3:54 AM, skayis selcuk wrote: David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sampling in R
When you shuffle the observations independently, you are performing a permutation test (though for this you only need to shuffle one side of the pairs). When you sort the observations you are doing something ridiculous that has no statistical meaning that I know. I'm not very familiar with bootstrap CI's, but I think the idea is to sample the PAIRS of data WITH replacement: http://lmgtfy.com/?q=bootstrap+correlation (first link is to a good overview by David Howell) On Tue, Apr 21, 2009 at 7:25 AM, Seyit Ali Kayis wrote: > > Dear R users, > > I need to do sampling without replacement (bootstraps). I have two variables > (Xvar, Yvar). > I have a correlation from original data set cor(Xvar, Yvar)=0.6174221. I am > doing 5 sampling, > and in each sampling calculating correlations, saving, sorting and getting > 95% cutt off point (0.1351877). > I am getting maximum value as 0.3507219 (much smaller than correlation of my > original data). > I repeated the sampling a couple of time and none of them produced a > correlation > coefficient higher than my original data set. However, if I sort out my Xvar > and Yvar and > obtain correlation it is 0.9657125 which is much higher than correlation for > my original data. > I am doing sampling in another program and getting at least 1% higher > correlation than mine. > Now I am getting confused with sampling(random data) in R. My data and codes > for the scenario above are below > > > Xvar<-c(0.1818182,0.5384615,0.5535714,0.4680851,0.4545455,0.4385965,0.5185185,0.4035088,0.4901961,0.3650794,0.462963,0.4,0.56,0.3965517,0.4909091, > > 0.4716981,0.4310345,0.2,0.1509434,0.2647059,0.173913,0.1914894,0.1914894,0.1489362,0.1363636,0.2244898,0.2325581,0.133,0.1818182,0.1702128, > > 0.2173913,0.2380952,0.1632653,0.5614035,0.3396226,0.4909091,0.3770492,0.5,0.5185185,0.5,0.467,0.4464286,0.362069,0.4285714,0.4561404, > > 0.4736842,0.4545455,0.417,0.4181818,0.4590164,0.517,0.5423729,0.483,0.5454545,0.4393939,0.5172414,0.4098361,0.4745763,0.4754098, > > 0.517,0.5,0.4603175,0.42,0.4038462,0.4897959,0.3148148,0.3673469,0.4,0.458,0.3877551,0.4375,0.4117647,0.4313725,0.533,0.3962264, > > 0.3548387,0.5272727,0.4137931,0.3928571,0.467,0.4210526,0.4363636,0.4545455,0.4310345,0.4237288,0.4814815,0.4912281,0.433,0.4,0.4285714, > > 0.4516129,0.5090909,0.4464286,0.4642857,0.417,0.4098361,0.4909091,0.3809524,0.5272727,0.4814815,0.5254237,0.627451,0.5,0.5471698,0.5454545, > > 0.5925926,0.5769231,0.5818182,0.444,0.4915254,0.4727273,0.4107143,0.4285714,0.4310345,0.4237288,0.4285714,0.440678,0.4237288,0.4807692, > > 0.4150943,0.4615385,0.4107143,0.4814815,0.4074074,0.4210526,0.5263158,0.440678,0.4576271,0.5344828,0.5,0.5636364,0.4677419,0.5,0.5192308, > > 0.4642857,0.5090909,0.58,0.4482759,0.5098039,0.4035088,0.4210526,0.5098039,0.4385965,0.5283019,0.5471698,0.625,0.4310345,0.4912281,0.5283019, > 0.4576271,0.5471698,0.4745763,0.4821429) > > Yvar<-c(0.2553191,0.4107143,0.5660377,0.389,0.3606557,0.2898551,0.3818182,0.4,0.4,0.3278689,0.2903226,0.4074074,0.4181818,0.3,0.2238806,0.3728814, > > 0.3709677,0.2307692,0.2830189,0.2244898,0.2142857,0.2131148,0.22,0.2258065,0.2321429,0.2,0.2264151,0.22,0.2115385,0.2459016,0.117,0.1785714, > > 0.2068966,0.6,0.4285714,0.3134328,0.4461538,0.3965517,0.4769231,0.6181818,0.4827586,0.3709677,0.3965517,0.4821429,0.4545455,0.359375,0.4576271, > > 0.4516129,0.5272727,0.4603175,0.4,0.4912281,0.5384615,0.5,0.4516129,0.4126984,0.4655172,0.5263158,0.4925373,0.358209,0.4285714,0.4920635, > > 0.4482759,0.3235294,0.4,0.4375,0.440678,0.3898305,0.35,0.4528302,0.58,0.4153846,0.3174603,0.5185185,0.3870968,0.2894737,0.3709677,0.369863, > > 0.3676471,0.3636364,0.3088235,0.328125,0.4032258,0.4084507,0.3188406,0.3636364,0.3823529,0.2816901,0.472,0.5,0.3521127,0.4393939,0.3787879, > > 0.453125,0.4324324,0.4057971,0.4545455,0.4492754,0.5,0.4098361,0.4067797,0.367,0.3928571,0.4285714,0.5,0.2923077,0.4561404,0.45,0.5538462, > > 0.4626866,0.4057971,0.3676471,0.5322581,0.5428571,0.375,0.4411765,0.4571429,0.4,0.3846154,0.3870968,0.4915254,0.530303,0.4375,0.4918033,0.4179104, > > 0.4032258,0.3606557,0.5178571,0.4848485,0.390625,0.375,0.4375,0.367,0.4,0.4477612,0.2571429,0.4032258,0.3382353,0.4814815,0.4090909,0.3548387, > > 0.4821429,0.5,0.557377,0.433,0.5454545,0.4590164,0.3943662,0.5076923,0.5,0.3283582,0.3676471,0.559322) > > my.cor<-cor(Xvar, Yvar) > print(my.cor) > > nperm<-4 > Perm.Cor<-NULL > > for (iperm in 1:nperm) { > XvarNew<-sample(Xvar, size=length(Xvar), replace=FALSE) > YvarNew<-sample(Yvar, size=length(Yvar), replace=FALSE) > perm.cor<-cor(XvarNew, YvarNew) > Perm.Cor<-c(Perm.Cor, perm.cor) > } > print(max(Perm.Cor)) > XvarSorted<-sort(Xvar, decreasing=TRUE) > YvarSorted<-sort(Yvar, decreasing=TRUE) > max.cor<-cor
Re: [R] Sampling in R
Seyit Ali Kayis wrote: Dear R users, I need to do sampling without replacement (bootstraps). I have two variables (Xvar, Yvar). I have a correlation from original data set cor(Xvar, Yvar)=0.6174221. I am doing 5 sampling, and in each sampling calculating correlations, saving, sorting and getting 95% cutt off point (0.1351877). I am getting maximum value as 0.3507219 (much smaller than correlation of my original data). I repeated the sampling a couple of time and none of them produced a correlation coefficient higher than my original data set. However, if I sort out my Xvar and Yvar and obtain correlation it is 0.9657125 which is much higher than correlation for my original data. I am doing sampling in another program and getting at least 1% higher correlation than mine. Now I am getting confused with sampling(random data) in R. My data and codes for the scenario above are below Xvar<-c(0.1818182,0.5384615,0.5535714,0.4680851,0.4545455,0.4385965,0.5185185,0.4035088,0.4901961,0.3650794,0.462963,0.4,0.56,0.3965517,0.4909091, 0.4716981,0.4310345,0.2,0.1509434,0.2647059,0.173913,0.1914894,0.1914894,0.1489362,0.1363636,0.2244898,0.2325581,0.133,0.1818182,0.1702128, 0.2173913,0.2380952,0.1632653,0.5614035,0.3396226,0.4909091,0.3770492,0.5,0.5185185,0.5,0.467,0.4464286,0.362069,0.4285714,0.4561404, 0.4736842,0.4545455,0.417,0.4181818,0.4590164,0.517,0.5423729,0.483,0.5454545,0.4393939,0.5172414,0.4098361,0.4745763,0.4754098, 0.517,0.5,0.4603175,0.42,0.4038462,0.4897959,0.3148148,0.3673469,0.4,0.458,0.3877551,0.4375,0.4117647,0.4313725,0.533,0.3962264, 0.3548387,0.5272727,0.4137931,0.3928571,0.467,0.4210526,0.4363636,0.4545455,0.4310345,0.4237288,0.4814815,0.4912281,0.433,0.4,0.4285714, 0.4516129,0.5090909,0.4464286,0.4642857,0.417,0.4098361,0.4909091,0.3809524,0.5272727,0.4814815,0.5254237,0.627451,0.5,0.5471698,0.5454545, 0.5925926,0.5769231,0.5818182,0.444,0.4915254,0.4727273,0.4107143,0.4285714,0.4310345,0.4237288,0.4285714,0.440678,0.4237288,0.4807692, 0.4150943,0.4615385,0.4107143,0.4814815,0.4074074,0.4210526,0.5263158,0.440678,0.4576271,0.5344828,0.5,0.5636364,0.4677419,0.5,0.5192308, 0.4642857,0.5090909,0.58,0.4482759,0.5098039,0.4035088,0.4210526,0.5098039,0.4385965,0.5283019,0.5471698,0.625,0.4310345,0.4912281,0.5283019, 0.4576271,0.5471698,0.4745763,0.4821429) Yvar<-c(0.2553191,0.4107143,0.5660377,0.389,0.3606557,0.2898551,0.3818182,0.4,0.4,0.3278689,0.2903226,0.4074074,0.4181818,0.3,0.2238806,0.3728814, 0.3709677,0.2307692,0.2830189,0.2244898,0.2142857,0.2131148,0.22,0.2258065,0.2321429,0.2,0.2264151,0.22,0.2115385,0.2459016,0.117,0.1785714, 0.2068966,0.6,0.4285714,0.3134328,0.4461538,0.3965517,0.4769231,0.6181818,0.4827586,0.3709677,0.3965517,0.4821429,0.4545455,0.359375,0.4576271, 0.4516129,0.5272727,0.4603175,0.4,0.4912281,0.5384615,0.5,0.4516129,0.4126984,0.4655172,0.5263158,0.4925373,0.358209,0.4285714,0.4920635, 0.4482759,0.3235294,0.4,0.4375,0.440678,0.3898305,0.35,0.4528302,0.58,0.4153846,0.3174603,0.5185185,0.3870968,0.2894737,0.3709677,0.369863, 0.3676471,0.3636364,0.3088235,0.328125,0.4032258,0.4084507,0.3188406,0.3636364,0.3823529,0.2816901,0.472,0.5,0.3521127,0.4393939,0.3787879, 0.453125,0.4324324,0.4057971,0.4545455,0.4492754,0.5,0.4098361,0.4067797,0.367,0.3928571,0.4285714,0.5,0.2923077,0.4561404,0.45,0.5538462, 0.4626866,0.4057971,0.3676471,0.5322581,0.5428571,0.375,0.4411765,0.4571429,0.4,0.3846154,0.3870968,0.4915254,0.530303,0.4375,0.4918033,0.4179104, 0.4032258,0.3606557,0.5178571,0.4848485,0.390625,0.375,0.4375,0.367,0.4,0.4477612,0.2571429,0.4032258,0.3382353,0.4814815,0.4090909,0.3548387, 0.4821429,0.5,0.557377,0.433,0.5454545,0.4590164,0.3943662,0.5076923,0.5,0.3283582,0.3676471,0.559322) my.cor<-cor(Xvar, Yvar) print(my.cor) nperm<-4 Perm.Cor<-NULL for (iperm in 1:nperm) { XvarNew<-sample(Xvar, size=length(Xvar), replace=FALSE) YvarNew<-sample(Yvar, size=length(Yvar), replace=FALSE) perm.cor<-cor(XvarNew, YvarNew) Perm.Cor<-c(Perm.Cor, perm.cor) } print(max(Perm.Cor)) XvarSorted<-sort(Xvar, decreasing=TRUE) YvarSorted<-sort(Yvar, decreasing=TRUE) max.cor<-cor(XvarSorted, YvarSorted) print(max.cor) if(mat.cor>0) Perm.Cor.Sorted<-sort(Perm.Cor, decreasing=TRUE) if(mat.cor<0) Perm.Cor.Sorted<-sort(Perm.Cor, decreasing=FALSE) T95<-Perm.Cor.Sorted[(nperm+1)*0.05]# 95% treshold value T99<-Perm.Cor.Sorted[(nperm+1)*0.01]# 99% treshold value I want to understand where I am making a mistake. Any comment is deeply appreciated. Well, if you are permuting Xvar and Yvar separately or sorting them (separately), then you cannot expect to get the same correlation again. Look at the formula and make an example for yoursel
[R] Sampling in R
Dear R users, I need to do sampling without replacement (bootstraps). I have two variables (Xvar, Yvar). I have a correlation from original data set cor(Xvar, Yvar)=0.6174221. I am doing 5 sampling, and in each sampling calculating correlations, saving, sorting and getting 95% cutt off point (0.1351877). I am getting maximum value as 0.3507219 (much smaller than correlation of my original data). I repeated the sampling a couple of time and none of them produced a correlation coefficient higher than my original data set. However, if I sort out my Xvar and Yvar and obtain correlation it is 0.9657125 which is much higher than correlation for my original data. I am doing sampling in another program and getting at least 1% higher correlation than mine. Now I am getting confused with sampling(random data) in R. My data and codes for the scenario above are below Xvar<-c(0.1818182,0.5384615,0.5535714,0.4680851,0.4545455,0.4385965,0.5185185,0.4035088,0.4901961,0.3650794,0.462963,0.4,0.56,0.3965517,0.4909091, 0.4716981,0.4310345,0.2,0.1509434,0.2647059,0.173913,0.1914894,0.1914894,0.1489362,0.1363636,0.2244898,0.2325581,0.133,0.1818182,0.1702128, 0.2173913,0.2380952,0.1632653,0.5614035,0.3396226,0.4909091,0.3770492,0.5,0.5185185,0.5,0.467,0.4464286,0.362069,0.4285714,0.4561404, 0.4736842,0.4545455,0.417,0.4181818,0.4590164,0.517,0.5423729,0.483,0.5454545,0.4393939,0.5172414,0.4098361,0.4745763,0.4754098, 0.517,0.5,0.4603175,0.42,0.4038462,0.4897959,0.3148148,0.3673469,0.4,0.458,0.3877551,0.4375,0.4117647,0.4313725,0.533,0.3962264, 0.3548387,0.5272727,0.4137931,0.3928571,0.467,0.4210526,0.4363636,0.4545455,0.4310345,0.4237288,0.4814815,0.4912281,0.433,0.4,0.4285714, 0.4516129,0.5090909,0.4464286,0.4642857,0.417,0.4098361,0.4909091,0.3809524,0.5272727,0.4814815,0.5254237,0.627451,0.5,0.5471698,0.5454545, 0.5925926,0.5769231,0.5818182,0.444,0.4915254,0.4727273,0.4107143,0.4285714,0.4310345,0.4237288,0.4285714,0.440678,0.4237288,0.4807692, 0.4150943,0.4615385,0.4107143,0.4814815,0.4074074,0.4210526,0.5263158,0.440678,0.4576271,0.5344828,0.5,0.5636364,0.4677419,0.5,0.5192308, 0.4642857,0.5090909,0.58,0.4482759,0.5098039,0.4035088,0.4210526,0.5098039,0.4385965,0.5283019,0.5471698,0.625,0.4310345,0.4912281,0.5283019, 0.4576271,0.5471698,0.4745763,0.4821429) Yvar<-c(0.2553191,0.4107143,0.5660377,0.389,0.3606557,0.2898551,0.3818182,0.4,0.4,0.3278689,0.2903226,0.4074074,0.4181818,0.3,0.2238806,0.3728814, 0.3709677,0.2307692,0.2830189,0.2244898,0.2142857,0.2131148,0.22,0.2258065,0.2321429,0.2,0.2264151,0.22,0.2115385,0.2459016,0.117,0.1785714, 0.2068966,0.6,0.4285714,0.3134328,0.4461538,0.3965517,0.4769231,0.6181818,0.4827586,0.3709677,0.3965517,0.4821429,0.4545455,0.359375,0.4576271, 0.4516129,0.5272727,0.4603175,0.4,0.4912281,0.5384615,0.5,0.4516129,0.4126984,0.4655172,0.5263158,0.4925373,0.358209,0.4285714,0.4920635, 0.4482759,0.3235294,0.4,0.4375,0.440678,0.3898305,0.35,0.4528302,0.58,0.4153846,0.3174603,0.5185185,0.3870968,0.2894737,0.3709677,0.369863, 0.3676471,0.3636364,0.3088235,0.328125,0.4032258,0.4084507,0.3188406,0.3636364,0.3823529,0.2816901,0.472,0.5,0.3521127,0.4393939,0.3787879, 0.453125,0.4324324,0.4057971,0.4545455,0.4492754,0.5,0.4098361,0.4067797,0.367,0.3928571,0.4285714,0.5,0.2923077,0.4561404,0.45,0.5538462, 0.4626866,0.4057971,0.3676471,0.5322581,0.5428571,0.375,0.4411765,0.4571429,0.4,0.3846154,0.3870968,0.4915254,0.530303,0.4375,0.4918033,0.4179104, 0.4032258,0.3606557,0.5178571,0.4848485,0.390625,0.375,0.4375,0.367,0.4,0.4477612,0.2571429,0.4032258,0.3382353,0.4814815,0.4090909,0.3548387, 0.4821429,0.5,0.557377,0.433,0.5454545,0.4590164,0.3943662,0.5076923,0.5,0.3283582,0.3676471,0.559322) my.cor<-cor(Xvar, Yvar) print(my.cor) nperm<-4 Perm.Cor<-NULL for (iperm in 1:nperm) { XvarNew<-sample(Xvar, size=length(Xvar), replace=FALSE) YvarNew<-sample(Yvar, size=length(Yvar), replace=FALSE) perm.cor<-cor(XvarNew, YvarNew) Perm.Cor<-c(Perm.Cor, perm.cor) } print(max(Perm.Cor)) XvarSorted<-sort(Xvar, decreasing=TRUE) YvarSorted<-sort(Yvar, decreasing=TRUE) max.cor<-cor(XvarSorted, YvarSorted) print(max.cor) if(mat.cor>0) Perm.Cor.Sorted<-sort(Perm.Cor, decreasing=TRUE) if(mat.cor<0) Perm.Cor.Sorted<-sort(Perm.Cor, decreasing=FALSE) T95<-Perm.Cor.Sorted[(nperm+1)*0.05]# 95% treshold value T99<-Perm.Cor.Sorted[(nperm+1)*0.01]# 99% treshold value I want to understand where I am making a mistake. Any comment is deeply appreciated. Kind Regards Seyit Ali -- Dr. Seyit Ali KAYIS Selcuk University Faculty of Agriculture Kampus, Ko
[R] Sampling in R
Dear R users, I need to do sampling without replacement (bootstraps). I have two variables (Xvar, Yvar). I have a correlation from original data set cor(Xvar, Yvar)=0.6174221. I am doing 5 sampling, and in each sampling calculating correlations, saving, sorting and getting 95% cutt off point (0.1351877). I am getting maximum value as 0.3507219 (much smaller than correlation of my original data). I repeated the sampling a couple of time and none of them produced a correlation coefficient higher than my original data set. However, if I sort out my Xvar and Yvar and obtain correlation it is 0.9657125 which is much higher than correlation for my original data. I am doing sampling in another program and getting at least 1% higher correlation than mine. Now I am getting confused with sampling(random data) in R. My data and codes for the scenario above are in the attached file. I want to understand where I am making a mistake. Any comment is deeply appreciated. Kind Regards Seyit Ali -- Dr. Seyit Ali KAYIS Selcuk University Faculty of Agriculture Kampus, Konya, TURKEY s_a_ka...@yahoo.com,s_a_ka...@hotmail.com Tell: +90 332 223 2830 Mobile: +90 535 587 1139 Fax: +90 332 241 0108 Greetings from Konya, TURKEY http://www.ziraat.selcuk.edu.tr/skayis/ -- _ No-one wants to be lonely this Autumn Find someone to snuggle up with Fchannel%2Findex%2Easpx%3Ftrackingid%3D1048628&_t=773568480&_r=nzWINDOWSliveMAILemailTAGLINES&_m=EXTXvar<-c(0.1818182,0.5384615,0.5535714,0.4680851,0.4545455,0.4385965,0.5185185,0.4035088,0.4901961,0.3650794,0.462963,0.4,0.56,0.3965517,0.4909091, 0.4716981,0.4310345,0.2,0.1509434,0.2647059,0.173913,0.1914894,0.1914894,0.1489362,0.1363636,0.2244898,0.2325581,0.133,0.1818182,0.1702128, 0.2173913,0.2380952,0.1632653,0.5614035,0.3396226,0.4909091,0.3770492,0.5,0.5185185,0.5,0.467,0.4464286,0.362069,0.4285714,0.4561404, 0.4736842,0.4545455,0.417,0.4181818,0.4590164,0.517,0.5423729,0.483,0.5454545,0.4393939,0.5172414,0.4098361,0.4745763,0.4754098, 0.517,0.5,0.4603175,0.42,0.4038462,0.4897959,0.3148148,0.3673469,0.4,0.458,0.3877551,0.4375,0.4117647,0.4313725,0.533,0.3962264, 0.3548387,0.5272727,0.4137931,0.3928571,0.467,0.4210526,0.4363636,0.4545455,0.4310345,0.4237288,0.4814815,0.4912281,0.433,0.4,0.4285714, 0.4516129,0.5090909,0.4464286,0.4642857,0.417,0.4098361,0.4909091,0.3809524,0.5272727,0.4814815,0.5254237,0.627451,0.5,0.5471698,0.5454545, 0.5925926,0.5769231,0.5818182,0.444,0.4915254,0.4727273,0.4107143,0.4285714,0.4310345,0.4237288,0.4285714,0.440678,0.4237288,0.4807692, 0.4150943,0.4615385,0.4107143,0.4814815,0.4074074,0.4210526,0.5263158,0.440678,0.4576271,0.5344828,0.5,0.5636364,0.4677419,0.5,0.5192308, 0.4642857,0.5090909,0.58,0.4482759,0.5098039,0.4035088,0.4210526,0.5098039,0.4385965,0.5283019,0.5471698,0.625,0.4310345,0.4912281,0.5283019, 0.4576271,0.5471698,0.4745763,0.4821429) Yvar<-c(0.2553191,0.4107143,0.5660377,0.389,0.3606557,0.2898551,0.3818182,0.4,0.4,0.3278689,0.2903226,0.4074074,0.4181818,0.3,0.2238806,0.3728814, 0.3709677,0.2307692,0.2830189,0.2244898,0.2142857,0.2131148,0.22,0.2258065,0.2321429,0.2,0.2264151,0.22,0.2115385,0.2459016,0.117,0.1785714, 0.2068966,0.6,0.4285714,0.3134328,0.4461538,0.3965517,0.4769231,0.6181818,0.4827586,0.3709677,0.3965517,0.4821429,0.4545455,0.359375,0.4576271, 0.4516129,0.5272727,0.4603175,0.4,0.4912281,0.5384615,0.5,0.4516129,0.4126984,0.4655172,0.5263158,0.4925373,0.358209,0.4285714,0.4920635, 0.4482759,0.3235294,0.4,0.4375,0.440678,0.3898305,0.35,0.4528302,0.58,0.4153846,0.3174603,0.5185185,0.3870968,0.2894737,0.3709677,0.369863, 0.3676471,0.3636364,0.3088235,0.328125,0.4032258,0.4084507,0.3188406,0.3636364,0.3823529,0.2816901,0.472,0.5,0.3521127,0.4393939,0.3787879, 0.453125,0.4324324,0.4057971,0.4545455,0.4492754,0.5,0.4098361,0.4067797,0.367,0.3928571,0.4285714,0.5,0.2923077,0.4561404,0.45,0.5538462, 0.4626866,0.4057971,0.3676471,0.5322581,0.5428571,0.375,0.4411765,0.4571429,0.4,0.3846154,0.3870968,0.4915254,0.530303,0.4375,0.4918033,0.4179104, 0.4032258,0.3606557,0.5178571,0.4848485,0.390625,0.375,0.4375,0.367,0.4,0.4477612,0.2571429,0.4032258,0.3382353,0.4814815,0.4090909,0.3548387, 0.4821429,0.5,0.557377,0.433,0.5454545,0.4590164,0.3943662,0.5076923,0.5,0.3283582,0.3676471,0.559322) my.cor<-cor(Xvar, Yvar) print(my.cor) nperm<-4 Perm.Cor<-NULL for (iperm in 1:nperm) { XvarNew<-sample(Xvar, size=length(Xvar
[R] Sampling in R: Please read my email from attached text.
(Sorry for multiple posting. Seems to be my msg is not distributed in my previous emails) Dear R users, I need to do sampling without replacement (bootstraps). I have two variables (Xvar, Yvar). I have a correlation from original data set cor(Xvar, Yvar)=0.6174221. I am doing 5 sampling, and in each sampling calculating correlations, saving, sorting and getting 95% cutt off point (0.1351877). I am getting maximum value as 0.3507219 (much smaller than correlation of my original data). I repeated the sampling a couple of time and none of them produced a correlation coefficient higher than my original data set. However, if I sort out my Xvar and Yvar and obtain correlation it is 0.9657125 which is much higher than correlation for my original data. I am doing sampling in another program and getting at least 1% higher correlation than mine. Now I am getting confused with sampling(random data) in R. My data and codes for the scenario above are in the attached file. I want to understand where I am making a mistake. Any comment is deeply appreciated. Kind Regards Seyit Ali Xvar<-c(0.1818182,0.5384615,0.5535714,0.4680851,0.4545455,0.4385965,0.5185185,0.4035088,0.4901961,0.3650794,0.462963,0.4,0.56,0.3965517,0.4909091, 0.4716981,0.4310345,0.2,0.1509434,0.2647059,0.173913,0.1914894,0.1914894,0.1489362,0.1363636,0.2244898,0.2325581,0.133,0.1818182,0.1702128, 0.2173913,0.2380952,0.1632653,0.5614035,0.3396226,0.4909091,0.3770492,0.5,0.5185185,0.5,0.467,0.4464286,0.362069,0.4285714,0.4561404, 0.4736842,0.4545455,0.417,0.4181818,0.4590164,0.517,0.5423729,0.483,0.5454545,0.4393939,0.5172414,0.4098361,0.4745763,0.4754098, 0.517,0.5,0.4603175,0.42,0.4038462,0.4897959,0.3148148,0.3673469,0.4,0.458,0.3877551,0.4375,0.4117647,0.4313725,0.533,0.3962264, 0.3548387,0.5272727,0.4137931,0.3928571,0.467,0.4210526,0.4363636,0.4545455,0.4310345,0.4237288,0.4814815,0.4912281,0.433,0.4,0.4285714, 0.4516129,0.5090909,0.4464286,0.4642857,0.417,0.4098361,0.4909091,0.3809524,0.5272727,0.4814815,0.5254237,0.627451,0.5,0.5471698,0.5454545, 0.5925926,0.5769231,0.5818182,0.444,0.4915254,0.4727273,0.4107143,0.4285714,0.4310345,0.4237288,0.4285714,0.440678,0.4237288,0.4807692, 0.4150943,0.4615385,0.4107143,0.4814815,0.4074074,0.4210526,0.5263158,0.440678,0.4576271,0.5344828,0.5,0.5636364,0.4677419,0.5,0.5192308, 0.4642857,0.5090909,0.58,0.4482759,0.5098039,0.4035088,0.4210526,0.5098039,0.4385965,0.5283019,0.5471698,0.625,0.4310345,0.4912281,0.5283019, 0.4576271,0.5471698,0.4745763,0.4821429) Yvar<-c(0.2553191,0.4107143,0.5660377,0.389,0.3606557,0.2898551,0.3818182,0.4,0.4,0.3278689,0.2903226,0.4074074,0.4181818,0.3,0.2238806,0.3728814, 0.3709677,0.2307692,0.2830189,0.2244898,0.2142857,0.2131148,0.22,0.2258065,0.2321429,0.2,0.2264151,0.22,0.2115385,0.2459016,0.117,0.1785714, 0.2068966,0.6,0.4285714,0.3134328,0.4461538,0.3965517,0.4769231,0.6181818,0.4827586,0.3709677,0.3965517,0.4821429,0.4545455,0.359375,0.4576271, 0.4516129,0.5272727,0.4603175,0.4,0.4912281,0.5384615,0.5,0.4516129,0.4126984,0.4655172,0.5263158,0.4925373,0.358209,0.4285714,0.4920635, 0.4482759,0.3235294,0.4,0.4375,0.440678,0.3898305,0.35,0.4528302,0.58,0.4153846,0.3174603,0.5185185,0.3870968,0.2894737,0.3709677,0.369863, 0.3676471,0.3636364,0.3088235,0.328125,0.4032258,0.4084507,0.3188406,0.3636364,0.3823529,0.2816901,0.472,0.5,0.3521127,0.4393939,0.3787879, 0.453125,0.4324324,0.4057971,0.4545455,0.4492754,0.5,0.4098361,0.4067797,0.367,0.3928571,0.4285714,0.5,0.2923077,0.4561404,0.45,0.5538462, 0.4626866,0.4057971,0.3676471,0.5322581,0.5428571,0.375,0.4411765,0.4571429,0.4,0.3846154,0.3870968,0.4915254,0.530303,0.4375,0.4918033,0.4179104, 0.4032258,0.3606557,0.5178571,0.4848485,0.390625,0.375,0.4375,0.367,0.4,0.4477612,0.2571429,0.4032258,0.3382353,0.4814815,0.4090909,0.3548387, 0.4821429,0.5,0.557377,0.433,0.5454545,0.4590164,0.3943662,0.5076923,0.5,0.3283582,0.3676471,0.559322) my.cor<-cor(Xvar, Yvar) print(my.cor) nperm<-4 Perm.Cor<-NULL for (iperm in 1:nperm) { XvarNew<-sample(Xvar, size=length(Xvar), replace=FALSE) YvarNew<-sample(Yvar, size=length(Yvar), replace=FALSE) perm.cor<-cor(XvarNew, YvarNew) Perm.Cor<-c(Perm.Cor, perm.cor) } print(max(Perm.Cor)) XvarSorted<-sort(Xvar, decreasing=TRUE) YvarSorted<-sort(Yvar, decreasing=TRUE) max.cor<-cor(XvarSorted, YvarSorted) print(max.cor) if(mat.cor>0) Perm.Cor.Sorted<-sort(Perm.Cor, decreasing=TRUE) if(mat.cor<0) Perm.Cor.Sorted<-sort(Perm.Cor, decreasing=FALSE) T95<-Perm.Cor.Sorted[(nperm+1)*0.05]# 95% treshold value T99<-Perm.Cor.Sorted[(nperm+1)*0.01]# 99% treshold value __ R-help@r-project.org mailing list h
[R] Sampling in R
(Sorry for multiple posting. Seems to be my msg is not distributed in my previous emails) Dear R users, I need to do sampling without replacement (bootstraps). I have two variables (Xvar, Yvar). I have a correlation from original data set cor(Xvar, Yvar)=0.6174221. I am doing 5 sampling, and in each sampling calculating correlations, saving, sorting and getting 95% cutt off point (0.1351877). I am getting maximum value as 0.3507219 (much smaller than correlation of my original data). I repeated the sampling a couple of time and none of them produced a correlation coefficient higher than my original data set. However, if I sort out my Xvar and Yvar and obtain correlation it is 0.9657125 which is much higher than correlation for my original data. I am doing sampling in another program and getting at least 1% higher correlation than mine. Now I am getting confused with sampling(random data) in R. My data and codes for the scenario above are in the attached file. I want to understand where I am making a mistake. Any comment is deeply appreciated. Kind Regards Seyit Ali Xvar<-c(0.1818182,0.5384615,0.5535714,0.4680851,0.4545455,0.4385965,0.5185185,0.4035088,0.4901961,0.3650794,0.462963,0.4,0.56,0.3965517,0.4909091, 0.4716981,0.4310345,0.2,0.1509434,0.2647059,0.173913,0.1914894,0.1914894,0.1489362,0.1363636,0.2244898,0.2325581,0.133,0.1818182,0.1702128, 0.2173913,0.2380952,0.1632653,0.5614035,0.3396226,0.4909091,0.3770492,0.5,0.5185185,0.5,0.467,0.4464286,0.362069,0.4285714,0.4561404, 0.4736842,0.4545455,0.417,0.4181818,0.4590164,0.517,0.5423729,0.483,0.5454545,0.4393939,0.5172414,0.4098361,0.4745763,0.4754098, 0.517,0.5,0.4603175,0.42,0.4038462,0.4897959,0.3148148,0.3673469,0.4,0.458,0.3877551,0.4375,0.4117647,0.4313725,0.533,0.3962264, 0.3548387,0.5272727,0.4137931,0.3928571,0.467,0.4210526,0.4363636,0.4545455,0.4310345,0.4237288,0.4814815,0.4912281,0.433,0.4,0.4285714, 0.4516129,0.5090909,0.4464286,0.4642857,0.417,0.4098361,0.4909091,0.3809524,0.5272727,0.4814815,0.5254237,0.627451,0.5,0.5471698,0.5454545, 0.5925926,0.5769231,0.5818182,0.444,0.4915254,0.4727273,0.4107143,0.4285714,0.4310345,0.4237288,0.4285714,0.440678,0.4237288,0.4807692, 0.4150943,0.4615385,0.4107143,0.4814815,0.4074074,0.4210526,0.5263158,0.440678,0.4576271,0.5344828,0.5,0.5636364,0.4677419,0.5,0.5192308, 0.4642857,0.5090909,0.58,0.4482759,0.5098039,0.4035088,0.4210526,0.5098039,0.4385965,0.5283019,0.5471698,0.625,0.4310345,0.4912281,0.5283019, 0.4576271,0.5471698,0.4745763,0.4821429) Yvar<-c(0.2553191,0.4107143,0.5660377,0.389,0.3606557,0.2898551,0.3818182,0.4,0.4,0.3278689,0.2903226,0.4074074,0.4181818,0.3,0.2238806,0.3728814, 0.3709677,0.2307692,0.2830189,0.2244898,0.2142857,0.2131148,0.22,0.2258065,0.2321429,0.2,0.2264151,0.22,0.2115385,0.2459016,0.117,0.1785714, 0.2068966,0.6,0.4285714,0.3134328,0.4461538,0.3965517,0.4769231,0.6181818,0.4827586,0.3709677,0.3965517,0.4821429,0.4545455,0.359375,0.4576271, 0.4516129,0.5272727,0.4603175,0.4,0.4912281,0.5384615,0.5,0.4516129,0.4126984,0.4655172,0.5263158,0.4925373,0.358209,0.4285714,0.4920635, 0.4482759,0.3235294,0.4,0.4375,0.440678,0.3898305,0.35,0.4528302,0.58,0.4153846,0.3174603,0.5185185,0.3870968,0.2894737,0.3709677,0.369863, 0.3676471,0.3636364,0.3088235,0.328125,0.4032258,0.4084507,0.3188406,0.3636364,0.3823529,0.2816901,0.472,0.5,0.3521127,0.4393939,0.3787879, 0.453125,0.4324324,0.4057971,0.4545455,0.4492754,0.5,0.4098361,0.4067797,0.367,0.3928571,0.4285714,0.5,0.2923077,0.4561404,0.45,0.5538462, 0.4626866,0.4057971,0.3676471,0.5322581,0.5428571,0.375,0.4411765,0.4571429,0.4,0.3846154,0.3870968,0.4915254,0.530303,0.4375,0.4918033,0.4179104, 0.4032258,0.3606557,0.5178571,0.4848485,0.390625,0.375,0.4375,0.367,0.4,0.4477612,0.2571429,0.4032258,0.3382353,0.4814815,0.4090909,0.3548387, 0.4821429,0.5,0.557377,0.433,0.5454545,0.4590164,0.3943662,0.5076923,0.5,0.3283582,0.3676471,0.559322) my.cor<-cor(Xvar, Yvar) print(my.cor) nperm<-4 Perm.Cor<-NULL for (iperm in 1:nperm) { XvarNew<-sample(Xvar, size=length(Xvar), replace=FALSE) YvarNew<-sample(Yvar, size=length(Yvar), replace=FALSE) perm.cor<-cor(XvarNew, YvarNew) Perm.Cor<-c(Perm.Cor, perm.cor) } print(max(Perm.Cor)) XvarSorted<-sort(Xvar, decreasing=TRUE) YvarSorted<-sort(Yvar, decreasing=TRUE) max.cor<-cor(XvarSorted, YvarSorted) print(max.cor) if(mat.cor>0) Perm.Cor.Sorted<-sort(Perm.Cor, decreasing=TRUE) if(mat.cor<0) Perm.Cor.Sorted<-sort(Perm.Cor, decreasing=FALSE) T95<-Perm.Cor.Sorted[(nperm+1)*0.05]# 95% treshold value T99<-Perm.Cor.Sorted[(nperm+1)*0.01]# 99% treshold value __ R-help@r-project.org mailing list h
[R] Sampling in R
__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sampling in R
__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.