Re: [R] Urgent Help needed
try this: t0 = read.table("datatest.txt", header=T) X.mean = ave(t0[,1], as.factor(t0[,3])) you do the rest of Y.mean and make them into a data.fame or whatever. HTH, Weiwei On 8/16/07, AbouEl-Makarim Aboueissa <[EMAIL PROTECTED]> wrote: > Dear All: > > Urgent help is needed. > > > I have a data set in matrix format of three columns: X, Y and index of four > groups (1,2,3,4). What I need to do is the following; > > 1- How I can subtract the sample mean of each group indexed 1,2,3,4 from the > corresponding data values of this group and create new columns say > X-sample mean > and Y-sample mean? I tried to use the "tapply" but I have some > difficulties to restore the new data > > > 2- How I can use the "tapply" if possible or any other R-function to find the > correlation > coefficient between the X and Y columns for each group indexed 1,2,3,4.? > Could not use the "tapply". > > > I attached part of the data as txt file. > > > Thank you so much for your attention to this matter, and I look forward to > hear from you soon. > > Regards, > > Abou > > > Data: > > x y index > 15807.2412.54 > 15752.5133.54 > 12893.7601.53 > 8426.88 22.23 > 5706.24 333 3 > 3982.08 560 2 > 3642.62 670 2 > 295.68 124 1 > 215.40 104 1 > 195.40 204 1 > 4240.21 22.42 > 1222.72 45.92 > 1142.26 23.62 > 63.00 90.11 > 1216.00 82.42 > 2769.60 111 2 > 1790.46 34.72 > 26.10 26.10 1 > 19676.830.994 > 10920.60203 3 > 6144.00 46 3 > 4534.48 4534.48 3 > 4.0065 4 > 29500.0056 4 > 17100.0077 4 > 9000.00 435 3 > 6300.00 84 3 > 3962.88 334 2 > 5690.00 653 3 > 3736.00 233 2 > 2750.00 22 2 > 1316.00 345 2 > 4595.00 4595.00 3 > 5928.00 45 3 > 2645.70 0.002 > 2580.24 454 2 > 6547.34 6547.34 3 > 1615.68 5 2 > 194.06 55 1 > 184.80 6 1 > 82.94 44 1 > 16649.0056 4 > 4500.00 74 3 > 1600.00 744 2 > > = > > > > == > AbouEl-Makarim Aboueissa, Ph.D. > Assistant Professor of Statistics > Department of Mathematics & Statistics > University of Southern Maine > 96 Falmouth Street > P.O. Box 9300 > Portland, ME 04104-9300 > > Tel: (207) 228-8389 > Email: [EMAIL PROTECTED] > [EMAIL PROTECTED] > Office: 301C Payson Smith > > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Urgent Help needed
For the 2nd item, perhaps: by(df[,1:2], df$index, FUN=cor) where df is your data.frame. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O On 16/08/07, AbouEl-Makarim Aboueissa <[EMAIL PROTECTED]> wrote: > > Dear All: > > Urgent help is needed. > > > I have a data set in matrix format of three columns: X, Y and index of > four groups (1,2,3,4). What I need to do is the following; > > 1- How I can subtract the sample mean of each group indexed 1,2,3,4 from > the > corresponding data values of this group and create new columns say > X-sample mean > and Y-sample mean? I tried to use the "tapply" but I have some > difficulties to restore the new data > > > 2- How I can use the "tapply" if possible or any other R-function to find > the correlation > coefficient between the X and Y columns for each group indexed > 1,2,3,4.? Could not use the "tapply". > > > I attached part of the data as txt file. > > > Thank you so much for your attention to this matter, and I look forward to > hear from you soon. > > Regards, > > Abou > > > Data: > > x y index > 15807.2412.54 > 15752.5133.54 > 12893.7601.53 > 8426.88 22.23 > 5706.24 333 3 > 3982.08 560 2 > 3642.62 670 2 > 295.68 124 1 > 215.40 104 1 > 195.40 204 1 > 4240.21 22.42 > 1222.72 45.92 > 1142.26 23.62 > 63.00 90.11 > 1216.00 82.42 > 2769.60 111 2 > 1790.46 34.72 > 26.10 26.10 1 > 19676.830.994 > 10920.60203 3 > 6144.00 46 3 > 4534.48 4534.48 3 > 4.0065 4 > 29500.0056 4 > 17100.0077 4 > 9000.00 435 3 > 6300.00 84 3 > 3962.88 334 2 > 5690.00 653 3 > 3736.00 233 2 > 2750.00 22 2 > 1316.00 345 2 > 4595.00 4595.00 3 > 5928.00 45 3 > 2645.70 0.002 > 2580.24 454 2 > 6547.34 6547.34 3 > 1615.68 5 2 > 194.06 55 1 > 184.80 6 1 > 82.94 44 1 > 16649.0056 4 > 4500.00 74 3 > 1600.00 744 2 > > = > > > > == > AbouEl-Makarim Aboueissa, Ph.D. > Assistant Professor of Statistics > Department of Mathematics & Statistics > University of Southern Maine > 96 Falmouth Street > P.O. Box 9300 > Portland, ME 04104-9300 > > Tel: (207) 228-8389 > Email: [EMAIL PROTECTED] > [EMAIL PROTECTED] > Office: 301C Payson Smith > > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Urgent Help needed
On Thu, 2007-08-16 at 12:33 -0400, AbouEl-Makarim Aboueissa wrote: > Dear All: > > Urgent help is needed. > > > I have a data set in matrix format of three columns: X, Y and index > of four groups (1,2,3,4). What I need to do is the following; > > 1- How I can subtract the sample mean of each group indexed 1,2,3,4 > from the > corresponding data values of this group and create new columns > say X-sample mean > and Y-sample mean? I tried to use the "tapply" but I have some > difficulties to restore the new data > > > 2- How I can use the “tapply” if possible or any other R-function to > find the correlation > coefficient between the X and Y columns for each group indexed > 1,2,3,4.? Could not use the "tapply". > > > I attached part of the data as txt file. > > > Thank you so much for your attention to this matter, and I look > forward to hear from you soon. > > Regards, > > Abou > > > Data: > > x y index > 15807.24 12.54 > 15752.51 33.54 > 12893.76 01.53 > 8426.88 22.23 > 5706.24 333 3 > 3982.08 560 2 > 3642.62 670 2 > 295.68124 1 > 215.40104 1 > 195.40204 1 > 4240.21 22.42 > 1222.72 45.92 > 1142.26 23.62 > 63.00 90.11 > 1216.00 82.42 > 2769.60 111 2 > 1790.46 34.72 > 26.10 26.10 1 > 19676.83 0.994 > 10920.60 203 3 > 6144.00 46 3 > 4534.48 4534.48 3 > 4.00 65 4 > 29500.00 56 4 > 17100.00 77 4 > 9000.00 435 3 > 6300.00 84 3 > 3962.88 334 2 > 5690.00 653 3 > 3736.00 233 2 > 2750.00 22 2 > 1316.00 345 2 > 4595.00 4595.00 3 > 5928.00 45 3 > 2645.70 0.002 > 2580.24 454 2 > 6547.34 6547.34 3 > 1615.68 5 2 > 194.0655 1 > 184.806 1 > 82.94 44 1 > 16649.00 56 4 > 4500.00 74 3 > 1600.00 744 2 > > = I might be tempted to take the following approach: If your data is a matrix, coerce it to a data frame first. Let's call that 'DF'. > str(DF) 'data.frame': 44 obs. of 3 variables: $ x: num 15807 15753 12894 8427 5706 ... $ y: num 12.5 33.5 1.5 22.2 333 560 670 124 104 204 ... $ index: int 4 4 3 3 3 2 2 1 1 1 ... Now use split() to break up the data frame into a list of 4 sub-dataframes, based upon the index value. We can use scale() within a lapply() loop to center the 'x' and 'y' columns for each sub-dataframe: DF.ctr <- lapply(split(DF[, -3], DF$index), scale, scale = FALSE) > str(DF.ctr) List of 4 $ 1: num [1:8, 1:2] 138.5 58.2 38.2 -94.2 -131.1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:8] "8" "9" "10" "14" ... .. ..$ : chr [1:2] "x" "y" ..- attr(*, "scaled:center")= Named num [1:2] 157.2 81.7 .. ..- attr(*, "names")= chr [1:2] "x" "y" $ 2: num [1:16, 1:2] 1469 1129 1727 -1291 -1371 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:16] "6" "7" "11" "12" ... .. ..$ : chr [1:2] "x" "y" ..- attr(*, "scaled:center")= Named num [1:2] 2513 230 .. ..- attr(*, "names")= chr [1:2] "x" "y" $ 3: num [1:13, 1:2] 5879 1413 -1308 3906 -870 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:13] "3" "4" "5" "20" ... .. ..$ : chr [1:2] "x" "y" ..- attr(*, "scaled:center")= Named num [1:2] 7014 1352 .. ..- attr(*, "names")= chr [1:2] "x" "y" $ 4: num [1:7, 1:2] -6262 -6317 -2393 17931 7431 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:7] "1" "2" "19" "23" ... .. ..$ : chr [1:2] "x" "y" ..- attr(*, "scaled:center")= Named num [1:2] 2206943 .. ..- attr(*, "names")= chr [1:2] "x" "y" Now, create a new single DF comprised of the sub-dataframes from DF.ctr: DF.new <- do.call(rbind, DF.ctr) Define colnames: colnames(DF.new) <- c("x-mean", "y-mean") > str(DF.new) num [1:44, 1:2] 138.5 58.2 38.2 -94.2 -131.1 ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:44] "8" "9" "10" "14" ... ..$ : chr [1:2] "x-mean" "y-mean" Now, use merge() to join DF and DF.new by the rownames: DF.final <- merge(DF, DF.new, by = "row.names") > DF.final Row.namesx y index x-mean y-mean 1 1 15807.24 12.50 4 -6262.12857 -30.498571 2 10 195.40 204.00 138.22750 122.35 3 11 4240.21 22.40 2 1726.93188 -208.037500 4 12 1222.72 45.90 2 -1290.55812 -184.537500 5 13 1142.26 23.60 2 -1371.01812 -206.837500 6 1463.00 90.10 1 -94.17250 8.45 7 15 1216.00 82.40 2 -1297.27812 -148.037500 8 16 2769.60 111.00 2 256.32188 -119.437500 9 17 1790.46 34.70 2 -722.81812 -195.737500 10
Re: [R] Urgent Help needed
Thanks all. it works. Just one more thing: if you look to this out put, > by(data1[,2:3], data1[,4], cor)[1] $`1` XY X 1.0000.4400451 Y 0.4400451 1.000 Q. How I can just pick the value of the correlation 0.4400451 from this output and call it sat corxy. Once again thank you all so much for your helps. Abou == AbouEl-Makarim Aboueissa, Ph.D. Assistant Professor of Statistics Department of Mathematics & Statistics University of Southern Maine 96 Falmouth Street P.O. Box 9300 Portland, ME 04104-9300 Tel: (207) 228-8389 Email: [EMAIL PROTECTED] [EMAIL PROTECTED] Office: 301C Payson Smith >>> "Henrique Dallazuanna" <[EMAIL PROTECTED]> 8/16/2007 2:05 PM >>> For the 2nd item, perhaps: by(df[,1:2], df$index, FUN=cor) where df is your data.frame. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O On 16/08/07, AbouEl-Makarim Aboueissa <[EMAIL PROTECTED]> wrote: > > Dear All: > > Urgent help is needed. > > > I have a data set in matrix format of three columns: X, Y and index of > four groups (1,2,3,4). What I need to do is the following; > > 1- How I can subtract the sample mean of each group indexed 1,2,3,4 from > the > corresponding data values of this group and create new columns say > X-sample mean > and Y-sample mean? I tried to use the "tapply" but I have some > difficulties to restore the new data > > > 2- How I can use the "tapply" if possible or any other R-function to find > the correlation > coefficient between the X and Y columns for each group indexed > 1,2,3,4.? Could not use the "tapply". > > > I attached part of the data as txt file. > > > Thank you so much for your attention to this matter, and I look forward to > hear from you soon. > > Regards, > > Abou > > > Data: > > x y index > 15807.2412.54 > 15752.5133.54 > 12893.7601.53 > 8426.88 22.23 > 5706.24 333 3 > 3982.08 560 2 > 3642.62 670 2 > 295.68 124 1 > 215.40 104 1 > 195.40 204 1 > 4240.21 22.42 > 1222.72 45.92 > 1142.26 23.62 > 63.00 90.11 > 1216.00 82.42 > 2769.60 111 2 > 1790.46 34.72 > 26.10 26.10 1 > 19676.830.994 > 10920.60203 3 > 6144.00 46 3 > 4534.48 4534.48 3 > 4.0065 4 > 29500.0056 4 > 17100.0077 4 > 9000.00 435 3 > 6300.00 84 3 > 3962.88 334 2 > 5690.00 653 3 > 3736.00 233 2 > 2750.00 22 2 > 1316.00 345 2 > 4595.00 4595.00 3 > 5928.00 45 3 > 2645.70 0.002 > 2580.24 454 2 > 6547.34 6547.34 3 > 1615.68 5 2 > 194.06 55 1 > 184.80 6 1 > 82.94 44 1 > 16649.0056 4 > 4500.00 74 3 > 1600.00 744 2 > > = > > > > == > AbouEl-Makarim Aboueissa, Ph.D. > Assistant Professor of Statistics > Department of Mathematics & Statistics > University of Southern Maine > 96 Falmouth Street > P.O. Box 9300 > Portland, ME 04104-9300 > > Tel: (207) 228-8389 > Email: [EMAIL PROTECTED] > [EMAIL PROTECTED] > Office: 301C Payson Smith > > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Urgent Help needed
Hi, try this: by(df[,1:2], df$index, FUN=function(x)cor(x[1],x[2])) -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O On 16/08/07, AbouEl-Makarim Aboueissa <[EMAIL PROTECTED]> wrote: > > Thanks all. > > it works. > > Just one more thing: if you look to this out put, > > > by(data1[,2:3], data1[,4], cor)[1] > $`1` > XY > X 1.0000.4400451 > Y 0.4400451 1.000 > > Q. How I can just pick the value of the correlation 0.4400451 from this > output and call it sat corxy. > > > Once again thank you all so much for your helps. > > > Abou > > > == > AbouEl-Makarim Aboueissa, Ph.D. > Assistant Professor of Statistics > Department of Mathematics & Statistics > University of Southern Maine > 96 Falmouth Street > P.O. Box 9300 > Portland, ME 04104-9300 > > Tel: (207) 228-8389 > Email: [EMAIL PROTECTED] > [EMAIL PROTECTED] > Office: 301C Payson Smith > > >>> "Henrique Dallazuanna" <[EMAIL PROTECTED]> 8/16/2007 2:05 PM >>> > For the 2nd item, perhaps: > > by(df[,1:2], df$index, FUN=cor) > > where df is your data.frame. > > -- > Henrique Dallazuanna > Curitiba-Paraná-Brasil > 25° 25' 40" S 49° 16' 22" O > > On 16/08/07, AbouEl-Makarim Aboueissa <[EMAIL PROTECTED]> wrote: > > > > Dear All: > > > > Urgent help is needed. > > > > > > I have a data set in matrix format of three columns: X, Y and index of > > four groups (1,2,3,4). What I need to do is the following; > > > > 1- How I can subtract the sample mean of each group indexed 1,2,3,4 from > > the > > corresponding data values of this group and create new columns say > > X-sample mean > > and Y-sample mean? I tried to use the "tapply" but I have some > > difficulties to restore the new data > > > > > > 2- How I can use the "tapply" if possible or any other R-function to > find > > the correlation > > coefficient between the X and Y columns for each group indexed > > 1,2,3,4.? Could not use the "tapply". > > > > > > I attached part of the data as txt file. > > > > > > Thank you so much for your attention to this matter, and I look forward > to > > hear from you soon. > > > > Regards, > > > > Abou > > > > > > Data: > > > > x y index > > 15807.2412.54 > > 15752.5133.54 > > 12893.7601.53 > > 8426.88 22.23 > > 5706.24 333 3 > > 3982.08 560 2 > > 3642.62 670 2 > > 295.68 124 1 > > 215.40 104 1 > > 195.40 204 1 > > 4240.21 22.42 > > 1222.72 45.92 > > 1142.26 23.62 > > 63.00 90.11 > > 1216.00 82.42 > > 2769.60 111 2 > > 1790.46 34.72 > > 26.10 26.10 1 > > 19676.830.994 > > 10920.60203 3 > > 6144.00 46 3 > > 4534.48 4534.48 3 > > 4.0065 4 > > 29500.0056 4 > > 17100.0077 4 > > 9000.00 435 3 > > 6300.00 84 3 > > 3962.88 334 2 > > 5690.00 653 3 > > 3736.00 233 2 > > 2750.00 22 2 > > 1316.00 345 2 > > 4595.00 4595.00 3 > > 5928.00 45 3 > > 2645.70 0.002 > > 2580.24 454 2 > > 6547.34 6547.34 3 > > 1615.68 5 2 > > 194.06 55 1 > > 184.80 6 1 > > 82.94 44 1 > > 16649.0056 4 > > 4500.00 74 3 > > 1600.00 744 2 > > > > = > > > > > > > > == > > AbouEl-Makarim Aboueissa, Ph.D. > > Assistant Professor of Statistics > > Department of Mathematics & Statistics > > University of Southern Maine > > 96 Falmouth Street > > P.O. Box 9300 > > Portland, ME 04104-9300 > > > > Tel: (207) 228-8389 > > Email: [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > Office: 301C Payson Smith > > > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.