Re: [R] Urgent Help needed

2007-08-16 Thread Weiwei Shi
try this:

t0 = read.table("datatest.txt", header=T)
X.mean = ave(t0[,1], as.factor(t0[,3]))

you do the rest of Y.mean and make them into a data.fame or whatever.

HTH,

Weiwei

On 8/16/07, AbouEl-Makarim Aboueissa <[EMAIL PROTECTED]> wrote:
> Dear All:
>
> Urgent help is needed.
>
>
> I have a data set in matrix format  of three columns: X, Y and index of four 
> groups (1,2,3,4). What I need to do is the following;
>
> 1- How I can subtract the sample mean of each group indexed 1,2,3,4 from the
>  corresponding data values of this group and create new columns say 
> X-sample mean
>   and Y-sample mean? I tried to use the "tapply" but I have some 
> difficulties to restore the new data
>
>
> 2- How I can use the "tapply" if possible or any other R-function to find the 
> correlation
>  coefficient between the X and Y columns for each group indexed 1,2,3,4.? 
> Could not use the "tapply".
>
>
> I attached part of the data as txt file.
>
>
> Thank you so much for your attention to this matter, and I look forward to 
> hear from you soon.
>
> Regards,
>
> Abou
>
>
> Data:
> 
> x   y   index
> 15807.2412.54
> 15752.5133.54
> 12893.7601.53
> 8426.88 22.23
> 5706.24 333 3
> 3982.08 560 2
> 3642.62 670 2
> 295.68  124 1
> 215.40  104 1
> 195.40  204 1
> 4240.21 22.42
> 1222.72 45.92
> 1142.26 23.62
> 63.00   90.11
> 1216.00 82.42
> 2769.60 111 2
> 1790.46 34.72
> 26.10   26.10   1
> 19676.830.994
> 10920.60203 3
> 6144.00 46  3
> 4534.48 4534.48 3
> 4.0065  4
> 29500.0056  4
> 17100.0077  4
> 9000.00 435 3
> 6300.00 84  3
> 3962.88 334 2
> 5690.00 653 3
> 3736.00 233 2
> 2750.00 22  2
> 1316.00 345 2
> 4595.00 4595.00 3
> 5928.00 45  3
> 2645.70 0.002
> 2580.24 454 2
> 6547.34 6547.34 3
> 1615.68 5   2
> 194.06  55  1
> 184.80  6   1
> 82.94   44  1
> 16649.0056  4
> 4500.00 74  3
> 1600.00 744 2
>
> =
>
>
>
> ==
> AbouEl-Makarim Aboueissa, Ph.D.
> Assistant Professor of Statistics
> Department of Mathematics & Statistics
> University of Southern Maine
> 96 Falmouth Street
> P.O. Box 9300
> Portland, ME 04104-9300
>
> Tel: (207) 228-8389
> Email: [EMAIL PROTECTED]
>   [EMAIL PROTECTED]
> Office: 301C Payson Smith
>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>


-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Urgent Help needed

2007-08-16 Thread Henrique Dallazuanna
For the 2nd item, perhaps:

by(df[,1:2], df$index, FUN=cor)

where df is your data.frame.

-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

On 16/08/07, AbouEl-Makarim Aboueissa <[EMAIL PROTECTED]> wrote:
>
> Dear All:
>
> Urgent help is needed.
>
>
> I have a data set in matrix format  of three columns: X, Y and index of
> four groups (1,2,3,4). What I need to do is the following;
>
> 1- How I can subtract the sample mean of each group indexed 1,2,3,4 from
> the
>  corresponding data values of this group and create new columns say
> X-sample mean
>   and Y-sample mean? I tried to use the "tapply" but I have some
> difficulties to restore the new data
>
>
> 2- How I can use the "tapply" if possible or any other R-function to find
> the correlation
>  coefficient between the X and Y columns for each group indexed
> 1,2,3,4.? Could not use the "tapply".
>
>
> I attached part of the data as txt file.
>
>
> Thank you so much for your attention to this matter, and I look forward to
> hear from you soon.
>
> Regards,
>
> Abou
>
>
> Data:
> 
> x   y   index
> 15807.2412.54
> 15752.5133.54
> 12893.7601.53
> 8426.88 22.23
> 5706.24 333 3
> 3982.08 560 2
> 3642.62 670 2
> 295.68  124 1
> 215.40  104 1
> 195.40  204 1
> 4240.21 22.42
> 1222.72 45.92
> 1142.26 23.62
> 63.00   90.11
> 1216.00 82.42
> 2769.60 111 2
> 1790.46 34.72
> 26.10   26.10   1
> 19676.830.994
> 10920.60203 3
> 6144.00 46  3
> 4534.48 4534.48 3
> 4.0065  4
> 29500.0056  4
> 17100.0077  4
> 9000.00 435 3
> 6300.00 84  3
> 3962.88 334 2
> 5690.00 653 3
> 3736.00 233 2
> 2750.00 22  2
> 1316.00 345 2
> 4595.00 4595.00 3
> 5928.00 45  3
> 2645.70 0.002
> 2580.24 454 2
> 6547.34 6547.34 3
> 1615.68 5   2
> 194.06  55  1
> 184.80  6   1
> 82.94   44  1
> 16649.0056  4
> 4500.00 74  3
> 1600.00 744 2
>
> =
>
>
>
> ==
> AbouEl-Makarim Aboueissa, Ph.D.
> Assistant Professor of Statistics
> Department of Mathematics & Statistics
> University of Southern Maine
> 96 Falmouth Street
> P.O. Box 9300
> Portland, ME 04104-9300
>
> Tel: (207) 228-8389
> Email: [EMAIL PROTECTED]
>   [EMAIL PROTECTED]
> Office: 301C Payson Smith
>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Urgent Help needed

2007-08-16 Thread Marc Schwartz
On Thu, 2007-08-16 at 12:33 -0400, AbouEl-Makarim Aboueissa wrote:
> Dear All:
> 
> Urgent help is needed.
> 
> 
> I have a data set in matrix format  of three columns: X, Y and index
> of four groups (1,2,3,4). What I need to do is the following;
> 
> 1- How I can subtract the sample mean of each group indexed 1,2,3,4
> from the 
>  corresponding data values of this group and create new columns
> say X-sample mean 
>   and Y-sample mean? I tried to use the "tapply" but I have some
> difficulties to restore the new data
> 
> 
> 2- How I can use the “tapply” if possible or any other R-function to
> find the correlation 
>  coefficient between the X and Y columns for each group indexed
> 1,2,3,4.? Could not use the "tapply".
> 
> 
> I attached part of the data as txt file.
> 
> 
> Thank you so much for your attention to this matter, and I look
> forward to hear from you soon.
> 
> Regards,
> 
> Abou
> 
> 
> Data:
> 
> x y   index
> 15807.24  12.54
> 15752.51  33.54
> 12893.76  01.53
> 8426.88   22.23
> 5706.24   333 3
> 3982.08   560 2
> 3642.62   670 2
> 295.68124 1
> 215.40104 1
> 195.40204 1
> 4240.21   22.42
> 1222.72   45.92
> 1142.26   23.62
> 63.00 90.11
> 1216.00   82.42
> 2769.60   111 2
> 1790.46   34.72
> 26.10 26.10   1
> 19676.83  0.994
> 10920.60  203 3
> 6144.00   46  3
> 4534.48   4534.48 3
> 4.00  65  4
> 29500.00  56  4
> 17100.00  77  4
> 9000.00   435 3
> 6300.00   84  3
> 3962.88   334 2
> 5690.00   653 3
> 3736.00   233 2
> 2750.00   22  2
> 1316.00   345 2
> 4595.00   4595.00 3
> 5928.00   45  3
> 2645.70   0.002
> 2580.24   454 2
> 6547.34   6547.34 3
> 1615.68   5   2
> 194.0655  1
> 184.806   1
> 82.94 44  1
> 16649.00  56  4
> 4500.00   74  3
> 1600.00   744 2
> 
> =


I might be tempted to take the following approach:

If your data is a matrix, coerce it to a data frame first. Let's call
that 'DF'.

> str(DF)
'data.frame':   44 obs. of  3 variables:
 $ x: num  15807 15753 12894  8427  5706 ...
 $ y: num  12.5 33.5 1.5 22.2 333 560 670 124 104 204 ...
 $ index: int  4 4 3 3 3 2 2 1 1 1 ...


Now use split() to break up the data frame into a list of 4
sub-dataframes, based upon the index value.  We can use scale() within a
lapply() loop to center the 'x' and 'y' columns for each sub-dataframe:


DF.ctr <- lapply(split(DF[, -3], DF$index), scale, scale = FALSE)


> str(DF.ctr)
List of 4
 $ 1: num [1:8, 1:2]  138.5   58.2   38.2  -94.2 -131.1 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:8] "8" "9" "10" "14" ...
  .. ..$ : chr [1:2] "x" "y"
  ..- attr(*, "scaled:center")= Named num [1:2] 157.2  81.7
  .. ..- attr(*, "names")= chr [1:2] "x" "y"
 $ 2: num [1:16, 1:2]  1469  1129  1727 -1291 -1371 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:16] "6" "7" "11" "12" ...
  .. ..$ : chr [1:2] "x" "y"
  ..- attr(*, "scaled:center")= Named num [1:2] 2513  230
  .. ..- attr(*, "names")= chr [1:2] "x" "y"
 $ 3: num [1:13, 1:2]  5879  1413 -1308  3906  -870 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:13] "3" "4" "5" "20" ...
  .. ..$ : chr [1:2] "x" "y"
  ..- attr(*, "scaled:center")= Named num [1:2] 7014 1352
  .. ..- attr(*, "names")= chr [1:2] "x" "y"
 $ 4: num [1:7, 1:2] -6262 -6317 -2393 17931  7431 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:7] "1" "2" "19" "23" ...
  .. ..$ : chr [1:2] "x" "y"
  ..- attr(*, "scaled:center")= Named num [1:2] 2206943
  .. ..- attr(*, "names")= chr [1:2] "x" "y"


Now, create a new single DF comprised of the sub-dataframes from DF.ctr:

DF.new <- do.call(rbind, DF.ctr)


Define colnames:

colnames(DF.new) <- c("x-mean", "y-mean")


> str(DF.new)
 num [1:44, 1:2]  138.5   58.2   38.2  -94.2 -131.1 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:44] "8" "9" "10" "14" ...
  ..$ : chr [1:2] "x-mean" "y-mean"


Now, use merge() to join DF and DF.new by the rownames:

DF.final <- merge(DF, DF.new, by = "row.names")

> DF.final
   Row.namesx   y index  x-mean   y-mean
1  1 15807.24   12.50 4 -6262.12857   -30.498571
2 10   195.40  204.00 138.22750   122.35
3 11  4240.21   22.40 2  1726.93188  -208.037500
4 12  1222.72   45.90 2 -1290.55812  -184.537500
5 13  1142.26   23.60 2 -1371.01812  -206.837500
6 1463.00   90.10 1   -94.17250 8.45
7 15  1216.00   82.40 2 -1297.27812  -148.037500
8 16  2769.60  111.00 2   256.32188  -119.437500
9 17  1790.46   34.70 2  -722.81812  -195.737500
10

Re: [R] Urgent Help needed

2007-08-16 Thread AbouEl-Makarim Aboueissa
Thanks all.

it works.

Just one more thing: if you look to this out put,

> by(data1[,2:3], data1[,4], cor)[1]
$`1`
XY
X   1.0000.4400451
Y   0.4400451  1.000

Q. How I can just pick the value of the correlation 0.4400451 from this output 
and call it sat corxy.


Once again thank you all so much for your helps.


Abou


==
AbouEl-Makarim Aboueissa, Ph.D.
Assistant Professor of Statistics
Department of Mathematics & Statistics
University of Southern Maine
96 Falmouth Street
P.O. Box 9300
Portland, ME 04104-9300

Tel: (207) 228-8389
Email: [EMAIL PROTECTED]
  [EMAIL PROTECTED]
Office: 301C Payson Smith

>>> "Henrique Dallazuanna" <[EMAIL PROTECTED]> 8/16/2007 2:05 PM >>>
For the 2nd item, perhaps:

by(df[,1:2], df$index, FUN=cor)

where df is your data.frame.

-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

On 16/08/07, AbouEl-Makarim Aboueissa <[EMAIL PROTECTED]> wrote:
>
> Dear All:
>
> Urgent help is needed.
>
>
> I have a data set in matrix format  of three columns: X, Y and index of
> four groups (1,2,3,4). What I need to do is the following;
>
> 1- How I can subtract the sample mean of each group indexed 1,2,3,4 from
> the
>  corresponding data values of this group and create new columns say
> X-sample mean
>   and Y-sample mean? I tried to use the "tapply" but I have some
> difficulties to restore the new data
>
>
> 2- How I can use the "tapply" if possible or any other R-function to find
> the correlation
>  coefficient between the X and Y columns for each group indexed
> 1,2,3,4.? Could not use the "tapply".
>
>
> I attached part of the data as txt file.
>
>
> Thank you so much for your attention to this matter, and I look forward to
> hear from you soon.
>
> Regards,
>
> Abou
>
>
> Data:
> 
> x   y   index
> 15807.2412.54
> 15752.5133.54
> 12893.7601.53
> 8426.88 22.23
> 5706.24 333 3
> 3982.08 560 2
> 3642.62 670 2
> 295.68  124 1
> 215.40  104 1
> 195.40  204 1
> 4240.21 22.42
> 1222.72 45.92
> 1142.26 23.62
> 63.00   90.11
> 1216.00 82.42
> 2769.60 111 2
> 1790.46 34.72
> 26.10   26.10   1
> 19676.830.994
> 10920.60203 3
> 6144.00 46  3
> 4534.48 4534.48 3
> 4.0065  4
> 29500.0056  4
> 17100.0077  4
> 9000.00 435 3
> 6300.00 84  3
> 3962.88 334 2
> 5690.00 653 3
> 3736.00 233 2
> 2750.00 22  2
> 1316.00 345 2
> 4595.00 4595.00 3
> 5928.00 45  3
> 2645.70 0.002
> 2580.24 454 2
> 6547.34 6547.34 3
> 1615.68 5   2
> 194.06  55  1
> 184.80  6   1
> 82.94   44  1
> 16649.0056  4
> 4500.00 74  3
> 1600.00 744 2
>
> =
>
>
>
> ==
> AbouEl-Makarim Aboueissa, Ph.D.
> Assistant Professor of Statistics
> Department of Mathematics & Statistics
> University of Southern Maine
> 96 Falmouth Street
> P.O. Box 9300
> Portland, ME 04104-9300
>
> Tel: (207) 228-8389
> Email: [EMAIL PROTECTED] 
>   [EMAIL PROTECTED] 
> Office: 301C Payson Smith
>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>
>
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Urgent Help needed

2007-08-16 Thread Henrique Dallazuanna
Hi, try this:

by(df[,1:2], df$index, FUN=function(x)cor(x[1],x[2]))

-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

On 16/08/07, AbouEl-Makarim Aboueissa <[EMAIL PROTECTED]> wrote:
>
> Thanks all.
>
> it works.
>
> Just one more thing: if you look to this out put,
>
> > by(data1[,2:3], data1[,4], cor)[1]
> $`1`
> XY
> X   1.0000.4400451
> Y   0.4400451  1.000
>
> Q. How I can just pick the value of the correlation 0.4400451 from this
> output and call it sat corxy.
>
>
> Once again thank you all so much for your helps.
>
>
> Abou
>
>
> ==
> AbouEl-Makarim Aboueissa, Ph.D.
> Assistant Professor of Statistics
> Department of Mathematics & Statistics
> University of Southern Maine
> 96 Falmouth Street
> P.O. Box 9300
> Portland, ME 04104-9300
>
> Tel: (207) 228-8389
> Email: [EMAIL PROTECTED]
>   [EMAIL PROTECTED]
> Office: 301C Payson Smith
>
> >>> "Henrique Dallazuanna" <[EMAIL PROTECTED]> 8/16/2007 2:05 PM >>>
> For the 2nd item, perhaps:
>
> by(df[,1:2], df$index, FUN=cor)
>
> where df is your data.frame.
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>
> On 16/08/07, AbouEl-Makarim Aboueissa <[EMAIL PROTECTED]> wrote:
> >
> > Dear All:
> >
> > Urgent help is needed.
> >
> >
> > I have a data set in matrix format  of three columns: X, Y and index of
> > four groups (1,2,3,4). What I need to do is the following;
> >
> > 1- How I can subtract the sample mean of each group indexed 1,2,3,4 from
> > the
> >  corresponding data values of this group and create new columns say
> > X-sample mean
> >   and Y-sample mean? I tried to use the "tapply" but I have some
> > difficulties to restore the new data
> >
> >
> > 2- How I can use the "tapply" if possible or any other R-function to
> find
> > the correlation
> >  coefficient between the X and Y columns for each group indexed
> > 1,2,3,4.? Could not use the "tapply".
> >
> >
> > I attached part of the data as txt file.
> >
> >
> > Thank you so much for your attention to this matter, and I look forward
> to
> > hear from you soon.
> >
> > Regards,
> >
> > Abou
> >
> >
> > Data:
> > 
> > x   y   index
> > 15807.2412.54
> > 15752.5133.54
> > 12893.7601.53
> > 8426.88 22.23
> > 5706.24 333 3
> > 3982.08 560 2
> > 3642.62 670 2
> > 295.68  124 1
> > 215.40  104 1
> > 195.40  204 1
> > 4240.21 22.42
> > 1222.72 45.92
> > 1142.26 23.62
> > 63.00   90.11
> > 1216.00 82.42
> > 2769.60 111 2
> > 1790.46 34.72
> > 26.10   26.10   1
> > 19676.830.994
> > 10920.60203 3
> > 6144.00 46  3
> > 4534.48 4534.48 3
> > 4.0065  4
> > 29500.0056  4
> > 17100.0077  4
> > 9000.00 435 3
> > 6300.00 84  3
> > 3962.88 334 2
> > 5690.00 653 3
> > 3736.00 233 2
> > 2750.00 22  2
> > 1316.00 345 2
> > 4595.00 4595.00 3
> > 5928.00 45  3
> > 2645.70 0.002
> > 2580.24 454 2
> > 6547.34 6547.34 3
> > 1615.68 5   2
> > 194.06  55  1
> > 184.80  6   1
> > 82.94   44  1
> > 16649.0056  4
> > 4500.00 74  3
> > 1600.00 744 2
> >
> > =
> >
> >
> >
> > ==
> > AbouEl-Makarim Aboueissa, Ph.D.
> > Assistant Professor of Statistics
> > Department of Mathematics & Statistics
> > University of Southern Maine
> > 96 Falmouth Street
> > P.O. Box 9300
> > Portland, ME 04104-9300
> >
> > Tel: (207) 228-8389
> > Email: [EMAIL PROTECTED]
> >   [EMAIL PROTECTED]
> > Office: 301C Payson Smith
> >
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
>
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.