On Sep 1, 2010, at 10:42 AM, David Winsemius wrote:


On Sep 1, 2010, at 10:35 AM, Olga Lyashevska wrote:

Dear all,

I have a dataframe:
df<-dataframe(a=c(1,2,3),b=c(4,5,6),c=c(7,8,9),d=c(10,11,12))

I want to obtain a new dataframe with columns a and b being standardized
((x-mean(x))/sd(x)); the other two columns (c,d) I want to leave
unchanged. What is the best way to achieve this? I have been trying to
use subscripts but did not succeed so far.

> df[ , 1:2] <- scale(df[ , 1:2])
> df
  a  b c  d
1 -1 -1 7 10
2  0  0 8 11
3  1  1 9 12

I suspect you might have tried (df-mean(df))/sd(x) and gotten unsatisfactory results; I know I did. If you had really wanted to persist and do it from first principles, so to speak, or perhaps as "homework", then consider the sweep operation. It takes an object of lower dimension and applies a function, ("-") by default, with the third argument repeatedly across the specified (in the second argument) dimension. You wanted to work on columns, so this would accomplish the subtraction of means() followed by division by sd():

> sweep(as.matrix(df[ , 1:2]), 2L, colMeans(mm)) # using the default "-" operator
      a  b
[1,] -1 -1
[2,]  0  0
[3,]  1  1
> sweep(sweep(df[ , 1:2], 2L, colMeans(mm)), 2, sd(mm), "/")
   a  b
1 -1 -1
2  0  0
3  1  1

(Your test columns happened to be scaled already and only needed to be centered. This is how scale() does its work, and their help pages have links cross-referencing each other.)

This is probably a good time to reference Burns', The R Inferno, which has an entry for sweep (p 57) as well tips regarding the drop=FALSE maneuver (p 54) that I tried first for this problem but it "didn't work".
--

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to