On 12-12-07 7:27 AM, Dimitri Liakhovitski wrote:
Dear R-ers,
my task is to simple: to assign cases to desired groupings based on the
combined values on 2 variables. I can think of 3 methods of doing it.
Method 1 seems to me pretty r-like, but it requires a lot of lines of code
- onerous.
Since your groups are so regular, you can compute the groups directly.
Convert each column to a factor (this might have happened automatically,
depending on your data and options), then use as.integer to convert to a
numeric value.
So a simple solution would be
mydata$mygroup.m4 <- with(mydata,
4*(2-as.integer(factor(sex)))
+ as.integer(factor(age)))
It would be a little simpler if you wanted the sex factor in alphbetical
order; then you wouldn't need to subtract from 2.
If your real data wasn't so regular, another approach would be to set up
a matrix, indexed by sex and age, that gives the desired group number.
That is somewhat like your "groupings" solution; I'm not sure it would
be preferable to what you did.
Duncan Murdoch
Method 2 is a loop, so not very good - as it loops through all rows of
mydata.
Method 3 is a loop but loops through fewer lines, so it seems to me more
efficient.
Can you please tell me:
1. Which of my methods is more efficient?
2. Is there maybe an even more efficient r-like way of doing it?
Imagine - "mydata" is actually a very tall data frame.
Thanks a lot!
Dimitri
### My Data:
mydata<-data.frame(sex=rep(c(rep("m",4),rep("f",4)),2),age=rep(c(1:4,1:4),2))
(mydata)
### My desired assignments (in column "mygroup")
groupings<-data.frame(sex=c(rep("m",4),rep("f",4)),age=c(1:4,1:4),mygroup=1:8)
(groupings)
# No, I don't need a solution where the last column of "groupings" is
stacked twice and bound to "mydata"
# Method 1 of assigning to groups - requires a lot of lines of code:
mydata$mygroup.m1<-NA
mydata[(mydata$sex %in% "m")&(mydata$age %in% 1),"mygroup.m1"]<-1
mydata[(mydata$sex %in% "m")&(mydata$age %in% 2),"mygroup.m1"]<-2
mydata[(mydata$sex %in% "m")&(mydata$age %in% 3),"mygroup.m1"]<-3
mydata[(mydata$sex %in% "m")&(mydata$age %in% 4),"mygroup.m1"]<-4
mydata[(mydata$sex %in% "f")&(mydata$age %in% 1),"mygroup.m1"]<-5
mydata[(mydata$sex %in% "f")&(mydata$age %in% 2),"mygroup.m1"]<-6
mydata[(mydata$sex %in% "f")&(mydata$age %in% 3),"mygroup.m1"]<-7
mydata[(mydata$sex %in% "f")&(mydata$age %in% 4),"mygroup.m1"]<-8
(mydata)
# Method 2 of assigning to groups - very "loopy":
mydata$mygroup.m2<-NA
for(i in 1:nrow(mydata)){ # i<-1
mysex<-mydata[i,"sex"]
myage<-mydata[i,"age"]
mydata[i,"mygroup.m2"]<-groupings[(groupings$sex %in%
mysex)&(groupings$age %in% myage),"mygroup"]
}
(mydata)
# Method 3 of assigning to groups - also "loopy", but less than Method 2:
mydata$mygroup.m3<-NA
for(i in 1:nrow(groupings)){ # i<-1
mysex<-groupings[i,"sex"]
myage<-groupings[i,"age"]
mydata[(mydata$sex %in% mysex)&(mydata$age %in%
myage),"mygroup.m3"]<-groupings[i,"mygroup"]
}
(mydata)
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.