Re: [R] Assigning cases to groupings based on the values of several variables

2012-12-07 Thread arun
HI,

In your method2 and method3, you are using the groupings data.  If that is the 
case, is it possible for you to use ?merge() or ?join() from library(plyr)
 join(mydata,groupings,by=c("sex","age"),type="inner")
 #  sex age mygroup
#1    m   1   1
#2    m   2   2
#3    m   3   3
#4    m   4   4
#5    f   1   5
#6    f   2   6
#7    f   3   7
#8    f   4   8
#9    m   1   1
#10   m   2   2
#11   m   3   3
#12   m   4   4
#13   f   1   5
#14   f   2   6
#15   f   3   7
#16   f   4   8
A.K.



- Original Message -
From: Dimitri Liakhovitski 
To: r-help 
Cc: 
Sent: Friday, December 7, 2012 7:27 AM
Subject: [R] Assigning cases to groupings based on the values of several 
variables

Dear R-ers,

my task is to simple: to assign cases to desired groupings based on the
combined values on 2 variables. I can think of 3 methods of doing it.
Method 1 seems to me pretty r-like, but it requires a lot of lines of code
- onerous.
Method 2 is a loop, so not very good - as it loops through all rows of
mydata.
Method 3 is a loop but loops through fewer lines, so it seems to me more
efficient.
Can you please tell me:
1. Which of my methods is more efficient?
2. Is there maybe an even more efficient r-like way of doing it?
Imagine - "mydata" is actually a very tall data frame.
Thanks a lot!
Dimitri

### My Data:
mydata<-data.frame(sex=rep(c(rep("m",4),rep("f",4)),2),age=rep(c(1:4,1:4),2))
(mydata)

### My desired assignments (in column "mygroup")
groupings<-data.frame(sex=c(rep("m",4),rep("f",4)),age=c(1:4,1:4),mygroup=1:8)
(groupings)

# No, I don't need a solution where the last column of "groupings" is
stacked twice and bound to "mydata"

# Method 1 of assigning to groups - requires a lot of lines of code:
mydata$mygroup.m1<-NA
mydata[(mydata$sex %in% "m")&(mydata$age %in% 1),"mygroup.m1"]<-1
mydata[(mydata$sex %in% "m")&(mydata$age %in% 2),"mygroup.m1"]<-2
mydata[(mydata$sex %in% "m")&(mydata$age %in% 3),"mygroup.m1"]<-3
mydata[(mydata$sex %in% "m")&(mydata$age %in% 4),"mygroup.m1"]<-4
mydata[(mydata$sex %in% "f")&(mydata$age %in% 1),"mygroup.m1"]<-5
mydata[(mydata$sex %in% "f")&(mydata$age %in% 2),"mygroup.m1"]<-6
mydata[(mydata$sex %in% "f")&(mydata$age %in% 3),"mygroup.m1"]<-7
mydata[(mydata$sex %in% "f")&(mydata$age %in% 4),"mygroup.m1"]<-8
(mydata)

# Method 2 of assigning to groups - very "loopy":
mydata$mygroup.m2<-NA
for(i in 1:nrow(mydata)){  # i<-1
  mysex<-mydata[i,"sex"]
  myage<-mydata[i,"age"]
  mydata[i,"mygroup.m2"]<-groupings[(groupings$sex %in%
mysex)&(groupings$age %in% myage),"mygroup"]
}
(mydata)

# Method 3 of assigning to groups - also "loopy", but less than Method 2:
mydata$mygroup.m3<-NA
for(i in 1:nrow(groupings)){  # i<-1
  mysex<-groupings[i,"sex"]
  myage<-groupings[i,"age"]
  mydata[(mydata$sex %in% mysex)&(mydata$age %in%
myage),"mygroup.m3"]<-groupings[i,"mygroup"]
}
(mydata)

-- 
Dimitri Liakhovitski
gfk.com <http://marketfusionanalytics.com/>

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning cases to groupings based on the values of several variables

2012-12-07 Thread Dimitri Liakhovitski
Wow, Arun I think I really like this solution. It allows me to create
irregular groupings and is very parsimonious.
Thank you very much!
Dimitri
On Fri, Dec 7, 2012 at 8:09 AM, arun  wrote:

> HI,
>
> In your method2 and method3, you are using the groupings data.  If that is
> the case, is it possible for you to use ?merge() or ?join() from
> library(plyr)
>  join(mydata,groupings,by=c("sex","age"),type="inner")
>  #  sex age mygroup
> #1m   1   1
> #2m   2   2
> #3m   3   3
> #4m   4   4
> #5f   1   5
> #6f   2   6
> #7f   3   7
> #8f   4   8
> #9m   1   1
> #10   m   2   2
> #11   m   3   3
> #12   m   4   4
> #13   f   1   5
> #14   f   2   6
> #15   f   3   7
> #16   f   4   8
> A.K.
>
>
>
> - Original Message -
> From: Dimitri Liakhovitski 
> To: r-help 
> Cc:
> Sent: Friday, December 7, 2012 7:27 AM
> Subject: [R] Assigning cases to groupings based on the values of several
> variables
>
> Dear R-ers,
>
> my task is to simple: to assign cases to desired groupings based on the
> combined values on 2 variables. I can think of 3 methods of doing it.
> Method 1 seems to me pretty r-like, but it requires a lot of lines of code
> - onerous.
> Method 2 is a loop, so not very good - as it loops through all rows of
> mydata.
> Method 3 is a loop but loops through fewer lines, so it seems to me more
> efficient.
> Can you please tell me:
> 1. Which of my methods is more efficient?
> 2. Is there maybe an even more efficient r-like way of doing it?
> Imagine - "mydata" is actually a very tall data frame.
> Thanks a lot!
> Dimitri
>
> ### My Data:
>
> mydata<-data.frame(sex=rep(c(rep("m",4),rep("f",4)),2),age=rep(c(1:4,1:4),2))
> (mydata)
>
> ### My desired assignments (in column "mygroup")
>
> groupings<-data.frame(sex=c(rep("m",4),rep("f",4)),age=c(1:4,1:4),mygroup=1:8)
> (groupings)
>
> # No, I don't need a solution where the last column of "groupings" is
> stacked twice and bound to "mydata"
>
> # Method 1 of assigning to groups - requires a lot of lines of code:
> mydata$mygroup.m1<-NA
> mydata[(mydata$sex %in% "m")&(mydata$age %in% 1),"mygroup.m1"]<-1
> mydata[(mydata$sex %in% "m")&(mydata$age %in% 2),"mygroup.m1"]<-2
> mydata[(mydata$sex %in% "m")&(mydata$age %in% 3),"mygroup.m1"]<-3
> mydata[(mydata$sex %in% "m")&(mydata$age %in% 4),"mygroup.m1"]<-4
> mydata[(mydata$sex %in% "f")&(mydata$age %in% 1),"mygroup.m1"]<-5
> mydata[(mydata$sex %in% "f")&(mydata$age %in% 2),"mygroup.m1"]<-6
> mydata[(mydata$sex %in% "f")&(mydata$age %in% 3),"mygroup.m1"]<-7
> mydata[(mydata$sex %in% "f")&(mydata$age %in% 4),"mygroup.m1"]<-8
> (mydata)
>
> # Method 2 of assigning to groups - very "loopy":
> mydata$mygroup.m2<-NA
> for(i in 1:nrow(mydata)){  # i<-1
>   mysex<-mydata[i,"sex"]
>   myage<-mydata[i,"age"]
>   mydata[i,"mygroup.m2"]<-groupings[(groupings$sex %in%
> mysex)&(groupings$age %in% myage),"mygroup"]
> }
> (mydata)
>
> # Method 3 of assigning to groups - also "loopy", but less than Method 2:
> mydata$mygroup.m3<-NA
> for(i in 1:nrow(groupings)){  # i<-1
>   mysex<-groupings[i,"sex"]
>   myage<-groupings[i,"age"]
>   mydata[(mydata$sex %in% mysex)&(mydata$age %in%
> myage),"mygroup.m3"]<-groupings[i,"mygroup"]
> }
> (mydata)
>
> --
> Dimitri Liakhovitski
> gfk.com <http://marketfusionanalytics.com/>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Dimitri Liakhovitski
gfk.com <http://marketfusionanalytics.com/>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning cases to groupings based on the values of several variables

2012-12-07 Thread Dimitri Liakhovitski
My example data indeed looks regular, but in reality neither the data nor
the assignments are regular.
E.g., sometimes all females would land in one grouping and males of
different ages will land in different groupings.
So, I am afraid the "with" solution won't work.
Dimitri

On Fri, Dec 7, 2012 at 7:54 AM, Duncan Murdoch wrote:

> mydata$mygroup.m4 <- with(mydata,
>  4*(2-as.integer(factor(sex)))
>  + as.integer(factor(age)))
>
>


-- 
Dimitri Liakhovitski
gfk.com 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning cases to groupings based on the values of several variables

2012-12-07 Thread Duncan Murdoch

On 12-12-07 7:27 AM, Dimitri Liakhovitski wrote:

Dear R-ers,

my task is to simple: to assign cases to desired groupings based on the
combined values on 2 variables. I can think of 3 methods of doing it.
Method 1 seems to me pretty r-like, but it requires a lot of lines of code
- onerous.


Since your groups are so regular, you can compute the groups directly. 
Convert each column to a factor (this might have happened automatically, 
depending on your data and options), then use as.integer to convert to a 
numeric value.


So a simple solution would be

mydata$mygroup.m4 <- with(mydata,
 4*(2-as.integer(factor(sex)))
 + as.integer(factor(age)))

It would be a little simpler if you wanted the sex factor in alphbetical 
order; then you wouldn't need to subtract from 2.


If your real data wasn't so regular, another approach would be to set up 
a matrix, indexed by sex and age, that gives the desired group number. 
That is somewhat like your "groupings" solution; I'm not sure it would 
be preferable to what you did.


Duncan Murdoch


Method 2 is a loop, so not very good - as it loops through all rows of
mydata.
Method 3 is a loop but loops through fewer lines, so it seems to me more
efficient.
Can you please tell me:
1. Which of my methods is more efficient?
2. Is there maybe an even more efficient r-like way of doing it?
Imagine - "mydata" is actually a very tall data frame.
Thanks a lot!
Dimitri

### My Data:
mydata<-data.frame(sex=rep(c(rep("m",4),rep("f",4)),2),age=rep(c(1:4,1:4),2))
(mydata)

### My desired assignments (in column "mygroup")
groupings<-data.frame(sex=c(rep("m",4),rep("f",4)),age=c(1:4,1:4),mygroup=1:8)
(groupings)

# No, I don't need a solution where the last column of "groupings" is
stacked twice and bound to "mydata"

# Method 1 of assigning to groups - requires a lot of lines of code:
mydata$mygroup.m1<-NA
mydata[(mydata$sex %in% "m")&(mydata$age %in% 1),"mygroup.m1"]<-1
mydata[(mydata$sex %in% "m")&(mydata$age %in% 2),"mygroup.m1"]<-2
mydata[(mydata$sex %in% "m")&(mydata$age %in% 3),"mygroup.m1"]<-3
mydata[(mydata$sex %in% "m")&(mydata$age %in% 4),"mygroup.m1"]<-4
mydata[(mydata$sex %in% "f")&(mydata$age %in% 1),"mygroup.m1"]<-5
mydata[(mydata$sex %in% "f")&(mydata$age %in% 2),"mygroup.m1"]<-6
mydata[(mydata$sex %in% "f")&(mydata$age %in% 3),"mygroup.m1"]<-7
mydata[(mydata$sex %in% "f")&(mydata$age %in% 4),"mygroup.m1"]<-8
(mydata)

# Method 2 of assigning to groups - very "loopy":
mydata$mygroup.m2<-NA
for(i in 1:nrow(mydata)){  # i<-1
   mysex<-mydata[i,"sex"]
   myage<-mydata[i,"age"]
   mydata[i,"mygroup.m2"]<-groupings[(groupings$sex %in%
mysex)&(groupings$age %in% myage),"mygroup"]
}
(mydata)

# Method 3 of assigning to groups - also "loopy", but less than Method 2:
mydata$mygroup.m3<-NA
for(i in 1:nrow(groupings)){  # i<-1
   mysex<-groupings[i,"sex"]
   myage<-groupings[i,"age"]
   mydata[(mydata$sex %in% mysex)&(mydata$age %in%
myage),"mygroup.m3"]<-groupings[i,"mygroup"]
}
(mydata)



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.