Re: [R] Assigning cases to groupings based on the values of several variables

2012-12-07 Thread Duncan Murdoch

On 12-12-07 7:27 AM, Dimitri Liakhovitski wrote:

Dear R-ers,

my task is to simple: to assign cases to desired groupings based on the
combined values on 2 variables. I can think of 3 methods of doing it.
Method 1 seems to me pretty r-like, but it requires a lot of lines of code
- onerous.


Since your groups are so regular, you can compute the groups directly. 
Convert each column to a factor (this might have happened automatically, 
depending on your data and options), then use as.integer to convert to a 
numeric value.


So a simple solution would be

mydata$mygroup.m4 - with(mydata,
 4*(2-as.integer(factor(sex)))
 + as.integer(factor(age)))

It would be a little simpler if you wanted the sex factor in alphbetical 
order; then you wouldn't need to subtract from 2.


If your real data wasn't so regular, another approach would be to set up 
a matrix, indexed by sex and age, that gives the desired group number. 
That is somewhat like your groupings solution; I'm not sure it would 
be preferable to what you did.


Duncan Murdoch


Method 2 is a loop, so not very good - as it loops through all rows of
mydata.
Method 3 is a loop but loops through fewer lines, so it seems to me more
efficient.
Can you please tell me:
1. Which of my methods is more efficient?
2. Is there maybe an even more efficient r-like way of doing it?
Imagine - mydata is actually a very tall data frame.
Thanks a lot!
Dimitri

### My Data:
mydata-data.frame(sex=rep(c(rep(m,4),rep(f,4)),2),age=rep(c(1:4,1:4),2))
(mydata)

### My desired assignments (in column mygroup)
groupings-data.frame(sex=c(rep(m,4),rep(f,4)),age=c(1:4,1:4),mygroup=1:8)
(groupings)

# No, I don't need a solution where the last column of groupings is
stacked twice and bound to mydata

# Method 1 of assigning to groups - requires a lot of lines of code:
mydata$mygroup.m1-NA
mydata[(mydata$sex %in% m)(mydata$age %in% 1),mygroup.m1]-1
mydata[(mydata$sex %in% m)(mydata$age %in% 2),mygroup.m1]-2
mydata[(mydata$sex %in% m)(mydata$age %in% 3),mygroup.m1]-3
mydata[(mydata$sex %in% m)(mydata$age %in% 4),mygroup.m1]-4
mydata[(mydata$sex %in% f)(mydata$age %in% 1),mygroup.m1]-5
mydata[(mydata$sex %in% f)(mydata$age %in% 2),mygroup.m1]-6
mydata[(mydata$sex %in% f)(mydata$age %in% 3),mygroup.m1]-7
mydata[(mydata$sex %in% f)(mydata$age %in% 4),mygroup.m1]-8
(mydata)

# Method 2 of assigning to groups - very loopy:
mydata$mygroup.m2-NA
for(i in 1:nrow(mydata)){  # i-1
   mysex-mydata[i,sex]
   myage-mydata[i,age]
   mydata[i,mygroup.m2]-groupings[(groupings$sex %in%
mysex)(groupings$age %in% myage),mygroup]
}
(mydata)

# Method 3 of assigning to groups - also loopy, but less than Method 2:
mydata$mygroup.m3-NA
for(i in 1:nrow(groupings)){  # i-1
   mysex-groupings[i,sex]
   myage-groupings[i,age]
   mydata[(mydata$sex %in% mysex)(mydata$age %in%
myage),mygroup.m3]-groupings[i,mygroup]
}
(mydata)



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning cases to groupings based on the values of several variables

2012-12-07 Thread Dimitri Liakhovitski
My example data indeed looks regular, but in reality neither the data nor
the assignments are regular.
E.g., sometimes all females would land in one grouping and males of
different ages will land in different groupings.
So, I am afraid the with solution won't work.
Dimitri

On Fri, Dec 7, 2012 at 7:54 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote:

 mydata$mygroup.m4 - with(mydata,
  4*(2-as.integer(factor(sex)))
  + as.integer(factor(age)))




-- 
Dimitri Liakhovitski
gfk.com http://marketfusionanalytics.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning cases to groupings based on the values of several variables

2012-12-07 Thread Dimitri Liakhovitski
Wow, Arun I think I really like this solution. It allows me to create
irregular groupings and is very parsimonious.
Thank you very much!
Dimitri
On Fri, Dec 7, 2012 at 8:09 AM, arun smartpink...@yahoo.com wrote:

 HI,

 In your method2 and method3, you are using the groupings data.  If that is
 the case, is it possible for you to use ?merge() or ?join() from
 library(plyr)
  join(mydata,groupings,by=c(sex,age),type=inner)
  #  sex age mygroup
 #1m   1   1
 #2m   2   2
 #3m   3   3
 #4m   4   4
 #5f   1   5
 #6f   2   6
 #7f   3   7
 #8f   4   8
 #9m   1   1
 #10   m   2   2
 #11   m   3   3
 #12   m   4   4
 #13   f   1   5
 #14   f   2   6
 #15   f   3   7
 #16   f   4   8
 A.K.



 - Original Message -
 From: Dimitri Liakhovitski dimitri.liakhovit...@gmail.com
 To: r-help r-help@r-project.org
 Cc:
 Sent: Friday, December 7, 2012 7:27 AM
 Subject: [R] Assigning cases to groupings based on the values of several
 variables

 Dear R-ers,

 my task is to simple: to assign cases to desired groupings based on the
 combined values on 2 variables. I can think of 3 methods of doing it.
 Method 1 seems to me pretty r-like, but it requires a lot of lines of code
 - onerous.
 Method 2 is a loop, so not very good - as it loops through all rows of
 mydata.
 Method 3 is a loop but loops through fewer lines, so it seems to me more
 efficient.
 Can you please tell me:
 1. Which of my methods is more efficient?
 2. Is there maybe an even more efficient r-like way of doing it?
 Imagine - mydata is actually a very tall data frame.
 Thanks a lot!
 Dimitri

 ### My Data:

 mydata-data.frame(sex=rep(c(rep(m,4),rep(f,4)),2),age=rep(c(1:4,1:4),2))
 (mydata)

 ### My desired assignments (in column mygroup)

 groupings-data.frame(sex=c(rep(m,4),rep(f,4)),age=c(1:4,1:4),mygroup=1:8)
 (groupings)

 # No, I don't need a solution where the last column of groupings is
 stacked twice and bound to mydata

 # Method 1 of assigning to groups - requires a lot of lines of code:
 mydata$mygroup.m1-NA
 mydata[(mydata$sex %in% m)(mydata$age %in% 1),mygroup.m1]-1
 mydata[(mydata$sex %in% m)(mydata$age %in% 2),mygroup.m1]-2
 mydata[(mydata$sex %in% m)(mydata$age %in% 3),mygroup.m1]-3
 mydata[(mydata$sex %in% m)(mydata$age %in% 4),mygroup.m1]-4
 mydata[(mydata$sex %in% f)(mydata$age %in% 1),mygroup.m1]-5
 mydata[(mydata$sex %in% f)(mydata$age %in% 2),mygroup.m1]-6
 mydata[(mydata$sex %in% f)(mydata$age %in% 3),mygroup.m1]-7
 mydata[(mydata$sex %in% f)(mydata$age %in% 4),mygroup.m1]-8
 (mydata)

 # Method 2 of assigning to groups - very loopy:
 mydata$mygroup.m2-NA
 for(i in 1:nrow(mydata)){  # i-1
   mysex-mydata[i,sex]
   myage-mydata[i,age]
   mydata[i,mygroup.m2]-groupings[(groupings$sex %in%
 mysex)(groupings$age %in% myage),mygroup]
 }
 (mydata)

 # Method 3 of assigning to groups - also loopy, but less than Method 2:
 mydata$mygroup.m3-NA
 for(i in 1:nrow(groupings)){  # i-1
   mysex-groupings[i,sex]
   myage-groupings[i,age]
   mydata[(mydata$sex %in% mysex)(mydata$age %in%
 myage),mygroup.m3]-groupings[i,mygroup]
 }
 (mydata)

 --
 Dimitri Liakhovitski
 gfk.com http://marketfusionanalytics.com/

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Dimitri Liakhovitski
gfk.com http://marketfusionanalytics.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning cases to groupings based on the values of several variables

2012-12-07 Thread arun
HI,

In your method2 and method3, you are using the groupings data.  If that is the 
case, is it possible for you to use ?merge() or ?join() from library(plyr)
 join(mydata,groupings,by=c(sex,age),type=inner)
 #  sex age mygroup
#1    m   1   1
#2    m   2   2
#3    m   3   3
#4    m   4   4
#5    f   1   5
#6    f   2   6
#7    f   3   7
#8    f   4   8
#9    m   1   1
#10   m   2   2
#11   m   3   3
#12   m   4   4
#13   f   1   5
#14   f   2   6
#15   f   3   7
#16   f   4   8
A.K.



- Original Message -
From: Dimitri Liakhovitski dimitri.liakhovit...@gmail.com
To: r-help r-help@r-project.org
Cc: 
Sent: Friday, December 7, 2012 7:27 AM
Subject: [R] Assigning cases to groupings based on the values of several 
variables

Dear R-ers,

my task is to simple: to assign cases to desired groupings based on the
combined values on 2 variables. I can think of 3 methods of doing it.
Method 1 seems to me pretty r-like, but it requires a lot of lines of code
- onerous.
Method 2 is a loop, so not very good - as it loops through all rows of
mydata.
Method 3 is a loop but loops through fewer lines, so it seems to me more
efficient.
Can you please tell me:
1. Which of my methods is more efficient?
2. Is there maybe an even more efficient r-like way of doing it?
Imagine - mydata is actually a very tall data frame.
Thanks a lot!
Dimitri

### My Data:
mydata-data.frame(sex=rep(c(rep(m,4),rep(f,4)),2),age=rep(c(1:4,1:4),2))
(mydata)

### My desired assignments (in column mygroup)
groupings-data.frame(sex=c(rep(m,4),rep(f,4)),age=c(1:4,1:4),mygroup=1:8)
(groupings)

# No, I don't need a solution where the last column of groupings is
stacked twice and bound to mydata

# Method 1 of assigning to groups - requires a lot of lines of code:
mydata$mygroup.m1-NA
mydata[(mydata$sex %in% m)(mydata$age %in% 1),mygroup.m1]-1
mydata[(mydata$sex %in% m)(mydata$age %in% 2),mygroup.m1]-2
mydata[(mydata$sex %in% m)(mydata$age %in% 3),mygroup.m1]-3
mydata[(mydata$sex %in% m)(mydata$age %in% 4),mygroup.m1]-4
mydata[(mydata$sex %in% f)(mydata$age %in% 1),mygroup.m1]-5
mydata[(mydata$sex %in% f)(mydata$age %in% 2),mygroup.m1]-6
mydata[(mydata$sex %in% f)(mydata$age %in% 3),mygroup.m1]-7
mydata[(mydata$sex %in% f)(mydata$age %in% 4),mygroup.m1]-8
(mydata)

# Method 2 of assigning to groups - very loopy:
mydata$mygroup.m2-NA
for(i in 1:nrow(mydata)){  # i-1
  mysex-mydata[i,sex]
  myage-mydata[i,age]
  mydata[i,mygroup.m2]-groupings[(groupings$sex %in%
mysex)(groupings$age %in% myage),mygroup]
}
(mydata)

# Method 3 of assigning to groups - also loopy, but less than Method 2:
mydata$mygroup.m3-NA
for(i in 1:nrow(groupings)){  # i-1
  mysex-groupings[i,sex]
  myage-groupings[i,age]
  mydata[(mydata$sex %in% mysex)(mydata$age %in%
myage),mygroup.m3]-groupings[i,mygroup]
}
(mydata)

-- 
Dimitri Liakhovitski
gfk.com http://marketfusionanalytics.com/

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.