Dear R-help:

I have a data frame (df1) with elements a, b, and c that identify a unique
set of conditions of interest; l and m identify other conditions; and x and
y are responses.

df1 <- data.frame(a = c(1,1,1,2,2,2,3,3,3), b = c(10,10,10,20,20,20,30,30,30),
                  c = c(100,100,100,200,200,200,300,300,300), 
                  l = runif(9), m = runif(9), 
                  x = c(1,2,2,2,2,1,2,2,1), y = c(3,2,1,3,2,1,3,2,1))
df1

I want to select 1 row from df1 for each combination of df1$a, df1$b, and
df1$c that has first the max of df1$x for that combination and than in the
case of a tie the max of df1$y and put in a new data.frame:

  a  b   c         l           m x y
1 1 10 100 0.2222679 0.351739848 2 2
2 2 20 200 0.2219270 0.002530816 2 3
3 3 30 300 0.1260224 0.820658343 2 3

My method is as follows:

max.by.x <- aggregate(list(x = df1$x), 
                      list(a = df1$a, b = df1$b, c = df1$c), 
                      max)

df2 <- df1[1,]
for ( i in 1:length(max.by.x[,1]) ) {
   index <- which(df1$a == max.by.x$a[i] &
                  df1$b == max.by.x$b[i] &
                  df1$c == max.by.x$c[i] &
                  df1$x == max.by.x$x[i])
   index2 <- which.max(df1$y[index])
   df2[i,] <- df1[index[index2],]
}

df2

This seems to work, but for real data with 12000 rows it is really slow.
Does anyone have any ideas for improvement (e.g. vectorizing what is done
in the loop)?

With best wishes and kind regards I am

Sincerely,

Corey A. Moffet
Rangeland Scientist

##################################################################
                                            ####                     
USDA-ARS                                        #                    
Northwest Watershed Research Center             #                    
800 Park Blvd, Plaza IV, Suite 105          ###########   ####    
Boise, ID 83712-7716                       #    #      # #        
Voice: (208) 422-0718                      #    #  ####   ####    
FAX:   (208) 334-1502                      #    # #           #   
                                            ####   ###########    
##################################################################

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to