On 28-Feb-11 15:51:17, JonC wrote: > Hi, I am trying to simultaneously remove duplicate variables from two > or more > variables in a small R data.frame. I am trying to reproduce the SAS > statements from a Proc Sort with Nodupkey for those familiar with SAS. > > Here's my example data : > > test <- read.csv("test.csv", sep=",", as.is=TRUE) >> test > date var1 var2 num1 num2 > 1 28/01/11 a 1 213 71 > 2 28/01/11 b 1 141 47 > 3 28/01/11 c 2 867 289 > 4 29/01/11 a 2 234 78 > 5 29/01/11 b 2 666 222 > 6 29/01/11 c 2 912 304 > 7 30/01/11 a 3 417 139 > 8 30/01/11 b 3 108 36 > 9 30/01/11 c 2 288 96 > > I am trying to obtain the following, where duplicates of date AND var2 > are removed from the above data.frame. > > date var1 var2 num1 num2 > 28/01/2011 a 1 213 71 > 28/01/2011 c 2 867 289 > 29/01/2011 a 2 234 78 > 30/01/2011 c 2 288 96 > 30/01/2011 a 3 417 139 > > > > If I use the !duplicated function with one variable everything works > fine. > However I wish to remove duplicates of both Date and var2. > > test[!duplicated(test$date),] > date var1 var2 num1 num2 > 1 0011-01-28 a 1 213 71 > 4 0011-01-29 a 2 234 78 > 7 0011-01-30 a 3 417 139 > > test2 <- test[!duplicated(test$date),!duplicated(test$var2),] > Error in `[.data.frame`(test, !duplicated(test$date), > !duplicated(test$var2), : undefined columns selected > I got different errors when using the unique() function. > > Can anybody solve this ? > > Thanks in advance. > Jon
The following gives what you state you wish to obtain (though not quite in the same order of rows. Call the original dataframe 'df': df # date var1 var2 num1 num2 # 1 28/01/11 a 1 213 71 # 2 28/01/11 b 1 141 47 # 3 28/01/11 c 2 867 289 # 4 29/01/11 a 2 234 78 # 5 29/01/11 b 2 666 222 # 6 29/01/11 c 2 912 304 # 7 30/01/11 a 3 417 139 # 8 30/01/11 b 3 108 36 # 9 30/01/11 c 2 288 96 ix <-which(duplicated(data.frame(df$date,df$var2))) ix # [1] 2 5 6 8 df[-ix,] # date var1 var2 num1 num2 # 1 28/01/11 a 1 213 71 # 3 28/01/11 c 2 867 289 # 4 29/01/11 a 2 234 78 # 7 30/01/11 a 3 417 139 # 9 30/01/11 c 2 288 96 Does this help? Ted. PS I'm posting this from a temporarily subscribed alternative address (for testing purposes) instead of my usual ted.hard...@wlandres.net -------------------------------------------------------------------- E-Mail: (Ted Harding) <e...@wlandres.net> Fax-to-email: +44 (0)870 094 0861 Date: 28-Feb-11 Time: 16:19:59 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.