Hi, I am trying to simultaneously remove duplicate variables from two or more variables in a small R data.frame. I am trying to reproduce the SAS statements from a Proc Sort with Nodupkey for those familiar with SAS.
Here's my example data : test <- read.csv("test.csv", sep=",", as.is=TRUE) > test date var1 var2 num1 num2 1 28/01/11 a 1 213 71 2 28/01/11 b 1 141 47 3 28/01/11 c 2 867 289 4 29/01/11 a 2 234 78 5 29/01/11 b 2 666 222 6 29/01/11 c 2 912 304 7 30/01/11 a 3 417 139 8 30/01/11 b 3 108 36 9 30/01/11 c 2 288 96 I am trying to obtain the following, where duplicates of date AND var2 are removed from the above data.frame. date var1 var2 num1 num2 28/01/2011 a 1 213 71 28/01/2011 c 2 867 289 29/01/2011 a 2 234 78 30/01/2011 c 2 288 96 30/01/2011 a 3 417 139 If I use the !duplicated function with one variable everything works fine. However I wish to remove duplicates of both Date and var2. test[!duplicated(test$date),] date var1 var2 num1 num2 1 0011-01-28 a 1 213 71 4 0011-01-29 a 2 234 78 7 0011-01-30 a 3 417 139 test2 <- test[!duplicated(test$date),!duplicated(test$var2),] Error in `[.data.frame`(test, !duplicated(test$date), !duplicated(test$var2), : undefined columns selected I get an error ? I got different errors when using the unique() function. Can anybody solve this ? Thanks in advance. Jon -- View this message in context: http://r.789695.n4.nabble.com/Problems-using-unique-function-and-duplicated-tp3328150p3328150.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.