RE: [R] extract rows in dataframe with duplicated column values

Tiago R Magalhaes Fri, 18 Mar 2005 10:21:43 -0800

Thank you very much to Andy Liaw, Rob J Goedman and Marc Schwartz for taking their time to solve my problem. I've learned in many other occasions from useful tips coming from all 3 of them and it just happened once again. You got to love this mailing list...

subset(x, a %in% a[duplicated(a)])

works in all cases and it's the simplest, but as always all the solutions made me understand a little better the R concepts and functions.

I would suggest to include this in the help pages for duplicated.
Also useful might be:

subset(x, !a %in% a[duplicated(a)])

giving all rows that don't have any duplicated

again thanks for all help in this mailing list

Here's one more possibility:

 > subset(x, a %in% a[duplicated(a)])
  a  b
2 2 10
3 2 10
4 3 10
5 3 10
6 3 10

HTH,

Marc Schwartz

On Thu, 2005-03-17 at 22:25 -0500, Liaw, Andy wrote:

 OK, strike one...

 Here's my second try:

 > cnt <- table(x[,1])
 > v <- as.numeric(names(cnt[cnt > 1]))
 > v
 [1] 2 3
 > x[x[,1] %in% v, ]
   a  b
 2 2 10
 3 2 10
 4 3 10
 5 3 10
 6 3 10

 Andy

 > From: Liaw, Andy
 >
 > Does this work for you?
 >
 > > x[table(x[,1]) > 1,]
 >   a  b
 > 2 2 10
 > 3 2 10
 > 5 3 10
 > 6 3 10
 >
 > Andy
 >
 > > From: Tiago R Magalhaes
 > >
 > > Hi
 > >
 > > I want to extract all the rows in a data frame that have duplicates
 > > for a given column.
 > > I would expect this question to come up pretty often but I have
 > > researched the archives and surprisingly couldn't find anything.
 > > The best I can come up with is:
 > >
 > > x <- data.frame(a=c(1,2,2,3,3,3), b=10)
 > > xdup1 <- duplicated(x[,1])
 > > xdup2 <- duplicated(x[,1][nrow(x):1])[nrow(x):1]
 > > xAllDups <- x[(xdup1+xdup2)!=0,]
 > >
 > > This seems to work, but it's so convoluted that I'm sure there's a
 > > better method.
 > > Thanks for any help and enlightenment

> > > [[alternative HTML version deleted]]


______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] extract rows in dataframe with duplicated column values

Reply via email to