[R] dataframe subset
I have a dataframe with a column, say x consisting of values, each value appearing different times, e.g. x: 1,1,1,1,2,2,4,4,4,9,10,10,10,10,10 ... and a vector, including e.g.: y: 2,9,10,... I need a subset of the dataframe: all rows where x is equal to one of the values in y. Currently I use a loop for this, but because x and y are large this is very slow. Is there any idea how to solve this problem faster? Thank you, Bernhard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] dataframe subset
On 2/8/2006 9:21 AM, Bernhard Baumgartner wrote: I have a dataframe with a column, say x consisting of values, each value appearing different times, e.g. x: 1,1,1,1,2,2,4,4,4,9,10,10,10,10,10 ... and a vector, including e.g.: y: 2,9,10,... I need a subset of the dataframe: all rows where x is equal to one of the values in y. Currently I use a loop for this, but because x and y are large this is very slow. Is there any idea how to solve this problem faster? It's actually very easy. Assume your dataframe is df, then subset(df, x %in% y) will give you what you want (assuming there is no column y in the dataframe). Duncan Murdoch Thank you, Bernhard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] dataframe subset
Bernhard Baumgartner wrote: I have a dataframe with a column, say x consisting of values, each value appearing different times, e.g. x: 1,1,1,1,2,2,4,4,4,9,10,10,10,10,10 ... and a vector, including e.g.: y: 2,9,10,... I need a subset of the dataframe: all rows where x is equal to one of the values in y. Currently I use a loop for this, but because x and y are large this is very slow. Is there any idea how to solve this problem faster? mydata - data.frame(X = sample(1:10, 1, replace=TRUE), Y = sample(c(2,9,10), 1, replace=TRUE)) newdata - mydata[mydata$X %in% unique(mydata$Y),] ?%in% -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R: dataframe subset
Dear Bernhard, if I understand correctly your question may be you want something like df-data.frame(x=sample(1:10, 100, repl=T), y=sample(1:5, 100, repl=T)) subset(df, x%in%y) Regards, Stefano -Messaggio originale- Da: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] conto di Bernhard Baumgartner Inviato: 08 February 2006 15:22 A: r-help@stat.math.ethz.ch Oggetto: [R] dataframe subset I have a dataframe with a column, say x consisting of values, each value appearing different times, e.g. x: 1,1,1,1,2,2,4,4,4,9,10,10,10,10,10 ... and a vector, including e.g.: y: 2,9,10,... I need a subset of the dataframe: all rows where x is equal to one of the values in y. Currently I use a loop for this, but because x and y are large this is very slow. Is there any idea how to solve this problem faster? Thank you, Bernhard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] dataframe subset
Sounds like you may need no use match(). On Wed, 2006-02-08 at 15:21 +0100, Bernhard Baumgartner wrote: I have a dataframe with a column, say x consisting of values, each value appearing different times, e.g. x: 1,1,1,1,2,2,4,4,4,9,10,10,10,10,10 ... and a vector, including e.g.: y: 2,9,10,... I need a subset of the dataframe: all rows where x is equal to one of the values in y. Currently I use a loop for this, but because x and y are large this is very slow. Is there any idea how to solve this problem faster? Thank you, Bernhard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] dataframe subset
Hi something like xx-data.frame(x=sample(1:10,100,replace=T)) y-c(2,5,8) xx[xx$x%in%y,] HTH Petr On 8 Feb 2006 at 15:21, Bernhard Baumgartner wrote: From: Bernhard Baumgartner [EMAIL PROTECTED] Organization: Universitaet Regensburg To: r-help@stat.math.ethz.ch Date sent: Wed, 08 Feb 2006 15:21:46 +0100 Priority: normal Subject:[R] dataframe subset I have a dataframe with a column, say x consisting of values, each value appearing different times, e.g. x: 1,1,1,1,2,2,4,4,4,9,10,10,10,10,10 ... and a vector, including e.g.: y: 2,9,10,... I need a subset of the dataframe: all rows where x is equal to one of the values in y. Currently I use a loop for this, but because x and y are large this is very slow. Is there any idea how to solve this problem faster? Thank you, Bernhard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] dataframe subset
Thanks to all, the %in% function solved my problem! Bernhard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] dataframe subset
Here's one way, x - data.frame(V=c(1,1,1,1,2,2,4,4,4,9,10,10,10,10,10)) y - data.frame(V=c(2,9,10)) xy - merge(x,y,all=FALSE) Pay close attention to what happens if you have duplicate values in y, say y - data.frame(V=c(2,9,10,10)) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bernhard Baumgartner Sent: Wednesday, February 08, 2006 9:22 AM To: r-help@stat.math.ethz.ch Subject: [R] dataframe subset I have a dataframe with a column, say x consisting of values, each value appearing different times, e.g. x: 1,1,1,1,2,2,4,4,4,9,10,10,10,10,10 ... and a vector, including e.g.: y: 2,9,10,... I need a subset of the dataframe: all rows where x is equal to one of the values in y. Currently I use a loop for this, but because x and y are large this is very slow. Is there any idea how to solve this problem faster? Thank you, Bernhard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html