Re: [R] Using apply for logical conditions
Wow, Thanks for all the excellent (and fast) responses. That's really helped. Sorry I didn't supply a cut and paste-able example (noted for future reference) but your examples caught the essence of my problem. I ended up opting for the apply any solution. But I'll bear the Reduce function in mind. Thanks, Alastair -- View this message in context: http://r.789695.n4.nabble.com/Using-apply-for-logical-conditions-tp2310929p2311079.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply for logical conditions
Just for fun, here are another couple of versions that work for data frames. For Reduce with "|" do.call(pmax,c(mydata,na.rm=TRUE)) >0 and for "&" do.call(pmin,c(mydata,na.rm=TRUE)) >0 Cheers, Bert Gunter Genentech Nonclinical Biostatistics On Mon, Aug 2, 2010 at 2:28 PM, Joshua Wiley wrote: > On Mon, Aug 2, 2010 at 2:08 PM, Michael Lachmann wrote: >> >> Reduce() is much nicer, but I usually use >> >> rowSums(A) > 0 for 'or', and >> rowSums(A) == ncols for 'and'. >> >> Which works slightly faster. > > For the sake of my own curiosity, I compared several of these options, > but in case others are interested. > >> boolean <- c(TRUE, FALSE, FALSE) >> >> set.seed(1) >> mydata <- data.frame(X = sample(boolean, 10^7, replace = TRUE), > + Y = sample(boolean, 10^7, replace = TRUE), > + Z = sample(boolean, 10^7, replace = TRUE)) >> >> system.time(opt1 <- apply(mydata, 1, any)) > user system elapsed > 147.26 0.42 148.56 >> system.time(opt2 <- Reduce('|', mydata)) > user system elapsed > 0.33 0.00 0.35 >> system.time(opt3 <- as.logical(rowSums(mydata, na.rm = TRUE))) > user system elapsed > 0.25 0.00 0.27 >> system.time(opt4 <- rowSums(mydata, na.rm = TRUE) > 0) > user system elapsed > 0.25 0.00 0.25 >> >> identical(opt1, opt2) > [1] TRUE >> identical(opt1, opt3) > [1] TRUE >> identical(opt1, opt4) > [1] TRUE >> >> rm(boolean, mydata, opt1, opt2, opt3, opt4) > > > >> >> I noticed, though, that Reduce() doesn't work on matrices. Is there an >> alternative for matrices, or do you have to convert the matrix first to a >> data.frame, and then use Reduce? >> >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/Using-apply-for-logical-conditions-tp2310929p2310991.html >> Sent from the R help mailing list archive at Nabble.com. >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Joshua Wiley > Ph.D. Student, Health Psychology > University of California, Los Angeles > http://www.joshuawiley.com/ > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply for logical conditions
Reduce() is really amazingly fast! Even with a much larger number of columns, it is still in the same ballpark (and much more readable): > boolean <- c(TRUE, rep(FALSE,10^3)) > a<-matrix(sample(boolean, 10^7, replace = TRUE),10^4,10^3) > b<-data.frame(a) > system.time({opt4 <- rowSums(a, na.rm = TRUE) > 0}) user system elapsed 0.129 0.001 0.131 > system.time({opt2 <- Reduce('|',b)}) user system elapsed 0.190 0.109 0.303 and: > boolean <- c(TRUE, rep(FALSE,10^4)) > a<-matrix(sample(boolean, 10^7, replace = TRUE),10^3,10^4) > b<-data.frame(a) > system.time({opt4 <- rowSums(a, na.rm = TRUE) > 0}) user system elapsed 0.082 0.001 0.083 > system.time({opt2 <- Reduce('|',b)}) user system elapsed 0.205 0.001 0.209 It seems to pretty much make rowSums obsolete, vs. Reduce('+'), except that it works on lists, and converting a matrix to a data.frame takes ages. -- View this message in context: http://r.789695.n4.nabble.com/Using-apply-for-logical-conditions-tp2310929p2311042.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply for logical conditions
On Mon, Aug 2, 2010 at 2:08 PM, Michael Lachmann wrote: > > Reduce() is much nicer, but I usually use > > rowSums(A) > 0 for 'or', and > rowSums(A) == ncols for 'and'. > > Which works slightly faster. For the sake of my own curiosity, I compared several of these options, but in case others are interested. > boolean <- c(TRUE, FALSE, FALSE) > > set.seed(1) > mydata <- data.frame(X = sample(boolean, 10^7, replace = TRUE), + Y = sample(boolean, 10^7, replace = TRUE), + Z = sample(boolean, 10^7, replace = TRUE)) > > system.time(opt1 <- apply(mydata, 1, any)) user system elapsed 147.260.42 148.56 > system.time(opt2 <- Reduce('|', mydata)) user system elapsed 0.330.000.35 > system.time(opt3 <- as.logical(rowSums(mydata, na.rm = TRUE))) user system elapsed 0.250.000.27 > system.time(opt4 <- rowSums(mydata, na.rm = TRUE) > 0) user system elapsed 0.250.000.25 > > identical(opt1, opt2) [1] TRUE > identical(opt1, opt3) [1] TRUE > identical(opt1, opt4) [1] TRUE > > rm(boolean, mydata, opt1, opt2, opt3, opt4) > > I noticed, though, that Reduce() doesn't work on matrices. Is there an > alternative for matrices, or do you have to convert the matrix first to a > data.frame, and then use Reduce? > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Using-apply-for-logical-conditions-tp2310929p2310991.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply for logical conditions
Yes, you must do the conversion. The reason is that Reduce requires its argument x, to be a vector; and a matrix is seen a vector obtained by columnwise concatenation. e.g. > Reduce("+",matrix(1:6,nr=3)) [1] 21 > Reduce("+",1:6) [1] 21 The data frame is seen as a list with elements the columns of the frame. Whence one concludes that the f argument must be vectorized for the Reduce to work on the columns of the data frame as you expect. e.g. > Reduce(min,data.frame(a=1:3,b=4:6)) [1] 1 but > Reduce(pmin,data.frame(a=1:3,b=4:6)) [1] 1 2 3 Cheers, Bert Gunter Genentech Nonclinical Biostatistics On Mon, Aug 2, 2010 at 2:08 PM, Michael Lachmann wrote: > > Reduce() is much nicer, but I usually use > > rowSums(A) > 0 for 'or', and > rowSums(A) == ncols for 'and'. > > Which works slightly faster. > > I noticed, though, that Reduce() doesn't work on matrices. Is there an > alternative for matrices, or do you have to convert the matrix first to a > data.frame, and then use Reduce? > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Using-apply-for-logical-conditions-tp2310929p2310991.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply for logical conditions
Reduce() is much nicer, but I usually use rowSums(A) > 0 for 'or', and rowSums(A) == ncols for 'and'. Which works slightly faster. I noticed, though, that Reduce() doesn't work on matrices. Is there an alternative for matrices, or do you have to convert the matrix first to a data.frame, and then use Reduce? -- View this message in context: http://r.789695.n4.nabble.com/Using-apply-for-logical-conditions-tp2310929p2310991.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply for logical conditions
In addition to Reduce(), you can take a look at ?any for '|' and ?all for '&'. Josh On Mon, Aug 2, 2010 at 1:43 PM, Allan Engelhardt wrote: > `|` is a binary operator which is why the apply will not work. See > > help("Reduce") > > For example, > > set.seed(1) > data <- data.frame(A = runif(10) > 0.5, B = runif(10) > 0.5, C = runif(10) > > 0.5) > Reduce(`|`, data) > # [1] TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > > Hope this helps. > > Allan > > On 02/08/10 21:35, Alastair wrote: >> >> Hi, >> >> I've got some boolean data in a data.frame in the form: >> X Y Z A B C >> [1] T T F T F F >> [2] F T T F F F >> . >> . >> . >> >> >> What I want to do is create a new column which is the logical disjunction >> of >> several of the columns. >> Just like: >> >> new.column<- data$X | data$Y | data$Z >> >> However I don't want to hard code the particular columns into the >> expression >> like that. I've tried using apply row wise with `|` as the function: >> >> columns<- c(X,Y,Z) >> apply(data[,columns], 1,`|`) >> >> This doesn't seem to do what I would have expected, does anyone have any >> advice how to use the the apply or similar function to perform a boolean >> operation on each row (and a specific subset of the columns) in a data >> frame? >> >> Thanks, >> Alastair >> >> >> > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply for logical conditions
Alastair wrote: Hi, I've got some boolean data in a data.frame in the form: XYZA B C [1] T TFT F F [2] F TTF F F . . . What I want to do is create a new column which is the logical disjunction of several of the columns. Just like: new.column <- data$X | data$Y | data$Z However I don't want to hard code the particular columns into the expression like that. I've tried using apply row wise with `|` as the function: columns <- c(X,Y,Z) apply(data[,columns], 1,`|`) Please provide *reproducible* examples. I cannot run any of your code since you don't give us the objects X, Y, or Z. An easy way to do this is to use ?dput on the objects we need to run your code, e.g., your data.frame. Does this do what you want? df1 <- data.frame(x = sample(c(TRUE, FALSE), 10, replace = TRUE), y = sample(c(TRUE, FALSE), 10, replace = TRUE), z = sample(c(TRUE, FALSE), 10, replace = TRUE)) columns <- c("x", "y", "z") apply(df1[columns], 1, any) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply for logical conditions
`|` is a binary operator which is why the apply will not work. See help("Reduce") For example, set.seed(1) data <- data.frame(A = runif(10) > 0.5, B = runif(10) > 0.5, C = runif(10) > 0.5) Reduce(`|`, data) # [1] TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE Hope this helps. Allan On 02/08/10 21:35, Alastair wrote: Hi, I've got some boolean data in a data.frame in the form: XYZA B C [1] T TFT F F [2] F TTF F F . . . What I want to do is create a new column which is the logical disjunction of several of the columns. Just like: new.column<- data$X | data$Y | data$Z However I don't want to hard code the particular columns into the expression like that. I've tried using apply row wise with `|` as the function: columns<- c(X,Y,Z) apply(data[,columns], 1,`|`) This doesn't seem to do what I would have expected, does anyone have any advice how to use the the apply or similar function to perform a boolean operation on each row (and a specific subset of the columns) in a data frame? Thanks, Alastair __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.